Automatically fix code vulnerabilities with AI
SnykSec
Posted on October 15, 2024
In Snyk’s library of SecRel customer workshops, we have one called Breaking AI. This workshop covers how generative AI assistants, like copilot and codium, can help developers write code faster. The big punchline of the workshop is that AI assistants are like junior developers fresh out of their coding boot camps: super eager and helpful, but - you really want to check their code (no shade to junior developers - we need them, and they are awesome!). In the workshop, we show how combining AI assistants and Snyk gives you a superpower - writing code quickly and securely.
To quote a famous chef, Snyk has now “kicked it up a notch” with its Deep Code AI Fix capability. Snyk Code has always given good remediation advice when it finds a security vulnerability. It would even give you three good references of other open-source projects with the same vulnerability and show you how those projects fixed it. Now, thanks to carefully curated data from Snyk’s security research team, a hybrid AI model that combines the power of generative and symbolic AI, as well as machine learning models, Snyk can now auto-fix many common security vulnerabilities right from the comfort of your favorite IDE.
In this blog, I’ll demonstrate how you can best take advantage of Deep Code AI Fix (DCAIF) in a Java example project we use in our workshops. The source code can be found here. NOTE: This project is purposely vulnerable in several ways and is not suitable for use in any production environment.
Fun with a conference scheduling app
If you’ve ever been to a tech conference or spoken at one, you know that one of the most challenging aspects is to create and present the schedule.
The reference app is a terrible version of this (I am NOT a good front-end developer!) but, it is useful for showing off how code can technically work and be insecure at the same time.
It’s a Spring Boot app using the Thymeleaf template engine. At startup, it generates random speakers, talk titles, and talk description data using the popular faker library. The talk titles and descriptions all come from the text of the book The Hitchhiker’s Guide to the Galaxy. NOTE: The titles and descriptions may not all be 100% safe for work. You’ve been warned!
Each generated speaker for the event also has a dedicated page where you can see a list of their talks at the conference. We'll focus on this part of the app.
To compile and run the app, execute the following:
mvn clean install
mvn spring-boot:run
When you run the app, in addition to the usual Spring Boot banner and other output, you’ll see something like this:
Access talks for heath.davis at: http://localhost:8081/talks?username=heath.davis
Access talks for russell.bernier at: http://localhost:8081/talks?username=russell.bernier
Access talks for kenyetta.jones at: http://localhost:8081/talks?username=kenyetta.jones
Access talks for howard.bailey at: http://localhost:8081/talks?username=howard.bailey
Access talks for buddy.jast at: http://localhost:8081/talks?username=buddy.jast
Access talks for jeanice.kertzmann at: http://localhost:8081/talks?username=jeanice.kertzmann
Access talks for deborah.hamill at: http://localhost:8081/talks?username=deborah.hamill
Access talks for horacio.renner at: http://localhost:8081/talks?username=horacio.renner
Access talks for winfred.schuster at: http://localhost:8081/talks?username=winfred.schuster
Access talks for tommie.hane at: http://localhost:8081/talks?username=tommie.hane
Access talks for micah at: http://localhost:8081/talks?username=micah
NOTE: It won’t be exactly like this, as the speakers and their talks are randomly generated each time you start the app.
If you click on or copy/paste one of the links, you’ll see a speaker page with a list of their talks:
Let’s play: find the vulnerability
Take a look at TalkController.java
. This is how a speaker’s page is rendered. Can you spot the vulnerability?
We don’t want you to have to rely on your eyeballs! We want developers to move fast AND write secure code.
I use IntelliJ Idea to write my Java applications. I also use Snyk’s IDE extension. The good news is that there are Snyk IDE extensions for all the popular IDEs, supporting a wide array of programming languages.
If you want to follow along and don’t have a Snyk account, you can create one for free here.
Did I mention that this app has a lot of vulnerabilities in it? That’s because we use it in our workshops. The one I want to focus on is the one in TalkController.java
:
- Shows that a Cross-site Scripting (XSS) vulnerability was found
- Shows a diff from one of three open-source projects with the same vulnerability. This shows how it was fixed (the green part).
- The red squiggly line indicates where the vulnerability is in the code.
This is all super helpful information for resolving the issue. However, notice a little lightning bolt icon (⚡️) to the left of the vulnerability on #1. That indicates that this vulnerability can be fixed automatically.
If you hang your mouse over the red squiggly, you’ll see the option to Fix this issue:
Before we look at the Deep Code AI Fix result, let’s look at what copilot would do for us.
Here’s the prompt I gave copilot on line 42 and the code it injects:
// guard against XSS
username = username.replaceAll("<", "<").replaceAll(">", ">");
This may work against naive attacks, but will it work against all types of XSS attacks? I’m not so sure. In fact, when I re-run the Snyk scan, the XSS vulnerability is still there.
Let’s take a look at the Snyk fix:
String usernameStr = "" + HtmlUtils.htmlEscape(username) + "'s talks";
================================================
Now, this I like. It uses an idiomatic and expected approach with a built-in Spring Boot library: HtmlUtils
. I’m confident that this will thoroughly sanitize the input: username
. A new Snyk scan shows that the vulnerability is no longer there.
Thumbs up, indeed! Not only did the Snyk auto fix resolve the vulnerability, but it did so in a way that matches the framework in use - Spring Boot in this case.
DCAIF won’t slow you down
In this post, I showed you how Snyk’s Deep Code AI Fix goes beyond just remediation advice and can make a fix on your behalf. You can also confirm the benefit of the fix by seeing that the original security vulnerability is no longer there.
At Snyk, we love the AI assistants, but they are not very good at security. Our research shows that the Gen AIs out there tend to generate insecure code at about the same rate that humans do - around 40%. This makes sense as pure Gen AI solutions like Copilot train their models primarily on existing code. Snyk’s hybrid approach, which includes symbolic AI, Gen AI, machine learning, and curated training from our security team, gives the results a security-first approach.
Posted on October 15, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.