8 minute read
Case Study

Patchstack transforms vulnerability detection with Gemini, finding novel threats in open-source code

Patchstack Founders

Discover how the team at vulnerability intelligence startup Patchstack uses Gemini’s large context window to find security threats other tools miss.

When entrepreneur (and retired-ethical hacker) Oliver Sild went on a cybersecurity subreddit in 2016, he had no idea he’d meet developer and formerly-aspiring-chef Dave Jong. Or that their shared passion for open-source vulnerability intelligence would inspire them to co-found Patchstack in 2017.

The Patchstack team now consists of 35 fully remote employees across 17 countries. With Oliver at the helm as CEO, the team has already provided more than 12,000 vulnerability-specific mitigation rules.

As the Patchstack team knows first-hand, the collaborative and transparent nature of open-source code has always made it more susceptible to security risks. But as the internet evolves, so do the volume and complexity of threats.

The challenge: An exponential rise in vulnerable code

With AI code generation tools picking up steam, open-source code can now be created much faster than before. Take WordPress plugins, for instance. Submissions to the WordPress repository have already increased by 87% in 2025, magnifying the inherent weaknesses of open-source code.

As Oliver explains, “This problem is now exacerbated by developers who utilize AI to generate code but often lack the necessary security expertise to validate it.” The skyrocketing volume of code, coupled with increased vibe coding, has led to more security threats. In 2024 alone, 7,966 new WordPress vulnerabilities were discovered and 96% of them were in plugins.

The Patchstack team realized traditional static analysis tools that assess codebases file by file couldn’t keep up for two main reasons: scale and context. That is, these tools analyze code in silos and can incorrectly flag code as dangerous because they can’t detect how other parts of the program safeguard data.

Additionally, these traditional static analysis tools lack a nuanced understanding of the WordPress ecosystem. For instance, they don’t reliably distinguish admin-only functions from public-facing features. This leads to false positives and creates bottlenecks.

"Our security researchers would be spending hours per plugin on code review, generating massive backlogs and leaving critical vulnerabilities undetected for weeks,” says Oliver. “With vulnerabilities being weaponized in a matter of hours, we needed a better way.”

The solution: Automating vulnerability detection and validation

Patchstack's AI Code Reviewer, built with Gemini, analyzes open-source code to find and report security vulnerabilities.
Patchstack's AI Code Reviewer, built with Gemini, analyzes open-source code to find and report security vulnerabilities.

The Patchstack team set out to build a smarter analysis tool that could analyze entire codebases at once. This tool also needed to digest massive amounts of detailed, proprietary data gathered by ethical hackers over almost five years.

The team started by experimenting with a wide range of models. “We initially tried fine-tuning open-source models including Llama 3.1 8B, CodeBERT, CodeT5, and CodeLlama. But they failed to identify vulnerabilities, understand complex codebase structures, and had context length limitations,” says Oliver.

That’s when the team turned to Gemini 2.5 Pro.

Gemini 2.5 Pro significantly outperformed these models with superior contextual understanding, accurate vulnerability detection, and the large context windows necessary for analyzing complex architectures,” Oliver shares.

A nimble and agile team created their custom AI Code Reviewer on the Google Cloud Platform (GCP). Throughout the build, they prioritized making the tool adaptable so that it could evolve with future Gemini advancements. They also discovered Gemini’s capabilities and GCP’s ease of integration were critical to addressing the ever-increasing pace and scale of open-source code generation.

“Gemini 2.5 Pro addressed the challenges where traditional static analysis tools failed,” Oliver explains. “The GCP integration allowed rapid deployment and scaling, while Gemini's processing speed and low latency enabled real-time vulnerability detection that dramatically improved response times.”

The final Patchstack tool consists of two parts: the Scanner and the Verifier. The Scanner uses Gemini to map out the plugin’s entire code structure and selects the most relevant files for investigation. The Verifier then reviews the plugin’s full structure and vulnerability report to create a targeted strategy for assessing whether there’s a security threat.

Flowchart of the Patchstack AI Code Reviewer's three-part analysis: mapping the code to generate a strategy, gathering contextual understanding, and delivering a final verdict.
Flowchart of the Patchstack AI Code Reviewer's three-part analysis: mapping the code to generate a strategy, gathering contextual understanding, and delivering a final verdict.

As the Verifier performs targeted static analysis on the selected files, it determines how data flows between the files and whether security controls are in place by putting Gemini’s large context window to work. Its final output—a structured JSON response—identifies vulnerable code, details security issues, and assigns a severity level based on the contextualized risk. Essentially, the Verifier ultimately determines whether a vulnerability is a genuine threat or a false positive.

The results: Faster and more accurate security analysis

The AI Code Reviewer has streamlined the Patchstack team’s workflow. “This tool has been used internally by the security researchers performing the initial code analysis and vulnerability triage that previously required hours of manual review,” says Oliver. The drastic reduction in manual workload has enabled the team to scale high-quality assessments.

Patchstack’s tool built using Gemini has also already identified approximately 10 new Common Vulnerabilities and Exposures that were previously undetectable by automated means. "The fact that the AI Code Reviewer is capable of finding real-world vulnerabilities early in its development is remarkable,” Oliver explains.

What’s next: Scaling open-source security hyper-automation

Oliver and the Patchstack team plan to expand their use of AI Code Reviewer beyond WordPress to other CMS platforms and programming languages. To gain real-world feedback as they scale, the Patchstack team is leaning into the Google for Startups Cloud Program. “Thanks to the credits Google offers, we’ve made our AI Code Reviewer free for some of our users so we can test, improve, and optimize the tool,” Oliver shares.

The team is also leveraging Gemini’s multimodal capabilities to analyze documentation and configuration alongside code and execute more complex, multi-step vulnerability analysis. This is a huge step toward their ultimate goal: using AI to cover the full vulnerability life cycle, from detection to patching.

“The current system is capable of autonomously finding new vulnerabilities and validating them,” Oliver explains. “The next step is to start mitigating the vulnerabilities. That's where we’ll essentially see self-healing software.”

To other founders diving into AI, Oliver offers this advice: “Don’t overthink it. Start with a focused use case where contextual understanding matters the most. For us, Gemini transformed a major operational bottleneck into a competitive advantage.”

Learn more about how the Patchstack team tackles cyberattacks.

Learn more about Patchstack