Trusted with AI security since the beginning

From prompt injection to data exfiltration to responsible AI failures - we've tested for these risks at scale, inside the platforms that defined the market. We build our own AI-driven tools to go deeper, and our clients trust us with their most sensitive systems.

From models to plugins to infrastructure

We examine the entire product ecosystem - AI models, LLM programs, plugins, and supporting cloud infrastructure - through penetration testing, vulnerability assessments, and compliance reviews.

Prompt Injection

Direct injection, indirect manipulation, and cross-boundary prompt injection attacks that exploit how your LLM processes inputs.

Jailbreaking

Testing output security controls to prevent your product from sharing sensitive or prohibited information when manipulated.

Responsible AI

Evaluating AI behavior against responsible AI standards to ensure your product acts in a safe, trustworthy, and ethical way. Our AI governance framework development helps establish these standards from the start.

Security Controls

Finding design flaws, weaknesses, and lax guardrails through testing that puts your LLM through real-world stress scenarios.

Training Data

Auditing the data at the heart of your model to ensure it's sound, safe, and accurate.

Plugin Risks

Checking the integrity of your LLM against flawed or risky plugins and interactions between separate components.

Black box, gray box, and white box testing

Our penetration testing incorporates the latest research - from universal jailbreaking techniques to gradient-based attacks - providing thorough analysis of LLM vulnerabilities. We continuously update our methods as new research emerges. For adversarial scenarios that go beyond prompt testing, our AI-focused red team engagements simulate full attack chains against your AI systems.

Black Box

Simulated real-world attacks with no internal knowledge. We probe and stress-test every element of the LLM from the attacker's perspective, including LLM-vs-LLM testing where we use other models to attack your product.

Gray Box

Working with your development team to understand system prompts and input integration. This enables us to identify hotspots for prompt injection and find vulnerabilities like resource overconsumption and unsafe credential handling.

White Box

Full access to model weights for deep analysis. We use techniques like Greedy Coordinate Gradient, GBDA, and HotFlip to test adversarial robustness at the most fundamental level.

Agentic AI Security & Responsible Deployment Guide

Our guide for engineering and security teams building autonomous AI systems. Covers architecture patterns, identity management, data security, guardrails, and infrastructure recommendations.

Read the full guide

Our process

Step 1

Scoping

We assess the attack surface and define security objectives for your specific needs. We identify the most important features for testing and present a detailed proposal with fixed pricing.

Step 2

Kickoff

We dive into the architecture and code with your team, develop a prioritized test plan, set up communication channels, and establish a weekly status meeting.

Step 3

Execution

Targeted code review, surgical runtime testing, and infrastructure analysis - powered by our Nemesis platform. We're looking for meaningful issues that matter to you.

Step 4

Reporting

Detailed findings on vulnerabilities including successful prompt engineering and jailbreaking, along with thematic and design issues, followed by an in-person or remote presentation.

Microsoft chooses Casaba to test M365 Copilot

Since January 2024, Microsoft has selected Casaba to perform ongoing security assessments of Copilot AI assistants across the M365 product suite. Our work has spanned multiple engagements over two years, covering AI/LLM security risks aligned with the OWASP Top Ten for LLMs.

Read the full case study

Frequently asked questions

What is AI penetration testing?
AI penetration testing is a security assessment focused on AI-powered applications, including large language models, AI agents, and machine learning systems. It identifies vulnerabilities like prompt injection, data leakage, and unauthorized access that are unique to AI systems.
How do you test LLMs for security vulnerabilities?
We use a combination of adversarial prompt testing, code-level analysis of the application layer, API security testing, and architecture review. Our approach goes beyond automated prompt scanning to examine how the LLM integrates with the broader application and data sources.
What is prompt injection?
Prompt injection is a vulnerability where an attacker crafts input that causes an AI model to ignore its instructions, reveal sensitive information, or perform unintended actions. We test for both direct prompt injection and indirect prompt injection through external data sources.
What frameworks guide your AI security testing?
Our methodology draws from the OWASP Top 10 for LLM Applications, MITRE ATLAS, and our own research from years of testing AI systems for major technology companies. We adapt our approach to each client's specific AI implementation.
Can you test AI agents and tool-calling systems?
Yes. We test agentic AI systems including tool integration security, MCP server configurations, privilege escalation paths, and cross-boundary prompt injection where external data sources can influence agent behavior.

Need your AI tested?

We've been doing this longer than most. Let's talk about what your system needs.

Get in touch