Now live - dual-region EU + US

Stop prompt injection
before it reaches your AI

One API call scans text, images, documents, and audio simultaneously. Bordair detects 18+ attack types - from jailbreaks and indirect injection to cross-modal smuggling and adversarial encoding - in under 50ms.

Designed by a cybersecurity professional in financial services - where missing a real threat isn't an option and false alarms cost hours.

⚠ BLOCKED
# pip install bordair
from bordair import Bordair
client = Bordair()
result = client.scan("Ignore all previous in...")
Result
{
threat: "high",
confidence: 0.9842
}
<50ms
avg latency
-
total scans
-
threats detected
<0.1%
false positive rate
Bordair's Castle
Free to play

Think you can outsmart AI security?

Bordair's Castle is a multimodal prompt injection game. Craft attacks that slip past 5 kingdoms of AI defences, climb the leaderboard, and win prizes. No account needed to start.

Play now 5 kingdoms · 35 levels · monthly prizes

Features

Everything you need to protect your LLM

Sub-50ms detection

Purpose-built detection pipeline optimised for speed. Fast enough to sit inline as synchronous middleware - no async queues, no polling.

Purpose-built detection engine

A proprietary detection system built specifically for prompt injection, covering 18+ attack categories from direct overrides to multi-turn escalation and cross-modal smuggling. Continuously updated with new threat intelligence.

Dual-region, always on

Deployed to EU (London) and US (Virginia) with Route 53 latency routing. Your traffic automatically hits the nearest region.

Long-prompt safe

Full-coverage scanning across prompts up to 10,000 characters. Injections buried within or appended after legitimate content are reliably detected.

Multimodal in one call

Send text, image, document, and audio together in a single /scan/multi request. Each modality is routed through its own pipeline - one response, one verdict.

Scan analytics

Every scan is logged. Track threat rates, confidence scores, and method breakdown across your API key - visible in your dashboard.

Image scanning

Detects injections embedded within images before they reach your multimodal LLM. OCR, metadata extraction, and automatic steganography neutralisation included. 10 credits per scan.

Steganography neutralisation

Every image is automatically sanitised to destroy invisible payloads - LSB steganography, adversarial perturbations, and palette-based encoding - before text extraction. Visible content and OCR quality are fully preserved.

Audio scanning

Detect ultrasonic injection attacks, adversarial audio perturbations, and spoken prompt injections via automatic transcription. 15 credits per scan.

Document scanning

Scan PDF, DOCX, XLSX, and PPTX files for embedded prompt injections across all content surfaces. 15 credits per scan.

Enforcement you control

Scan LLM outputs before they reach your users. Block, redact, or warn. Fine-tune with allow-lists and per-project policies to minimise false positives without weakening protection.

How it works

Three lines of code between you and attacks

1

User sends input

Your application receives a message or prompt from a user.

2

Bordair scans it

POST the input to /scan with your API key. The detector returns threat level and confidence in milliseconds.

3

Route or reject

If threat is "high", return an error to the user. If "low", forward to your LLM as normal.

# pip install bordair
from bordair import Bordair
client = Bordair(api_key=API_KEY)
# Single-turn scan
result = client.scan(user_input)
# Multi-turn: pass conversation history
result = client.scan(
user_input,
conversation_history=history, # last 3 turns scanned
)
if result["threat"] == "high":
raise ValueError("Request blocked")
Output scanning

Protect inputs and outputs

Input scanning stops attacks before they reach your LLM. Output scanning lets you define custom regex rules to block, redact, or flag sensitive content in model responses -before they reach your users.

Paid plans only
output_scan.pyOUTPUT
# Define rules, then scan LLM output
from bordair import Bordair
client = Bordair()
# Add rules (one-time setup)
client.add_output_rule(
"sk-[a-zA-Z0-9]{20,}", "block",
"Block leaked API keys"
)
# Scan the output
result = client.scan_output(llm_response)
if result["blocked"]:
return "Sorry, that response was blocked."
# Response
{
"action": "block",
"blocked": true,
"output": "",
"matched_rules": [...],
"rules_checked": 3
}

Per-rule actions

Each regex rule gets its own action -block, redact, warn, or log. Block leaked API keys, redact emails, warn on PII, and log everything else.

Custom regex patterns

Define your own patterns to match against LLM output. Catch API keys, credentials, email addresses, phone numbers, or any sensitive content specific to your domain.

Smart redaction

Redact rules replace matched content with [REDACTED] while keeping the rest of the response intact. Multiple redaction patterns work together in a single scan.

Priority-based resolution

When multiple rules match, the highest-priority action wins: block > redact > warn > log. Deterministic behaviour, no surprises.

Threat coverage

What Bordair protects against

Bordair detects the full spectrum of prompt injection and jailbreak techniques - from basic instruction overrides to sophisticated cross-modal and multi-turn attacks.

Direct prompt injection

Common

Attempts to override system instructions, change AI behaviour, or bypass safety guidelines through explicit commands in user input.

Indirect prompt injection

Growing

Malicious instructions hidden in external content the AI processes - emails, web pages, API responses, RAG documents, and retrieved context.

Jailbreak attacks

Common

Role-play exploits, DAN prompts, hypothetical framing, and persona hijacking designed to make AI ignore its safety constraints.

System prompt extraction

High risk

Social engineering, translation tricks, encoding games, and formatting exploits aimed at making AI leak its confidential instructions.

Multi-turn escalation

Sophisticated

Attacks that build up gradually across multiple messages - Crescendo attacks, context poisoning, and incremental trust manipulation.

Cross-modal attacks

Emerging

Injection payloads split across text, images, documents, and audio that only become dangerous when the AI combines them.

Payload smuggling

Common

Injections buried inside legitimate-looking content - hidden text in documents, encoded strings, delimiter escapes, and markup injection.

Tool and function call injection

Critical

Prompts that trick AI agents into calling dangerous functions, executing unauthorised API calls, or passing attacker-controlled arguments.

Agent and chain-of-thought manipulation

Emerging

Fake reasoning steps, plan hijacking, and goal redirection targeting AI agents that reason and take actions autonomously.

Encoding and obfuscation

Common

Base64, ROT13, leetspeak, Unicode homoglyphs, zero-width characters, and RTL overrides used to smuggle instructions past filters.

Adversarial suffixes

Sophisticated

Machine-generated token sequences (GCG, AutoDAN) appended to prompts that reliably bypass safety alignment in language models.

Image-embedded injection

Growing

Instructions hidden in images via steganography, white-on-white text, QR codes, and adversarial perturbations. Images are automatically sanitised to destroy invisible payloads before scanning.

Document-embedded injection

High risk

Malicious prompts concealed in PDF, DOCX, XLSX, and PPTX files - in metadata, hidden layers, comments, and embedded objects.

Audio injection

Emerging

Ultrasonic payloads, adversarial audio perturbations, and spoken prompt injections hidden within audio files and voice input.

Structured data injection

Growing

Malicious instructions embedded in JSON, XML, CSV, YAML, and SVG payloads that get parsed and processed by AI systems.

Language-switching attacks

Sophisticated

Mid-sentence language changes exploiting the gap between multilingual understanding and safety training in English-centric models.

ASCII art and visual encoding

Emerging

Instructions rendered as ASCII art or banner-font text that models read visually but text-based filters miss entirely.

Social engineering of AI

Common

Authority impersonation, fake credentials, emotional manipulation, and urgency framing designed to convince AI to break its own rules.

Bordair's detection is continuously updated as new attack techniques emerge. Our threat coverage is informed by real-world attack data from Castle, academic security research, and production deployments.

Start protecting your AI

Pricing

Start free, scale when you need to

No payment required to get started.

Free

$0forever

For personal projects and prototypes.

  • 200 credits/week
  • 20 credits/minute
  • REST API access
  • Image, document & audio scanning
  • Dashboard
  • Output scanning rules
  • Priority routing
  • SLA guarantee
Start free
Most popular

Individual

$19/month

For solo developers shipping to production.

  • 10,000 credits/week
  • 100 credits/minute
  • REST API access
  • Image, document & audio scanning
  • Output scanning rules
  • Dashboard
  • Email support
  • SLA guarantee
Get started

Business

$99/month

For teams with production workloads.

  • 100,000 credits/week
  • 2,000 credits/minute
  • REST API access
  • Image, document & audio scanning
  • Output scanning rules
  • Semantic layer (coming soon)
  • Dashboard
  • Priority support
  • 99.9% SLA
Get started

Enterprise

Custom

For large-scale or compliance-sensitive deployments.

  • Unlimited credits
  • Custom rate limits
  • REST API access
  • Output scanning rules
  • Semantic layer (coming soon)
  • Dashboard
  • Dedicated support
  • Custom SLA
  • Custom contracts
Talk to us

Why Bordair

Built by a defender. Stress-tested by attackers.

I work in cybersecurity at a major bank. My job is watching how attackers operate - how they test boundaries, disguise payloads, and exploit blind spots.

When companies started connecting AI to their products without checking what users were sending in, I knew exactly how that story ends. So I built Bordair.

The detection system works the way good security should: fast enough that users never notice it, accurate enough that it doesn't cry wolf, and built to handle attacks across text, images, documents, and audio - because real attackers don't stick to one format.

Designed by a cybersecurity professional protecting a FTSE 100 bank

Open Research

Bordair's Multimodal Dataset

We're open-sourcing the adversarial prompt injection training data partially used to build Bordair's API - 101,032 labeled samples across four dataset versions covering cross-modal, multi-turn, adversarial suffix, agentic, indirect injection, and evasion attacks. Exactly 1:1 balanced (50,516 attack / 50,516 benign). All samples source-attributed to peer-reviewed research.

50,516
attack samples
46
attack categories
55+
academic sources (arXiv)
GCG adversarial suffixes
Zou et al. 2023 (arXiv:2307.15043)
2,400
AutoDAN wrappers
Liu et al. 2024 (arXiv:2310.04451)
1,656
Encoding attacks
Base64, ROT13, leetspeak, homoglyphs
1,932
Jailbreak templates
PyRIT / Microsoft AI Red Team
8,100
Crescendo multi-turn
Russinovich et al. 2024 (arXiv:2404.01833)
118
Combined multiturn+GCG
TAP / PAIR hybrid (arXiv:2312.02119)
152
Indirect injection
RAG poisoning, email, web, API response
30
System prompt extraction
Translation tricks, formatting exploits
30
Tool-call injection
Function argument manipulation
20
Agent/CoT manipulation
Fake reasoning, plan hijack
20
Unicode and homoglyph attacks
Cyrillic lookalikes, zero-width, RTL
20
Language-switching attacks
Mid-sentence code-switching evasion
20
Computer use injection (v4)
VLM agent UI/DOM hijacking via screenshots
1,075
Memory poisoning (v4)
ChatGPT/Claude/Gemini persistent memory exploits
1,075
MCP tool injection (v4)
Model Context Protocol descriptor poisoning
1,075
Reasoning token injection (v4)
o1/R1/Claude thinking-trace hijacking
860
Multi-agent contagion (v4)
Inter-agent handoff message poisoning
860
RAG chunk boundary (v4)
Separator injection, index poisoning, role-token spoofing
860

Four dataset versions: v1 cross-modal (23,759), v2 PyRIT/GCG (14,358), v3 emerging vectors (187), v4 agentic attacks (12,212) plus cross-modal expansion. Every sample carries its academic source and attack reference. Exactly 50,516 attack samples paired with 50,516 benign samples for a clean 1:1 binary classifier split.