Multilingual Prompt Injection: Attacks in Every Language

Most prompt injection defences are built for English. Attackers know this, and they exploit it by writing their injection payloads in other languages. A German "Ignoriere alle vorherigen Anweisungen" or Chinese "忽略之前的指令" achieves the same result as "Ignore all previous instructions" but may slip past English-only detection.

How it works

Multilingual injection takes several forms:

Full translation: The entire payload is written in a non-English language
Code-switching: A benign English prefix followed by an injection in a different script (e.g., Russian Cyrillic or Arabic script)
Mixed-language: Key injection terms are in a foreign language while the rest is English

Languages covered by attackers

Our training data includes injection patterns in German, French, Spanish, Italian, Portuguese, Chinese (simplified and traditional), Japanese, Arabic, Russian, and Korean. These are not just translations of English patterns; they use native grammatical structures and phrasing.

Prevalence

Multilingual injection is documented in arXiv research on multilingual prompt attacks. It is particularly relevant for global applications that accept input in multiple languages.

Severity: High

Multilingual injections are just as dangerous as English ones. The severity is amplified by the fact that many defenders do not test for them, creating a false sense of security.

How Bordair detects it

Bordair has dedicated patterns for injection phrases in 10 languages. Our ML model, built on DeBERTa v3 which supports over 100 languages, handles multilingual input natively. We also detect code-switching patterns where non-Latin script is mixed into primarily Latin text.