Steganographic Injection: Invisible Attacks in Images

Steganographic injection represents the cutting edge of image-based prompt injection. Unlike white text attacks which embed visible (if hard to spot) text, steganographic techniques encode instructions directly in the pixel values of an image. The image looks completely normal to human eyes.

How it works

The Invisible Injections paper (arXiv 2507.22304) demonstrates that injection payloads can be encoded in the least significant bits of image pixels. Vision-language models (VLMs) process these pixel patterns and can be influenced by the encoded instructions, even though no visible text is present.

Adversarial perturbation

The CrossInject paper (ACM MM 2025) takes this further with adversarial perturbation alignment. By carefully modifying image pixels to align with injection text embeddings, the attacker creates images that steer model behaviour without any text at all. The perturbations are invisible to human observers but meaningful to the model's vision encoder.

Defence challenges

Steganographic injection is hard to defend against because:

OCR cannot detect it because there is no text to read
Metadata scanning cannot detect it because the payload is in the pixel data
The image passes all standard visual inspection

Current defences

Research-stage defences include image preprocessing (compression, noise addition) that disrupts steganographic encoding, and adversarial robustness training that makes models less susceptible to perturbation-based attacks. Bordair's image pipeline includes preprocessing steps that mitigate known steganographic techniques.

Steganographic Injection: Invisible Attacks in Images

How it works

Adversarial perturbation

Defence challenges

Current defences

Protect your LLM application