Indirect Prompt Injection Into LLMs Using Images and Sounds

  Рет қаралды 989

Black Hat

Black Hat

2 ай бұрын

Multi-modal Large Language Models (LLMs) are advanced artificial intelligence models that can produce contextually rich responses that combine inputs of various types (text, audio, pictures). As a result, Bard already relies on such architecture, and the next generation of ChatGPT is expected to rely on them as well.
In this talk, we demonstrate how images and audio samples can be used for indirect prompt and instruction injection against (unmodified and benign) multi-modal LLMs. An attacker generates an adversarial perturbation corresponding to the prompt and blends it into an image or audio recording. When the user asks the (unmodified, benign) model about the perturbed image or audio, the perturbation steers the model to output the attacker-chosen text and/or make the subsequent dialog follow the attacker's instruction....
By: Ben Nassi, Eugene Bagdasaryan
Full Abstract and Presentation Materials:
www.blackhat.com/eu-23/briefi...

Пікірлер
DELETE TOXICITY = 5 LEGENDARY STARR DROPS!
02:20
Brawl Stars
Рет қаралды 10 МЛН
Cute Barbie Gadget 🥰 #gadgets
01:00
FLIP FLOP Hacks
Рет қаралды 41 МЛН
Who’s more flexible:💖 or 💚? @milanaroller
00:14
Diana Belitskay
Рет қаралды 19 МЛН
ПООСТЕРЕГИСЬ🙊🙊🙊
00:39
Chapitosiki
Рет қаралды 61 МЛН
DELETE TOXICITY = 5 LEGENDARY STARR DROPS!
02:20
Brawl Stars
Рет қаралды 10 МЛН