aifaq.wtf

"How do you know about all this AI stuff?"
I just read tweets, buddy.

#jailbreaks

Page 1 of 1

ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs

#ASCII art   #jailbreaks   #prompt injection   #hacks   #prompt engineering   #lol   #link  

I honestly though that ASCII art didn't work that well for LLMs! But maybe they're just bad at generating it, not reading it? In this case, the semantics of building a bomb makes it through the alignment force field:

ArtPrompt attack

And yeah, it's still bad at generating ASCII art. So at least we can still employ humans for one thing.

Build a bombe

Build a bombe

Gandalf | Lakera – Test your prompting skills to make Gandalf reveal secret information.

#games   #prompt injection   #jailbreaks   #link  

I ran across the "trick the LLM" game again and realized I never posted it here! It's great.

Tricking the LLM into revealing the password

@johnowhitaker on October 11, 2023

#models   #jailbreaks   #tweets