Alex Robey

I am a postdoc at CMU with J. Zico Kolter. I received my Ph.D. from Penn in 2024, where I worked with Hamed Hassani & George J. Pappas.

My goal is to make AI safe for people to use. Because modern AIs are so complex and capable, finding ways to make AI safe involves tools from different areas of math and engineering. My research, in particular, draws on tools from statistics, optimization, and control theory.

Reach me at arobey(at)andrew(dot)cmu(dot)edu.

News

Our work on jailbreaking robots was covered by Forbes.
Slides from my talk on jailbreaking robots at ICRA '25.
Our paper on safety pretraining is online.
Our paper on antidistillation sampling is online.
SmoothLLM was accepted at TMLR.
My thesis won the 2024 Charles Hallac and Sarah Keil Wolf Award.
Slides from my talk on jailbreaking LLMs at IEEE SaTML '25.
Our tutorial on jailbreaking LLMs was accepted at ICML '25.
I was named a Rising Star in Cyber-Physical Systems by the NSF.
Slides & recording from my talk at the Paris AI Security Forum '25.
Slides & recording from my talk at IASEAI '25.
SmoothLLM was covered by MIT Technology Review.
Our paper on jailbreaking robots was accepted at ICRA '25.
Slides from my talk at Microsoft Research.
Slides from my lecture in AI & Criminal Justice at UBC.

2025

Slides from my talk at the NeurIPS '24 AdvML workshop.
Our paper introducing the PAIR jailbreak was accepted at SaTML '25.
Our work on jailbreaking robots was covered by WIRED.
Slides from my lecture in Verifiable ML (CS 7180) at Northeastern.
Our work on jailbreaking robots was covered by IEEE Spectrum.
We won the Best Poster Award at a workshop on robotic safety.
Slides from my talk on jailbreaking at USC.
Our paper on jailbreaking LLM-controlled robots is online.
I was named a Rising Star in Adversarial ML at NeurIPS '24.
JailbreakBench was accepted to the benchmarks track at NeurIPS '24.
I joined Gray Swan as a research contractor.
I started a postdoc at CMU with J. Zico Kolter.
Slides from my Ph.D. thesis defense.
Our red teaming public policy proposal received an oral at ICML '24.
Our work on red teaming was covered in The Washington Post.
Our non-zero-sum adv. training paper was accepted at ICLR '24.
Slides from my lecture in Trustworthy ML (CIS 7000) at Penn.
I will be an instructor at Swarthmore College for the spring term.
Our work on jailbreaking LLMs was covered by WIRED.