Publications

JailbreakBench

Jailbreaking LLM-Controlled Robots

Alexander Robey, Zachary Ravichandran, Vijay Kumar, Hamed Hassani, George J. Pappas

PAIR

Jailbreaking Black Box Large Language Models in Twenty Queries

Patrick Chao, Alexander Robey, Edgar Dobriban, Hamed Hassani, George J. Pappas, Eric Wong

JailbreakBench

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Patrick Chao*, Edoardo Debenedetti*, Alexander Robey*, Maksym Andriushchenko*, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

Text-to-image

Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation

Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico Kolter

Safe Harbor

A Safe Harbor for AI Evaluation and Red Teaming

Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

Semantic smoothing

Defending Large Language Models against Jailbreak Attacks via Semantic Smoothing

Jiabao Ji*, Bairu Hou*, Alexander Robey*, George J. Pappas, Hamed Hassani, Yang Zhang, Eric Wong, Shiyu Chang

Model-based verification

Data-Driven Modeling and Verification of Perception-Based Autonomous Systems

Thomas Waite, Alexander Robey, Hassani Hamed, George J. Pappas, Radoslav Ivanov

SmoothLLM

SmoothLLM: Defending Large Language Models against Jailbreaking Attacks

Alexander Robey, Eric Wong, Hamed Hassani, George J. Pappas

Non-zero-sum AT

Adversarial Training Should Be Cast as a Non-Zero-Sum Game

Alexander Robey*, Fabian Latorre*, George J. Pappas, Hamed Hassani, Volkan Cevher

ROCBF

Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations

Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, Nikolai Matni

Distribution shift verification

Toward Certified Robustness Against Real-World Distribution Shifts

Haoze Wu*, Teruhiro Tagomori*, Alexander Robey*, Fengjun Yang*, Nikolai Matni, George J. Pappas, Hamed Hassani, Corina Pasareanu, Clark Barrett

QRM

Probable Domain Generalization via Quantile Risk Minimization

Cian Eastwood*, Alexander Robey*, Shashank Singh, Julius von Kügelgen, Hamed Hassani, George J. Pappas, Bernhard Schölkopf

Stable imitation learning

On the Sample Complexity of Stability Constrained Imitation Learning

Stephen Tu, Alexander Robey, Tingnan Zhang, Nikolai Matni

Chordally sparse LipSDP

Chordal Sparsity for Lipschitz Constant Estimation of Deep Neural Networks

Anton Xue, Lars Lindemann, Alexander Robey, Hamed Hassani, George J. Pappas, Rajeev Alur

Long-tailed robustness

Do Deep Networks Transfer Invariances Across Classes?

Allan Zhou*, Fahim Tajwar*, Alexander Robey, Tom Knowles, George J. Pappas, Hamed Hassani, Chelsea Finn

Probabilistic robustness

Probabilistically Robust Learning: Balancing Average- and Worst-case Performance

Alexander Robey, Luiz F. O. Chamon, George J. Pappas, Hamed Hassani

Semi-infinite robustness

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey*, Luiz Chamon*, George J. Pappas, Hamed Hassani, Alejandro Ribeiro

MBDG

Model-Based Domain Generalization

Alexander Robey, George J. Pappas, Hamed Hassani

CDCG

Optimal Algorithms for Submodular Maximization With Distributed Constraints

Alexander Robey, Arman Adibi, Brent Schlotfeldt, Hamed Hassani, George J. Pappas

RHCBF

Learning Robust Hybrid Control Barrier Functions for Uncertain Systems

Alexander Robey*, Lars Lindemann*, Stephen Tu, Nikolai Matni

HCBF

Learning Hybrid Control Barrier Functions from Data

Lars Lindemann, Haimin Hu, Alexander Robey, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

CBF

Learning Control Barrier Functions from Expert Demonstrations

Alexander Robey*, Haimin Hu*, Lars Lindemann, Hanwen Zhang, Dimos V. Dimarogonas, Stephen Tu, Nikolai Matni

Adversarial trade-off

Provable Tradeoffs in Adversarially Robust Classification

Edgar Dobriban, Hamed Hassani, David Hong, Alexander Robey

MBRDL

Model-Based Robust Deep Learning: Generalizing to Natural, Out-of-Distribution Data

Alexander Robey, Hamed Hassani, George J. Pappas

LipSDP

Efficient and Accurate Estimation of Lipschitz Constants for Deep Neural Networks

Mahyar Fazlyab, Alexander Robey, Hamed Hassani, Manfred Morari, George J. Pappas

Fourier Ptychography

Optimal Physical Preprocessing for Example-Based Super-Resolution

Alexander Robey and Vidya Ganapati