# Chris Cundy

**Contact:** chris.j.cundy@gmail.com
**GitHub:** [C-J-Cundy](https://github.com/C-J-Cundy)

---

## Experience

### Research Scientist
**FAR.AI** | Berkeley, California, USA | June 2024–Present

Characterising and mitigating catastrophic risks from frontier AI systems. Main responsibilities:
- Leading research projects and determining the direction for teams of researchers and engineers
- Designing and implementing training and evaluation schemes with state-of-the-art models, conducting empirical studies, and communicating results through papers and presentations
- As a research team lead, ensuring that research aims are consistent with FAR's mission and that my direct reports are supported in their output and professional development

### Research Scientist Intern
**Technical AI Safety Team, DeepMind** | London, UK | June 2022–September 2022

Investigating robust and reliable machine learning in theory and at scale:
- Investigated susceptibility of autoregressive models to 'delusions', where unobserved latent variables lead to incorrect probabilistic judgments
- Developed a theoretical model for delusions; investigated delusions at scale by analysing performance of DeepMind's Gato (a large generalist, multi-task autoregressive model) on custom environments

### Visiting Scholar
**Future of Humanity Institute, University of Oxford** | Oxford, UK | October 2017–January 2018

Developing algorithms to predict human judgments. Supervised by Owain Evans and Andreas Stuhlmüller:
- Designed algorithms to collate quick, noisy human judgments to predict the answer to complicated tasks which would typically require deliberation

### Visiting Scholar
**Centre for Human-Compatible AI, University of California, Berkeley** | US | June–September 2017

Supervised by Daniel Filan & Stuart Russell, researching topics in AI safety:
- Extended previous work on inverse reinforcement learning to hierarchical setting. Formalized the problem, derived theoretical results, performed experiments on data and presented at an ICML workshop

---

## Education

### PhD - Computer Science
**Stanford University** | Stanford, California, USA | 2018–2024
- Advised by Stefano Ermon
- Investigating topics in inverse reinforcement learning, sequence modelling and variational inference
- Thesis: *Beyond Maximum Likelihood: Distribution-Aware Machine Learning*

### MEng - Computer Science
**University of Cambridge** | Cambridge, UK | 2016–2017
- Grade: Distinction
- Modules: Data Science, Probabilistic Machine Learning, Network Analytics
- Thesis: *Investigating Variational Gaussian Process State-Space Models with Gaussian Likelihood*. Supervised by Carl E. Rasmussen

### BA - Natural Sciences (Physics)
**University of Cambridge** | Cambridge, UK | 2013–2016
- Grade: 1st
- Modules: Physics, Maths, Chemistry, Computer Science

---

## Selected Publications

**The Obfuscation Atlas: Mapping Where Honesty Emerges in RLVR with Deception Probes**
Mohammad Taufeeque, Stefan Heimersheim, Adam Gleave, Chris Cundy | Preprint, 2026

**Auditing Games for Sandbagging**
Jordan Taylor, Sid Black, Dillon Bowen, Thomas Read, Satvik Golechha, Alex Zelenka-Martin, Oliver Makins, Connor Kissane, Kola Ayonrinde, Jacob Merizian, Samuel Marks, Chris Cundy, Joseph Bloom | Technical Report, 2025

**Preference Learning with Lie Detectors can Induce Honesty or Evasion**
Chris Cundy, Adam Gleave | NeurIPS 2025

**SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking**
Chris Cundy, Stefano Ermon | ICLR 2024

**Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients**
Chris Cundy, Rishi Desai, Stefano Ermon | AISTATS 2024

**LMPriors: Pre-Trained Language Models as Task-Specific Priors**
Kristy Choi*, Chris Cundy*, Sanjari Srivasta, Stefano Ermon | First Workshop on Foundation Models for Decision Making, NeurIPS 2022

**BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery**
Chris Cundy, Aditya Grover, Stefano Ermon | NeurIPS 2021

**Flexible Approximate Inference via Stratified Normalizing Flows**
Chris Cundy, Stefano Ermon | UAI 2020

**Parallelizing Linear Recurrent Neural Nets over Sequence Length**
Eric Martin, Chris Cundy | ICLR 2018

---

## Additional Publications

**Sharpe Ratio-Guided Active Learning for Preference Optimization in RLHF**
Syrine Belakaria, Joshua Kazdan, Charles Marx, Chris Cundy, Willie Neiswanger, Sanmi Koyejo, Barbara E Engelhardt, Stefano Ermon | CoLM 2025

**A physics-informed machine learning model for the prediction of drop breakup in two-phase flows**
Chris Cundy, Shahab Mirjalili, Charlélie Laurent, Stefano Ermon, Gianluca Iaccarino, Ali Mani | International Journal of Multiphase Flow, 2024

**Neural Networks and the Chomsky Hierarchy**
Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Chris Cundy, Marcus Hutter, Shane Legg, Joel Veness, Pedro A. Ortega | ICLR 2023

**Towards a foundation model for geospatial artificial intelligence**
Gengchen Mai, Chris Cundy, Kristy Choi, Yingjie Hu, Ni Lao, Stefano Ermon | Proceedings of the 30th International Conference on Advances in Geographic Information Systems, 2022

**IQ-Learn: Inverse soft-Q Learning for Imitation**
Divyansh Garg, Shuvam Chakraborty, Chris Cundy, Jiaming Song, Stefano Ermon | NeurIPS 2021

**Exploring Hierarchy-Aware Inverse Reinforcement Learning**
Chris Cundy, Daniel Filan | Workshop on Goal Specifications for Reinforcement Learning, ICML 2018

**Predicting Slow Judgment**
Owain Evans, Andreas Stuhlmüller, Ryan Carey, Neal Jean, Andrew Schreiber, Girish Sastry, Chris Cundy | Aligned Artificial Intelligence Workshop, NeurIPS 2017

---

## Service

**Participant, EU AI Act Code of Practice Working Groups 2 and 4** | 2025
Participated, as an independent expert, in working groups 2 and 4 for the development of the EU AI Act Code of Practice (CoP). I advocated, via written and oral presentation, for the importance of pre- and post-mitigation model evaluations, outlined in an earlier position paper I authored.

**Teaching Assistant—CS228 (Probabilistic Graphical Models)** | Stanford University | 2023

**Head Teaching Assistant—CS228 (Probabilistic Graphical Models)** | Stanford University | 2022
Received award for excellence (awarded to top 5% of Teaching Assistants).

**Project Supervisor** | Supervised Project for Alignment Research (SPAR), Stanford AI Alignment | 2023
Supervised five undergraduates on a project finding scaling laws in prompt injections. Presented work at the 7th Center for Human-Compatible AI workshop.

**Project Supervisor** | Undergraduate Research Program, Stanford Existential Risk Initiative | 2021
Served as supervisor for an undergraduate project on forecasting AI progress.

**Reviewer** | 2020–Present
Reviewed for the following venues: UAI (2020, 2022, 2025, 2026), ICML (2019, 2020, 2023, 2025, 2026), ICLR (2021-2026), NeurIPS (2021-2025), AAAI (Safe and Robust AI track) (2023-2024), TMLR (2025, 2026).

---

## Relevant Awards

**Winner, OpenAI Preparedness Challenge** | March 2024
- One of the top ten submissions for the OpenAI Preparedness Challenge, for submitting *the most unique, while still being probable, potentially catastrophic misuse of the [OpenAI API]*
- Developed proof-of-concept showing how GPT4-V, and speech-to-text with GPT4, could be used to parse vast amounts of unlabelled surveillance data, finding actionable insights for blackmail or insider trading
- Prize: $25,000 in OpenAI credits
