About
I am a research scientist at IBM Research in the trustworthy AI group. My primary research interests lie in the steerability of generative models, multi-agent coordination and alignment, and the dynamics of human-AI interaction.
Prior to joining IBM Research I was a postdoctoral research associate in the Coordinated Science Lab at the University of Illinois at Urbana–Champaign where I worked with Tamer Başar and Cedric Langbort on problems in reinforcement learning, control, and games. I obtained my PhD in electrical engineering and computer science at the University of Michigan under Demos Teneketzis.
Model steerability and safety
I share the view that model steerability (i.e., controllability) is a more direct and sustainable target for AI safety than model alignment, in that it forces us to focus on questions of how corrigibility/oversight can be preserved as model capabilities grow. Understanding how steerable models are toward particular behaviors also informs which model capabilities should be constrained. See Kush Varshney's interview and Helen Toner's post for more on these ideas.
To better understand model steerability generally, I've been leading the development of the AI Steerability 360 toolkit. The toolkit allows for the design of new steering methods, the analysis of how much a model can be controlled (or steered), and the (empirical) study of any unintended side effects that were introduced through steering.
Multi-agent systems
Another focus area of my research concerns multi-agent systems, specifically how we can effectively monitor if collectives of LLM-based agents are behaving as intended, and what we can do to intervene if their goals start to drift. Given that agentic AI systems are being deployed in environments that interface with humans, the tools used to understand these systems must draw upon theory from domains beyond engineering. Check out our workshop for relevant discussions.