I am a research scientist at IBM Research in the trustworthy AI group. My primary research interests lie in multi-agent coordination and alignment, fairness of learning algorithms in sequential decision environments, and human-AI interaction (e.g., via language models).

Prior to joining IBM Research I was a postdoctoral research associate in the Coordinated Science Lab at the University of Illinois at Urbana–Champaign where I worked with Tamer Başar and Cedric Langbort on problems in reinforcement learning, control, and games. I obtained my PhD in electrical engineering and computer science at the University of Michigan under Demos Teneketzis.



Erik Miehling

erik.miehling at ibm.com

Research Scientist
IBM Research
Dublin, Ireland

News and updates

  • [Oct 2024]:
    • Our paper on evaluating the prompt steerability of LLMs was accepted to the pluralistic alignment workshop at NeurIPS 2024. Preprint to appear shortly.
    • Our attack atlas paper was accepted to the red teaming GenAI workshop at NeurIPS 2024. Check out the preprint.
    • Our paper on conversational maxims for human-AI interactions was accepted to EMNLP 2024 Findings!
  • [Sept 2024]:
    • Check out our new preprint describing how to use activation steering to control refusal behavior in language models.
  • [Aug 2024]:
    • Serving as a program committee member for AAAI 2025.
  • [June 2024]:
    • Posted a preprint (submitted to EMNLP 2024) on the development of black-box contrastive explanation methods for LLMs. We apply these methods to enable explainability in open-text generation, automated red-teaming, and conversational AI.
    • Posted a preprint (submitted to EMNLP 2024) on work proposing a set of conversational maxims for human-AI interactions. We demonstrate that some current language models possess an internal prioritization of principles that significantly impacts their ability to interpret the maxims.
    • Served as program committee member for AIES 2024.
  • [February 2024]:
    • Check out a preprint of our paper "Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations". We outline some of the ongoing efforts at IBM Research in building light-weight natural language classifiers.
    • Served as program committee member for FAccT 2024, HCXAI 2024, CHIWORK 2024, and RecSys 2024.
Plain Academic