Hey there!

My name is Coby, and I’m an AI safety researcher with interest in interpretability, model psychology, and model organisms of misalignment. I have a broader interest in most things related to AI ethics, analytical philosophy, math, and biosafety. In my free time, I love composing and performing music, dancing Argentine tango, and (ski) mountaineering. I’m currently doing my undergrad at Yale.

See my research or personal writing.


News

  • February 2026 — We have started work on a research project having to do with emergent misalignment.
  • January 2026 — I’m now on the board and an intro fellowship facilitator for Yale AI Alignment, and I am the intro fellowship manager for YEA.
  • December 2025 — We finished our privacy neuron editing project, comparing inference-time interventions for suppressing PII leakage from LLMs
  • October 2025 — I am now on the board for Yale Effective Altruism (YEA), and I am a facilitator for our intro fellowship.
  • August 2025 — We wrapped up extensions of our SPAR project, producing a manifold neural block with differentiable Betti numbers and a simplex-constrained transformer
  • May 2025 — I finished SPAR and presented our findings (so far) at the poster session. We are continuing to work on the project beyond the program
  • February 2025 — I was accepted to the SPAR program under Dr. Ronak Mehta to research models that are inherently interpretable.
  • November 2024 — Our activation steering data extraction project placed 7th in the LLM Privacy Contest at NeurIPS
  • May 2016 — I have graduated the third grade; I could not have achieved this momentous milestone without the support of my beloved family and friends.
  • February 2009 — I am enormously excited to share that I have just turned two years old