Coby Kassner

Hey there!

My name is Coby, and I’m an AI safety researcher with interest in interpretability, model psychology, and model organisms of misalignment. I have a broader interest in most things related to AI ethics, analytical philosophy, math, and biosafety. In my free time, I love composing and performing music, dancing Argentine tango, and (ski) mountaineering. I’m currently doing my undergrad at Yale.

See my research or personal writing. If you’re at Yale, I’d love to grab a meal.

News

February 2026 — We have started work on a research project having to do with emergent misalignment.
January 2026 — I’m now on the board and an intro fellowship facilitator for Yale AI Alignment, and I am the intro fellowship manager for YEA.
December 2025 — We finished our privacy neuron editing project, comparing inference-time interventions for suppressing PII leakage from LLMs
October 2025 — I am now on the board for Yale Effective Altruism (YEA), and I am a facilitator for our intro fellowship.
August 2025 — We wrapped up extensions of our SPAR project, producing a manifold neural block with differentiable Betti numbers and a simplex-constrained transformer
May 2025 — I finished SPAR and presented our findings (so far) at the poster session. We are continuing to work on the project beyond the program
February 2025 — I was accepted to the SPAR program under Dr. Ronak Mehta to research models that are inherently interpretable.
November 2024 — Our activation steering data extraction project placed 7th in the LLM Privacy Contest at NeurIPS
May 2016 — I have graduated the third grade; I could not have achieved this momentous milestone without the support of my beloved family and friends.
February 2009 — I am enormously excited to share that I have just turned two years old

cobylk.io

On this page:

Graph View

Coby Kassner

Hey there!

News

Graph View