Projects

Research

Currently working in the Jacob Andreas lab on optimizing language models to collaborate with humans directly via RL.

In the past, I worked in the Isola lab on in context learning and inner optimization in transformers, and before that in the Tegmark and Solar-Lezama labs, all at MIT.

Papers:

Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents. Kaivalya Hariharan*, Uzay Girit*, Atticus Wang, Jacob Andreas. CoLM, 2025. A methodology to synthetically generate software tasks of arbitrary difficulty, with a detailed analysis of agent behavior and static measures of difficulty.
The Quantization Model of Neural Scaling. Eric J. Michaud, Ziming Liu, Uzay Girit, Max Tegmark. NeurIPS, 2023. Modeling capability emergence in terms of structure in the task distribution of language.
Lower Data Diversity Accelerates Training: Case Studies in Synthetic Tasks. In Submission. Suhas Kotha*, Uzay Girit*, Tanishq Kumar*, Gauran Ghosal, Aditi Raghunathan. Investigating an ICL generalization phenomenon where memorization acts as a pathway to generalization.
Between the Bars: Gradient-based Jailbreaks are Bugs that induce Features. Kaivalya Hariharan*, Uzay Girit*. Accepted to NeurIPS ATTRIB 2024, Redteaming Adversaries. An analysis of structure and patterns in language model adversaries.

Software

new things coming soon
Archivy, popular self-hostable and extensible knowledge management software.
Espial, an engine for automated organization and discovery of personal knowledge. See the demo here.
Dust, where I worked in summer 2023 with Stanislas Polu when only 4 people were there. I optimized parts of the Rust backend and built a new power user AI collaboration product
AdiosCorona, a general resource on COVID guidelines I built with a group of French scientists, which delivered information to millions of people during the pandemic.

See GitHub for more.