Projects

My main project right now is my work as CTO of Fulcrum Research. We are building products and doing research on how to oversee the computations of AI systems. This involves both orchestrating and scaling the use of agents without losing control, and being able to precisely answer questions about agent behavior. WIth Fulcrum, I made the following software:

Lunette, a product/infrastructure to run and audit model evaluations
Orchestra and Quibbler, open source tooling for a new kind of coding – where the role of the human in the loop is to communicate intent and design

More coming soon!

Research

As an undergrad, I was fortunate to work with many researchers in MIT CSAIL.

In the Jacob Andreas lab, I studied how to optimize language models for direct collaboration humans, via RL, and also did research on long horizon software evals. I also worked in the Isola lab on in context learning and inner optimization in transformers, and before that in the Tegmark and Solar-Lezama labs, all at MIT.

Papers:

Breakpoint: Scalable evaluation of system-level reasoning in LLM code agents. Kaivalya Hariharan*, Uzay Girit*, Atticus Wang, Jacob Andreas. CoLM, 2025. A methodology to synthetically generate software tasks of arbitrary difficulty, with a detailed analysis of agent behavior and static measures of difficulty.
The Quantization Model of Neural Scaling. Eric J. Michaud, Ziming Liu, Uzay Girit, Max Tegmark. NeurIPS, 2023. Modeling capability emergence in terms of structure in the task distribution of language.
Lower Data Diversity Accelerates Training: Case Studies in Synthetic Tasks. In Submission. Suhas Kotha*, Uzay Girit*, Tanishq Kumar*, Gauran Ghosal, Aditi Raghunathan. Investigating an ICL generalization phenomenon where memorization acts as a pathway to generalization.
Between the Bars: Gradient-based Jailbreaks are Bugs that induce Features. Kaivalya Hariharan*, Uzay Girit*. Accepted to NeurIPS ATTRIB 2024, Redteaming Adversaries. An analysis of structure and patterns in language model adversaries.

Software

new things coming soon
Archivy, popular self-hostable and extensible knowledge management software.
Espial, an engine for automated organization and discovery of personal knowledge. See the demo here.
Dust, where I worked in summer 2023 with Stanislas Polu when only 4 people were there. I optimized parts of the Rust backend and built a new power user AI collaboration product
AdiosCorona, a general resource on COVID guidelines I built with a group of French scientists, which delivered information to millions of people during the pandemic.

See GitHub for more.