Research interests

No news so far...

  • Mechanistic interpretability of Transformers and LLMs
  • Latent experts / subnetworks and modular reasoning
  • Evaluation of long-form reasoning and creativity (LongWriter / HelloBench / WritingBench)
  • Causal RL and self-consistent data generation for reasoning