PhD Student, Computer Science
The University of Texas at Austin
saisurya [at] cs [dot] utexas [dot] edu
Hi, I'm Sai! I am a fifth-year PhD student in Computer Science at UT Austin, advised by Prof. Inderjit S. Dhillon. My research goal is building data-efficient LLMs through (a) efficient architectures for long-context understanding and reasoning, and (b) optimization algorithms which gets the best out of each batch. My work usually has a linear algebraic flavour, utilizing theoretical insights to build algorithms with strong empirical performance — with some ideas finding their way into Google, Meta, and Microsoft.
Before my PhD, I spent two years at Microsoft Research collaborating with Neeraj Kayal, Ankit Garg, and Venkata N. Padmanabhan — where I got hooked on linear algebra and machine learning. I completed my B.Tech in CS from IIT Kharagpur. I have been fortunate to intern at Google Ads, Google DeepMind, Meta (FAIR), and IBM Research, where I met amazing collaborators including Rohan Anil, Manzil Zaheer, Cho-Jui Hsieh, and Abhijit Mishra.
A deep-dive into how preconditioning the attention matrix fixes attention noise in long-context LLMs.