I am an incoming Ph.D. student in Computer Science at UNC Chapel Hill, where I will be advised by Prof. Shashank Srivastava. My research spans Machine Learning and Natural Language Processing, with a focus on interpretability and explainability of ML models, post-training of LLMs, and belief-behavior misalignment in LLMs. My work has been published at ICML 2026 and EMNLP 2025. I am currently a Research Fellow at the Max Planck Institute for Software Systems, working on understanding and training LLMs to balance conflicting behaviors such as creativity and factuality, under the supervision of Prof. Abhilasha Ravichander. I am pursuing my Bachelor's in Computer Science and M.S. by Research in Computational Linguistics at IIIT Hyderabad.
Advisor: Prof. Shashank Srivastava
Directly Optimizing Natural Language Explanations for Behavioral Faithfulness: Simulatability and Recoverability
Advaith Malladi, Shashank Srivastava
International Conference on Machine Learning 2026 (ICML 2026)
[Paper]Explaining Differences Between Model Pairs in Natural Language through Sample Learning
Advaith Malladi, Rakesh R. Menon, Yuvraj Jain, Shashank Srivastava
Empirical Methods in Natural Language Processing 2025 (EMNLP 2025)
[Paper]My research spans Machine Learning and Natural Language Processing. I am especially interested in:
Under the guidance of Prof. Abhilasha Ravichander, I am working on understanding how conflicting behaviors such as creativity and factuality interact in LLMs, and training Mixture-of-Experts LLMs to dynamically alternate between such behaviors at a token level.
Under the guidance of Prof. Shashank Srivastava, I have worked on generating faithful natural language explanations to explain differences between model pairs (Accepted at EMNLP 2025), training latent space monitors to detect hallucinations and deception in LLMs, and training language models to generate explanations directly optimized for behavioral using RLHF techniques such as GRPO (Accepted at ICML 2026).
Under the guidance of Prof. Radhika Mamidi, I have worked on utilizing Natural Language Inference to detect hallucinations in Definition Modelling, Machine Translation, and Paraphrase Generation (SemEval 2024, 2025). I am currently working on the generalizability of latent space monitors in merged language models and on techniques to merge latent space monitors.
I worked on utilizing the planning capabilities of LLMs to generate a dataset of synthetic contact-center call transcripts. I utilized these synthetic call transcripts to train smaller models and evaluated their performance compared to models trained on real human-generated call transcripts.
I collaborated with the NLP team at Subtl.AI to build the in-house Retrieval Augmented Generation (RAG) pipeline. I also created synthetic datasets using LLMs to evaluate the retrieval step of the RAG pipeline.