Advaith Malladi

Incoming Ph.D. in Computer Science @ UNC Chapel Hill  |  ML Research @ UNC Chapel Hill, MPI-SWS, IIIT Hyderabad
advaithmalladi02@gmail.com

I am an incoming Ph.D. student in Computer Science at UNC Chapel Hill, where I will be advised by Prof. Shashank Srivastava. My research spans Machine Learning and Natural Language Processing, with a focus on interpretability and explainability of ML models, post-training of LLMs, and belief-behavior misalignment in LLMs. My work has been published at ICML 2026 and EMNLP 2025. I am currently a Research Fellow at the Max Planck Institute for Software Systems, working on understanding and training LLMs to balance conflicting behaviors such as creativity and factuality, under the supervision of Prof. Abhilasha Ravichander. I am pursuing my Bachelor's in Computer Science and M.S. by Research in Computational Linguistics at IIIT Hyderabad.

Education

University of North Carolina at Chapel Hill

Incoming Ph.D. in Computer Science

Advisor: Prof. Shashank Srivastava

August 2026 -

International Institute of Information Technology - Hyderabad

B.Tech in Computer Science (Hons) + M.S. by Research in Computational Linguistics
Advised by Dr. Radhika Mamidi in the integrated bachelor's and masters by research program
CGPA: 8.47/10  |  TOEFL: 115/120
Relevant Coursework:
  • Introduction to NLP
  • Advanced NLP
  • Information Retrieval and Extraction
  • Cognitive Science and AI
  • Computational Linguistics
  • Computational Psycholinguistics
2021 - 2026

Publications

Directly Optimizing Natural Language Explanations for Behavioral Faithfulness: Simulatability and Recoverability

Advaith Malladi, Shashank Srivastava

International Conference on Machine Learning 2026 (ICML 2026)

[Paper]

Explaining Differences Between Model Pairs in Natural Language through Sample Learning

Advaith Malladi, Rakesh R. Menon, Yuvraj Jain, Shashank Srivastava

Empirical Methods in Natural Language Processing 2025 (EMNLP 2025)

[Paper]

Research Interests

My research spans Machine Learning and Natural Language Processing. I am especially interested in:

  • Interpretability and Explainability of ML Models. Understanding the internal mechanisms of ML models and developing methods to make their decision-making processes transparent and interpretable to humans.
  • Generating Faithful Natural Language Explanations for Model Behavior. Training language models to generate explanations that are directly optimized for faithfulness and simulatability, enabling humans to better understand and predict model behavior.
  • Post-Training of LLMs using RLHF Techniques. Exploring reinforcement learning from human feedback techniques such as GRPO to align LLMs with desired behaviors after pretraining.
  • Belief-Behavior Misalignment and Deception in LLMs. Investigating how and why LLMs exhibit misalignment between their internal beliefs and external behaviors, including studying deception and training latent space monitors to detect such misalignment.
  • Conflicting Behaviors in LLMs. Understanding how contradicting properties such as creativity and factuality interact within LLMs, and training models to dynamically alternate between such behaviors at a token level.
  • Mixture-of-Experts (MoE) Language Models. Exploring the architecture and training of MoE models to enable specialized expert routing, particularly for handling conflicting behavioral objectives.
  • Analyzing Chain-of-Thought in Thinking-Based Models. Investigating the misalignment between the chain-of-thought reasoning and the final answers of thinking-based models — where a correct CoT may lead to an incorrect answer, or a correct answer may stem from a flawed CoT.

Research Experience

Research Fellow

Under the guidance of Prof. Abhilasha Ravichander, I am working on understanding how conflicting behaviors such as creativity and factuality interact in LLMs, and training Mixture-of-Experts LLMs to dynamically alternate between such behaviors at a token level.

March 2026 - August 2026

Undergraduate Researcher

Under the guidance of Prof. Shashank Srivastava, I have worked on generating faithful natural language explanations to explain differences between model pairs (Accepted at EMNLP 2025), training latent space monitors to detect hallucinations and deception in LLMs, and training language models to generate explanations directly optimized for behavioral using RLHF techniques such as GRPO (Accepted at ICML 2026).

January 2024 - Present

Undergraduate Researcher

Under the guidance of Prof. Radhika Mamidi, I have worked on utilizing Natural Language Inference to detect hallucinations in Definition Modelling, Machine Translation, and Paraphrase Generation (SemEval 2024, 2025). I am currently working on the generalizability of latent space monitors in merged language models and on techniques to merge latent space monitors.

January 2024 - Present

Industry Experience

Machine Learning Research Intern

I worked on utilizing the planning capabilities of LLMs to generate a dataset of synthetic contact-center call transcripts. I utilized these synthetic call transcripts to train smaller models and evaluated their performance compared to models trained on real human-generated call transcripts.

March 2024 - July 2024

Machine Learning Intern

I collaborated with the NLP team at Subtl.AI to build the in-house Retrieval Augmented Generation (RAG) pipeline. I also created synthetic datasets using LLMs to evaluate the retrieval step of the RAG pipeline.

September 2023 - February 2024

Teaching Experience

Teaching Assistant

  • 2025  |  CS4.406 Information Retrieval and Extraction
  • 2025  |  CL2.404 Computational Psycholinguistics
  • 2024  |  CS7.501 Advanced Natural Language Processing
  • 2024  |  CS7.401 Introduction to Natural Language Processing
  • 2023  |  CS1.301 Algorithm Analysis and Design
August 2023 - December 2025

Relevant Projects

Retrieval Guided Code Generation Leveraging the Planning Capabilities of LLMs

  • We used retrieval-augmented generation (RAG) to improve code generation by providing relevant examples.
  • We incorporated LLM planning capabilities to generate structured pseudocode before actual code generation.
  • We found that retrieval with planning significantly improves code accuracy and coherence compared to zero-shot generation.
  • Link: Code Generation
Oct 2024 - Nov 2024

Prompt-tuned vs Fine-tuned models Which Better Account for Brain Language (and Vision) Representations?

  • We used decoder language models to compare which representations better account for the brain's representations: fine-tuned or prompt-tuned.
  • We extended the same idea to vision encoder models, implementing vision prompt tuning along the way.
  • We learned that prompt-tuned text representations are similar to the brain, while fine-tuned vision representations are closer to the brain.
  • Link: Fine Tune vs Prompt Tune
Mar 2024 - Apr 2024

Retrieval Augmented Multimodal Factual Verification

  • Given a claim, we built a pipeline to retrieve the relevant paragraphs, tables and images from Wikipedia. These serve as Evidence.
  • After the retrieval of Evidence, we made use of two kinds of models for the claim verification part: FactBERT+DINO and Bridge Tower (presented in AAAI'23).
  • Link: Multimodal Factual Verification
Aug 2023 - Nov 2023

Parameter Efficient Prompt Tuning of GPT-2

Oct 2023 - Nov 2023

Attention is all you Need

  • Implemented the Encoder-Decoder architecture presented in the "Attention Is All You Need" paper from scratch without using the ready-made encoder-decoder modules.
  • Build a Machine Translation System for English to French.
  • Link: Eng2French
Sep 2023 - Oct 2023

Decoder Language Model

Aug 2023 - Sep 2023

Evaluating Discourse Coherence in Paragraphs

  • A stacked LSTM model which can detect topic coherence and temporal/sequential coherence in a paragraph with an accuracy of 0.8.
  • This model was taught 2 kinds of discourse coherence, temporal discourse coherence and topic discourse coherence by inducing different kinds of negative sampling.
  • Link: Textual Coherence
Jan 2023 - Apr 2023

Embeddings from Language MOdelling (ELMO)

  • Trained a Stacked Bi-LSTM model on Masked Language Modelling to learn contextual word embeddings for downstream tasks.
  • Link: ELMO
Apr 2023 - Apr 2023

Awards

  • 2025  |  Dean's Research Award, IIIT Hyderabad
  • 2024  |  Dean's Research Award, IIIT Hyderabad
  • 2024  |  Dean's Merit List (top 20%), IIIT Hyderabad
  • 2024  |  7th Place in Amazon ML Challenge, Amazon
  • 2023  |  1st Place in Megathon'23, IIIT Hyderabad & Qualcomm

Service

Reviewing

  • 2025  |  ACL Rolling Review, Reviewer
  • 2024  |  ACL Rolling Review, Reviewer

University Positions

  • 2023 - 2025  |  Apex Body Member, IIIT Hyderabad
  • 2022 - 2023  |  Undersecretary, Clubs Council, IIIT Hyderabad