Advaith Malladi

Incoming Ph.D. in Computer Science @ UNC Chapel Hill | ML Research @ UNC Chapel Hill, MPI-SWS, IIIT Hyderabad
advaithmalladi02@gmail.com

I am an incoming Ph.D. student in Computer Science at UNC Chapel Hill, where I will be advised by Prof. Shashank Srivastava. My research spans Machine Learning and Natural Language Processing, with a focus on interpretability and explainability of ML models, post-training of LLMs, and belief-behavior misalignment in LLMs. My work has been published at ICML 2026 and EMNLP 2025. I am currently a Research Fellow at the Max Planck Institute for Software Systems, working on understanding and training LLMs to balance conflicting behaviors such as creativity and factuality, under the supervision of Prof. Abhilasha Ravichander. I am pursuing my Bachelor's in Computer Science and M.S. by Research in Computational Linguistics at IIIT Hyderabad.

Education

University of North Carolina at Chapel Hill

Incoming Ph.D. in Computer Science

Advisor: Prof. Shashank Srivastava

August 2026 -

International Institute of Information Technology - Hyderabad

B.Tech in Computer Science (Hons) + M.S. by Research in Computational Linguistics

Advised by Dr. Radhika Mamidi in the integrated bachelor's and masters by research program

CGPA: 8.47/10 | TOEFL: 115/120

Relevant Coursework:

Introduction to NLP
Advanced NLP
Information Retrieval and Extraction
Cognitive Science and AI
Computational Linguistics
Computational Psycholinguistics

2021 - 2026

Publications

Directly Optimizing Natural Language Explanations for Behavioral Faithfulness: Simulatability and Recoverability

Advaith Malladi, Shashank Srivastava

International Conference on Machine Learning 2026 (ICML 2026)

[Paper]

Explaining Differences Between Model Pairs in Natural Language through Sample Learning

Advaith Malladi, Rakesh R. Menon, Yuvraj Jain, Shashank Srivastava

Empirical Methods in Natural Language Processing 2025 (EMNLP 2025)

[Paper]

Research Interests

My research spans Machine Learning and Natural Language Processing. I am especially interested in:

Interpretability and Explainability of ML Models. Understanding the internal mechanisms of ML models and developing methods to make their decision-making processes transparent and interpretable to humans.
Generating Faithful Natural Language Explanations for Model Behavior. Training language models to generate explanations that are directly optimized for faithfulness and simulatability, enabling humans to better understand and predict model behavior.
Post-Training of LLMs using RLHF Techniques. Exploring reinforcement learning from human feedback techniques such as GRPO to align LLMs with desired behaviors after pretraining.
Belief-Behavior Misalignment and Deception in LLMs. Investigating how and why LLMs exhibit misalignment between their internal beliefs and external behaviors, including studying deception and training latent space monitors to detect such misalignment.
Conflicting Behaviors in LLMs. Understanding how contradicting properties such as creativity and factuality interact within LLMs, and training models to dynamically alternate between such behaviors at a token level.
Mixture-of-Experts (MoE) Language Models. Exploring the architecture and training of MoE models to enable specialized expert routing, particularly for handling conflicting behavioral objectives.
Analyzing Chain-of-Thought in Thinking-Based Models. Investigating the misalignment between the chain-of-thought reasoning and the final answers of thinking-based models — where a correct CoT may lead to an incorrect answer, or a correct answer may stem from a flawed CoT.

Research Experience

Research Fellow

Max Planck Institute for Software Systems

Under the guidance of Prof. Abhilasha Ravichander, I am working on understanding how conflicting behaviors such as creativity and factuality interact in LLMs, and training Mixture-of-Experts LLMs to dynamically alternate between such behaviors at a token level.

March 2026 - August 2026

Undergraduate Researcher

University of North Carolina at Chapel Hill

Under the guidance of Prof. Shashank Srivastava, I have worked on generating faithful natural language explanations to explain differences between model pairs (Accepted at EMNLP 2025), training latent space monitors to detect hallucinations and deception in LLMs, and training language models to generate explanations directly optimized for behavioral using RLHF techniques such as GRPO (Accepted at ICML 2026).

January 2024 - Present

Undergraduate Researcher

IIIT Hyderabad

Under the guidance of Prof. Radhika Mamidi, I have worked on utilizing Natural Language Inference to detect hallucinations in Definition Modelling, Machine Translation, and Paraphrase Generation (SemEval 2024, 2025). I am currently working on the generalizability of latent space monitors in merged language models and on techniques to merge latent space monitors.

January 2024 - Present

Industry Experience

Machine Learning Research Intern

Observe.AI

I worked on utilizing the planning capabilities of LLMs to generate a dataset of synthetic contact-center call transcripts. I utilized these synthetic call transcripts to train smaller models and evaluated their performance compared to models trained on real human-generated call transcripts.

March 2024 - July 2024

Machine Learning Intern

Subtl.AI

I collaborated with the NLP team at Subtl.AI to build the in-house Retrieval Augmented Generation (RAG) pipeline. I also created synthetic datasets using LLMs to evaluate the retrieval step of the RAG pipeline.

September 2023 - February 2024

Teaching Experience

Teaching Assistant

IIIT Hyderabad

2025 | CS4.406 Information Retrieval and Extraction
2025 | CL2.404 Computational Psycholinguistics
2024 | CS7.501 Advanced Natural Language Processing
2024 | CS7.401 Introduction to Natural Language Processing
2023 | CS1.301 Algorithm Analysis and Design

August 2023 - December 2025

Relevant Projects

Retrieval Guided Code Generation Leveraging the Planning Capabilities of LLMs

We used retrieval-augmented generation (RAG) to improve code generation by providing relevant examples.
We incorporated LLM planning capabilities to generate structured pseudocode before actual code generation.
We found that retrieval with planning significantly improves code accuracy and coherence compared to zero-shot generation.
Link: Code Generation

Oct 2024 - Nov 2024

Prompt-tuned vs Fine-tuned models Which Better Account for Brain Language (and Vision) Representations?

We used decoder language models to compare which representations better account for the brain's representations: fine-tuned or prompt-tuned.
We extended the same idea to vision encoder models, implementing vision prompt tuning along the way.
We learned that prompt-tuned text representations are similar to the brain, while fine-tuned vision representations are closer to the brain.
Link: Fine Tune vs Prompt Tune

Mar 2024 - Apr 2024

Retrieval Augmented Multimodal Factual Verification

Given a claim, we built a pipeline to retrieve the relevant paragraphs, tables and images from Wikipedia. These serve as Evidence.
After the retrieval of Evidence, we made use of two kinds of models for the claim verification part: FactBERT+DINO and Bridge Tower (presented in AAAI'23).
Link: Multimodal Factual Verification

Aug 2023 - Nov 2023

Parameter Efficient Prompt Tuning of GPT-2

Implemented Parameter-Efficient Prompt Tuning on GPT-2 for Question Answering, Machine Translation and Summarization.
Link: Parameter Efficient Prompt Tuning

Oct 2023 - Nov 2023

Attention is all you Need

Implemented the Encoder-Decoder architecture presented in the "Attention Is All You Need" paper from scratch without using the ready-made encoder-decoder modules.
Build a Machine Translation System for English to French.
Link: Eng2French

Sep 2023 - Oct 2023

Decoder Language Model

A decoder-only transformer based language model built from scratch using PyTorch.
Link: Transformer Language Model

Aug 2023 - Sep 2023

Evaluating Discourse Coherence in Paragraphs

A stacked LSTM model which can detect topic coherence and temporal/sequential coherence in a paragraph with an accuracy of 0.8.
This model was taught 2 kinds of discourse coherence, temporal discourse coherence and topic discourse coherence by inducing different kinds of negative sampling.
Link: Textual Coherence

Jan 2023 - Apr 2023

Embeddings from Language MOdelling (ELMO)

Trained a Stacked Bi-LSTM model on Masked Language Modelling to learn contextual word embeddings for downstream tasks.
Link: ELMO

Apr 2023 - Apr 2023

Awards

2025 | Dean's Research Award, IIIT Hyderabad
2024 | Dean's Research Award, IIIT Hyderabad
2024 | Dean's Merit List (top 20%), IIIT Hyderabad
2024 | 7th Place in Amazon ML Challenge, Amazon
2023 | 1st Place in Megathon'23, IIIT Hyderabad & Qualcomm

Advaith Malladi

Education

University of North Carolina at Chapel Hill

International Institute of Information Technology - Hyderabad

Publications

Research Interests

Research Experience

Research Fellow

Undergraduate Researcher

Undergraduate Researcher

Industry Experience

Machine Learning Research Intern

Machine Learning Intern

Teaching Experience

Teaching Assistant

Relevant Projects

Retrieval Guided Code Generation Leveraging the Planning Capabilities of LLMs

Prompt-tuned vs Fine-tuned models Which Better Account for Brain Language (and Vision) Representations?

Retrieval Augmented Multimodal Factual Verification

Parameter Efficient Prompt Tuning of GPT-2

Attention is all you Need

Decoder Language Model

Evaluating Discourse Coherence in Paragraphs

Embeddings from Language MOdelling (ELMO)

Awards

Service

Reviewing

University Positions