I am a Senior Applied Scientist at Microsoft Office AI, where I focus on advancing intelligent document understanding and generation to improve workplace productivity. I earned my Ph.D. in Computer Science from Purdue University, working with Prof. Lin Tan on applying Artificial Intelligence techniques to Software Engineering.
I have a wide interest on applied artificial intelligence techniques. My research centers on addressing challenges in how software is created, maintained, and used in daily work. Specifically, I develop AI-driven methods for code generation, automated program repair, and vulnerability detection, aimed at accelerating implementation, reducing errors, and improving reliability. At Microsoft, I extend these approaches to build document intelligence systems that help knowledge workers more efficiently understand, draft, and edit documents, reducing repetitive tasks and enabling higher-level creative and analytical work.
Publications
- [ACL-2025] Can Language Models Replace Programmers? REPOCOD Says 'Not Yet'
Shanchao Liang, Nan Jiang, Yiran Hu, Lin Tan Data - [ACL-2025] WAFFLE: Multi-Modal Model for Automated Front-End Development
Shanchao Liang, Nan Jiang, Shangshu Qian, Lin Tan Model Data - [ICLR-2025] Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning
Nan Jiang, Chengxiao Wang, Kevin Liu, Xiangzhe Xu, Lin Tan, Xiangyu Zhang Model - [ICRA-2025] SELP: Generating Safe and Efficient Task Plans for Robot Agents with Large Language Models
Yi Wu, Zikang Xiong, Yiran Hu, Shreyash S. Iyengar, Nan Jiang, Aniket Bera, Lin Tan, Suresh Jagannathan - [OOPSLA-2025] Show Me Why It's Correct: Saving 1/3 of Debugging Time in Program Repair with Interactive Runtime Comparison
Ruixin Wang, Zhongkai Zhao, Le Fang, Nan Jiang, Yiling Lou, Lin Tan, Tianyi Zhang - [AAAI-2025] LATTE: Improving Latex Recognition for Tables and Formulae with Iterative Refinement
Nan Jiang, Shanchao Liang, Chengxiao Wang, Jiannan Wang, Lin Tan Data Poster - [NeurIPS-2024] Training LLMs to Better Self-Debug and Explain Code
Nan Jiang, Xiaopeng Li, Shiqi Wang, Qiang Zhou, Soneya Binta Hossain, Baishakhi Ray, Varun Kumar, Xiaofei Ma, Anoop Deoras Page Poster - [CCS-2024] ReSym: Harnessing LLMs to Recover Variable and Data Structure Symbols from Stripped Binaries
🏆 Won ACM SIGSAC Distinguished Paper
Danning Xie, Zhuo Zhang, Nan Jiang, Xiangzhe Xu, Lin Tan, Xiangyu Zhang Data - [SANER-2025] How Effective are Large Language Models in Generating Software Specifications?
Danning Xie, Byungwoo Yoo, Nan Jiang, Mijung Kim, Lin Tan, Xiangyu Zhang, Judy S. Lee Poster - [NDSS-2025] Unleashing the Power of Generative Model in Recovering Variable Names from Stripped Binary
Xiangzhe Xu, Zhuo Zhang, Zian Su, Ziyang Huang, Shiwei Feng, Yapeng Ye, Nan Jiang, Danning Xie, Siyuan Cheng, Lin Tan, Xiangyu Zhang - [FSE-2024] A Deep Dive into Large Language Models for Automated Bug Localization and Repair
Soneya Binta Hossain, Nan Jiang, Qiang Zhou, Xiaopeng Li, Wen-Hao Chiang, Yingjun Lyu, Hoan Nguyen, Omer Tripp - [ISSTA-2023] How Effective Are Neural Networks for Fixing Security Vulnerabilities
Yi Wu, Nan Jiang, Hung Viet Pham, Thibaud Lutellier, Jordan Davis, Lin Tan, Petr Babkin, Sameena Shah Data - [ICSE-2023] Impact of Code Language Models on Automated Program Repair
Nan Jiang, Kevin Liu, Thibaud Lutellier, Lin Tan Data - [ICSE-2023] KNOD: Domain Knowledge Distilled Tree Decoder for Automated Program Repair
Nan Jiang, Thibaud Lutellier, Yiling Lou, Lin Tan, Dan Goldwasser, Xiangyu Zhang - [ICSE-2021] CURE: Code-Aware Neural Machine Translation for Automatic Program Repair
Nan Jiang, Thibaud Lutellier, Lin Tan - [NMI-2022 (Journal)] Quantifying Spatial Homogeneity of Urban Road Networks via Graph Neural Networks
Jiawei Xue, Nan Jiang, Senwei Liang, Qiyuan Pang, Takahiro Yabe, Satish V. Ukkusuri, Jianzhu Ma
Preprints
- Collu-Bench: A Benchmark for Predicting Language Model Hallucinations in Code
Nan Jiang, Qi Li, Lin Tan, Tianyi Zhang Data
Services
- PC Member, AAAI 2026
- PC Member, ASE 2025
- PC Member, CIKM 2025
- PC Member, NeurIPS 2025 Datasets & Benchmark Track
- PC Member, FORGE 2025 Data and Benchmarking Track (Co-located with ICSE 2025)
- PC Member, ICLR 2025
- PC Member, Workshop on Automated Program Repair (Co-located with ICSE 2025)
- PC Member, ASE 2024 Artifact Evaluation Track
- PC Member, SCAM 2024 Research Track
- PC Member, FSE 2023 Artifact Evaluation Track
- Reviewer, TSE 2024 (five times), TSE 2025
- Reviewer, TOSEM 2023, TOSEM 2024 (seven times)
- Reviewer, EISEJ 2024 (twice)
- Reviewer, Open Research Europe 2024
Teaching
- Teaching Assistant, CS408 Software Testing, Purdue University (Fall 2023, Fall 2022)
- Teaching Assistant, CS251 Data Structures and Algorithm, Purdue University (Spring 2021)
- Teaching Assistant, CS380 Python Programming, Purdue University (Fall 2020)
- Teaching Assistant, CS180 Problem Solving and Object-Oriented Programming, Purdue University (Spring 2020)
Education
- Ph.D in Computer Science, Purdue University, 2025 (expected)
- B.S. in Computer Science, Peking University, 2019