Di Zhang
I am a PhD candidate at Fudan University, working on chemical AI, multimodal reasoning, test-time search, reinforcement learning, and foundation models for scientific discovery. My recent work includes ChemLLM, ChemVLM, MCTSr, Llama-Berry, Critic-V, Chem-R, and related systems for reasoning and chemistry.
I am a Research Resident and Head of Agent Model at MindLab starting in 2026. Previously, I interned at NVIDIA Research from 2025 to 2026 and Shanghai AI Lab from 2023 to 2025, worked full-time as a machine learning developer at Alibaba from 2022 to 2023, and received my Master of Engineering from the USTC Robotics Lab from 2019 to 2022. I also interned at Ant Group and MIT Han Lab.
Research
My current research is centered on the work I have pursued since 2025:
- LLM reasoning: test-time scaling, reinforcement learning, tree search, self-evaluation, critic models, and controllable reasoning, represented by Llama-Berry, Control-R, SELT, Chem-R, Critic-V, and TinyEye.
- Scientific intelligence: foundation models and reasoning systems for chemistry, materials science, molecules, and scientific discovery, represented by ChemVLM, Mol-R1, MolReflect, ChemAgent, MOOSE-Chem3, CMPhysBench, and LoRA-Chem.
- Agentic learning: tool-using agents, memory, retrieval-time critique, scalable training/serving infrastructure, and protocols for agentic model development, represented by MinT, delta-mem, Retrieval Is Not Enough, and MCP-based reasoning.
Selected Publications
Visit Google Scholar for the complete publication list.

Llama-Berry: Pairwise Optimization for Olympiad-Level Mathematical Reasoning via o1-like Monte Carlo Tree Search
Di Zhang, Jianbo Wu, Jingdi Lei, Tong Che, Jiatong Li, Tong Xie, Xiaoshui Huang, Shufei Zhang, Marco Pavone, Yuqiang Li, and others.
Proceedings of NAACL-HLT 2025, Long Papers, pages 7315-7337.

Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning
Di Zhang, Jingdi Lei, Junxian Li, Xunzhi Wang, Yujie Liu, Zonglin Yang, Jiatong Li, Weida Wang, Suorong Yang, Jianbo Wu, and others.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2025, pages 9050-9061.


ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
Junxian Li, Di Zhang, Xunzhi Wang, Zeying Hao, Jingdi Lei, Qian Tan, Cai Zhou, Wei Liu, Yaotian Yang, Xinrui Xiong, and others.
Proceedings of the AAAI Conference on Artificial Intelligence, 39(1), 415-423, 2025.


Publications
2026
- MinT: Managed Infrastructure for Training and Serving Millions of LLMs.
Mind Lab, Song Cao, Vic Cao, Andrew Chen, Kaijie Chen, Cleon Cheng, Steven Chiang, Kaixuan Fan, Hera Feng, Huan Feng, and others. arXiv preprint arXiv:2605.13779, 2026. HF Paper / arXiv / PDF - delta-mem: Efficient Online Memory for Large Language Models.
Jingdi Lei, Di Zhang, Junxian Li, Weida Wang, Kaixuan Fan, Xiang Liu, Qihan Liu, Xiaoteng Ma, Baian Chen, Soujanya Poria. arXiv preprint arXiv:2605.12357, 2026. HF Paper / arXiv / PDF - Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text.
Ximing Lu, David Acuna, Jaehun Jung, Jian Hu, Di Zhang, Shizhe Diao, Yunheng Zou, Shaokun Zhang, Brandon Cui, Mingjie Liu, and others. arXiv preprint arXiv:2601.22975, 2026. arXiv / PDF / Project - Retrieval Is Not Enough: Enhancing RAG Through Test-Time Critique and Optimization.
Jiaqi Wei, Hao Zhou, Xiang Zhang, Di Zhang, Zijie Qiu, Noah Wei, Jinzhe Li, Wanli Ouyang, Siqi Sun. Advances in Neural Information Processing Systems, 38:21484-21520, 2026. OpenReview / PDF - IAG: Input-Aware Backdoor Attack on VLM-Based Visual Grounding.
Junxian Li, Beining Xu, Simin Chen, Jiatong Li, Jingdi Lei, Haodong Zhao, Di Zhangβ‘. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 27872-27883, 2026. HF Paper / arXiv / PDF - MolReflect: Towards In-Context Fine-Grained Alignments Between Molecules and Texts.
Jiatong Li, Yunqing Liu, Wei Liu, Jingdi Lei, Di Zhang, Wenqi Fan, Dongzhan Zhou, Yuqiang Li, Qing Li. IEEE Transactions on Knowledge and Data Engineering, 2026. HF Paper / arXiv / PDF / Code
2025
- TinyEye: Sharpening Visual Reasoning of Tiny Models with Offline Policy Optimization.
Di Zhang, Junxian Li, Shihao Wang, Weida Wang, Guo Chen, Hao Zhang, Shizhe Diao, Mingjie Liu, Ximing Lu, Jaehun Jung, and others. OpenReview, 2025. OpenReview / PDF - Error-Free Linear Attention Is a Free Lunch: Exact Solution from Continuous-Time Dynamics.
Jingdi Lei, Di Zhang, Soujanya Poria. arXiv preprint arXiv:2512.12602, 2025. arXiv / PDF - Chem-R: Learning to Reason as a Chemist.
Weida Wang, Benteng Chen, Di Zhang, Wanhao Liu, Shuchen Pu, Ben Gao, Jin Zeng, Xiaoyong Wei, Tianshu Yu, Shuzhou Sun, and others. arXiv preprint arXiv:2510.16880, 2025. arXiv / PDF - NVIDIA Nemotron Nano V2 VL.
Amala Sanjay Deshmukh, Kateryna Chumachenko, Tuomas Rintamaki, Matthieu Le, Tyler Poon, Danial Mohseni Taheri, Ilia Karmanov, Guilin Liu, Jarno Seppanen, Guo Chen, and others. arXiv preprint arXiv:2511.03929, 2025. arXiv / PDF / Model - NVIDIA Isaac GR00T N1.6.
NVIDIA. NVIDIA News, 2025. News - CMPhysBench: A Benchmark for Evaluating Large Language Models in Condensed Matter Physics.
Weida Wang, Dongchen Huang, Jiatong Li, Tengchao Yang, Ziyang Zheng, Di Zhang, Dong Han, Benteng Chen, Binzhao Luo, Zhiyu Liu, and others. arXiv preprint arXiv:2508.18124, 2025. arXiv / PDF / Code - Your Reward Function for RL Is Your Best PRM for Search: Unifying RL and Search-Based TTS.
Can Jin, Yang Zhou, Qixin Zhang, Hongwu Peng, Di Zhang, Zihan Dong, Marco Pavone, Ligong Han, Zhang-Wei Hong, Tong Che, and others. arXiv preprint arXiv:2508.14313, 2025. arXiv / PDF - Mol-R1: Towards Explicit Long-CoT Reasoning in Molecule Discovery.
Jiatong Li, Weida Wang, Qinggang Zhang, Junxian Li, Di Zhang, Changmeng Zheng, Shufei Zhang, Xiaoyong Wei, Qing Li. arXiv preprint arXiv:2508.08401, 2025. arXiv / PDF - Scaling Up RL: Unlocking Diverse Reasoning in LLMs via Prolonged Training.
Mingjie Liu, Shizhe Diao, Jian Hu, Ximing Lu, Xin Dong, Hao Zhang, Alexander Bukharin, Shaokun Zhang, Jiaqi Zeng, Makesh Narsimhan Sreedhar, and others. arXiv preprint arXiv:2507.12507, 2025. arXiv / PDF - SELT: Self-Evaluation Tree Search for LLMs with Task Decomposition.
Mengsong Wu, Di Zhang, Yuqiang Li, Dongzhan Zhou, Wenliang Chen. arXiv preprint arXiv:2506.07557, 2025. arXiv / PDF / Code - Control-R: Towards Controllable Test-Time Scaling.
Di Zhang, Weida Wang, Junxian Li, Xunzhi Wang, Jiatong Li, Jianbo Wu, Jingdi Lei, Haonan He, Peng Ye, Shufei Zhang, and others. arXiv preprint arXiv:2506.00189, 2025. HF Paper / arXiv / PDF - Exploring the Application of Model Context Protocol for Enhanced Reasoning in Large Language Models.
Jianbo Wu, Di Zhang, Wei Shu, Jie Liu. ICML 2025 Workshop NewInML Poster, 2025. ICML - ChemAgent: Enhancing LLMs for Chemistry and Materials Science Through Tree-Search Based Tool Learning.
Mengsong Wu, YaFei Wang, Di Zhang, Yidong Ming, Yuqi An, Yuwei Wan, Wenliang Chen, Binbin Lin, Yuqiang Li, Tong Xie, Dongzhan Zhou. OpenReview, 2025. HF Paper / arXiv / PDF / Code - LoRA-Chem: Modular Machine Learning for Multitask Prediction in Organic Reactions.
Ben Gao, Penghui Li, Di Zhang, Qian Tan, Wanhao Liu, Xunzhi Wang, Junxian Li, Shufei Zhang, Dongzhan Zhou, Yuqiang Li, and others. CCS Chemistry, 1-9, 2025. DOI / Code / Model - MOOSE-Chem3: Toward Experiment-Guided Hypothesis Ranking via Simulated Experimental Feedback.
Wanhao Liu, Zonglin Yang, Jue Wang, Lidong Bing, Di Zhang, Dongzhan Zhou, Yuqiang Li, Houqiang Li, Erik Cambria, Wanli Ouyang. NeurIPS 2025 AI for Science Workshop, 2025. arXiv / PDF / OpenReview / NeurIPS - Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning.
Di Zhang, Jingdi Lei, Junxian Li, Xunzhi Wang, Yujie Liu, Zonglin Yang, Jiatong Li, Weida Wang, Suorong Yang, Jianbo Wu, and others. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 9050-9061, 2025. HF Paper / arXiv / CVF PDF - Llama-Berry: Pairwise Optimization for Olympiad-Level Mathematical Reasoning via o1-like Monte Carlo Tree Search.
Di Zhang, Jianbo Wu, Jingdi Lei, Tong Che, Jiatong Li, Tong Xie, Xiaoshui Huang, Shufei Zhang, Marco Pavone, Yuqiang Li, and others. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, Long Papers, 7315-7337. HF Paper / arXiv / PDF - ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area.
Junxian Li, Di Zhang, Xunzhi Wang, Zeying Hao, Jingdi Lei, Qian Tan, Cai Zhou, Wei Liu, Yaotian Yang, Xinrui Xiong, and others. Proceedings of the AAAI Conference on Artificial Intelligence, 39(1):415-423, 2025. HF Paper / arXiv / AAAI PDF / Code / Model
2024 and Earlier
- Biology Instructions: A Dataset and Benchmark for Multi-Omics Sequence Understanding Capability of Large Language Models.
Haonan He, Yuchen Ren, Yining Tang, Ziyang Xu, Junxian Li, Minghao Yang, Di Zhang, Dong Yuan, Tao Chen, Shufei Zhang, and others. EMNLP 2025 Findings. HF Paper / arXiv / ACL PDF - Accessing GPT-4 Level Mathematical Olympiad Solutions via Monte Carlo Tree Self-Refine with Llama-3 8B.
Di Zhang, Xiaoshui Huang, Dongzhan Zhou, Yuqiang Li, Wanli Ouyang. arXiv preprint arXiv:2406.07394, 2024. HF Paper / arXiv / PDF - ChemLLM: A Chemical Large Language Model.
D. Zhang, W. Liu, Q. Tan, J. Chen, H. Yan, Y. Yan, J. Li, W. Huang, X. Yue, D. Zhou. arXiv preprint arXiv:2402.06852, 2024. HF Paper / arXiv / PDF / Model - Sentiment Analysis Dataset for Service-Oriented Places Like Electric Power Supply Offices.
Bo Zhang, Chenguang Li, Di Zhang, Bin Lu, Kaibao Zhou, Jing Zhang, Qiming Zhu, Xiaoping Chen. Journal of Computer Applications, 42(S1):37-42, 2022. Wanfang - Target Selection Model for Robot Interaction and Robot Interaction System.
Bo Zhang, Bin Lyu, Huizhou Liu, Yu Ouyang, Qian Zhao, Di Zhang, Rongya Chen, Xiaoping Chen, Liang Tang, Songlin Zuo, and others. CN Patent CN114,399,529 B, 2022. Patent - Design and Implementation of Safety and Robustness of Mobile Service Robot Navigation in Complex Pedestrian Scenarios.
Di Zhang. Master thesis, University of Science and Technology of China, 2022. - Juvenile State Hypothesis: What We Can Learn from Lottery Ticket Hypothesis Researches?
Di Zhang. arXiv preprint arXiv:2109.03862, 2021. arXiv / PDF - Water Supply Engineering Design Scheme 2 of Municipal Services District of C City.
Di Zhang. Thesis, Hefei Technology University, 2019.
Experience
- MindLab, Research Residency, Head of Agent Model, 2026-present.
- NVIDIA Research, Research Intern, 2025-2026.
- Shanghai AI Lab, Research Intern, 2023-2025.
- Alibaba Inc., Machine Learning Developer, 2022-2023.
- Ant Group, Intern, 2021.
- MIT Han Lab, Intern, 2021-2022.
Education
- Fudan University, PhD Candidate, 2023-present.
- University of Science and Technology of China, Master of Engineering, Robotics Lab, 2019-2022.