Zirui Zhao  赵子瑞
ziruiz [at] u.nus.edu
Zirui Zhao is a final-year CS PhD candidate at National University of Singapore under Prof Wee Sun Lee and Prof David Hsu. He got his B.Eng. at Xi'an Jiaotong University. He's doing research on AI decision-making and reasoning.
He will be joining Salesforce AI Research Singapore as a research scientist in May 2025.
CV  / 
Google Scholar  / 
LinkedIn  / 
Github
|
|
-
Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Zirui Zhao,
Hanze Dong,
Amrita Saha,
Caiming Xiong,
Doyen Sahoo
ICLR, 2025
arXiv /
openreview
Auto-CEI pushes and estimates the limits of LLM reasoning capacities and aligns LLM's assertive and conservative response behaviours according to these limits for reliable reasoning.
-
On the Empirical Complexity of Reasoning and Planning in LLMs
Liwei Kang*,
Zirui Zhao*,
David Hsu,
Wee Sun Lee (*Equal contribution, listed in alphabetical order)
EMNLP Findings, 2024
arXiv
We propose an easy-to-use framework that leverages sample and computational complexity from machine learning theory to analyze reasoning and planning problems and to design/optimize LLM-based reasoning methods.
-
Large Language Models as Commonsense Knowledge for Large-Scale Task Planning
Zirui Zhao,
Wee Sun Lee,
David Hsu
NeurIPS, 2023
Also in RSS LTAMP Workshop, Best Paper Runner-up, 2023
project page /
arXiv /
code /
openreview /
bibtex
We use Large Language Models as both the commonsense world model and the heuristic policy within Monte Carlo Tree Search. LLM's world model provides with MCTS a commonsense prior belief of states for reasoned decision-making. The LLM's heuristic policy guides the search to relevant parts of the tree, substantially reducing the search complexity.
-
Differentiable Parsing and Visual Grounding of Natural Language Instructions for Object Placement
Zirui Zhao,
Wee Sun Lee,
David Hsu
ICRA, 2023
Also in CoRL LangRob Workshop, 2022
project page /
IEEE Xplore /
code /
video /
arXiv /
bibtex
We proposed ParaGon for language-conditioned object placing. ParaGon integrates a parsing algorithm into an end-to-end trainable neural network. It is data-efficient and generalizable for learning compositional instructions, and robust to noisy, ambiguous language inputs.
-
Active Learning for Risk-sensitive Inverse Reinforcement Learning
Rui Chen,
Wenshuo Wang,
Zirui Zhao,
Ding Zhao
Tech Report, 2019
code /
arXiv
Risk-sensitive inverse reinforcement learning provides an general model to capture how human assess the distribution of a stochastic outcome when the true distribution is unknown (ambiguous). This work enables an RS-IRL learner to actively query expert demonstrations for faster risk envelope approximation.
-
Visual Semantic SLAM
Zirui Zhao, Yijun Mao, Yan Ding,
Pengju Ren, Nanning Zheng
CCHI, 2019
code /
arXiv
Semantic SLAM projecting semantic meaning into 3D point clouds generated by ORB SLAM algorithm.
|