Yesheng Zhang Eason Zhang

I am a PhD student in the Computer Vision Lab at Shanghai Jiao Tong university, under the supervision of Prof. Xu Zhao.

Before that I received my Master's (2023) and Bachelor's (2020) degree from School of Biomedical Engineering and Institute of Medical Robotics, Shanghai Jiao Tong University, advised by Prof. Dahong Qian and Prof. Xu Zhao.

Email  /  Google Scholar  /  Github

profile photo
Research

I am passionate about 3D computer vision, including scene reconstruction and understanding.

Publications
Searching from Area to Point: A Hierarchical Framework for Semantic-Geometric Combined Feature Matching
Yesheng Zhang, Xu Zhao
Pattern Recognition (PR), 2026
paper | page | code

Semantic-geometric combined feature matching in a unified searching perspective.

MESA: Effective Matching Redundancy Reduction by Semantic Area Segmentation
Yesheng Zhang, Shuhan Shen, Xu Zhao
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2026
paper | page | invited talk | code

Densely or Sparsely Matching every areas and points between images for many 3D downstream tasks, utilizing SAM.

Semantic-guided Camera Ray Regression for Visual Localization
Yesheng Zhang, Xu Zhao
International Conference on Computer Vision (ICCV), 2025
paper | page

Visual Localization by regressing camera rays with semantic guidance.

MESA: Matching Everything by Segmenting Anything
Yesheng Zhang, Xu Zhao
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
paper | page | video | code

Matching every areas and points between images, utilizing SAM.

Learning-Based Distortion Correction and Feature Detection for High Precision and Robust Camera Calibration
Yesheng Zhang, Xu Zhao, Dahong Qian
IEEE Robotics and Automation Letters (RA-L), 2022
IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2022
paper | code | video

We proposed a learning-based camera calibration system, which combines deep learning methods and traditional methods.

Preprints
Structuring GUI Elements through Vision Language Models: Towards Action Space Generation
Yi Xu, Yesheng Zhang, Jiajia Liu, Jingdong Chen
arXiv preprint, 2025
paper

An IoU-augmented maximum likelihood training paradigm for improved GUI element coordinate prediction in MLLMs.

Trajectory Entropy: Modeling Game State Stability from Multimodality Trajectory Prediction
Yesheng Zhang*, Wenjian Sun*, Yuheng Chen, Qingwei Liu, Qi Lin, Rui Zhang, Xu Zhao
arXiv preprint, 2025
paper

Trajectory Entropy quantifies game-state stability from multimodal trajectory predictions to improve level-k planning accuracy and efficiency.


Thank Jon Barron for sharing his website's source code.