Tong He
Email : tonghe90[at]gmail[dot]com
I am now a Research Fellow at Shanghai AI Lab, working with Prof. Ouyang Wanli and Prof. Qiao Yu . I was a Research Fellow at Australian Institute for Machine Learning (AIML), the University of Adelaide, working with Prof. Chunhua Shen and Prof. Anton van den Hengel

(Google scholar)

I got my PhD in computer science at the University of Adelaide and supervised by Chunhua Shen. I was a visiting student at MMLAB of the Chinese University of Hong Kong at Shenzhen under the supervision of Dr.Weilin Huang and Prof.Yu Qiao. We are looking for self-motivated PhD students (joint PhD program with SJTU, FDU, ZJU, USTC etc) and interns. If you are interested in joining us, please feel free to contact me with your CV!

News

  • Mar, 2025: Three papers are accepted by ICCV2025.
  • Mar, 2025: We released our world model AETHER. Try it here.
  • Feb, 2025: Two papers have been accepted by CVPR2025
  • Jan, 2025: Six papers have been accepted by ICLR2025
  • Oct, 2024: Four papers have been accepted by NIPS2024
  • Oct, 2024: One paper have been accepted by T-PAMI
  • Ranked as Worldwide Top 2% Scientists by Stanford University (2024.10)
  • June, 2024: Five papers have been accepted by ECCV2024
  • Ranked as Worldwide Top 2% Scientists by Stanford University (2023.10)
  • Mar, 2024: Four papers have been accepted by CVPR2024
  • July, 2023: One paper on long-tail object recognition has been accepted by T-PAMI
  • July, 2023: One paper on point cloud pretraining (Ponder) has been accepted by ICCV2023
  • Mar, 2023: One paper on point cloud pretraining (CP3) has been accepted by T-PAMI
  • Mar, 2023: Four papers have been accepted by CVPR23
  • Oct, 2022: One paper has been accepted by SIGGRAPH ASIA.
  • Oct, 2022: The extended version of DyCo3D has been accepted by T-PAMI
  • July, 2022: One paper has been accepted by ECCV22
  • April, 2022: Check our latest instance segmentation paper for 3D point cloud.
  • March, 2021: One T-PAMI has been accepted.
  • March, 2021: One IJCV has been accepted.
  • March, 2021: Two CVPR papers have been accepted.
  • Nov, 2020: Got Ph.D degree and my thesis was awarded the Dean’s Commendation for Doctoral Thesis Excellence.
  • Oct, 2020: The extended version of FCOS is accepted by T-PAMI.
  • July, 2020: Two ECCV papers have been accepted.
  • March, 2020: One CVPR paper has been accepted.

Recent Publications

π3: Scalable Permutation-Equivariant Visual Geometry Learning
Y. Wang, J. Zhou, H. Zhu, W. Chang, Y. Zhou, Z Li, J. Chen, J. Pang, C. Shen and T. He*.
arxiv 2025, [PDF] [code] [project]
Sekai: A Video Dataset towards World Exploration
Z. Li, C. Li, ...T. He, J. Pang, Y. Qiao, Y. Jia, K. Zhang.
arxiv 2025, [PDF] [code] [project]
DeepVerse: 4D Autoregressive Video Generation as a World Model
J. Chen, H. Zhu, X. He, Y. Wang, J. Zhou, W. Chang, Y. Zhou, Z Li, Z. Fu, J. Pang and T. He*.
arxiv 2025, [PDF] [code] [project]
Aether: Geometric-Aware Unified World Modeling
H. Zhu*, Y. Wang*, J. Zhou*, W. Chang*, Y. Zhou*, Z. Li*, J. Chen*, C. Shen, J. Pang and T. He**.
ICCV 2025, [PDF] [code] [project]
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers
Y. Wang, H. Zhu, M. Liu, J. Yang, H. Fang and T. He*.
ICCV 2025, [PDF] [code] [project]
EgoAgent: A Joint Predictive Agent Model in Egocentric Worlds
L.Chen, Y. Wang, S. Tang, Q. Ma, T. He* W. Ouyang, Z. Zhou, H. Bao and S. Peng.
ICCV 2025, [PDF]
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
H. Zhu, H. Yang, X. Wu, D. Huang, S. Zhang, X. He, T. He*, H. Zhao, C. Shen, Y. Qiao and W. Ouyang.
TPAMI 2025, [PDF] [code]
GoalFlow: Goal-Driven Flow Matching for Multimodal Trajectories Generation in End-to-End Autonomous Driving
Z. Xing, X. Zhang, Y. Hu, B. Jiang, T. He, Q. Zhang, X. Long and W. Yin.
CVPR 2025, [PDF] [code]
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
J. Yang, H. Zhu, Y. Wang, G. Wu, T. He, L. Wang.
CVPR 2025, [PDF] [code]
Depth Any Video with Scalable Synthetic Data
H. Yang, D. Huang, W. Yin, C. Shen, H. Liu, X. He, B. Lin, W. Ouyang and T. He*.
ICLR 2025, [PDF] [code]
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
H. Zhu, H. Yang, Y. Wang, J. Yang, L. Wang and T. He*.
ICLR 2025, [PDF] [code]
Where Am I and What Will I See: An Auto-Regressive Model for Spatial Localization and View Prediction
J. Chen, D. Huang, W. Ye, W. Ouyang and T. He*.
ICLR 2025, [PDF] [code]
Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation
P. Gao, ... and T. He....
ICLR 2025, [PDF] [code]
ND-SDF: Learning Normal Deflection Fields for High-Fidelity Indoor Reconstruction
Z. Tang, W. Ye, Y. Wang, D. Huang, H. Bao, T. He* and G. Zhang*.
ICLR 2025, [PDF] [code]
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
Y. Chen, T. He*, D. Huang, W. Ye, S. Chen, J. Tang, Z. Cai, L. Yang, G. Yu, G. Lin and C. Zhang.
ICLR 2025, [PDF] [code]
NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction
Y. Wang, D. Huang, W. Ye*, G. Zhang, W. Ouyang and T. He*.
NIPS 2024, [PDF] [code]
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
H. Zhu, Y. Wang, D. Huang, W. Ye, W. Ouyang and T. He*.
NIPS 2024, [PDF] [code]
GUPNet++: Geometry Uncertainty Propagation Network for Monocular 3D Object Detection
Y. Lu, X. Ma, L. Yang, T. Zhang, Y. Liu, Q. Chu, T. He*, Y. Li and W. Ouyang.
T-PAMI 2024, [PDF] [code]
GVGEN: Text-to-3D Generation with Volumetric Representation
X. He, J. Chen, S. Peng, D. Huang, Y. Li, X. Huang, C. Yuan, W. Ouyang and T. He*.
ECCV 2024, [PDF] [code]
Agent3D-Zero: An Agent for Zero-shot 3D Understanding
S. Zhang, D. Huang, J. Deng, S. Tang, W. Ouyang, T. He* and Y. Zhang*.
ECCV 2024, [PDF] [code]
DetToolChain: A New Prompting Paradigm to Unleash Detection Ability of MLLM
Y. Wu, Y. Wang, S. Tang, W. Wu, T. He, W. Ouyang, J. Wu and P. Torr.
ECCV 2024, [PDF] [code]
Pixel-GS: Density Control with Pixel-aware Gradient for 3D Gaussian Splatting
Z. Zhang, W. Hu, Y. Liao, T. He and H. Zhao.
ECCV 2024, [PDF] [code]
UniPad: A Universal Pre-Training Paradigm For Autonomous Driving
H. Yang, S. Zhang, D. Huang, X. Wu, H. Zhu, T. He*, S. Tang, H. Zhao, Q. Qiu, B. Lin, X. He and W. Ouyang.
CVPR 2024, [PDF] [code]
TASeg: Temporal Aggregation Network for LiDAR Semantic Segmentation
X Wu, Y Hou, X Huang, B Lin, T. He, X Zhu, Y Ma, B Wu, H Liu, D Cai, W Ouyang
CVPR 2024, [PDF] [code]
DreamComposer: Controllable 3D Object Generation via Multi-View Conditions
Y. Yang, Y. Huang, X. Wu, Y. Guo, S. Zhang, H. Zhao, T. He and X. Liu.
CVPR 2024, [PDF] [code]
Point Transformer V3: Simpler, Faster, Stronger
X. Wu, L. Jiang, P. Wang, Z. Liu, X. Liu, Y. Qiao, W. Ouyang, T. He* and H. Zhao*.
CVPR 2024, [PDF] [code]

Professional activities

    Journals

    Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)

    International Journal of Computer Vision (IJCV)

    Transaction on Image Processing(TIP)

    Pattern Recognition(PR)

    IEEE Transactions on Circuits and Systems for Video Technology(TCSVT)

    Conferences

    CVPR, ICCV, ECCV, NIPS, ICLR, AAAI, etc.

Last Updated on 26th Aug, 2019

Published with GitHub Pages