I am a Ph.D. candidate in Electrical and Computer Engineering at University of California, Los Angeles (UCLA) advised by Prof. Achuta Kadambi. Previously, I obtained my M.S. in Electrical Engineering from Columbia University in 2021, where I was fortunate to be advised by Prof. John Wright. I received my B.E. in Electronic Information Engineering from University of Electronic Science and Technology of China (UESTC) in 2019.
My research interests include 3D computer vision and generative AI. Currently, my research focuses on building spatial intelligence, including 3D/4D scene reconstruction (Gaussian Splatting), generation (diffusion models), understanding (vision foundation models), reasoning (vision language models), and interaction (agentic AI). I am also actively working with Prof. Leonidas Guibas at Stanford University and Prof. Atlas Wang at University of Texas at Austin.
I am currently a Research Intern at Apple, and I was a Student Researcher at
in 2024. I will be on the job market and actively seeking full-time research scientist/engineer opportunities starting in 2026.
π₯ News
- 2025.04: X-Dyna was selected as a Highlight paper at CVPR 2025 (2.98% of 13008 submissions)! π
- 2025.02: Two papers accepted to CVPR 2025.
- 2025.02: 4K4DGen (4D version of DreamScene360) was selected as a Spotlight at ICLR 2025 (5.1% of 11565 submissions)! π
- 2025.01: One paper accepted to ICLR 2025.
- 2024.10: Awarded J.B. Fourier Scholar in Vision and Graphics from UCLA.
- 2024.09: One paper accepted to NeurIPS 2024.
- 2024.07: One paper accepted to ECCV 2024.
- 2024.04: Feature 3DGS was selected as a Highlight paper at CVPR 2024 (2.8% of 11532 submissions)! π
- 2024.02: One paper accepted to CVPR 2024.
- 2023.02: One paper accepted to CVPR 2023.
- 2021.09: Awarded Graduate Deanβs Scholar Award from UCLA.
- 2020.06: Awarded MS Honors Student from Columbia University.
π Selected Publications
* indicates equal contribution

VLM4D: Towards Spatiotemporal Awareness in Vision Language Models
Shijie Zhou*, Alexander Vilesov*, Xuehai He*, Ziyu Wan, Shuwang Zhang, Aditya Nagachandra, Di Chang, Dongdong Chen, Xin Eric Wang, Achuta Kadambi
- We propose the first benchmark explicitly designed to rigorously evaluate the spatiotemporal reasoning capabilities of Vision-Language Models (VLMs).

Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
Shijie Zhou*, Hui Ren*, Yijia Weng, Shuwang Zhang, Zhen Wang, Dejia Xu, Zhiwen Fan, Suya You, Zhangyang Wang, Leonidas Guibas, Achuta Kadambi
- Building 4D interactive scenes with agentic AI from monocular videos, by dynamically distilling model-conditioned features and integrating 2D foundation models with LLMs in feedback loops.

DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting
Shijie Zhou*, Zhiwen Fan*, Dejia Xu*, Haoran Chang, Pradyumna Chari, Tejas Bharadwaj, Suya You, Zhangyang Wang, Achuta Kadambi
- We introduce a 3D scene generation pipeline that creates immersive scenes with full 360$^{\circ}$ coverage from text prompts of any level of specificity.

Feature 3DGS: Supercharging 3D Gaussian Splatting to Enable Distilled Feature Fields
Shijie Zhou, Haoran Chang*, Sicheng Jiang*, Zhiwen Fan, Zehao Zhu, Dejia Xu, Pradyumna Chari, Suya You, Zhangyang Wang, Achuta Kadambi
Paper | Project (CVPR 2024 Highlight)
- Feature 3DGS πͺ, distills feature fields from 2D foundation models, opening the door to a brand new semantic, editable, and promptable explicit 3D scene representation.

π Honors and Awards
- 2024 J.B. Fourier Scholar in Vision and Graphics, UCLA
- 2021 Graduate Deanβs Scholar Award, UCLA
- 2020 MS Honors Student, Columbia University
- 2019 Outstanding Graduate, University of Electronic Science and Technology of China
- 2018 James Watt Scholar, University of Glasgow
π» Work Experience
- 2025.04 - now, Research Intern at
Apple
- 2024.06 - 2024.11, Student Researcher at
- 2023.06 - 2023.09, Visiting Academic at USC Institute for Creative Technologies
π Teaching
- Teaching Assistant @ UCLA: ECE188 Computer Vision, ECE113 Digital Signal Processing
- Teaching Assistant @ Columbia: EECS6690 Statistical Learning, ELEN6885 Reinforcement Learning
- Teaching Assistant @ UoG & UESTC: 1008 Microelectronic Systems, 3010 Team Design Project and Skills
ποΈ Service
- Conference Reviewer: SIGGRAPH 2025, CVPR 2025, ICCV 2025, ICML 2025, ICLR 2025, 3DV 2025, NeurIPS 2025/2024, ECCV 2024
- Journal Reviewer: International Journal of Computer Vision, IEEE Transactions on Multimedia, Pattern Recognition
- Workshop Reviewer: AI for 3D Generation @ CVPR 2024, AI for 3D Content Creation @ ICCV 2023