Tianyuan Zhang

Tianyuan Zhang 「张天远」

I am a second-year PhD student at MIT EECS, advised by Prof. Bill Freeman. Before that, I get my MS in Robotics at CMU, supervised by Prof. Srinivasa Narasimhan, and my undergraduate in Peking University, working with Prof. Zhanxing Zhu, Dr. Xiangyu Zhang, and Prof. Hang Zhao.

Email: tianyuan [at] mit [dot] edu

I acknowledge that information asymmetry can significantly hinder research opportunities for junior students. If you're interested in chatting about life, research, or potential collaborations, feel free to email me.

CV / Google Scholar / Github / Attempts at photography

Research

I have had research experience in machine learning, physiscs-based vision, computational imaging and computer graphics.

My current focus is on video generation, world models and infinite context learning.

	Test-Time Training Done Right Tianyuan Zhang, Sai Bi, Yicong Hong, Kai Zhang, Fujun Luan, Songlin Yang, Kalyan Sunkavalli, William T. Freeman, Hao Tan arxiv, 2025 (New) project page / paper / code Hardware-friendly Test-Time Training boosts FLOPs utilization by 10x, facilitates larger state-size and advanced optimizers, and can be implemented in PyTorch with just a few lines of code. Validated on novel view synthesis, language models, and AR video diffusion.
	RandAR: Decoder-only Autoregressive Visual Generation in Random Orders Ziqi Pang, Tianyuan Zhang, Fujun Luan, Yunze Man, Hao Tan, Kai Zhang, William T. Freeman, Yu-Xiong Wang CVPR, 2025 (Oral Presentation) project page / paper / github Next-token prediction in random orders for images.
	RelitLRM: Generative Relightable Radiance for Large Reconstruction Models Tianyuan Zhang, Zhengfei Kuang, Haian Jin, Zexiang Xu, Sai Bi, Hao Tan, He Zhang, Yiwei Hu, Milos Hasan, William T. Freeman, Kai Zhang, Fujun Luan ICLR, 2025 (Spotlight) project page / paper / code comming soon We build a probabilistic inverse rendering model that reconstrcts and relights 3D objects with sparse input views. GPUs learn algorithms!
	LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias Haian Jin, Hanwen Jiang, Hao Tan, Kai Zhang, Sai Bi, Tianyuan Zhang, Fujun Luan, Noah Snavely, Zexiang Xu ICLR, 2025 (Oral Presentation) project page / paper / code comming soon Posed Novel view synthesis with minimal 3D inductive bias.
	PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation Tianyuan Zhang, Hong-Xing "Koven" Yu, Rundi Wu, Brandon Y. Feng, Changxi Zheng, Noah Snavely, Jiajun Wu, William T. Freeman. ECCV, 2024 (Oral Presentation) project page / github / paper We bring static 3D objects to life by distilling material parameters from video generation models.
	Physically Compatible 3D Object Modeling from a Single Image Minghao Guo, Bohan Wang, Pingchuan Ma, Tianyuan Zhang, Crystal Elaine Owens, Chuang Gan, Joshua B. Tenenbaum, Kaiming He, Wojciech Matusik NeurIPS, 2024 (Spotlight) project page / paper / Recostruct 3D physical objects from single images by considering mechanical properties, external forces, and rest-shape geometry.
	Analyzing Physical Impacts using Transient Surface Wave Imaging Tianyuan Zhang, Mark Sheinin, Dorian Chan, Mark Rau, Matthew O'Toole, Srinivasa G. Narasimhan. CVPR, 2023 project page / github / paper / videos We image the "ripples" on solid surfaces caused by physical impacts, which contain information about the object's physical properties and its interaction with the environment. We showcase non-line-of-sight impact localization capabilities.
	Real-Time Intermediate Flow Estimation for Video Frame Interpolation Zhewei Huang, Tianyuan Zhang, Wen Heng, Boxin Shi, Shuchang Zhou ECCV, 2022 github / arXiv / demos We propose a real-time intermediate flow estimation (RIFE) method for video frame interpolation, it runs 30+FPS for 2X 720p interpolation on a 2080Ti GPU
	Embracing Single Stride 3D Object Detector with Sparse Transformer Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyang Wang, Zhaoxiang Zhang CVPR, 2022 github / arxiv / In contrast to 2D, object size in 3D does not exhibit long-tail distributions. We propose a single stride sparse Transformer (SST) for 3D object detection. We obtained impressive results on small objects
	DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries Yue Wang, Vitor Guizilini, Tianyuan Zhang, Yilun Wang, Hang Zhao, Justin Solomon CoRL, 2021 github / arxiv / A new paradigm of 3D object detection from multiview 2D images
	MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries Tianyuan Zhang, Xuanyao Chen, Yue Wang, Yilun Wang, Hang Zhao preprint, 2022 project page / github / arXiv End-to-End 3D tracking with multiview-cameras
	FUTR3D: A Unified Sensor Fusion Framework for 3D Detection Xuanyao Chen, Tianyuan Zhang, Yue Wang, Yilun Wang, Hang Zhao preprint, 2022 project page / github / arXiv A unified framework for 3D detection from multi-sensor data. We achieved impressive results with multiview-cameras and one-beam LiDAR.
	Objects365: A Large-scale, High-quality Dataset for Object Detection Shuai Shao, Zeming Li, Tianyuan Zhang, Chao Peng, Gang Yu, Xiangyu Zhang, Jing Li, Jian Sun ICCV, 2019 project page / paper We provide a high-quality large-scale object detection dataset, with 365 categories, 638K images, and 10,101K bounding boxes
	You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle Dinghuai Zhang, Tianyuan Zhang, Yiping Lu, Zhanxing Zhu, Bin Dong NeurIPS*, 2019 arXiv / code Accelerating adversarial training using Pontryagin`s Maximum Principle
	Interpreting Adversarially Trained Convolutional Neural Networks Tianyuan Zhang, Zhanxing Zhu ICML, 2019 github / arXiv Discussion on the shape-bias and texture-bias of adversarially trainined convolutional neural networks

Professional Services

Reviewer: CVPR' 2021,23, NeurIPS' 2020, ICLR' 2021,22,23 BlogPosts.

Updated at Oct. 2024

Template

Template for photography page comes from this amazing guy