Double-click to toggle motion / 双击切换动效

ZHANG Yuechen

Hi! I am a Research Scientist in Multi-Modal at Xiaomi MiMo.

I obtained my PhD from CUHK in 2025, advised by Prof. Jiaya Jia. Before that, I received my bachelor's degree from CUHK in 2021. I focus on autoregressive, interactive, and efficient video generation and intelligence. I also enjoy exploring special and interesting visual effects and applications in generation tasks.

Double-click to toggle motion / 双击切换动效
HOVER IMAGES TO SHOW RESULTS

Representative Research Works



FULL PUBLICATIONS

Full Publications

AIGC

unityvideo unityvideo
UnityVideo : Unified Multi-Modal Multi-Task Learning for Enhancing World-Aware Video Generation
Jiehui Huang, Yuechen Zhang, Xu He, Yuan Gao, Zhi Cen, Bin Xia, Yan Zhou, Xin Tao, Pengfei Wan, Jiaya Jia
CVPR, 2026
DreamOmni2 DreamOmni2
DreamOmni2: Multimodal Instruction-based Editing and Generation
Bin Xia, Bohao Peng, Yuechen Zhang, Junjia Huang, Jiyang Liu, Jingyao Li, Haoru Tan, Sitong Wu, Chengyao Wang, Yitong Wang, Xinglong Wu, Bei Yu, Jiaya Jia
CVPR, 2026
Jenga Jenga
Training-Free Efficient Video Generation via Dynamic Token Carving
Yuechen Zhang, Jinbo Xing, Bin Xia, Shaoteng Liu, Bohao Peng, Xin Tao, Pengfei Wan, Eric Lo, Jiaya Jia
NeurIPS, 2025
MagicMirror MagicMirror
MagicMirror: ID-Preserved Video Generation in Video Diffusion Transformers
Yuechen Zhang*, Yaoyang Liu*, Bin Xia, Bohao Peng, Zexin Yan, Eric Lo, Jiaya Jia
ICCV, 2025
DreamOmni DreamOmni
DreamOmni: Unified Image Generation and Editing
Bin Xia, Yuechen Zhang, Jingyao Li, Chengyao Wang, Yitong Wang, Xinglong Wu, Bei Yu, Jiaya Jia
CVPR, 2025
ResMaster ResMaster
ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance
Shuwei Shi, Wenbo Li, Yuechen Zhang, Jingwen He, Biao Gong, Yinqiang Zheng
AAAI, 2025
Make-Your-Video Make-Your-Video
Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance
Jinbo Xing, Menghan Xia, Yuxin Liu, Yuechen Zhang, Yong Zhang, Yingqing He, Hanyuan Liu, Haoxin Chen, Xiaodong Cun, Xintao Wang, Ying Shan, Tien-Tsin Wong
TVCG, 2025
Customized video generation using textual and structural guidance.
Video-P2P Video-P2P
Video-P2P: Video Editing with Cross-attention Control
Shaoteng Liu, Yuechen Zhang, Wenbo Li, Zhe Lin, Jiaya Jia
CVPR, 2024
ControlNeXt ControlNeXt
ControlNeXt: Powerful and Efficient Control for Image and Video Generation
Bohao Peng, Jian Wang, Yuechen Zhang, Wenbo Li, Ming-Chang Yang, Jiaya Jia
Preprint, 2024
RIVAL RIVAL
Real-World Image Variation by Aligning Diffusion Inversion Chain
Yuechen Zhang, Jinbo Xing, Eric Lo, Jiaya Jia
NeurIPS (Spotlight), 2023
Ref-NPR Ref-NPR
Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields
Yuechen Zhang, Zexin He, Jinbo Xing, Xufeng Yao, Jiaya Jia
CVPR, 2023
Flow-aware Synthesis: A Generic Motion Model for Video Frame Interpolation
Jinbo Xing*, Wenbo Hu*, Yuechen Zhang, Tien-Tsin Wong
Computational Visual Media (CVM), 2021

Controllable Large Language Models

Lyra Lyra
Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition
Zhisheng Zhong*, Chengyao Wang*, Yuqi Liu*, Senqiao Yang, Longxiang Tang, Yuechen Zhang, Jingyao Li, Tianyuan Qu, Yanwei Li, Yukang Chen, Shaozuo Yu, Sitong Wu, Eric Lo, Shu Liu, Jiaya Jia
ICCV, 2025
Mini-Gemini Mini-Gemini
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models
Yanwei Li*, Yuechen Zhang*, Chengyao Wang*, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya Jia
TPAMI, 2025
Prompt Highlighter Prompt Highlighter
Prompt Highlighter: Interactive Control for Multi-Modal LLMs
Yuechen Zhang, Shengju Qian, Bohao Peng, Shu Liu, Jiaya Jia
CVPR, 2024

Others

UniVA UniVA
UniVA: Universal Video Agent towards Open-Source Next-Generation Video Generalist
Zhengyang Liang, Daoan Zhang, Huichi Zhou, Rui Huang, Bobo Li, Yuechen Zhang, Shengqiong Wu, Xiaohan Wang, Jiebo Luo, Lizi Liao, Hao Fei
Preprint, 2025
KDiffusion KDiffusion
Progressively Knowledge Distillation via Re-parameterizing Diffusion Reverse Process
Xufeng Yao, Fanbin Lu, Yuechen Zhang, Xinyun Zhang, Wenqian Zhao, Bei Yu
AAAI, 2024
CodeTalker
CodeTalker: Speech-Driven 3D Facial Animation with Discrete Motion Prior
Jinbo Xing, Menghan Xia, Yuechen Zhang, Xiaodong Cun, Jue Wang, Tien-Tsin Wong
CVPR, 2023
R^2 R^2
R2Former: Probing Region Relationship in Semantic Segmentation Transformers
Yuechen Zhang, Tiancheng Shen*, Huaijia Lin, Lu Qi, Eric Lo, Jiaya Jia
Preprint, 2022
CRM CRM
High Quality Segmentation for Ultra High-resolution Images
Tiancheng Shen, Yuechen Zhang, Lu Qi, Jason Kuen, Xingyu Xie, Jianlong Wu, Zhe Lin, Jiaya Jia
CVPR, 2022
PCL PCL
PCL: Proxy-based Contrastive Learning for Domain Generalization
Xufeng Yao, Yang Bai, Xinyun Zhang, Yuechen Zhang, Qi Sun, Ran Chen, Ruiyu Li, Bei Yu
CVPR, 2022
EDUCATION & EXPERIENCE

Education & Work Experience

The Chinese University of Hong Kong Hong Kong
Doctor of Philosophy, Computer Science. Aug 2021 - Dec 2025
Supervisor: Prof. Jia Jiaya, Prof. Eric Lo
Bachelor of Computer Science Sep 2016 - Jul 2021
First Class Honour, ELITE Stream
Nanyang Technological University Singapore
[Exchange] GEM Trailblazer Exchange Program Jan 2019 - May 2019
Tsinghua University Beijing, China
[Exchange] Yao Class Summer Program Jul 2019 - Aug 2019
Xiaomi MiMo Beijing, China
Research Scientist, Multi-Modal Jan 2026 - Present
Multi-modal research and development.
WanX, Alibaba Hangzhou, China
Research Internship Jul 2025 - Sep 2025
A short research intern on lonvideo generation.
Kling, Kuaishou Shenzhen, China
Research Internship Feb 2025 - June 2025
Research on efficient video generation.
LightSpeed, Tencent Hong Kong
Research Internship Feb 2024 - Jul 2024
Research on interactive video generation and customization.
SmartMore Hong Kong
Work Study Internship Jan 2020 - Jul 2025
Research on image segmentation on real-world industry projects, including defect detection and chip circuit high-precision instance segmentation. Developing an algorithm for 2D Datamatrix code recognition and decoding.
AWARDS & COMMUNITY

Awards & Community Contributions