Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers
More Showcases
Reference
Video 1
Video 2
Reference
Video 1
Video 2
Reference
Video 1
Video 2
Reference
Video 1
Video 2
Reference
Video 1
Video 2
Reference
Video 1
Video 2
Reference
Video 1
Video 2
Reference
Video 1
Video 2
Reference
Video 1
Video 2
Comparisons
Reference
DynamiCrafter
EasyAnimate-I2V
CogVideoX-I2V
ID-Animator
Magic Mirror
A seasoned male police officer, wearing a crisp navy uniform adorned with badges, stands beside his patrol car, holding a radio to his mouth. His expression is focused and serious, reflecting the gravity of his communication. The scene is set in an urban environment, with the city skyline visible in the background, and the flashing lights of the patrol car casting a rhythmic glow. As he speaks into the radio, his other hand rests on his utility belt, showcasing his readiness and professionalism. The ambient sounds of distant traffic and the occasional chirp of the radio punctuate the scene, emphasizing the officer's role in maintaining order.
Reference
DynamiCrafter
EasyAnimate-I2V
CogVideoX-I2V
ID-Animator
Magic Mirror
An elderly man, with a determined expression, stands in a sunlit gym, wearing a gray tank top and black shorts, his muscles taut as he grips a heavy kettlebell. The room is filled with natural light streaming through large windows, casting shadows on the polished wooden floor. His face shows concentration and strength, highlighting his commitment to fitness. As he lifts the kettlebell with steady hands, the camera captures the sweat glistening on his brow, emphasizing his effort and resilience. The background features neatly arranged gym equipment, adding to the atmosphere of dedication and perseverance.
Reference
DynamiCrafter
EasyAnimate-I2V
CogVideoX-I2V
ID-Animator
Magic Mirror
A bearded man in his thirties, wearing a plaid shirt and jeans, sits at a rustic wooden bar, surrounded by an array of beer taps and vintage brewery decor. He carefully lifts a frosty pint glass filled with amber beer, examining its color and clarity against the warm, ambient lighting. He takes a slow, appreciative sip, his eyes closing momentarily as he savors the complex flavors. The camera captures the subtle smile of satisfaction on his face, highlighting the rich foam on his upper lip. The background hum of soft chatter and clinking glasses adds to the cozy, inviting atmosphere of the pub.
Reference
DynamiCrafter
EasyAnimate-I2V
CogVideoX-I2V
ID-Animator
Magic Mirror
A serene woman, dressed in a cozy oversized sweater and jeans, kneels on a lush green meadow, gently petting a friendly golden retriever. The dog's tail wags enthusiastically, its fur gleaming in the soft sunlight. Her face lights up with a warm smile as her hand moves tenderly over the dog's head and back. In the background, a picturesque landscape of rolling hills and blooming wildflowers enhances the tranquil scene. The golden retriever, with its tongue lolling out and eyes full of affection, leans into her touch, creating a heartwarming moment of connection and joy.
Reference
DynamiCrafter
EasyAnimate-I2V
CogVideoX-I2V
ID-Animator
Magic Mirror
A focused man in a contemporary gym performs a leg exercise, clad in a fitted black tank top and gray athletic shorts. The scene conveys the intensity of his workout, with sweat glistening on his brow and muscles visibly engaged. Positioned on a sleek leg press machine, he pushes against the resistance with determination. The ambient lighting accentuates his form, while the background showcases neatly arranged weights and exercise equipment. His expression reflects concentration and resolve, embodying the dedication and effort of his fitness journey..
Reference
DynamiCrafter
EasyAnimate-I2V
CogVideoX-I2V
ID-Animator
Magic Mirror
A serene woman, dressed in a flowing white blouse and light blue jeans, stands at a rustic wooden table in a sunlit room filled with greenery. She carefully selects vibrant blooms from a wicker basket, including roses, lilies, and daisies, and begins arranging them in a crystal vase. Sunlight filters through the window, casting a warm glow on her focused expression. As she works, her hands move gracefully, adjusting stems and leaves to create a harmonious bouquet. The scene transitions to a close-up of her hands tying a delicate ribbon around the vase, completing the arrangement with a touch of elegance. The final shot captures her stepping back to admire her creation, a satisfied smile on her face, with the room's natural beauty enhancing the tranquil atmosphere.
Personalized Videos with Style-Specific Prompts
Hover on video to see the text prompts & style
Reference
Video
Reference
Video
Reference
Video
Reference
Video
Reference
Video
Reference
Video
Reference
Video
Reference
Video
Multi-shot Video Generation with One Character
A serene woman with delicate features wearing a flowing white blouse:
(1) practices gentle yoga stretches... (2) sits at her kitchen counter bathed in morning light... (3) is working at her writing desk near a window... (4) starts painting her artwork... (5) is preparing for the lunch...
Reference
Shot 1
Shot 2
Shot 3
Shot 4
Shot 5
The young woman with the black shoulder-length hair is captured, wearing her beige knit sweater: shots with different aspects:
Reference
Shot 1
Shot 2
Shot 3
Shot 4
Shot 5
A beard man with yellow T-shirt: working on a wooden table in his workshop:
Reference
Shot 1
Shot 2
Shot 3
Shot 4
Shot 5
Module-wise Ablations.
ID Reference |
w/o facial embedding |
w/o adaptive condition |
Full Model |
---|---|---|---|
ID Reference | image pre-training only | video fine-tuning only | Full Model |
---|---|---|---|
Limitations
Our method has limitations, including the inability to handle multi-person videos and process fine-grained features. For example, it sometimes fails to accurately capture details like eye color (case 1). Additionally, since its basic generation capability is bound to the base model, it may produce artifacts when dealing with complex physical motions (case 2). |
|||
---|---|---|---|
Reference ID | Our failcase | Reference ID | Our failcase |
Contact Us
Feel free to contact Yuechen Zhang at zhangyc@link.cuhk.edu.hk for any question,cooperation, and communication.
If you find this work useful, please consider citing:
@article{zhang2025magicmirror, title={Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers}, author={Yuechen Zhang and Yaoyang Liu and Bin Xia and Bohao Peng and Zexin Yan and Eric Lo and Jiaya Jia}, journal={arXiv preprint arXiv:2501.03931}, year={2025} }