👋 About me [Updated 26/06/2025]

My name is Junhao Cheng (程钧豪). I received my bachelor’s degree from Sun Yat-sen University (SYSU) in 2025, supervised by Prof. Xiaodan Liang (梁小丹). Now I am an MPhil student at Prof. Jing Liao (廖菁)’s lab. Before this, I had the privilege of interning in Prof. Ming-Hsuan Yang’s lab and working closely with him.

I am currently a research intern at Kuaishou Kling team. My research interests lie in Interactive AI. Now I focus on designing novel applications for video understanding and generation and other interesting downstream tasks.

I am looking for research collaborations and PhD opportunities (27 Fall). If you think there is anything interesting we can discuss, feel free to email me!

🔥 News

  • 2025.06:  🎉🎉 One paper is accepted by ICCV 2025.
  • 2025.06:   Release Video-Holmes, evaluating MLLMs for complex video reasoning like Holmes.
  • 2025.04:   Release AnimeGamer (300+Stars✨), transforming characters from anime films into interactive entities with an MLLM.
  • 2024.06:   Release AutoStudio (400+Stars✨), generating comic book with multi-character, multi-turn consistency.
  • 2024.05:  🎉🎉 One paper is accepted by ACL 2024.

💻 Internships

🎓 Educations

clean-usnob

2025-now

Studying as an MPhil Student at City University of Hong Kong

Supervisor: Jing Liao (廖菁)


clean-usnob

2021-2025

Studying as an Undergraduate Student at Sun Yat-sen University

Supervisor: Xiaodan Liang (梁小丹)


📝 Publications

clean-usnob

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction


Junhao Cheng, Yuying Ge, Yixiao Ge, Jing Liao, Ying Shan

ICCV 2025 / Paper / Code GitHub stars
clean-usnob

Object Isolated Attention for Consistent Story Visualization


Xiangyang Luo, Junhao Cheng, Yifan Xie, Xin Zhang, Tao Feng, Zhou Liu, Fei Ma, Fei Yu

ICME 2025 (CCF-B) / Paper
clean-usnob

BD-Diff: Generative Diffusion Model for Image Deblurring on Unknown Domains with Blur-Decoupled Learning


Junhao Cheng, Wei-Ting Chen, Xi Lu, Ming-Hsuan Yang

arXiv 2025 / Paper / Code GitHub stars
clean-usnob

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation


Junhao Cheng, Xi Lu, Hanhui Li, Khun Loun Zai, Baiqiao Yin, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang

arXiv 2024 / Paper / Code GitHub stars
clean-usnob

VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language Models


Qingxing Cao, Junhao Cheng, Xiaodan Liang, Liang Lin

ACL 2024 (CCF-A) / Paper / Code GitHub stars
clean-usnob

TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation


Junhao Cheng, Baiqiao Yin, Kaixin Cai, Minbin Huang, Hanhui Li, Yuxin He, Xi Lu, Yue Li, Yifei Li, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang

arXiv 2024 / Paper / Code GitHub stars
clean-usnob

Integrating Domain Knowledge into Transformer for Short-Term Wind Power Forecasting


Junhao Cheng, Xing Luo, Zhi Jin

Energy (JCR-Q1) / Paper