I am a Ph.D. candidate in Computer Science at the GenAI Center of Excellence, KAUST, advised by Prof. Bernard Ghanem. Prior to that, I obtained my master degree from Shanghai Jiao Tong University, and bachelor degree from Xi’an JiaoTong University. I interned at Meta AI as a Research Scientist Intern in 2025, working on adaptive reasoning for video-language models.

My recent work focuses on advancing multimodal large language models, including VideoAuto-R1, an adaptive auto-thinking video model, and BOLT, an effective frame-selection approach for long-form videos. I have also proposed a series of efficient, scalable temporal action detection methods (AdaTAD, ETAD, CausalTAD) and developed OpenTAD, the largest open-source framework for temporal action detection.

My research interests include:

  • Adaptive Video Reasoning: RL-based auto-thinking and agentic video reasoning
  • Video-Language Models: efficient modeling and adaptation for long videos
  • Temporal Grounding: end-to-end timestamp localization and action detection at scale

📢 I am actively seeking full-time research positions. Feel free to reach out if you have any opportunities: shuming.liu[at]kaust.edu.sa