I am a final-year Ph.D. student in Computer Science at the GenAI Center of Excellence, KAUST, advised by Prof. Bernard Ghanem. Prior to that, I obtained my master degree from Shanghai Jiao Tong University, and bachelor degree from Xi’an JiaoTong University.

My recent work focuses on advancing large video-language models, including VideoAuto-R1, an adaptive auto-thinking model, and BOLT, an effective frame-selection approach for long-form videos. I have also proposed a series of efficient, scalable temporal action detection methods (AdaTAD, ETAD, CausalTAD) and developed OpenTAD, the largest open-source framework for temporal action detection.

My research interests include:

  • Adaptive Video Reasoning: RL-based auto-thinking, agentic video reasoning
  • Efficient Video-Language Models: low-cost adaptation and inference-time efficiency for long videos.
  • End-To-End Temporal Grounding: precise timestamp localization and action detection at scale

📢 I am actively seeking full-time research positions. Feel free to reach out if you have any opportunities: shuming.liu[at]kaust.edu.sa