I am a final-year Ph.D. student in Computer Science at the GenAI Center of Excellence, KAUST, advised by Prof. Bernard Ghanem. Prior to that, I obtained my master degree from Shanghai Jiao Tong University, and bachelor degree from Xi’an JiaoTong University.
My recent work focuses on advancing large video-language models, including VideoAuto-R1, an adaptive auto-thinking model, and BOLT, an effective frame-selection approach for long-form videos. I have also proposed a series of efficient, scalable temporal action detection methods (AdaTAD, ETAD, CausalTAD) and developed OpenTAD, the largest open-source framework for temporal action detection.
My research interests include:
- Adaptive Video Reasoning: RL-based auto-thinking, agentic video reasoning
- Efficient Video-Language Models: low-cost adaptation and inference-time efficiency for long videos.
- End-To-End Temporal Grounding: precise timestamp localization and action detection at scale
📢 I am actively seeking full-time research positions. Feel free to reach out if you have any opportunities: shuming.liu[at]kaust.edu.sa