📝 Publications

VideoAuto-R1: Video Auto Reasoning via Thinking Once, Answering Twice
Shuming Liu, Bernard Ghanem, Vikas Chandra, Yunyang Xiong, et al.
CVPR 2026, [Project Page] [Code]

BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding
Shuming Liu, Chen Zhao, Tianqi Xu, Bernard Ghanem
CVPR 2025, [Code]

OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection
Shuming Liu, Chen Zhao, Bernard Ghanem, et al.
CVPRW 2025, [Code]

Harnessing Temporal Causality for Advanced Temporal Action Detection
Shuming Liu, Lin Sui, Chen-Lin Zhang, Fangzhou Mu, Chen Zhao, Bernard Ghanem
Technical Report, [Code]

End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames
Shuming Liu, Chen-Lin Zhang, Chen Zhao, Bernard Ghanem
CVPR 2024, [Code]

ETAD: Training Action Detection End to End on a Laptop
Shuming Liu, Mengmeng Xu, Chen Zhao, Xu Zhao, Bernard Ghanem
CVPRW 2023, [Code]
-
CVPR 2026Mixture of States: Routing Token-Level Dynamics for Multimodal Generation
Haozhe Liu, Ding Liu, Mingchen Zhuge, Zijian Zhou, Tian Xie, Sen He, Yukang Yang, Shuming Liu, Yuren Cong, Jiadong Guo, Hongyu Xu, Ke Xu, Kam-Woh Ng, Juan C Pérez, Tao Xiang, Wei Liu, Shikun Liu, Jürgen Schmidhuber -
Arxiv 2025TimeLoc: A Unified End-to-End Framework for Precise Timestamp Localization in Long Videos
Chen-Lin Zhang*, Lin Sui*, Shuming Liu*, Fangzhou Mu*, Zhangcheng Wang, Bernard Ghanem -
ECCV 2024ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders
Carlos Hinojosa, Shuming Liu, Bernard Ghanem [Code] -
CVPR 2024Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning
Chen Zhao, Shuming Liu, Karttikeya Mangalam, Guocheng Qian, Fatimah Zohra, Abdulmohsen Alghannam, Jitendra Malik, Bernard Ghanem [Code] -
CVPRW 2024Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition
Hasan Hammoud, Shuming Liu, Mohammed Alkhrashi, Fahad AlBalawi, Bernard Ghanem -
ICLR 2024Boundary-Denoising for Video Activity Localization
Mengmeng Xu, Mattia Soldan, Jialin Gao, Shuming Liu, Juan-Manuel Perez-Rua, Bernard Ghanem [Code] -
CVPR 2023Re2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization
Chen Zhao, Shuming Liu, Karttikeya Mangalam, Bernard Ghanem [Code] -
TMM 2020Transferable Knowledge Based Multi-Granularity Fusion Network for Weakly Supervised Temporal Action Detection
Haisheng Su, Xu Zhao, Tianwei Lin, Shuming Liu, Zhilan Hu