Kumar, RahulRahulKumarBaghel, VipulVipulBaghelSingh, SudhanshuSudhanshuSinghYadav, ShivamShivamYadavSrinivasan, BabjiBabjiSrinivasanHegde, RaviRaviHegde2026-01-072026-01-072025-07-1610.1007/978-3-032-08511-5_26http://repository.iitgn.ac.in/handle/IITG2025/33773Combat sports like MMA and boxing increasingly adopt computer vision for real-time, non-intrusive movement analysis. However, challenges remain due to high costs, environmental variability, and the complexity of fluid, unstructured actions. We propose a novel vision-based method for punch detection, demarcation, classification, and scoring in boxing. Key contributions include: (1) a well-annotated dataset of 6, 915 punch clips across six categories, sourced from 20 YouTube sparring sessions featuring 18 athletes; and (2) a hierarchical framework integrating boundary detection with classification for precise action localization in free-form videos. Our model achieves 98% accuracy on training and 91% on testing data. The system is also validated in home-based, self-paced punching scenarios, showing promise for low-resource settings. Results suggest that high-quality training video analysis can improve technique and performance in combat sports and beyond.en-USSports analyticsComputer visionHuman pose estimationRemote trainingReal-time combat training analytics: skeleton-based temporal action localization in unstructured videoConference Paper0