dc.contributor.author |
Baghel, Vipul |
|
dc.contributor.author |
Deb, Sagar Deep |
|
dc.contributor.author |
Nagisetti, Rithihas |
|
dc.contributor.author |
Srinivasan, Babji |
|
dc.contributor.author |
Hegde, Ravi S. |
|
dc.coverage.spatial |
India |
|
dc.date.accessioned |
2025-08-01T07:02:19Z |
|
dc.date.available |
2025-08-01T07:02:19Z |
|
dc.date.issued |
2024-12-19 |
|
dc.identifier.citation |
Baghel, Vipul; Deb, Sagar Deep; Nagisetti, Rithihas; Srinivasan, Babji and Hegde, Ravi S., "A novel hierarchical pipeline for fine-grained punch recognition in uncontrolled setting", in the 9th International Conference on Computer Vision and Image Processing (CVIP 2024), Chennai, IN, Dec. 19-21, 2024. |
|
dc.identifier.uri |
https://doi.org/10.1007/978-3-031-93688-3_19 |
|
dc.identifier.uri |
https://repository.iitgn.ac.in/handle/123456789/11709 |
|
dc.description.abstract |
Human Action Recognition (HAR) is one of the most emerging topics in the field of Computer Vision. It finds utilization in a wide range of applications like athletics, healthcare, security, sports performance, etc. In sports, HAR can be used to monitor athletes’ movements to facilitate skill development and decision-making training. However, in case of highly dynamic sports like boxing, automatic HAR becomes more difficult. Variations in lighting conditions, background, camera perspectives, and rapid movements collectively compound the complexity in automatic HAR and its performance further worsens with the involvement of uncontrolled environment videos. This manuscript proposes a hierarchical pipeline for robust sports activity recognition. For this purpose, a novel regression-based detection is introduced using a Dual Stream Spatio-Temporal Transformer, and a 3D Convolutional Neural Network (CNN) is used for fine-grained classification. The approach commences with regression-based detection to identify punch instances within unedited boxing video streams. Subsequently, a fine-grained classification module is employed to categorize detected punches into six basic punch types: jabs, cross, lead, and rear hooks, or lead and rear uppercuts. This hierarchical architecture enables efficient and accurate punch recognition in diverse real-world wild settings. Experimental results on various unedited YouTube videos show the effectiveness of our proposed approach. The proposed approach achieves an overall mean detection accuracy of 96.09% and a mean classification accuracy of 88.25% on test videos. |
|
dc.description.statementofresponsibility |
by Vipul Baghel, Sagar Deep Deb, Rithihas Nagisetti, Babji Srinivasan and Ravi S. Hegde |
|
dc.language.iso |
en_US |
|
dc.publisher |
Springer |
|
dc.title |
A novel hierarchical pipeline for fine-grained punch recognition in uncontrolled setting |
|
dc.type |
Conference Paper |
|
dc.relation.journal |
9th International Conference on Computer Vision and Image Processing (CVIP 2024) |
|