UTAL-GNN: unsupervised temporal action localization using graph neural networks

Show simple item record

dc.contributor.author Badatya, Bikash Kumar
dc.contributor.author Baghel, Vipul
dc.contributor.author Hegde, Ravi S.
dc.coverage.spatial United States of America
dc.date.accessioned 2025-09-04T07:14:08Z
dc.date.available 2025-09-04T07:14:08Z
dc.date.issued 2025-08
dc.identifier.citation Badatya, Bikash Kumar; Baghel, Vipul and Hegde, Ravi S., "UTAL-GNN: unsupervised temporal action localization using graph neural networks", arXiv, Cornell University Library, DOI: arXiv:2508.19647, Aug. 2025.
dc.identifier.issn 2331-8422
dc.identifier.uri https://doi.org/10.48550/arXiv.2508.19647
dc.identifier.uri https://repository.iitgn.ac.in/handle/123456789/11849
dc.description.abstract Fine-grained action localization in untrimmed sports videos presents a significant challenge due to rapid and subtle motion transitions over short durations. Existing supervised and weakly supervised solutions often rely on extensive annotated datasets and high-capacity models, making them computationally intensive and less adaptable to real-world scenarios. In this work, we introduce a lightweight and unsupervised skeleton-based action localization pipeline that leverages spatio-temporal graph neural representations. Our approach pre-trains an Attention-based Spatio-Temporal Graph Convolutional Network (ASTGCN) on a pose-sequence denoising task with blockwise partitions, enabling it to learn intrinsic motion dynamics without any manual labeling. At inference, we define a novel Action Dynamics Metric (ADM), computed directly from low-dimensional ASTGCN embeddings, which detects motion boundaries by identifying inflection points in its curvature profile. Our method achieves a mean Average Precision (mAP) of 82.66% and average localization latency of 29.09 ms on the DSV Diving dataset, matching state-of-the-art supervised performance while maintaining computational efficiency. Furthermore, it generalizes robustly to unseen, in-the-wild diving footage without retraining, demonstrating its practical applicability for lightweight, real-time action analysis systems in embedded or dynamic environments.
dc.description.statementofresponsibility by Bikash Kumar Badatya, Vipul Baghel and Ravi S. Hegde
dc.language.iso en_US
dc.publisher Cornell University Library
dc.title UTAL-GNN: unsupervised temporal action localization using graph neural networks
dc.type Article
dc.relation.journal arXiv


Files in this item

Files Size Format View

There are no files associated with this item.

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account