Accepted author manuscript, 7.99 MB, PDF document
Available under license: CC BY: Creative Commons Attribution 4.0 International License
Final published version
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
Research output: Contribution in Book/Report/Proceedings - With ISBN/ISSN › Conference contribution/Paper › peer-review
}
TY - GEN
T1 - Text-driven Video Acceleration
AU - De Souza Ramos, Washington
AU - Soriano Marcolino, Leandro
AU - Nascimento, Erickson R.
PY - 2024/9/30
Y1 - 2024/9/30
N2 - From the dawn of the digital revolution until today, data has grown exponentially, especially in images and videos. Smartphones and wearable devices with high storage and long battery life contribute to continuous recording and massive uploads to social media. This rapid increase in visual data, combined with users' limited time, demands methods to produce shorter videos that convey the same information. Semantic Fast-Forwarding reduces viewing time by adaptively accelerating videos and slowing down for relevant segments. However, current methods require predefined visual concepts or user supervision, which is costly and time-consuming. This work explores using textual data to create text-driven fast-forwarding methods that generate semantically meaningful videos without explicit user input. Our proposed approaches outperform baselines, achieving F1 Score improvements up to 12.8 percentage points over the best competitors. Comprehensive user and ablation studies, along with quantitative and qualitative evaluations, confirm their superiority. Visual results are available at https://youtu.be/cOYqumJQOY and https://youtu.be/u6ODTv7-9C4 .
AB - From the dawn of the digital revolution until today, data has grown exponentially, especially in images and videos. Smartphones and wearable devices with high storage and long battery life contribute to continuous recording and massive uploads to social media. This rapid increase in visual data, combined with users' limited time, demands methods to produce shorter videos that convey the same information. Semantic Fast-Forwarding reduces viewing time by adaptively accelerating videos and slowing down for relevant segments. However, current methods require predefined visual concepts or user supervision, which is costly and time-consuming. This work explores using textual data to create text-driven fast-forwarding methods that generate semantically meaningful videos without explicit user input. Our proposed approaches outperform baselines, achieving F1 Score improvements up to 12.8 percentage points over the best competitors. Comprehensive user and ablation studies, along with quantitative and qualitative evaluations, confirm their superiority. Visual results are available at https://youtu.be/cOYqumJQOY and https://youtu.be/u6ODTv7-9C4 .
U2 - 10.5753/sibgrapi.est.2024.31642
DO - 10.5753/sibgrapi.est.2024.31642
M3 - Conference contribution/Paper
SP - 35
EP - 41
BT - 37th Conference on Graphics, Patterns and Images (SIBGRAPI)
T2 - 2024 37th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI)
Y2 - 30 September 2024 through 3 October 2024
ER -