TY - GEN
T1 - Generating Factually Consistent Sport Highlights Narrations
AU - Sarfati, Noah
AU - Yerushalmy, Ido
AU - Chertok, Michael
AU - Keller, Yosi
N1 - Publisher Copyright: © 2023 ACM.
PY - 2023/10/29
Y1 - 2023/10/29
N2 - Sports highlights are an important form of media for fans worldwide, as they provide short videos that capture key moments from games, often accompanied by the original commentaries of the game's announcers. However, traditional forms of presenting sports highlights have limitations in conveying the complexity and nuance of the game. In recent years, the use of Large Language Models (LLMs) for natural language generation has emerged and is a promising approach for generating narratives that can provide a more compelling and accessible viewing experience. In this paper, we propose an end-to-end solution to enhance the experience of watching sports highlights by automatically generating factually consistent narrations using LLMs and crowd noise extraction. Our solution involves several steps, including extracting the source of information from the live broadcast using a transcription model, prompt engineering, and comparing out-of-the-box models for consistency evaluation. We also propose a new dataset annotated on generated narratives from 143 Premier League plays and fine-tune a Natural Language Inference (NLI) model on it, achieving 92% precision. Furthermore, we extract crowd noise from the original video to create a more immersive and realistic viewing experience for sports fans by adapting speech enhancement SOTA models on a brand new dataset created from 155 Ligue 1 games.
AB - Sports highlights are an important form of media for fans worldwide, as they provide short videos that capture key moments from games, often accompanied by the original commentaries of the game's announcers. However, traditional forms of presenting sports highlights have limitations in conveying the complexity and nuance of the game. In recent years, the use of Large Language Models (LLMs) for natural language generation has emerged and is a promising approach for generating narratives that can provide a more compelling and accessible viewing experience. In this paper, we propose an end-to-end solution to enhance the experience of watching sports highlights by automatically generating factually consistent narrations using LLMs and crowd noise extraction. Our solution involves several steps, including extracting the source of information from the live broadcast using a transcription model, prompt engineering, and comparing out-of-the-box models for consistency evaluation. We also propose a new dataset annotated on generated narratives from 143 Premier League plays and fine-tune a Natural Language Inference (NLI) model on it, achieving 92% precision. Furthermore, we extract crowd noise from the original video to create a more immersive and realistic viewing experience for sports fans by adapting speech enhancement SOTA models on a brand new dataset created from 155 Ligue 1 games.
KW - factual consistency evaluation
KW - hallucinations
KW - large language models (llms)
KW - natural language inference
KW - prompt engineering
KW - speech enhancing
UR - http://www.scopus.com/inward/record.url?scp=85178270305&partnerID=8YFLogxK
U2 - https://doi.org/10.1145/3606038.3616157
DO - https://doi.org/10.1145/3606038.3616157
M3 - منشور من مؤتمر
T3 - MMSports 2023 - Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports, Co-located with: MM 2023
SP - 15
EP - 22
BT - MMSports 2023 - Proceedings of the 6th International Workshop on Multimedia Content Analysis in Sports, Co-located with
T2 - 6th ACM International Workshop on Multimedia Content Analysis in Sports, MMSports 2023, co-located with ACM Multimedia 2023
Y2 - 29 October 2023
ER -