Use Gemini to cut 100min+ soccer recordings to scoring clips
Spain vs England Combine Gen AI with SQL to cut 100 minutes+ of soccer videos from Fox Entertainment into 28 short clips of scoring moments, each around 5-10 seconds long
When my cofounder Jason Wang and I were watching the Euro2024 semi-final, one idea came to us: why not use AI to find the exact scoring moments and cut it into clips? So, we hacked this out!
1. Collect videos and upload to Roe AI.
2. Create a video extraction agent, and configure the output schema. The output schema is our prompt to the Gemini model. Therefore, we can configure the start and end timestamp via natural language.
Example JSON output_schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "array",
"items": {
"type": "object",
"properties": {
"scorer": {
"type": "string",
"description": "Player name who scored the goal"
},
"scored_start_timestamp": {
"type": "string",
"description": "Timestamp in the video before the player scored the goal. The timestamp should be in the format of mm:ss"
},
"scored_end_timestamp": {
"type": "string",
"description": "Timestamp in the video after the player scored the goal. The timestamp should be in the format of mm:ss"
},
"duration_in_seconds": {
"type": "string",
"description": "Duration of the clip in seconds"
}
},
"description": "6-10 seconds clip highlighting when the player scored the goal."
},
"description": "The highlight video clips about the scorer"
}
3. Then create a video trimming agent, the agent simply takes a video, a starting time and duration, to cut the video into a clip.
4. Finally, let's assemble this together into a SQL query
SELECT name, file AS original_video,
JSONExtractString(ai_detection, 'scorer') as scorer,
JSONExtractString(ai_detection, 'scored_start_timestamp') AS start_timestamp,
JSONExtractString(ai_detection, 'scored_end_timestamp') AS end_timestamp,
JSONExtractString(ai_detection, 'duration_in_seconds') AS duration,
run_agent_sync('709be2bb-e8f2-4a3f-9592-58e90256d313', -- this is the video trimming agent
MAP('video', file, 'start_second', start_timestamp, 'duration_seconds', duration)) AS clip -- pass in argument of video trimming agent
FROM (SELECT *, arrayJoin(JSONExtractArrayRaw(clips)) AS ai_detection, -- explode JSON array of clip timestamp into multiple rows
FROM (SELECT *, run_agent_sync('b3a6a5a1-d4d7-4c32-af4f-7e1858fad13b',
MAP('instruction', 'Extract clip start and end time when a team shots on the goal with description and player name', 'video', file)) -- this is the video AI extraction agent that gives the timestamps of scoring moments
AS clips FROM soccer_games));