Video Analytics AI Agents at GTC 2026

1 minute read

Published:

At GTC 2026, I was excited to support the NVIDIA booth demo “Turn Video Into Insights With Video Analytics AI Agents”, highlighting the NVIDIA AI Blueprint for Video Search and Summarization (VSS).

The demo showed how VSS can help developers build visual AI agents that review events, understand physical context, and streamline decisions across large volumes of live and recorded video. The workflow brought together search, summarization, Q&A, active alerts, VLM/CV/model implementation, and event review, with examples spanning smart cities, warehouses, and factories.

My main contribution focused on the video embedding search stack behind the demo. I helped prepare search models and datasets, finalized embedding models and person attribute search checkpoints, generated example queries and captions for warehouse and other video scenarios, and supported evaluation data preparation for demo clips and KPI validation. I also helped debug and validate key pieces of the real-time search path, including text embedding behavior, model packaging, and search-profile integration.

This work connected directly to the broader VSS search roadmap. In the VSS 3.1.0 release, the Search Agent Workflow added attribute search, multi-embedding fusion search, and a critic agent for reviewing search results. The release also updated the RT-CV microservice to support embedding generation for detected objects, including RADIO-CLIP and SigLIP2 embedding models.

For me, the most rewarding part was seeing several threads come together in one visible experience: video embeddings for event and activity search, object/person attribute search through vision-language embeddings, model packaging for deployment, and practical demo workflows that make large video archives feel searchable and useful.

GTC 2026 VSS booth demo: Turn Video Into Insights With Video Analytics AI Agents