Video Analytics AI Agents at GTC 2026

1 minute read

Published: March 19, 2026

At GTC 2026, I was excited to support the NVIDIA booth demo “Turn Video Into Insights With Video Analytics AI Agents”, highlighting the NVIDIA AI Blueprint for Video Search and Summarization (VSS).

The demo showed how VSS can help developers build visual AI agents that review events, understand physical context, and streamline decisions across large volumes of live and recorded video. The workflow brought together search, summarization, Q&A, active alerts, VLM/CV/model implementation, and event review, with examples spanning smart cities, warehouses, and factories.

My main contribution focused on the video embedding search stack behind the demo. I helped prepare search models and datasets, finalized embedding models and person attribute search checkpoints, generated example queries and captions for warehouse and other video scenarios, and supported evaluation data preparation for demo clips and KPI validation. I also helped debug and validate key pieces of the real-time search path, including text embedding behavior, model packaging, and search-profile integration.

This work connected directly to the broader VSS search roadmap. In the VSS 3.1.0 release, the Search Agent Workflow added attribute search, multi-embedding fusion search, and a critic agent for reviewing search results. The release also updated the RT-CV microservice to support embedding generation for detected objects, including RADIO-CLIP and SigLIP2 embedding models.

For me, the most rewarding part was seeing several threads come together in one visible experience: video embeddings for event and activity search, object/person attribute search through vision-language embeddings, model packaging for deployment, and practical demo workflows that make large video archives feel searchable and useful.

GTC 2026 VSS booth demo: Turn Video Into Insights With Video Analytics AI Agents

Share on

Twitter Facebook LinkedIn

Cosmos 3, VANTAGE-Bench, and TAR at Computex 2026

1 minute read

Published: June 01, 2026

NVIDIA’s Cosmos 3 release at Computex 2026 highlighted physical AI benchmarks including VANTAGE-Bench and Traffic Anomaly Reasoning (TAR). I supported VANTAGE-Bench preparation as a NeurIPS 2026 competition effort and TAR launch alignment for AI City Challenge 2026 Track 3.

UrbanAI at NeurIPS 2025: Advancing Multi-Camera Tracking & Multimodal Spatial AI with NVIDIA Metropolis

1 minute read

Published: December 07, 2025

At the NeurIPS 2025 UrbanAI Workshop, we presented nine years of progress from the AI City Challenge and introduced NVIDIA’s latest advancements in multi-camera 3D perception, cloud-native tracking workflows, and end-to-end spatial AI models like MCBLT and Sparse4D—supporting next-generation smart city applications.

Dr. Zheng (Thomas) Tang

Video Analytics AI Agents at GTC 2026

Share on

You May Also Enjoy

Cosmos 3, VANTAGE-Bench, and TAR at Computex 2026

10th AI City Challenge Accepted as an ECCV 2026 Workshop

UrbanAI at NeurIPS 2025: Advancing Multi-Camera Tracking & Multimodal Spatial AI with NVIDIA Metropolis

Vision AI Agents for Foxconn Digital Twins Featured at NVIDIA GTC DC 2025 & Hosting the 9th AI City Challenge at ICCV 2025