Relying only on nvidia-smi is like measuring highway usage by checking if any car is present, not how many lanes are full.
This talk reveals the metrics nvidia-smi doesn't show and introduces open source tools that expose actual GPU efficiency metrics.
We'll cover:
- Why GPU Utilization is not same as GPU Efficiency.
- Deep dive into relevant key metrics: SM metrics, Tensor Core metrics, Memory metrics explained.
- Practical gpu profiling and monitoring setup.
- Identifying bottlenecks in inference workloads.
Attendees will leave understanding how to identify underutilized GPU and discover real optimization opportunities across inference workloads.