Standard Kernel Raises $20M to Let AI Rewrite the Software That Runs AI — Up to 4x GPU Speedup

Standard Kernel, a startup that uses AI to automatically optimize GPU software, has raised a $20 million seed round led by Jump Capital, with participation from General Catalyst, Felicis, Cowboy Ventures, and strategic angels including Google’s Jeff Dean.
The Problem: Billions in GPUs Running Below Peak
Here’s an uncomfortable truth about the AI boom: companies are spending hundreds of billions of dollars on GPU clusters, but much of that hardware isn’t running at peak performance. Extracting maximum efficiency from modern accelerators like NVIDIA’s H100 requires deep expertise in hardware architecture, compiler behavior, and low-level systems optimization — skills that maybe a few hundred people on Earth actually possess.
Standard Kernel’s pitch is simple: let AI do it instead. Their system automatically generates ultra-optimized GPU software (called “kernels”) that can extract significantly more performance from existing hardware.
The Results: Outperforming NVIDIA’s Own Software
In partner testing, Standard Kernel has demonstrated performance improvements ranging from 80% to 4x on end-to-end workloads running on NVIDIA H100 GPUs. Perhaps most impressively, their AI-generated kernels have outperformed NVIDIA’s own highly optimized cuDNN library in certain scenarios.
Think about what that means: an AI system is writing GPU optimization code that’s faster than what NVIDIA’s own engineers produce. NVIDIA, the company that literally designed the hardware.
The Team and Backers
The founding team includes alumni from MIT, Stanford, UIUC, and SJTU who have contributed widely used open-source research including KernelBench and Kernel Tree Search. The investor list reads like a who’s who of AI: Jeff Dean (Google’s chief scientist), Jonathan Frankle (of lottery ticket hypothesis fame), CoreWeave, and Ericsson Ventures.
The Bottom Line
Standard Kernel is essentially betting that the bottleneck in AI isn’t hardware — it’s software. With $20 million and AI that can outperform NVIDIA’s own code, they might be right. If they can deliver on the promise of day-one peak performance on new hardware platforms, they’ll save the AI industry billions in unnecessary GPU purchases. The irony? AI optimizing the software that runs AI. We’re officially in recursive self-improvement territory.