Next generation AI compute: silicon to unlock the next era of AI scaling
Next-Gen AI Compute: Silicon Innovations Powering the Future of Inference at Web Summit Lisbon 2025
(This article was generated with AI and it’s based on a AI-generated transcription of a real talk on stage. While we strive for accuracy, we encourage readers to verify important information.)
At Web Summit Lisbon 2025, Ms. Sally Ward-Foxton of EE Times interviewed Mr. Walter Goodwin, CEO of Fractile, on AI hardware. He noted the shift from training to inference as the main challenge, demanding new infrastructure to keep pace with AI’s rapid advancements.
Mr. Goodwin explained AI capability scales with model size and processing duration. Running models longer, or increasing compute, unlocks greater intelligence like advanced coding. This “wall clock time” is critical for achieving significant economic transformation.
The main bottleneck for AI inference speed, especially for LLM text generation, is memory bandwidth. Current GPU architectures separate compute and memory, limiting data flow. Fractile predicts future models will generate hundreds of millions of tokens, requiring a 100x or more increase in output speed.
While Nvidia links multiple chips for aggregate bandwidth, Mr. Goodwin noted this creates a cost-speed trade-off, leading to expensive systems with limited throughput. Fractile re-architects chips for fundamentally better memory access to overcome this.
This issue stems from a historical imbalance: compute power increased a million-fold in 25 years, but memory access only 40 times. This divergence hinders cost efficiency and inference speeds. Fractile’s innovation spans the entire stack to rectify this.
Regarding data center heterogeneity, Ms. Ward-Foxton questioned specialized hardware. Mr. Goodwin cautioned that hardware’s long lifespan conflicts with AI workloads’ rapid evolution, making specialized chips risky and quickly obsolete.
Fractile aims to deliver a single, adaptable platform offering radical speed improvements without sacrificing versatility. This ensures silicon remains useful as AI applications change, providing a balanced, future-proof solution for evolving AI challenges.
Mr. Goodwin highlighted AI’s latent capabilities, citing Google’s “AI co-scientist” project. Scaling inference compute and running AI agents for weeks achieved breakthroughs, demonstrating immense untapped potential in current AI models with faster hardware.
Ultimately, the speed of AI’s “internal flywheel” – its ability to reason, iterate, and debate – is paramount. This inference speed is the lifeblood for applying AI to solve millennia-old problems. Fractile’s integrated team delivers this crucial silicon innovation.