From POC to Production: The Infrastructure for Gen AI at Scale
As enterprises move GenAI from proof-of-concept to production, a new set of infrastructure challenges is emerging. The IDC report, “Planning for GenAI Inferencing Impact on Infrastructure Investment Decisions,” reveals that organizations are grappling with forecasting capacity, managing costs, and ensuring performance, especially with distributed workloads. Join IDC Research Vice President, Nancy Gohring, and Red Hat Chief Technology Officer, Americas, Maria Bracho, as they unpack the report’s key findings. This discussion will go beyond the hype to explore practical strategies for optimizing infrastructure for GenAI inferencing. You’ll learn how to navigate the shift to production-scale AI, balance on-premises and cloud deployments, and understand the critical role of innovative technologies like vLLM and model compression in controlling costs and latency. We will also discuss the importance of Retrieval-Augmented Generation (RAG) and how smarter infrastructure decisions can help you scale your AI initiatives while maintaining security and performance.
Speakers:
María Bracho, Red Hat Chief Technology Officer, Americas
Nancy Gohring, IDC Senior Research Director, AI
Tom Schmidt, CIO Marketing Services Contributing Editor
