From POC to Production: The Infrastructure for Gen AI at Scale

From POC to Production: The Infrastructure for Gen AI at Scale

 

From POC to Production: The Infrastructure for Gen AI at Scale

As enterprises move GenAI from proof-of-concept to production, a new set of infrastructure challenges is emerging. The IDC report, “Planning for GenAI Inferencing Impact on Infrastructure Investment Decisions,” reveals that organizations are grappling with forecasting capacity, managing costs, and ensuring performance, especially with distributed workloads. Join IDC Research Vice President, Nancy Gohring, and Red Hat Chief Technology Officer, Americas, Maria Bracho, as they unpack the report’s key findings. This discussion will go beyond the hype to explore practical strategies for optimizing infrastructure for GenAI inferencing. You’ll learn how to navigate the shift to production-scale AI, balance on-premises and cloud deployments, and understand the critical role of innovative technologies like vLLM and model compression in controlling costs and latency. We will also discuss the importance of Retrieval-Augmented Generation (RAG) and how smarter infrastructure decisions can help you scale your AI initiatives while maintaining security and performance.

Speakers:
María Bracho, Red Hat Chief Technology Officer, Americas
Nancy Gohring, IDC Senior Research Director, AI
Tom Schmidt, CIO Marketing Services Contributing Editor

White Paper from  redhat_logo

    Complete the form below to download the content:






    You have been directed to this site by Global IT Research. For more details on our information practices, please see our Privacy Policy, and by accessing this content you agree to our Terms of Use. You can unsubscribe at any time.

    If your Download does not start Automatically, Click Download Whitepaper

    Show More