- Red Hat and AMD are extending their partnership to boost AI performance.
- The partnership covers LLM support, vLLM contributions, and a new AI Inference Server.
Red Hat and AMD have strengthened their partnership to better support AI workloads and modernise virtual machines (VMs) in hybrid cloud setups. This joint effort brings together AMD’s hardware, including Instinct GPUs and EPYC CPUs, with Red Hat’s open source platforms.
The goal is to make it easier for businesses to manage resource-heavy AI tasks and existing virtual infrastructure without needing major upgrades. Both companies are working to give IT teams the tools to scale up AI deployments while reducing system costs. This is becoming more important as organisations seek to use large language models, data pipelines, and inference engines within environments that may not have been designed with AI in mind.
Model deployment tests with AMD Instinct MI300X
Red Hat and AMD tested the performance of AI models using AMD Instinct MI300X GPUs on Microsoft Azure’s ND MI300X v5 virtual machines. These tests showed that both small and large language models could run across numerous GPUs within a single VM. This avoids the need to split workloads across many virtual machines, which often increases cost and complexity.
This setup could make AI inference more efficient. It’s meant to support teams that are training or deploying models at scale without needing to redesign their infrastructure. The fact that a single VM can support multi-GPU operations simplifies deployment and can help organisations avoid scaling challenges. With fewer systems to manage, teams can spend less time on operational tasks and more on development.
Red Hat AI Inference server gains AMD GPU support
To build on this, AMD Instinct GPUs will now support the Red Hat AI Inference Server. This tool helps users run open source AI models in enterprise environments. It gives them a tested path to deploy models on AMD hardware without extra tuning or setup. The combination is aimed at reducing the friction many teams face when moving AI projects from test environments into production.
The AI Inference Server is based on the vLLM open source project. Red Hat and AMD are both contributing to it. Their work includes performance tuning for dense and quantised models, kernel-level updates, and better support for collective GPU operations. These efforts allow organisations to use existing AI models more effectively and improve overall performance.
Community work around vLLM
Support for multi-GPU workloads has become more important as AI use grows. Red Hat and AMD are working to help users get the most out of their current GPUs. Their changes to vLLM aim to reduce the delays associated with managing distributed GPU workloads.
They are also working with IBM and others to grow the vLLM ecosystem. The shared goal is to make it easier to run large models on open source tools, without forcing users into one cloud or vendor setup. Open contribution models like this one are helping shape a future where AI development is not tied to any single provider.
The work on vLLM also helps smaller organisations that may lack the scale of large tech firms. By providing tested tools and code optimisations, Red Hat and AMD are helping reduce the barrier to entry for teams that want to work with LLMs but don’t have extensive infrastructure.
Red Hat launches llm-d to improve model Inference at scale
Red Hat recently introduced a new open source project called llm-d. This tool is designed for teams running generative AI models across large, distributed systems. It is built on Kubernetes and works with the vLLM runtime.
AMD, NVIDIA, IBM Research, and Google Cloud have all funded the project. It aims to help developers and researchers in building large-scale model inference workflows. According to Red Hat, llm-d is designed to work with different model types and can be adapted to run in public, private, or hybrid environments.
Red Hat’s decision to offer llm-d as an open source resource demonstrates the company’s commitment to providing accessible and transparent tools. With several contributors, the project may benefit from faster iteration and broader compatibility.
AMD EPYC CPUs help modernise virtual machines
While much of the focus has been on AI, the Red Hat and AMD partnership also includes support for more traditional IT systems. Red Hat OpenShift Virtualization runs on AMD EPYC processors. It empowers teams to manage both VMs and containers from a single platform.
This can help reduce the number of systems needed to run both new and legacy applications. It also supports better hardware use, which can lower power and licensing costs. Managing VMs and containers together lets organisations extend the life of existing workloads while building for future needs.
Red Hat says OpenShift Virtualization works well with AMD EPYC CPUs on popular servers from Dell, HPE, and Lenovo. This gives businesses a clearer path to simplify their data centres while keeping workloads in place. As more enterprises mix containerised applications with traditional systems, this support becomes more useful.
RHEL 10 gains support from broader partner network
Red Hat Enterprise Linux 10 is the latest version of its operating system. It has gained support from AMD and other hardware and cloud vendors. The update includes tools for AI work, stronger security features, and better performance in hybrid cloud setups.
Red Hat says the wider ecosystem support means users can expect more stable setups when combining RHEL 10 with their choice of hardware. This version of RHEL aims to be ready for both cloud-native deployments and long-standing workloads that still run on physical servers.
Red Hat and SiFive explore RISC-V support
In a separate partnership, Red Hat is working with SiFive to test RHEL 10 on the RISC-V chip architecture. The developer preview gives early access to a version of the OS built for open hardware.
This effort supports a long-term goal of giving users more choice in hardware. It also shows Red Hat’s continued focus on keeping its tools open and portable. The RISC-V effort reflects a growing interest in alternative processor architectures and has the potential to provide more options for edge computing and research use cases.
The work between Red Hat and AMD reflects a push to support both AI development and ongoing IT operations. While newer tools like AI inference servers and llm-d get most of the attention, support for virtual machines and open hardware shows the effort to balance future needs with present systems.
This approach may help businesses take on AI projects without having to replace their existing infrastructure. It also gives developers more control over how they run their workloads—whether that’s in the cloud, in the data centre, or somewhere in between.