Red Hat OpenShift Service on AWS (ROSA) is a fully managed application platform that offers a more seamless experience for building, deploying, and scaling applications. For machine learning (ML) workloads, ROSA now supports On-Demand Capacity Reservations (ODCR) and Capacity Blocks for ML, allowing cloud architects and platform administrators to strategically utilize their existing AWS purchases to help deliver  uninterrupted access to essential compute infrastructure. 

Today, ROSA is available in over 30 regions and supports over 600 instance types, allowing customers to run diverse workloads according to their business needs. However, maintaining guaranteed or uninterrupted access to a specific infrastructure type in a particular availability zone (AZ) is important for several critical scenarios:

  • GPU-based accelerated computing workloads: Gaining uninterrupted access to accelerated computing (GPU) instances is vital for AI/ML teams conducting training, fine-tuning, or inference workloads. Capacity reservation helps eliminate the risk of compute unavailability for these time-sensitive, resource-intensive tasks.
  • Planned scaling events: Enabling infrastructure scaling events to confidently support planned business events—such as peak traffic seasons, major product launches, or scheduled batch processing—without provisioning delays.
  • High availability and disaster recovery: Enhancing resiliency by guaranteeing capacity when deploying workloads across multiple AZs or executing disaster recovery protocols across regions.

Amazon EC2 Capacity Reservations allow you to reserve compute capacity for your Amazon EC2 instances in a specific AZ for any duration. Capacity Blocks for ML allow you to reserve GPU-based accelerated computing instances on a future date to support your short duration ML workloads. With the support for Capacity Reservations for clusters with hosted control planes (HCP), platform administrators can now create ROSA machine pools in their cluster that directly consume the capacity already reserved with AWS. 

Key best practices for effectively leveraging Capacity Reservations with ROSA:

  1. Pre-planning of AZs, instance types, and capacity: Before creation, ensure a precise match between the reserved capacity and the ROSA machine pool attributes. This includes VPC subnets, the number of node replicas, and the instance type. When reserving capacity for a future date, carefully balance the relative costs of purchasing capacities across different AZs with technical configurations like VPC subnet size, available IPs, and node replica requirements. You must wait until the AWS Capacity Reservation status is active before attempting to provision ROSA machine pools utilizing it.
  2. Informed decision on instance matching criteria: AWS provides two types of instance matching criteria for ODCRs: "Open" and "targeted." Choose a strategy based on your workload distribution. If you run multiple workloads across different services and intend to reserve capacity exclusively for your ROSA clusters, using the targeted matching criteria is strongly recommended. Remember that ODCRs operate on a ‘use it or lose it’ principle, as they are billed at on-demand rates regardless of utilization.
  3. Precise control over reserved capacity consumption: ROSA offers flexible controls to define how workloads should utilize EC2 instances across on-demand and capacity reservations. For example, decide whether you want the machine pool to either use on-demand instances as fall back or to fail when the configured capacity reservation is exhausted.
  4. Centralized management and allocation of purchases: For organizations managing multiple AWS accounts, the ability to centralize the purchase of ODCRs and allocate them across member accounts with AWS Resource Access Manager is a significant benefit. ROSA fully supports utilizing Capacity Reservations that are shared to the AWS account where the cluster is created, simplifying financial management and ensuring all teams benefit from reserved capacity.
  5. Proactive monitoring of Capacity Reservation utilization: Given that multiple workloads or accounts may share reservations, it's crucial to monitor Capacity Reservation utilization continuously. Cluster-specific utilization can fluctuate widely over time. Proactively planning for conditions, such as the exhaustion of reserved capacities, can prevent a ROSA cluster node from becoming unavailable for critical workloads.

To learn more about how to purchase Capacity Reservations and Capacity Blocks for ML, read the AWS documentation. To learn more about managing machine pools and setting capacity preferences in your ROSA cluster, read the Managing Nodes chapter in the ROSA documentation.

To get started with ROSA, visit the ROSA product page.

Essai de produit

Red Hat OpenShift Container Platform | Essai de produit

Plateforme de base cohérente pour le cloud hybride, qui facilite l'assemblage et la mise à l'échelle d'applications conteneurisées.

À propos des auteurs

Bala Chandrasekaran is a Product Manager in the Managed OpenShift Cloud Services. He has over 20 years of experience across cloud native technologies, infrastructure and data systems.

Brae Troutman is a Software Engineer supporting ROSA HCP Commercial and FedRAMP offerings. He's in his first 5 years of experience working in the world of Cloud Platforms as a Service, with special interest and focus in declarative configuration management and durable microservice approaches to cloud services, and continuous learning within his field.

UI_Icon-Red_Hat-Close-A-Black-RGB
ROSA and AWS logo lockup

Se lancer avec ROSA

Profitez d'une expérience pratique de Red Hat® OpenShift® Service on AWS (ROSA).

Découvrir le hub d'apprentissage ROSA

Apprenez-en plus sur ROSA et résolvez les problèmes associés grâce à des supports et outils d'apprentissage.

Parcourir par canal

automation icon

Automatisation

Les dernières nouveautés en matière d'automatisation informatique pour les technologies, les équipes et les environnements

AI icon

Intelligence artificielle

Actualité sur les plateformes qui permettent aux clients d'exécuter des charges de travail d'IA sur tout type d'environnement

open hybrid cloud icon

Cloud hybride ouvert

Découvrez comment créer un avenir flexible grâce au cloud hybride

security icon

Sécurité

Les dernières actualités sur la façon dont nous réduisons les risques dans tous les environnements et technologies

edge icon

Edge computing

Actualité sur les plateformes qui simplifient les opérations en périphérie

Infrastructure icon

Infrastructure

Les dernières nouveautés sur la plateforme Linux d'entreprise leader au monde

application development icon

Applications

À l’intérieur de nos solutions aux défis d’application les plus difficiles

Virtualization icon

Virtualisation

L'avenir de la virtualisation d'entreprise pour vos charges de travail sur site ou sur le cloud