Kubernetes Scheduling with Checkpoint/Restore: Challenges and Open Problems

Varování

Publikace nespadá pod Ústav výpočetní techniky, ale pod Fakultu informatiky. Oficiální stránka publikace je na webu muni.cz.
Autoři

SPIŠAKOVÁ Viktória STOYANOV Radostin HEJTMÁNEK Lukáš KLUSÁČEK Dalibor REBER Adrian BRUNO Rodrigo

Rok publikování 2026
Druh Článek ve sborníku
Konference Job Scheduling Strategies for Parallel Processing
Fakulta / Pracoviště MU

Fakulta informatiky

Citace
Doi https://doi.org/10.1007/978-3-032-10507-3_3
Klíčová slova Checkpoint and Restore; Kubernetes; Containers; Resource Management; Scheduling
Popis Efficient resource management and scheduling have been persistent challenges since the early days of computing and remain critical to this day.The widespread adoption of containers managed by orchestrators like Kubernetes have introduced new dimensions to this challenge. Despite the lightweight nature and minimal overhead of containers, they still suffer from utilization inefficiencies due to overprovisioning. Existing scheduling techniques are not enough to meet these demands and there is a growing need for orchestration and scheduling policies that support advanced preemption, migration, and fault tolerance. Well-established container checkpoint/restore (C/R) mechanisms implemented through tools like CRIU, offer a promising solution for improving resource scheduling efficiency. However, these mechanisms remain only partially integrated with platforms like Kubernetes. In this paper, we explore the use cases for general C/R, examine the current state, and delve into the open problems and challenges associated with native integration into Kubernetes. We propose potential solutions to these challenges, offering a pathway towards more efficient resource management to better meet the needs of today's computational landscape. While scheduling efficiency is considered critical in HPC clusters, serverless and deep learning platforms also benefit directly from these optimizations.
Související projekty:

Používáte starou verzi internetového prohlížeče. Doporučujeme aktualizovat Váš prohlížeč na nejnovější verzi.

Další info