April 21, 2006
Checkpointing, CHPOX, and LinuxSaving program state--even with power on/offs!Mulyadi Santosa and Eugeniy Meshcheryakov
Checkpointing lets you grab a snapshot of the current state of a program in execution, then save it on disk.
Checkpointing is a mechanism that lets you grab a snapshot of the current state of a program in execution, then save it on disk. Later the state of the program can be resumed, even after the machine has been rebooted or turned off.
Why do you need checkpointing? Consider these scenarios:
All of these scenarios (and there are a lot more) lead to one thing--you need a checkpointing solution so you won't lose a task's progress. Even though the task has native save/resume feature, checkpointing still give some help as the alternative in case there is a failure when you try to roll back using native feature. Keep in mind that both features doesn't guarantee flawless resume nor support every type of tasks.
In this article, we examine on open-source checkpointing project called CHPOX, short for "CHeckPOinter for Linux". CHPOX is a Linux kernel module for transparent dumping of specified processes into disk file, then restarting them. CHPOX was created by Olexander Sudakov and Eugeniy at the National Taras Shevchenko University, Kiev (Ukraine). Since CHPOX uses the kernel module approach, it can be supported by kernel Version (2.4.x) and dynamically inserted to or removed from kernel space, only loading when you need it.
For more information on checkpointing tools, go to www.checkpointing.org.
|
|
||||||||||||||||||||||||||||||
|
|
|
|