Skip to content
Projects
Groups
Snippets
Help
Loading...
Sign in
Toggle navigation
C
criu
Project
Project
Details
Activity
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Board
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
zhul
criu
Commits
2e00d019
Commit
2e00d019
authored
Nov 22, 2011
by
Cyrill Gorcunov
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
docs: Add internals details
Signed-off-by:
Cyrill Gorcunov
<
gorcunov@gmail.com
>
parent
60c9235f
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
145 additions
and
0 deletions
+145
-0
INTERNALS
INTERNALS
+145
-0
No files found.
INTERNALS
0 → 100644
View file @
2e00d019
crtools internals
=================
What CRtools is
---------------
In short -- crtools is an utility to checkpoint/restore (CR) processes. Unlike CR
implemented completely in kernel space, it tries to achieve the same goal opreating
in user space.
Since this tool (and overall concept) is under heavily development stage, there are
some known limitations
- Only pure x86-64 environment is supported, no IA32 emulation.
- There is no way to use cgroups freezer facility.
- No network or IPC CR supported.
At moment CR of the following resources are supported
- Process tree
- Files (with some limitations)
- Pipes
- Memory
Basic design
------------
Checkpoint
~~~~~~~~~~
Checkpoint procedure relies on /proc file system (it's a general place
where crtools takes all the information needed). Which includes
- File descriptors (via /proc/$pid/fd and /proc/$pid/fdinfo).
- Pipes parameters.
- Memory maps (via /proc/$pid/maps).
Process dumper (lets call it "dumper") does the following steps during
checkpoint stage
- A $pid of a process group leader is obtained from the command line.
- By using this $pid the dumper walks though /proc/$pid/status and gathers
children $pid's recursively. At the end we will have a complete process tree.
- Then it takes every $pid from a process tree, sends SIGSTOP to the every process
found and performs the following steps on each $pid
- Collects VMA areas by parsing /proc/$pid/maps.
- Seizes a task via relatively new ptrace interface. Seizing a task means to
put it into a special state when the task have no idea if it's being operated
by the ptrace.
- Core parameters of a task (such as registers and friends) are being dumped via
ptrace interface and parsing the /proc/$pid/stat entry.
- The dumper injects a parasite code into a task via ptrace interface. This allows
us to dump pages of a task right from within the task's address space.
An injection procedure is pretty simple one
- The dumper scans executable VMA areas of a task (which were previously collected)
and tests if there a place for a few instructions.
- Then (by ptrace as well) it substitutes an original code with new instructions
and creates a new VMA area inside process address space.
- Finally parasite code get copied into the new VMA and the former code which was
being modified during the parasite bootstrap procedure -- restored.
- Then the dumper flushes contents of a task's pages to a file, and drops out
the parasite code block completely, since we don't need it anymore.
- Once the parasite code removed a task get unseized via ptrace call but remains
stopped still.
- The dumper writes out parameters of opened files and pipes (flushing data on disk
if needed).
- SIGCONT is sent to every task in the process tree (to continue execution).
Restore
~~~~~~~
Restore procedure (aka restorer) proceed by the following steps
- The process tree read from a file.
- To restore the process tree the restorer executes clone(CLONE_CHILD_USEPID)
syscall which creates a process with $pid specified. Note if for some reason
you already have a process with the same $pid up and running, the restoration
procedure will refuse to proceed.
- Files and pipes are restored (ie opened with file descriptors they had at
checkpoint time and positioned exactly as they were before. In case if the pipe
had some data buffered before checkpoint -- data will be sent back to the pipe).
- Restoration of virtual memory (and memory pages) is a bit tricky and implemented
by the following steps
- The restorer analyzes the current VMA map by parsing /proc/$pid/maps file.
- Since we are to create completely new memory map the restorer enumerates
all VMA entries and figures out where is the place (or hole) between VMAs
which could be big enough to hold all code and parameters needed for the
rest of the restore procedure.
- Once such area found the restorer copies own code and data to a new place.
- Then the restorer pass execution there, which in turn does
- Unmaps current active VMAs and maps areas the process had at
the checkpoint time.
- Reads pages contents back to newly mapped memory.
- Prepares rt-sigreturn frame on stack and yields __NR_rt_sigreturn
syscall, so in result the process start execution from the former
IP it had at checkpoint time.
Kernel area
-----------
While CR is implemented in user-space still some help from the Linux kernel
is needed, so the following patches are needed
- New directory /proc/$pid/map_files, which allows the CR to find and restore
anonymous shared memory areas.
- Explicit "Children:" line in /proc/$pid/stat file added. This simplifies code
significantly (and kernel already has this information but simply not yet
exported).
- An ability to call clone() with specified $pid.
- start_data, end_data and a few more members of mm_struct.
- Export added to /proc/$pid/stat.
- Import implemented via new prctl codes.
- An ability to map vDSO at predefined address (implemented via
new prctl code as well).
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment