- 19 May, 2017 5 commits
-
-
Pavel Emelyanov authored
When running 'cirt x dir rss' one will see the way pagemap chunks are scatered across the VMs of processes. Sample output from the env00 zdtm test is 22 400000 / 1 00400000 / 5 /root/criu/test/zdtm/static/env00 604000 / 2 00604000 / 1 /root/criu/test/zdtm/static/env00 00605000 / 1 /root/criu/test/zdtm/static/env00 853000 / 1 00853000 / 33 7faba2d4b000 / 6 7faba2d4b000 / 4 /usr/lib64/libc-2.22.so 7faba2d4f000 / 2 /usr/lib64/libc-2.22.so 7faba2d51000 / 2 7faba2d51000 / 4 7faba2d54000 / 1 ~ 7faba2f64000 / 3 7faba2f64000 / 3 7faba2f74000 / 1 7faba2f74000 / 1 7faba2f75000 / 2 7faba2f75000 / 1 /usr/lib64/ld-2.22.so 7faba2f76000 / 1 /usr/lib64/ld-2.22.so 7faba2f77000 / 1 7faba2f77000 / 1 7fffb4de3000 / 3 7fffb4de2000 / 70 7fffb4e24000 / 2 ~ 7fffb4e27000 / 1 ~ 7fffb4f6a000 / 2 7fffb4f6a000 / 2 Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
The same as for Checkpointing - we need to call 32-bit syscall for compatible tasks here to correctly restore compat_robust_list, not robust_list. Note: I check here restorer's *task* arg for compatible mode, not restorer *thread* arg. As changing application's mode during runtime is very rare thing itself, application that runs different bitness threads is most likely not present at all. If we ever meet such application, this could be improved. Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
The kernel keeps two different pointers for 32-bit and 64-bit futex lists: robust_list and compat_robust_list in task_struct. So, dump compat_robust_list for ia32 tasks. Note: this means that one can set *both* compat_robust_list and robust_list pointers by using as we're here 32-bit and 64-bit syscalls. That's one of mixed-bitness application questions. For simplification (and omitting more syscalls), we dump here only one of the pointers. Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
Check for presence of robust futex list is in the reality (futex_rla_len != 0). That does code on dumping, in get_task_futex_robust_list(): > ret = syscall(SYS_get_robust_list, pid, &head, &len); > if (ret < 0 && errno == ENOSYS) { [..] > len = 0; [..] > } [..] > info->futex_rla_len = (u32)len; And in images: futex_rla_len == 0 means that futex is not present. So, we don't need additional restorer's parameter `has_futex' which is always true, remove it. Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
Cleanup: use nr, provided by compel. Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
- 10 May, 2017 35 commits
-
-
Pavel Emelyanov authored
Modules pre-load is also slow, but guarding this code with the presence of criu.kdat cache file seems reasonable. Or course, one may unload the needed modules by hands, but such smart user may as well remove the /run/criu.kdat file :) Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Doing kerndat checks on every criu start is way too slow. We need some way to speed this checks up on a particular box. As suggested by Andre, Dima and Mike let's try to keep the collected kdat bits into some tmpfs file. Keeping it on tmpfs would invaludate this cache on every machine reboot. There have been many suggestions how to generate this file, my proposal is to create it once the kdat object is filled by criu, w/o any explicit command. Optionally we can add 'criu kdat --save|--drop' actions to manage this file. v2: * don't ignore return code of write() (some glibcs complain) * unlink tmp file in case rename failed v3: * add one more magic into kerndat_s which is the 'date +%s' * ignore any errors opening or saving cache. Only size/magic mismatch matters (and result in dropping the cache) * cache file path is Makefile-configurable (RUNDIR) * don't save cache if kerndat auto-detection failed v4: * Use ?= for RUNDIR definition. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
v2: When uffd is present, the reported features may still be 0, so we need one more bool for uffd syscall itself. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Introduce 3-state mode and check them always. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Kirill Tkhai authored
Some get_status() methods may allocate data, because not all of the fields in /proc/[pid]/status file have the fixed size. For example, NSpid, which size may vary. Introduce new method free_status() in counterweight for such type get_status() methods. it will be called in case of we go to try_again and need to free allocated data. Also, introduce data parameter for a use in the future. Signed-off-by:
Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Kirill Tkhai authored
The goal of this function is to compare everything except caps, but caps size is took to compare. It's wrong, there must be used offsetof(struct proc_status_creds, cap_inh) instead. Also, sigpnd may be different too. v3: Move excluding sigpnd from comparation in this patch (was in another patch). Reorder fields in seize_task_status(). Signed-off-by:
Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Kir Kolyshkin authored
Having a "header library" is nice if it's small and clean, but - we compile its code a few times; - there is no distinction between internal and external functions. Let's separate functions out of header and into a .c file. Signed-off-by:
Kir Kolyshkin <kir@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Kir Kolyshkin authored
The function is not included into the library, so having its prototype there was a shortcut. Move it to a separate include file. Signed-off-by:
Kir Kolyshkin <kir@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Kir Kolyshkin authored
This is an auxiliary source file. The corresponding object file was cleaned, but .d was not. Add it to SRC/OBJ/DEP so the appropriate files will be cleaned automatically. Signed-off-by:
Kir Kolyshkin <kir@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Kir Kolyshkin authored
MAKEFLAGS += -r only works for sub-make, and it is not applicable to the current instance. Since previous commit make is not re-running itself (after re-reading deps files), so MAKEFLAGS no longer works. Use one more way to disable built-in rules that stand in our way. Signed-off-by:
Kir Kolyshkin <kir@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Kir Kolyshkin authored
As it was pointed out by our esteemed maintainer (let his light shine), after my recent changes to test/zdtm Makefiles all dependencies are regenerated even if we only need to build a single test (for example, cd test/zdtm/static && make env00). This was caused by "-include $(DEP)" statement. Make sees that these files are need to be included, but are missing, and since it knows how to generate them it goes on to do so. The solution is to use $(wildcard) function which returns the list of _existing_ files, and so include will only receive the files that exist. Reported-by:
Andrei Vagin <avagin@virtuozzo.com> Signed-off-by:
Kir Kolyshkin <kir@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Kir Kolyshkin authored
inotify_system_nodel.c is supposed to be a symlink to inotify_system.c, but somehow the file was committed. This, together with the statement in Makefile to recreate the file, lead to replacing the file with a symlink during make. Remove the file, add the symlink, and remove the Makefile rule. PS yes, I have checked the files are identical. Signed-off-by:
Kir Kolyshkin <kir@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
When we don't restore any namespaces criu forces tasks to wake it up two times simply to no-op and wake up tasks back. This can be optimized by simply omitting the not needed wakeups. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Here's why: This stage is needed to make sure all tasks have appeared and did some actions (that are called before restore_finish_stage()). With this description there's no need in involving criu process in it, this stage is purely inter-tasks sync point. Taking into account we do already make root task wait for others to complete forking (it calls restore_wait_ther_tasks()) we may rework this stage not to involve criu process in it. Here's how: So the criu task starts the forking stage, then goes waiting for "inprogress tasks". The latter wait is purely about nr_in_progress counter, thus there's no strict requirement that the stage remains the same by the time criu is woken up. Siad that, the root task waits for other tasks to finish forking, does fini_restore_mntns() (already in the code), then switches the stage to the next (the RESTORE one). Other tasks do normal staging barrier. Criu task is not woken up as nr_in_progress always remains >= 1. The result is -2 context switches -- from root task to criu and back -- which gives us good boost when restoring single task app. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Describe what the tasks do during restore and what the expectations at the sync points are. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Looks like this separate stage is not needed. The scripts involved in ns restore are synchronized with existing stages like this: criu: root task: ROOT_TASK stage <appear> "setup-ns" script PREPARE_NAMESPACES prepare_namespace() "post-setup-ns" script FORKING restore_task_mnt_ns() <everything else> which seems to be OK. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
This is wat root task does -- calls prepare_namespace(). Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
The stage name is what tasks do, not what criu waits for. When the first stage is started we want the root task to come up, rather than namespaces to get restored. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Don't set futexes by hands, use the restore_switch_stage helpers explicitly (for code readability). Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
The restore_switch_stage() already waits tasks at the end, so there's no need in one more waiting. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
It is used now to close descriptors of mount namespaces and will be used for network namespaces too. Acked-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
It is required for cases when we inject a fault in criu restore. In this case we execute "criu restore" and check that it fails, then we execute "criu restore" without a fault and check that it passes. If the first "criu restore" restores only a part of processes, the second criu can get PID of one of restored processes. https://github.com/xemul/criu/issues/282Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
We need compat realization for restorer unmap as after rt_sigreturn() the task is stopped it 32-bit code and ptrace API doesn't allow setting x86_64 full registers set to ia32 task. Generic restorer has now x86-specific __export_unmap_compat() function, which isn't right. Clean restorer from x86-related realization. Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
We don't need __export_unmap_compat() for !CONFIG_COMPAT in restorer blob. This preparation will allow to move compatible unmap that's written in x86 asm from generic restorer blob to arch/x86. Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
Each `make tags` resulted in running feature-tests. That's not needed for tags generation. Don't waste time on tests as we don't compile anything. Reported-by:
Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
Let's pretend that we're doing something ;-D FWIW: cleaning two lines in the test output: Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> ========================== Run zdtm/static/tty03 in h ========================== make[2]: Nothing to be done for `default'. Start test make[2]: Nothing to be done for `default'. ./tty03 --pidfile=tty03.pid --outfile=tty03.out Run criu dump Run criu restore Send the 15 signal to 24 Wait for zdtm/static/tty03(24) to die for 0.100000 Removing dump/zdtm/static/tty03/24 ========================= Test zdtm/static/tty03 PASS ========================== Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Right now they all sit in a separate file. Since we don't support CLONE_SIGHAND (and don't plan to) it's much better to have them in core, all the more so by the time we dump/restore sigacts, the core entry is at hands already. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
Some sysctl-s are optional, so criu has to skip them silently Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
HOSTCFLAGS are populated with CFLAGS if they are set from environment: CFLAGS=-O1 make But it turns out that =? operator, which sets variable iff it was unset previously - is recursive expanded operator. Which means that value of HOSTCFLAGS is evaluated every time it's used. Which is wrong with the current flaw in Makefile: 1. it assigns HOSTCFLAGS with CFLAGS from environment as recursive 2. it assigns target-related options to CFLAGS such as -march. 3. HOSTCFLAGS are used (with the current code - here they are expanded from CFLAGS). Which results in target-related options supplied to host objects building, which breaks cross-compilation. Fix by omitting recursive expansion for HOSTCFLAGS. Still we need to keep $(WARNINGS) and $(DEFINES) in HOSTCFLAGS. Link: https://lists.openvz.org/pipermail/criu/2017-April/037109.html Cc: Cyrill Gorcunov <gorcunov@openvz.org> Reported-by:
"Brinkmann, Harald" <Harald.Brinkmann@bst-international.com> Reviewed-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
DEFINES, LDARCH, VDSO in one place - visually simpler, more terse. Reviewed-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
Arch-specific options will be clearer without support checks. Reviewed-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Dmitry Safonov authored
It's defined in NMK - don't redefine it. Remove BTW twice exports and twice definition of $(LDARCH). Call 64-bit ARM as aarch64. Reviewed-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Dmitry Safonov <dsafonov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-