- 30 Oct, 2018 40 commits
-
-
Adrian Reber authored
Running 'criu dump -t <PID>' with a configuration file under valgrind where <PID> does not exist, gives: ==14336== 600 bytes in 5 blocks are definitely lost in loss record 5 of 5 ==14336== at 0x4C29BC3: malloc (vg_replace_malloc.c:299) ==14336== by 0x5D387A4: getdelim (in /usr/lib64/libc-2.17.so) ==14336== by 0x439829: getline (stdio.h:117) ==14336== by 0x439829: parse_config (config.c:69) ==14336== by 0x439CB2: init_configuration.isra.1 (config.c:159) ==14336== by 0x439F75: init_config (config.c:212) ==14336== by 0x439F75: parse_options (config.c:487) ==14336== by 0x42499F: main (crtools.c:140) ==14336== LEAK SUMMARY: ==14336== definitely lost: 600 bytes in 5 blocks With this patch: ==17892== LEAK SUMMARY: ==17892== definitely lost: 0 bytes in 0 blocks Signed-off-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
Acked-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
It works like other external resources. A user specify which namespaces are external and have not to be dumped. On restore, the user gives file descriptors to preconfigured namespaces. How to use: dump: --external net[INO]:KEY restore: --inherit-fd fd[NSFD]:KEY The test script contains more details how to use this: test/others/netns_ext/run.sh Acked-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
We need to know which namespaces are external to restore them properly. Acked-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Tikhomirov authored
1150 1371 0:169 / /zdtm/static/private_bind_propagation.test rw,relatime shared:920 - tmpfs zdtm_fs rw 1151 1150 0:170 / /zdtm/static/private_bind_propagation.test/share1 rw,relatime shared:921 - tmpfs share rw 1152 1150 0:170 / /zdtm/static/private_bind_propagation.test/share2 rw,relatime shared:921 - tmpfs share rw 1153 1151 0:169 /source /zdtm/static/private_bind_propagation.test/share1/child rw,relatime - tmpfs zdtm_fs rw 1154 1152 0:169 /source /zdtm/static/private_bind_propagation.test/share2/child rw,relatime - tmpfs zdtm_fs rw Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
We already check (root, mountpoint) pairs preserve, do the same for (root, mountpoint, shared, slave) fours. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Radostin Stoyanov authored
When CRIU is called for a first time and the /run/criu.kdat file does not exists, the following warning is shown: Warn (criu/kerndat.c:847): Can't load /run/criu.kdat This patch is replacing this warning with a more appropriate debug message. File /run/criu.kdat does not exist Signed-off-by:
Radostin Stoyanov <rstoyanov1@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Tikhomirov authored
If we fail to create temporary directory for doing a clean mount we can make mount clean reusing the code which enters new mountns to umount overmounts. As when last process exits mntns all mounts are implicitly cleaned from children, see in kernel source - sys_exit->do_exit ->exit_task_namespaces->switch_task_namespaces->free_nsproxy ->put_mnt_ns->umount_tree->drop_collected_mounts->umount_tree: /* Hide the mounts from mnt_mounts */ list_for_each_entry(p, &tmp_list, mnt_list) { list_del_init(&p->mnt_child); } Fixes commit b6cfb1ce2948 ("mount: make open_mountpoint handle overmouts properly") https://github.com/checkpoint-restore/criu/issues/520Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Acked-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Tikhomirov authored
Create a tree of shared mounts where shared mounts have different sets of children (while having the same root): First share1 is mounted shared tmpfs and second share1/child1 is mounted inside, third share1 is bind-mounted to share2 (now share1 and share2 have the same shared id, but share2 has no child), fourth share1/child2 is bind-mounted from share1, and also propagated to share2/child2 (now all except share1/child1 have the same shared id), fifth share1/child3 is mounted and propagates inside the share. Finally we have four mounts shared between each other with different sets of children mounts, and even more two of them are children of another two: 495 494 0:62 / /zdtm/static/non_uniform_share_propagation.test/share1 rw,relatime shared:235 - tmpfs share rw 496 495 0:63 / /zdtm/static/non_uniform_share_propagation.test/share1/child1 rw,relatime shared:236 - tmpfs child1 rw 497 494 0:62 / /zdtm/static/non_uniform_share_propagation.test/share2 rw,relatime shared:235 - tmpfs share rw 498 495 0:62 / /zdtm/static/non_uniform_share_propagation.test/share1/child2 rw,relatime shared:235 - tmpfs share rw 499 497 0:62 / /zdtm/static/non_uniform_share_propagation.test/share2/child2 rw,relatime shared:235 - tmpfs share rw 500 495 0:64 / /zdtm/static/non_uniform_share_propagation.test/share1/child3 rw,relatime shared:237 - tmpfs child3 rw 503 497 0:64 / /zdtm/static/non_uniform_share_propagation.test/share2/child3 rw,relatime shared:237 - tmpfs child3 rw 502 499 0:64 / /zdtm/static/non_uniform_share_propagation.test/share2/child2/child3 rw,relatime shared:237 - tmpfs child3 rw 501 498 0:64 / /zdtm/static/non_uniform_share_propagation.test/share1/child2/child3 rw,relatime shared:237 - tmpfs child3 rw Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
These also fixes false-propagation problem of the mount to itself if it is in parent's share. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
1) redo waiting for parents of propagation group to be mounted using pre-found propagation groups 2) for shared mount wait for children of that shared group which has no propagation in our shared mount (2) - effectively is a support of non-uniform shares, that means two mounts of shared group can have different sets of children now - we will mount them in the right order, but propagate_mount and validate_shared are still preventing c/r-ing such shares, will fix the former and remove the latter in separate(next) patches. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
These information will help improving the restore of tricky mounts configurations. Function same_propagation_group checks if two mounts were created simultaneousely through shared mount propagation, and the main part of these - they should be in exaclty the same place inside the share of their parents. Function root_path_from_parent prints the mountpoint path relative to the root of the parent's share, by first substracting parent's mountpoint from our mountpoint and second prepending parents root path (relative to the root of it's file system), e.g: id parent_id root mountpoint 1 0 / / 2 1 / /parent_a 3 1 /dir /parent_b 4 2 / /parent_a/dir/a 5 3 / /parent_b/a (Let 2 and 3 be a shared group) For mount 4 root_path_from_parent gives: "/parent_a/dir/a" - "/parent_a" == "/dir/a" "/" + "/dir/a" == "/dir/a" For mount 5: "/parent_b/a" - "/parent_b" == "/a" "/dir" + "/a" == "/dir/a" So mounts 4 and 5 are a propagation group. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
495 494 0:62 / /zdtm/static/shared_slave_mount_children.test/share rw,relatime shared:235 - tmpfs share rw 496 494 0:62 / /zdtm/static/shared_slave_mount_children.test/slave1 rw,relatime shared:236 master:235 - tmpfs share rw 497 494 0:62 / /zdtm/static/shared_slave_mount_children.test/slave2 rw,relatime shared:236 master:235 - tmpfs share rw 498 496 0:63 / /zdtm/static/shared_slave_mount_children.test/slave1/child rw,relatime shared:237 - tmpfs child rw 499 497 0:63 / /zdtm/static/shared_slave_mount_children.test/slave2/child rw,relatime shared:237 - tmpfs child rw Before the fix we had: (00.167574) 1: Error (criu/mount.c:1769): mnt: A few mount points can't be mounted (00.167577) 1: Error (criu/mount.c:1773): mnt: 498:496 / /tmp/.criu.mntns.o2Op5j/9-0000000000/zdtm/static/shared_slave_mount_children.test/slave1/child child (00.167580) 1: Error (criu/mount.c:1773): mnt: 497:494 / /tmp/.criu.mntns.o2Op5j/9-0000000000/zdtm/static/shared_slave_mount_children.test/slave2 share Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
We should not use ->bind link for checking master's children. As if we have two slaves shared between each other, the one mounted first will replace ->bind link for the other - that will break restore. Also while on it, if we do not want doubled mounts and want to prohibit propagation to slaves on restore we likely want all children of the whole master's share mounted before slave. JFYI: Actually these restriction is very strict and some cases will fail to restore, for instance (hope nobody does so): mkdir /test mount -t tmpfs test /test mount --make-private /test mkdir /test/{share,slave} mount -t tmpfs share /test/share --make-shared mount --bind /test/share/ /test/slave/ mount --make-slave /test/slave mount --make-shared /test/slave mkdir /test/share/slave mount --bind /test/slave/ /test/share/slave/ cat /proc/self/mountinfo | grep test 524 612 0:69 / /test rw,relatime - tmpfs test rw 570 524 0:73 / /test/share rw,relatime shared:879 - tmpfs share rw 571 524 0:73 / /test/slave rw,relatime shared:942 master:879 - tmpfs share rw 602 570 0:73 / /test/share/slave rw,relatime shared:942 master:879 - tmpfs share rw 603 571 0:73 / /test/slave/slave rw,relatime shared:943 master:942 - tmpfs share rw Here 603 is a propagation of 602 from master 570 to slave 571, and it is the only way to get such a mount as 571 and 602 are in one shared group now and all later mounts to them will propagate between them and create dublicated mounts. So to create real 603 without dups we need to have /test/slave mounted before /test/share/slave, which contradicts with current assumption. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
These test is not automatic as after kernel v4.11 behaviour changes, on older kernel we get children collision: 817 188 0:48 / /zdtm/static/unsupported_children_collision.test/share1 rw,relatime shared:942 - tmpfs share rw > 818 817 0:124 / /zdtm/static/unsupported_children_collision.test/share1/child rw,relatime shared:943 - tmpfs child1 rw 819 188 0:48 / /zdtm/static/unsupported_children_collision.test/share2 rw,relatime shared:942 - tmpfs share rw 820 819 0:125 / /zdtm/static/unsupported_children_collision.test/share2/child rw,relatime shared:944 - tmpfs child2 rw > 821 817 0:125 / /zdtm/static/unsupported_children_collision.test/share1/child rw,relatime shared:944 - tmpfs child2 rw Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Pavel Tikhomirov authored
See more detailed explanation inside in-code comment. note: Actually before we remove validate_mounts (later in these patchset) we likely won't get to these check and fail earlier, as having children collision implies shared mounts with different sets of children. note: from v4.11 and ms kernel commit 1064f874abc0 ("mnt: Tuck mounts under others instead of creating shadow/side mounts.") there will be no more mount collision. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
-
Adrian Reber authored
Signed-off-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
We use fdstore intensively for example when handling bindmounted sockets and ghost dgram sockets. The system limit for per-socket queue may not be enough if someone generate lots of ghost sockets (150 and more as been detected on default fedora 27). To make it operatable lets unlimit fdstore queue size on startup. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Adrian Reber authored
Signed-off-by:
Adrian Reber <areber@redhat.com> Acked-by:
Radostin Stoyanov <rstoyanov1@gmail.com> Acked-by:
Radostin Stoyanov <rstoyanov1@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
When we are dumping epoll and one of target fd is been duped we can reuse already collected fds rbtree to find proper target. We handle it in a lazy way: - try use plain regular bsearch first, in case of all targets are not duped we checkpoint epoll immediately - if bsearch failed we put this epoll entry into a queue and run its dumping later when all other files in the process are already dumped. At this moment fds tree should already has all target files in rbtree thus we can simply lookup for it Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
It is used in files tree generation so we will need reuse for epoll sake. Also use the whole 64 bit offset to shuffle bits more. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
To find target files with help of our collected rbtree. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
If we can't find target file descriptor we should exit on dump with error instead of skipping it. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
We will use them to fast lookup of targets files. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
To run epoll tests only where it is supported. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
For readability sake Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
To figure out efd:tfd mapping easier by reading the logs. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
For easier fd match when reading logs Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
When target file obtained from epoll fdinfo (internally the kernel keeps only file _number_ inside) we have to check its identity to make sure it is exactly one which has been added into epoll engine. The only proper way is to use kcmp syscall. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
When we are checkpoiting epoll targets we assuming that this target file is belonging to the process we are on. This is of course not true. Without kernel support the only thing we can do is compare fd numbers with ones present in epoll fdinfo. When fd numer match we assume that it indeed the file which has been added into epoll. This won't cover the case when file has been moved to some other number and new one is reopened instead of it. Such scenario will trigger false positive and we can't do anything about. In next patches with kernel help we will make precise check for files identity. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
In epoll dumping we will need the whole set of fds to investigate the targets, so pass this parameter down to epoll code. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
We will need it to make sure the target files in epolls are present in current process. Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
Signed-off-by:
Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
- aling memebers - use pid_t type for PIDs Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
- switch to use uintX type (just to drop uX finally, it doesn't worth to carry this type) - instead of including huge util.h rather include the files which are really needed: log, xmalloc, compiler and bug Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-