- 27 May, 2016 40 commits
-
-
Andrew Vagin authored
Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Cyrill Gorcunov authored
Updated version attached. >From 6c0e1522e01e01aa89861862fbdf039a0892b89b Mon Sep 17 00:00:00 2001 From: Cyrill Gorcunov <gorcunov@openvz.org> Date: Tue, 12 Apr 2016 20:00:24 +0300 Subject: [PATCH 1/2] tty: Write unread pty buffers on post dump stage When unread data present on peers we currently simply ignore it but actually we can try to fetch it in non(that)destructive way. For this sake at the end of dump procedure (because fetching queued data may go wrong and we will have to write it back, which is heavy, and we need all ttys under our hands) we walk over all collected TTYs and link PTYs peers which indices are matching. Note to not overload tty_dump_info we reuse @list member for new @all_ptys list. Once link established we literally read queued data and flush it into new tty-data.img. If something go wrong at this moment, we stop reading queued data but walk back over already queued ones and write them back to restore former state. Same applies if the dump has been requested to leave task alive. On restore we link peers back and write queued data once peer back to live. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Tycho Andersen authored
See the comment for details, but basically tracefs is automounted by the kernel, so we can just mount debugfs with MS_REC and get the right result. v2: rebase on criu-dev v3: don't use a new fstype->flags, just always set MS_REC in debugfs' ->parse Signed-off-by:
Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
(00.170031) Error (files-reg.c:515): `- XFail [.criu.mntns.3xE0jR/15var/tmp/ibmiNsaA.cr.5.ghost] ghost: No such file or directory Reported-by:
Adrian Reber <adrian@lisas.de> Cc: Adrian Reber <adrian@lisas.de> Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Test-for: c59759345e6e ("mount: dump a file system only if a mount point isn't overmounted") Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrei Vagin authored
"clean mount" is a copy of target mount without child mounts. Currently a clean mount is created when a target mount has children. In this case a target path MAY be overmounted. In this patch, a clean mount is created only if a target path is overmounted. For that we enumerate all children and check that they are not mounted over a target path. Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
static char *cut_root_for_bind(char *target_root, char *source_root) Currently it returns a relative path if source_root is '/' and an absolute path for other cases. Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
If a mount point has to be bind-mounted to somewhere, we need to have access to it. This patch solves a problem, when the mount point is overmounted. In this case we can open the targret mount point and then use /proc/pid/FD to access it. v2: make a bind-mount from an underlying mount via a file descriptor v3: add a separate buffer to generate path to a file descriptor v4: use the PSFDS contant. Reported-by:
Stanislav Kinsburskiy <skinsbursky@virtuozzo.com> Cc: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com> Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Emelyanov authored
When we didn't have tree with pids, the search for parent item was optimized. Nowadays we can just use one rbtree lookup. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Acked-by:
Andrew Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
This code is for older kernels that don't have locks info in fdinfo files. So don't keep global pstree helper for this. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
stable_secret is always unset in new netns, and we can not restore it to that state after it is set, so just skip save/restore steps Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
mldv1_unsolicited_report_interval and mldv2_unsolicited_report_interval are rounded up to (MSEC_PER_SEC / HZ)*k when set as in kernel they are saved in jiffies. As min HZ is 100, 10 is MAX_MSEC_GRANULARITY. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
As changing disable_ipv6 sysctl for some device may change mtu sysctl for it we need to first check mtu, and only then set disable_ipv6. That can be done splitting our functions into two. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
some sysctls have minimal value != 0 so we will also need to set their lower limit e.g. /proc/sys/net/ipv6/conf/all/mtu name possible states range accept_source_route "<0", ">=0" {-1, 0} medium_id "-1", "0", ">0" {-1, INT_MAX} src_valid_mark true, false {0, 1} tag any {INT_MIN, INT_MAX} Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
Get arrays out of save_and_set and check_and_restore, as it is an overkill to give to the function wich should work with only one element of array the whole array. These makes them reusable in future test of ipv6 config. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
First restore "all" and then "default"(and "dev") as for some ipv6 sysctls setting all can change the latter(forwarding, disable_ipv6) e.g.: echo 1 > /proc/sys/net/ipv6/conf/default/disable_ipv6 echo 0 > /proc/sys/net/ipv6/conf/all/disable_ipv6 cat /proc/sys/net/ipv6/conf/default/disable_ipv6 0 As changing disable_ipv6 sysctl for some device may change mtu sysctl do not optimize mtu's restore. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
* old MAX_CONF_OPT_PATH should be enough for ipv6 path array as len('net/ipv6/conf//mldv2_unsolicited_report_interval') is 48 but len and size are almost back to back, so increase it a bit, to don't worry about it. reading stable_secret before it is initialized fails with EIO, we can safely skip dumping it in that case as in new netns it will be uninitialized by default v6: use __CTL_STR without len in image type v9: set flag CTL_FLAGS_READ_EIO_SKIP for stable_secret dumping Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
* do not c/r mc_forwarding option it depends on multicast managment socket existence and is readonly. In addrconf_sysctl_disable->addrconf_disable_ipv6->dev_disable_change: On addrconf_notify + NETDEV_UP if idev->cnf.mtu6 differs from dev->mtu, sysctl mtu6 is overwritten. So changing disable_ipv6 sysctl for some device may change mtu sysctl for it and we need to restore disable_ipv6 first and only then mtu. Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
use new SysctlEntry, leave old ipv4_conf_op as ipv4_conf_op_old for forward compatibility v4: use CTL_FLAGS_HAS instead of req[].has v5: use CTL_TYPE in sysctl_entries_equal v6: fix net_conf_op for string sysctls, add rconf to have requests conf at hand Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
int32 with boolean value in protobuf has the same size with bool, many sysctls are boolean but we don't lose anything by storing them in int32, so add only int32 and string fields will need string field for stable_secret ipv6 sysctl also such fromat allows us to easily handle non-present int sysctls we can check if we have it using has_*arg v3: rebase images/Makefile to criu-dev branch v4: use enum for type Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
v4: replace separate has pointer to CTL_FLAGS_HAS flag, second part in patch "net/ipv4: add net_conf_op to reuse for ipv6" v6: define CTL_FLAGS_HAS v7: also allow EIO on do_sysctl_op for optional sysctls like stable_secret and fix sysctl file to close in error path v9: add CTL_FLAGS_READ_EIO_SKIP to skip dumping stable_secret on EIO Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Reviewed-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Kirill Tkhai authored
MAP_FAILED is retval of lib'c mmap(). Direct syscall returns IS_ERR() in case of error. Signed-off-by:
Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
criu dump should return an error in this case Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Here we try to enumerate all mount points and try to find one, which allows us to dump content of a file system. It's should be a root mount and its mount point should not be overmounted. We don't have a separate call-back to dump content of a file system. fstype->dump() isn't always requires access to a mount point (e.g. autofs), so we check overmounts in open_mountpoint(). $ cat /proc/61693/root/etc/redhat-release Fedora release 23 (Twenty Three) $ cat /proc/61692/mountinfo | grep '\s/tmp' 234 199 0:57 / /tmp rw shared:97 master:76 - tmpfs tmpfs rw,size=131072k,nr_inodes=32768 235 234 0:57 /systemd-private-dd74de99e1104383aa7cd6e27d3d0b8a-httpd.service-uFqNHk/tmp /tmp rw,relatime shared:98 master:76 - tmpfs tmpfs rw,size=131072k,nr_inodes=32768 v2: return an error if we can't dump a file system v3: try to find a mount point which allows to dump a file system v4: check that children are not overmounted a target mount instead of getting a mnt_id for a file descriptor. v5: add a special error code for unreachable mount points Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Emelyanov authored
Next patch will put non-root task's PID into it, so this one is preparatory. But, as a bonus, we remove the need to unlink ths pid file in case of error :) Risk -- scripts might want to have pidfile, but we already have CRTOOLS_ROOT_PID environment in them for such cases. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Tycho Andersen authored
Signed-off-by:
Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Tycho Andersen authored
Signed-off-by:
Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
gcc v6.0 and clang think that &next->pid.node can't be null. Here is an explanation from a kernel log (v3.12-5097-g1310a5a): """ the result of this expression is not defined by a C standard and some gcc versions (e.g. 4.3.4) assume the above expression can never be equal to NULL. The net result is an oops because the iteration is not properly terminated. """ $ gcc -v gcc version 6.0.0 20160406 (Red Hat 6.0.0-0.20) (GCC) $ python test/zdtm.py run -t zdtm/static/session00 ... $ gdb -c /tmp/core.61 criu/criu Program terminated with signal SIGSEGV, Segmentation fault. 598 if (&next->pid.node == NULL || next->pid.virt > pid) $ make CC=clang pstree.c:598:18: error: comparison of address of 'next->pid.node' equal to a null pointer is always false [-Werror,-Wtautological-pointer-compare] if (&next->pid.node == NULL || next->pid.virt > pid) Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
All task are collected in rbtree what allows us to search any task by a virtual pid for O(n log(n)). Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-