- 24 Jun, 2014 11 commits
-
-
Cyrill Gorcunov authored
New kernel 3.16 will have old vDSO zone splitted into the two vmas: one for vdso code itself and second that named vvar for data been referenced from vdso code. Because I can't do 'dump' and 'restore' parts of the code separately (otherwise test would fail) the commit is pretty big one and hard to read so here is detailed explanation what's going on. 1) When start dumping we detect vvar zone by reading /proc/pid/smap and looking up for "[vvar]" token. Note the vvar zone is mapped by a kernel with PF/IO flags so we should not fail here. Also it's assumed that at least for now kernel won't be changed much and [vvar] zone always follows the [vdso] zone, otherwise criu will print error. 2) In previous commits we disabled dumping vvar area contents so the restorer code never try to read vvar data but still we need to map vvar zone thus vma entry remains in image. 3) As with previous vdso format we might have 2 cases a) Dump and restore is happening on same kernel b) Dump and restore are done on different kernels To detect which case we have we parse vdso data from image and find symbols offsets then compare their values with runtime symbols provided us by a kernel. If they match and (!!!) the size of vvar zone is the same -- we simply remap both zones from runtime kernel into the positions dumpee had at checkpoint time. This is that named "inplace" remap (a). If this happens the vdso_proxify() routine drops VMA_AREA_REGULAR from vvar area provided by a caller code and restorer won't try to handle this vma. It looks somehow strange and probably should be reworked but for now I left it as is to minimize the patch. In case of (b) we need to generate a proxy. We do that in same way as we were before just include vvar zone into proxy and save vvar proxy address inside vdso mark injected into vdso area. Thus on subsequent checkpoint we can detect proxy vvar zone and rip it off the list of vmas to handle. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Cyrill Gorcunov authored
Because of new vvar area we need to carry the address of vvar proxy inside the mark. Thus add members needed and update routines. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Cyrill Gorcunov authored
This is for debug purpose mostly. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Cyrill Gorcunov authored
vvar zone is mapped by a kernel and must not ever been dumped into image, the data present there is valid on running kernel only. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Cyrill Gorcunov authored
Will need it to handle vvar zones in a special way. Because VMA_UNSUPP never goes into the image file lets reuse bit 12 for VVAR. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Filipe Brandenburger authored
The /dev directory is also created by zdtm when running ns/ enabled tests. Add it to the list, together with entries such as /bin and /lib. Signed-off-by:
Filipe Brandenburger <filbranden@google.com> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Filipe Brandenburger authored
This adds new tests "cgroup00" and "clean_mntns" to the .gitignore file. Signed-off-by:
Filipe Brandenburger <filbranden@google.com> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Filipe Brandenburger authored
This confirms that the fix to handle dumpable flag set to 2 still works after restore. To force dumpable flag set to 0 or 2 (whatever the fs.suid_dumpable is set to), chmod the test binary to 0111 (executable, but not readable) and execv() it while running as non-root. The kernel will unset the dumpable flag to prevent a core dump or ptrace to giving the user access to the pages of the binary (which are supposedly not readable by that user.) Tested: - # test/zdtm.sh static/dumpable02 Test: zdtm/live/static/dumpable02, Result: PASS - # test/zdtm.sh ns/static/dumpable02 Test: zdtm/live/static/dumpable02, Result: PASS - Used -DDEBUG to confirm the value of the dumpable flag was 0 or 2 to match the fs.suid_dumpable sysctl in the tests (both in and out of namespaces.) - Confirmed that the test fails if the commit that fixes handling of dumpable flag with value 2 is reverted and the fs.suid_dumpable sysctl is set to 2. Signed-off-by:
Filipe Brandenburger <filbranden@google.com> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Filipe Brandenburger authored
Commit d5bb7e97 started to preserve the dumpable flag across migration by using prctl to get the value on dump and set it back on restore. On some situations, the dumpable flag can be set to 2. This happens when it is not reset (with prctl) after using setuid() or after using execv() on a binary that has executable but not read permissions, when the fs.suid_dumpable sysctl is also set to 2. However, it is not possible to set it to 2 using prctl, which would make criu restore fail. Fix this by checking for the value before passing it to prctl. In case the value of the dumpable flag was 2 at the source, check whether it is already 2 at the destination, which is likely to happen if the fs.suid_dumpable sysctl is also set to 2 where restore is running. In that case, preserve the value, otherwise reset it to 0 which is the most secure fallback. Fixes: d5bb7e97 Tested: - Using dumpable02 zdtm test after setting fs.suid_dumpable to 2. # sysctl -w fs.suid_dumpable=2 # test/zdtm.sh ns/static/dumpable02 4: DEBUG: before dump: dumpable=2 4: DEBUG: after restore: dumpable=2 4: PASS Test: zdtm/live/static/dumpable02, Result: PASS Signed-off-by:
Filipe Brandenburger <filbranden@google.com> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Filipe Brandenburger authored
This reverts commit 8870aa1e. Signed-off-by:
Filipe Brandenburger <filbranden@google.com> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Filipe Brandenburger authored
This confirms that the fix in commit d5bb7e97 to preserve the dumpable flag after migration is working as expected. In this test case, the dumpable flag is expected to always be set to 1, as test_init will use prctl to reset it to 1 after using setuid and setgid. Tested: - # test/zdtm.sh static/dumpable01 Test: zdtm/live/static/dumpable01, Result: PASS - # test/zdtm.sh ns/static/dumpable01 Test: zdtm/live/static/dumpable01, Result: PASS - Confirmed that the test fails after reverting commit d5bb7e97. Signed-off-by:
Filipe Brandenburger <filbranden@google.com> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
- 23 Jun, 2014 1 commit
-
-
Cyrill Gorcunov authored
Otherwise I see on 3.16-rc1 and higher | [ 100.851730] futex wrote to ns_last_pid when file position was not 0! | This will not be supported in the future. To silence this | warning, set kernel.sysctl_writes_strict = -1 Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
- 20 Jun, 2014 5 commits
-
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Andrey Vagin authored
A proxy vdso is removed from the vma_area_list list, so vma_area_list->nr must be decremented. Cc: Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrey Vagin <avagin@openvz.org> Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
- 18 Jun, 2014 1 commit
-
-
Pavel Emelyanov authored
Next acheivement -- external bind mounts and tasks-to-cgroups bindings. Plus many bugfixes in memory restore and mounpoints dump, many thanks to Google guys for reports and patches! We have quite a few things left to make workable LXC and Docker support, hopefully the next tag will be the 1.3 one :) Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
- 17 Jun, 2014 7 commits
-
-
Saied Kazemi authored
[ xemul: It's a temporary workaround not to lock the -rc2 release. Once we have some better solution, this will be rolled back. ] Signed-off-by:
Saied Kazemi <saied@google.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Andrey Vagin authored
Signed-off-by:
Andrey Vagin <avagin@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Andrey Vagin authored
A file system can be bind-mounted a few times and some of these mounts can be non-root. We need to find one of root mounts and dump it. v2: don't forget to check pm->dumped and pm->parent don't dump a root file system, it's always external for now. Reported-by:
Saied Kazemi <saied@google.com> Signed-off-by:
Andrey Vagin <avagin@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Andrey Vagin authored
One file system can be mounted a few times, so mnt_id isn't unique for it. Signed-off-by:
Andrey Vagin <avagin@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
On dump one uses one or more --ext-mount-map option with A:B arguments. A denotes a mountpoint (as seen from the target mount namespace) criu dumps and B is the string that will be written into the image file instead of the mountpoint's root. On restore one uses the same --ext-mount-map option(s) with similar A:B arguments, but this time criu treats A as string from the image's root field (foobar in the example above) and B as the path in criu's mount namespace the should be bind mounted into the mountpoint. v3: * Added documentation * Added RPC bits * Changed option name into --ext-mount-map * Use colon as key and value separator Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
Just for simpler further patching. Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
- 11 Jun, 2014 1 commit
-
-
Tycho Andersen authored
These are mounted by default in ubuntu containers, so criu should know about them and remount them on restore. Signed-off-by:
Tycho Andersen <tycho.andersen@canonical.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
- 10 Jun, 2014 2 commits
-
-
Pavel Emelyanov authored
It uses absolute file names, so any open-s should happen _before_ we change tasks' root. Signed-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com>
-
Pavel Emelyanov authored
If fchroot() succeeds the further failures don't get noticed by caller. Signed-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com>
-
- 09 Jun, 2014 6 commits
-
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com>
-
Pavel Emelyanov authored
There's no such thing as fchroot() in Linux, but we need to do chroot() into existing file descriptor. Before this patch we did this by chroot()-ing into /proc/self/fd/$fd. W/o proc mounted it's no longer possible, so do this like fchdir(proc_service_fd); chroot("./self/fd/$root_fd"); fchdir($cwd_fd); Thanks to Andrey Vagin for this trick ;) Signed-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com>
-
Pavel Emelyanov authored
We have a set of routines that open /proc/$pid files via proc service descriptor. Teach them to accept non-pids as pids to open /proc/self/* and /proc/* files via the same engine. Signed-f-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
When running test with ns/ prefix zdth.sh does complex preparations. Make it possible to make them and let started process ready for manual investigation. Signed-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com>
-
- 06 Jun, 2014 5 commits
-
-
Cyrill Gorcunov authored
New vDSO are in stripped format so use dynamic symbols instead of sectioned ones. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Cyrill Gorcunov authored
We're not sharing the code anymore so drop it. Signed-off-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
This fixes the support for fifo-s in mount namespaces and makes it easier to control the correct open_path() usage in the future. Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
Not all filesystems like it. Other than this options in the image just look cleaner. Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
Pavel Emelyanov authored
I will need to make cgroup test behave slightly differently when it's in and out of ns/ run. To do so it's handy to use the ZDTM_NEWNS variable set by zdtm.sh Signed-off-by:
Pavel Emelyanov <xemul@parallels.com>
-
- 04 Jun, 2014 1 commit
-
-
Pavel Emelyanov authored
This patch consists of 3 unsplittable (from my POV) fixes. 1. Remove messy check from dump_one_mountpoint() -- we have validate_mounts to check whether we can dump the tree or not. 2. Other than being in the wron place the mentioned check is wrong. Comparing of the length of the mp->source-s makes no sense -- it should be mp->root, but even this would be wrong... 3. ... instead, we should check for bind mount root path being accessible from the target mount root path, i.e. the bind->root should start with src->root. Signed-off-by:
Pavel Emelyanov <xemul@parallels.com> Acked-by:
Andrew Vagin <avagin@parallels.com>
-