Commits · 6cf2906b0a73cc55c2ac8c4d94e8120723456bbc · zhul / criu

24 Jun, 2014 4 commits

zdtm: add new dumpable02 test to check that dumpable flag set to 0 or 2 works · 6cf2906b

Filipe Brandenburger authored Jun 18, 2014

This confirms that the fix to handle dumpable flag set to 2 still works after
restore.

To force dumpable flag set to 0 or 2 (whatever the fs.suid_dumpable is set to),
chmod the test binary to 0111 (executable, but not readable) and execv() it
while running as non-root.  The kernel will unset the dumpable flag to prevent
a core dump or ptrace to giving the user access to the pages of the binary
(which are supposedly not readable by that user.)

Tested:
- # test/zdtm.sh static/dumpable02
  Test: zdtm/live/static/dumpable02, Result: PASS
- # test/zdtm.sh ns/static/dumpable02
  Test: zdtm/live/static/dumpable02, Result: PASS
- Used -DDEBUG to confirm the value of the dumpable flag was 0 or 2 to match
  the fs.suid_dumpable sysctl in the tests (both in and out of namespaces.)
- Confirmed that the test fails if the commit that fixes handling of dumpable
  flag with value 2 is reverted and the fs.suid_dumpable sysctl is set to 2.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

6cf2906b

restore: preserve dumpable flag when it is set to 2 · f662df45

Filipe Brandenburger authored Jun 18, 2014

Commit d5bb7e97 started to preserve the dumpable flag across migration by
using prctl to get the value on dump and set it back on restore.

On some situations, the dumpable flag can be set to 2.  This happens when it is
not reset (with prctl) after using setuid() or after using execv() on a binary
that has executable but not read permissions, when the fs.suid_dumpable sysctl
is also set to 2.  However, it is not possible to set it to 2 using prctl,
which would make criu restore fail.

Fix this by checking for the value before passing it to prctl.  In case the
value of the dumpable flag was 2 at the source, check whether it is already 2
at the destination, which is likely to happen if the fs.suid_dumpable sysctl is
also set to 2 where restore is running.  In that case, preserve the value,
otherwise reset it to 0 which is the most secure fallback.

Fixes: d5bb7e97

Tested:
- Using dumpable02 zdtm test after setting fs.suid_dumpable to 2.
  # sysctl -w fs.suid_dumpable=2
  # test/zdtm.sh ns/static/dumpable02
  4: DEBUG: before dump: dumpable=2
  4: DEBUG: after restore: dumpable=2
  4: PASS
  Test: zdtm/live/static/dumpable02, Result: PASS
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

f662df45

Revert "pie: A quick workaround for PR_SET_DUMPABLE == 2 restore error." · 9f30b9e7

Filipe Brandenburger authored Jun 18, 2014

This reverts commit 8870aa1e.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

9f30b9e7

zdtm: add new dumpable01 test to check that dumpable flag is preserved · 1176081e

Filipe Brandenburger authored Jun 18, 2014

This confirms that the fix in commit d5bb7e97 to preserve the dumpable flag
after migration is working as expected.

In this test case, the dumpable flag is expected to always be set to 1, as
test_init will use prctl to reset it to 1 after using setuid and setgid.

Tested:
- # test/zdtm.sh static/dumpable01
  Test: zdtm/live/static/dumpable01, Result: PASS
- # test/zdtm.sh ns/static/dumpable01
  Test: zdtm/live/static/dumpable01, Result: PASS
- Confirmed that the test fails after reverting commit d5bb7e97.
Signed-off-by: Filipe Brandenburger <filbranden@google.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

1176081e

23 Jun, 2014 1 commit

restore: Make sure the last_pid is writen with zero offset · 7f3de288

Cyrill Gorcunov authored Jun 23, 2014

Otherwise I see on 3.16-rc1 and higher

| [  100.851730] futex wrote to ns_last_pid when file position was not 0!
| This will not be supported in the future. To silence this
| warning, set kernel.sysctl_writes_strict = -1
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

7f3de288

20 Jun, 2014 5 commits

iov: Add page_server_iov to iov and back helpers · 687c3894
Pavel Emelyanov authored Jun 19, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
```
687c3894
iov: Add iovec2pagemap() helper · 3b995f1a
Pavel Emelyanov authored Jun 19, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
```
3b995f1a
iov: Add iov_init() helper · cd347240
Pavel Emelyanov authored Jun 19, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
```
cd347240
iov: Add iov_grow_page() helper · bb7ac03a
Pavel Emelyanov authored Jun 19, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
```
bb7ac03a

vdso: don't forget to adjust vma_area_list->nr · 997f08ea

Andrey Vagin authored Jun 20, 2014

A proxy vdso is removed from the vma_area_list list,
so vma_area_list->nr must be decremented.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

997f08ea

18 Jun, 2014 1 commit

criu: Version 1.3-rc2 · 7edf0994

Pavel Emelyanov authored Jun 18, 2014

Next acheivement -- external bind mounts and tasks-to-cgroups
bindings. Plus many bugfixes in memory restore and mounpoints
dump, many thanks to Google guys for reports and patches!

We have quite a few things left to make workable LXC and Docker
support, hopefully the next tag will be the 1.3 one :)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

7edf0994

17 Jun, 2014 7 commits

pie: A quick workaround for PR_SET_DUMPABLE == 2 restore error. · 8870aa1e

Saied Kazemi authored Jun 17, 2014

[ xemul: It's a temporary workaround not to lock the -rc2 release.
  Once we have some better solution, this will be rolled back. ]
Signed-off-by: Saied Kazemi <saied@google.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

8870aa1e

zdtm: check bind-mounted files in static/mountpoints · 2ad1ba72

Andrey Vagin authored Jun 16, 2014

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

2ad1ba72

mount: dump one file system only once (v2) · 494c0443

Andrey Vagin authored Jun 16, 2014

A file system can be bind-mounted a few times and some of these mounts
can be non-root. We need to find one of root mounts and dump it.

v2: don't forget to check pm->dumped and pm->parent
    don't dump a root file system, it's always external for now.
Reported-by: Saied Kazemi <saied@google.com>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

494c0443

tmpfs: use device number instead of mnt_id in image names · 69721190

Andrey Vagin authored Jun 16, 2014

One file system can be mounted a few times, so mnt_id isn't unique for it.
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

69721190

mnt: Handle external bind mounts according to --ext-mount option (v3) · 061d6cfa
Pavel Emelyanov authored Jun 09, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
```
061d6cfa

crtools: Introduce the --ext-mount-map option (v3) · c7e00429

Pavel Emelyanov authored Jun 09, 2014

On dump one uses one or more --ext-mount-map option with A:B arguments.
A denotes a mountpoint (as seen from the target mount namespace) criu
dumps and B is the string that will be written into the image file
instead of the mountpoint's root.

On restore one uses the same --ext-mount-map option(s) with similar
A:B arguments, but this time criu treats A as string from the image's
root field (foobar in the example above) and B as the path in criu's
mount namespace the should be bind mounted into the mountpoint.

v3:
* Added documentation
* Added RPC bits
* Changed option name into --ext-mount-map
* Use colon as key and value separator
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

c7e00429

mnt: Tossing bits around in validate_mounts · c3ea0ba0
Pavel Emelyanov authored Jun 09, 2014
```
Just for simpler further patching.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
```
c3ea0ba0

11 Jun, 2014 1 commit

Allow dumping of pstore, securityfs, fusectl, debugfs · 43c96be7

Tycho Andersen authored Jun 10, 2014

These are mounted by default in ubuntu containers, so criu should know about
them and remount them on restore.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

43c96be7

10 Jun, 2014 2 commits

fs: Opening FE-s after fchdir doesn't work · 72a9372a

Pavel Emelyanov authored Jun 09, 2014

It uses absolute file names, so any open-s should happen _before_
we change tasks' root.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>

72a9372a

fs: Don't hide error from prepare_fs · 7aa7e95f

Pavel Emelyanov authored Jun 09, 2014

If fchroot() succeeds the further failures don't get
noticed by caller.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>

7aa7e95f

09 Jun, 2014 6 commits

zdtm: Add test for mount namespace w/o mountpoints · bd7bddb8
Pavel Emelyanov authored Jun 05, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
```
bd7bddb8

restore: Do fchroot() via proc helpers · 701f8837

Pavel Emelyanov authored Jun 05, 2014

There's no such thing as fchroot() in Linux, but we need to do
chroot() into existing file descriptor. Before this patch we did
this by chroot()-ing into /proc/self/fd/$fd. W/o proc mounted it's
no longer possible, so do this like

fchdir(proc_service_fd);
chroot("./self/fd/$root_fd");
fchdir($cwd_fd);

Thanks to Andrey Vagin for this trick ;)
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>

701f8837

restore: Open /proc/sys/kernel/ns_last_pid via helpers · 3659d60a
Pavel Emelyanov authored Jun 05, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
```
3659d60a

restore: Open /proc/self/maps via helpers · 0066d5e8

Pavel Emelyanov authored Jun 05, 2014

Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>

0066d5e8

util: Prepare proc opening helpers to open any files · 8644ce96

Pavel Emelyanov authored Jun 05, 2014

We have a set of routines that open /proc/$pid files via proc service
descriptor. Teach them to accept non-pids as pids to open /proc/self/*
and /proc/* files via the same engine.
Signed-f-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

8644ce96

zdtm: Add ability just to start the test · d9e7a5f1

Pavel Emelyanov authored Jun 05, 2014

When running test with ns/ prefix zdth.sh does complex preparations.
Make it possible to make them and let started process ready for
manual investigation.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>

d9e7a5f1

06 Jun, 2014 5 commits

vdso: x86 -- Use dynamic symbols for parsing · c09b7c2f

Cyrill Gorcunov authored May 28, 2014

New vDSO are in stripped format so use dynamic
symbols instead of sectioned ones.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

c09b7c2f

vdso: x86 -- Drop DECLARE_VDSO macro · 3ca8b12e

Cyrill Gorcunov authored May 28, 2014

We're not sharing the code anymore so drop it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

3ca8b12e

files: Fix open_path() to provide mntns root fd to callbacks · 8a073493

Pavel Emelyanov authored Jun 06, 2014

This fixes the support for fifo-s in mount namespaces and
makes it easier to control the correct open_path() usage in
the future.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

8a073493

mnt: Strip commas from options string · b9c6cf3d

Pavel Emelyanov authored Jun 06, 2014

Not all filesystems like it. Other than this options in the
image just look cleaner.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

b9c6cf3d

zdtm: Make it possible for test to get ZDTM_NEWNS variable · 0457c94c

Pavel Emelyanov authored Jun 06, 2014

I will need to make cgroup test behave slightly differently
when it's in and out of ns/ run. To do so it's handy to use
the ZDTM_NEWNS variable set by zdtm.sh
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

0457c94c

04 Jun, 2014 2 commits

mnt: Fix validation of dumpable mountpoints · 30e95be2

Pavel Emelyanov authored May 30, 2014

This patch consists of 3 unsplittable (from my POV) fixes.

1. Remove messy check from dump_one_mountpoint() -- we have
   validate_mounts to check whether we can dump the tree
   or not.

2. Other than being in the wron place the mentioned check
   is wrong. Comparing of the length of the mp->source-s
   makes no sense -- it should be mp->root, but even this
   would be wrong...

3. ... instead, we should check for bind mount root path
   being accessible from the target mount root path, i.e.
   the bind->root should start with src->root.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>

30e95be2

mnt: Relax checks for top-mount in validate_mounts · 3635f2c4
Pavel Emelyanov authored May 30, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Andrew Vagin <avagin@parallels.com>
```
3635f2c4

02 Jun, 2014 1 commit

mnt: Devpts options get corrupted on dump (v2) · c75b7ab6

Pavel Emelyanov authored May 30, 2014

The memcpy() in devpts_dump() just overwrites part of them.
Fix this and move the whole code into sub-routine for future.

v2: Fix off-by-one error spotted by Filipe.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Filipe Brandenburger <filbranden@google.com>

c75b7ab6

27 May, 2014 5 commits

vdso: make -- Arch targets depends on config · 06f559fc

Cyrill Gorcunov authored May 26, 2014

We use config.h in vDSO handling code so arch
targets should depend on it.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

06f559fc

zdtm: Stupid test for task-in-cgroup · 441b9b9e
Pavel Emelyanov authored May 08, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
```
441b9b9e

cg: Restore tasks into proper cgroups · 203c2914

Pavel Emelyanov authored May 08, 2014

On restore find out in which sets tasks live in and move
them there.

Optimization note -- move tasks into cgroups _before_ fork
kids to make them inherit cgroups if required. This saves
a lot of time.

Accessibility note -- when moving tasks into cgroups don't
search for existing host mounts (they may be not available)
and don't mount temporary ones (may be impossible due to
user namespaces). Instead introduce service fd with a yard
of mounts.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

203c2914

cg: Dump cgroups tasks live in · 1ba9d2ca

Pavel Emelyanov authored May 08, 2014

Each task points to a single ID of cgroup-set it lives in. This
is done so to save some space in the image, as tasks likely
live in the same set of cgroups.

Other than this we keep track of what cgroup set we dump the
subtree from. If it happens, that root task lives in the same
cgroup set as criu does, we don't allow for any other sub-cgroups
and make restore (next patch) much simpler and faster.
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>

1ba9d2ca

cg: Skeleton for cgroup code · 8b8eb53a
Pavel Emelyanov authored May 08, 2014
```
Signed-off-by: Pavel Emelyanov <xemul@parallels.com>
```
8b8eb53a