Commits · d7325fc6b7edbe742e4e21e12a38a840200b506c · zhul / criu

11 Aug, 2016 2 commits

COPYING: fix a typo in a preamble · d7325fc6

Kir Kolyshkin authored Aug 04, 2016

Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

d7325fc6

prepare_pstree: fixup reading kernel pid_max · cb58aa84

Kir Kolyshkin authored Aug 04, 2016

Two fixes (reported by coverity) and a minor nitpick:

1. Fix checking error from open_proc().

2. Fix buffer overflow. MAX_ULONG can be 20 characters long, so
ret = read() can return 20 and buf[ret] = 0 will overrun the buf.
Make a buf one character longer (an extra byte for \0) and pass
sizeof(buf) - 1 to read to fix it.

3. Call close() right after read().

This is a fixup to commit e68bded.

Reported by Coverity, CID 168505, 168504.

Cc: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Kir Kolyshkin <kir@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

cb58aa84

08 Aug, 2016 11 commits

seize: Wait the freezer to complete before processing tags · c44683c1

Cyrill Gorcunov authored Aug 02, 2016

Currently, when we use cgroup freezer to seize the tasks we start freezer
and then without waiting the completion of transition procedure we are
seizing tasks read from freezer @tasks file, using fgets.

This is fragile construction because fgets uses internal buffer and tasks
we've read might be exiting same time while we're freezing them,
the kernel won't freeze these exiting tasks because they are dying
anyway and I fear we might read a pid here which is not even in
our cgroup anymore but reused with another out of cgroup task.

Thus lets do the following: use iterations to freeze tasks waiting
for freezer to change its state and then collect/seize all tasks
in one pass.

For example on container I'm playing with it takes just one iteration

 | (00.013690) cg: Set 1 is criu one
 | (00.013705) freezing processes: 1800000 attempst with 100 ms steps
 | (00.013720) freezer.state=THAWED
 | (00.013795) freezer.state=FREEZING
 | (00.113962) freezer.state=FROZEN
 | (00.113990) freezing processes: 1 attempts done
 | (00.114073) SEIZE 240893 (comm systemd): success
 | (00.114110) Warn  (ptrace.c:121): Unable to interrupt task: 240905 (comm kthreadd/1) (Operation not permitted)
 | (00.114136) Warn  (ptrace.c:121): Unable to interrupt task: 240906 (comm khelper) (Operation not permitted)
 | (00.114155) SEIZE 240969 (comm screen): success
 | (00.114166) SEIZE 240970 (comm sendmail): success
 | (00.114179) SEIZE 240971 (comm sendmail): success
 | (00.114189) SEIZE 240972 (comm saslauthd): success
 | (00.114202) SEIZE 240973 (comm crond): success
 | (00.114211) SEIZE 240974 (comm agetty): success
 | (00.114221) SEIZE 240975 (comm agetty): success
 | ...
Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Acked-by: Andrew Vagin <avagin@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

c44683c1

seize: Take --timeout option into account when freezing processes · 9fae23fb

Cyrill Gorcunov authored Aug 02, 2016

When we're freezing processes we don't count on anything but
rather do 5 attempts with constantly increasing delay.

Lets rather do the following:

 - take --timeout option into account (which is 5 seconds
   by default) and split it into 100 ms steps;

 - when frezing processes check freezer status every 100 ms.

Same time it looks that 5 seconds by default is too small
for high loaded containers. Lets increase it to 10 seconds
by default.

[ skinsbursky@:
Frankly speaking, in this particular case increasing intervals are not nice.
This is not a network issue or something.
Usually freezing takes less than a second, but more, that, say 200ms.
Otherwise it takes quite o lot of time.
If step size is growing all the time, in most of the cases criu will
waste hundreds of milliseconds between iterX (say, third) and (iterX+1)
because of the growing step size.
100ms step looks solid enough: not to small to produce a lot of syscalls
and not to large to waste a lot of time.
With previous scheme freezing was usually taking half a second more that
it should because of this growing step.

[ gorcunov@:
You won't belive, but been able to sepcify --timeout 0 here allowed
me and Stas to catch serieous problem in freezer code.

https://lkml.org/lkml/2016/8/3/317

Without this feature we would have to patch criu instead. So you know,
this would be great to keep it for catching more kernel bugs ;)
Reported-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

9fae23fb

log: Print version on startup · 5a43e55e

Cyrill Gorcunov authored Aug 02, 2016

For debug sake.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

5a43e55e

files: don't create a transport socket for each file · e46ba886

Andrew Vagin authored Jul 29, 2016

This is an unix dgram socket which doesn't have an address and
isn't connected to somewhere, so we can use one socket for all processes.

v2: return non-zero code in error cases
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

e46ba886

zdtm.py: check for link remap files presence on test end · 3e840917

Stanislav Kinsburskiy authored Aug 02, 2016

These files have to be removed after successful restore.

v2:
Check link remap files only for tests with "--link-remap" option in
descriptor.
Signed-off-by: Stanislav Kinsburskiy <skinsbursky@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

3e840917

namespaces: use fstatat instead of readlink to get a namespace kernel id · 2eb1dde6

Andrew Vagin authored Jul 30, 2016

It should be faster and we don't need to parse a string.
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

2eb1dde6

parasite: handle errors while a transport socket is being created · 64e74fab

Andrew Vagin authored Jul 28, 2016

Currently if socket() or connect() syscall-s failed, parasite cures itself,
but criu has not got any signals and waits on accept().

This patch adds a futex to synchronize parasite and criu. The server socket
is created with SOCK_NONBLOCK and waits on the futex when a parasite
connects to it, only then criu calls accept() and it returns immediately.
Reported-by: Yohei Kamitsukasa <uhoidx@gmail.com>
Cc: Yohei Kamitsukasa <uhoidx@gmail.com>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Reviewed-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

64e74fab

tcp: dump and restore window parameters · 7d95face

Andrew Vagin authored Jul 13, 2016

We found that sometimes a restored tcp socket doesn't work.

A reason of this bug is incorrect window parameters and in this case
tcp_acceptable_seq() returns tcp_wnd_end(tp) instead of tp->snd_nxt. The
other side drops packets with this seq, because seq is less than
tp->rcv_nxt ( tcp_sequence() ).

We need to restore window parameters to avoid such side effects.

https://github.com/xemul/criu/issues/168Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

7d95face

kerndat: check the TCP_REPAIR_WINDOW option · f3b730d0

Andrew Vagin authored Jul 13, 2016

It's a new option to get/set window parameters.

v2: don't do this check to unprivileged users, because TCP_REPAIR
    is protected by CAP_NET_ADMIN.
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

f3b730d0

mount: resolve parent mount of symbolic link correctly · 488fc072

Katerina Koukiou authored Jul 25, 2016

When using --root option in criu dump, when the mountpoint passed
contains a symbolic link, criu does not resolve its parent correctly.
e.g when passing --root /run/rootfs, dump finishes successfully;
but when /var/run/rootfs is passed, where /var/run is symbolic link to
/run it exits with error "New root and old root are the same".
Signed-off-by: Katerina Koukiou <k.koukiou@gmail.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

488fc072

proc_parce: Fix assignment of ns_mountpoint · 227ffd3e

Kirill Tkhai authored Jul 20, 2016

realloc() may move a memory chunk in case of shrink.

v4: New
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

227ffd3e

05 Aug, 2016 1 commit

travis: Don't run --unshare tests · 9c7a234b

Pavel Emelyanov authored Aug 05, 2016

They are only available on dev branch
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

9c7a234b

02 Aug, 2016 1 commit

cr-exec: initialize kdat.{task_size, has_compat_sigreturn} on criu exec · da9315d8

Dmitry Safonov authored Jul 07, 2016

For `criu exec` we are searching for a place for syscall injection.
While searching for a VMA with PROT_EXEC and with needed size,
we check that VMA is lower than task_size.
The callpath for it is:
cr_exec => parasite_prep_ctl => get_vma_by_ip

Firstly, I thought to omit kdat.task_size checking if it's not inited:
> if (vma_area->e->start >= kdat.task_size && kdat.task_size)
but I think it's a hack then a proper solution.
Besides, this code still can choose VMA over task_size on ARM
and try to inject syscall there (IIRC, ARM has kernel-mapped
VMA in that area).

So, lets init kdat.task_size for `criu exec`.
Also lets init kdat.has_compat_sigreturn so we could exec into
compatible applications.

Cc: Christopher Covington <cov@codeaurora.org>
Cc: Andrew Vagin <avagin@virtuozzo.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Reviewed-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

da9315d8

01 Aug, 2016 25 commits

build/nmk: declare build-as as a recursive · 3693c5e6

Dmitry Safonov authored Jul 18, 2016

So, how it was working:
1. build-as was declared with $$(1) and $$(2) which were expanded
on entering the submake;
2. function $(call build-as,...) performed the second expansion of
build-as.

Cons: build-as works only in sub-makefile, no sub-sub-makefile, no
upper/top makefile.

Simplify this by single $(1).
Then build-as variable will be used _only_ in makefile, not in
sub-makefiles.

This is for now fine, as each file, that calls $(MAKE) with
$(build)=dir or $(call build-as,makefile,dir) will include main.mk
from NMK, which has build-as definition (from include.mk).

In the future, we'll get rid of $(build) and $(build-as) workarounds
as finally switch to building from a global makefile.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

3693c5e6

build/make: return to make from top directory · 7a360484

Dmitry Safonov authored Jul 18, 2016

It looks like, there is not so much that needs to be fixed for
building criu from a top directory.
After the patch it's possible to do `make criu/mount.o` i.e.
It will build protobuf, compel as dependencies (if they are not built),
but no more from criu objects. If something breaks, you can
do make from vim and jump to error. Nice.
Mostly the patch corrects pathes to objects - I tried to make them
depend on $(obj) or $(SRC_DIR)/criu, where it's possible.

After it tested:
`make -j 10`, `make criu/log.o`, `make clean`, `make mrproper`,
`make install DESTDIR=/tmp/criu`, `make uninstall DESTDIR=/tmp/criu`

Note: I improperly called v1 for this patch as "return to make from
top Makefile" -- but I didn't mean that (and it was friday ;)

This patch doesn't yet switch to top-Makefile building, but that's
a step in that way (building from a top Makefile needs correct pathes
in makefiles) which also adds ability to build objects in subdirectories
and etc.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

7a360484

restore: handle the case where zombies are reparented · a9a62403

Tycho Andersen authored Jul 22, 2016

For example, if a zombie has a helper that sets up its session id, the
zombie will be reparented to the init task, which will then potentially get
a SIGCHLD for a task which isn't its direct child zombie, which we didn't
handle. Instead, let's find all the zombies for the init task, in case they
get reparented this way.

v2: only the zombies need to be recursively collected, helpers wait on
    their children before they exit and will never be reparented
v4: the root task waits all zombies
Reported-by: Tycho Andersen <tycho.andersen@canonical.com>
Cc: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

a9a62403

mount: sysfs -- Use slave mounting for the root · 1727742f

Cyrill Gorcunov authored Jul 25, 2016

Seems this snippet escaeped from commit
84bf1ad4
so we may get -EBUSY in open_detach_mount.
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andrey Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

1727742f

util-vdso: correct vdso symbol's length · ceddac39

Dmitry Safonov authored Jul 22, 2016

VDSO_SYMBOL_MAX is max number of symbols, not their max length.
Fixes my buggy commit: 4c69339c ("string.h/pie: use builtin strncmp
instead of strcmp"). Sorry for that bogus misprinting.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

ceddac39

pstree: bump kernel pid_max value if needed · 42cc04fc

Laurent Dufour authored Jul 22, 2016

When restoring on a different node, it may happen that pid_max is
below one of the pid we wanted to recreate.
This leads to a restore error when cloning the restarted process:

(00.011172) Forking task with 44794 pid (flags 0x0)
(00.011205) Error (cr-restore.c:1008): 44794: Write 44793 to sys/kernel/ns_last_pid: Invalid argument

This patch computes the largest pid value and sets the kernel pid_max if
necessary.

If the user don't have the permission to do so, the restart is
failing mentioning that we can't push the pid_max limit.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

42cc04fc

criu: page-xfer: add helper function for dumping holes · e673c7d6

Mike Rapoport authored Jul 14, 2016

Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

e673c7d6

cgroup: drop useless name variable in collect_cgroups · 64725769

Dmitry Safonov authored Jul 15, 2016

It looks like, it's completely not needed here.

criu/cgroup.c:582:4: warning: Value stored to 'name' is never read
                        name = cc->name + 5;
                        ^      ~~~~~~~~~~~~
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

64725769

unix: handle unlink failure correctly · 40a37ae7

Tycho Andersen authored Jul 19, 2016

Instead of returning, we should revert the cwd as in all the other error
paths.

*** CID 164720:  Resource leaks  (RESOURCE_LEAK)
/criu/sk-unix.c: 1030 in bind_unix_sk()
1024                                    goto done;
1025                            }
1026                    }
1027
1028                    if (ui->ue->deleted && unlink((char
*)ui->ue->name.data) < 0) {
1029                            pr_perror("failed to unlink %s\n",
ui->ue->name.data);
>>> >>>     CID 164720:  Resource leaks  (RESOURCE_LEAK)
>>> >>>     Handle variable "cwd_fd" going out of scope leaks the handle.
1030                            return -1;
1031                    }
1032            }
1033
1034            if (ui->ue->state != TCP_LISTEN)
1035                    futex_set_and_wake(&ui->prepared, 1);
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
CC: Andrew Vagin <avagin@virtuozzo.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

40a37ae7

Fix minor printf format · 69c69601

Laurent Dufour authored Jul 19, 2016

In cr-restore printf() format is mixing "%p" and the prefix "0x" which is
already managed by "%p". This leads to log lines like:

(00.053282)  38744: Found bootstrap VMA hint at: 0x0x100000 (needs ~576K)
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

69c69601

namespaces: don't for to skip cgroup namespace · 66f7d2f7

Tycho Andersen authored Jul 19, 2016

Instead, let's skip it before we fork.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

66f7d2f7

restore: fix race with helpers' kids dying too early · ced9c529

Tycho Andersen authored Jul 13, 2016

We masked off SIGCHLD in wait_on_helpers_zombies(), but in fact this is too
late: zombies can die any time after CR_STATE_RESTORE before this function
is called, which lead to us getting "unexpected" deaths. Instead, we should
mask off SIGCHLD before the helpers finish CR_STATE_RESTORE, since they're
explicitly going to wait on all their kids to make sure they don't die
anyway.
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

ced9c529

tests: add a test for the case when there is a helper with a zombie child · dfe5f4e5

Tycho Andersen authored Jul 13, 2016

v2: drop /bin/ps from test deps
v3: wait for the zombie to make sure it exits
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

dfe5f4e5

tests: only wait for the pid we spawned · d5bee200

Tycho Andersen authored Jul 13, 2016

In the next patch, we'll introduce an option to allow for leaving zombie
processes in the pid ns for the test so that we can test the behavior of
zombies. Let's not reap everything after restore, since we'll reap the
restored zombies as well.

v2: restore the old behavior when in reap mode

CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

d5bee200

tests: add a ZDTM_NOREAP variable · b3c42b28

Tycho Andersen authored Jul 13, 2016

We'll use this variable in the next test to make sure the test suite
doesn't accidentally reap the zombie we want to leave around for the actual
test.

This is kind of ugly and there might be a better way to pass information to
the test's init, I'm open for suggestions :)

CC: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Tycho Andersen <tycho.andersen@canonical.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

b3c42b28

zdtm.py: check permissions for memory mappings · 67960bbd

Andrew Vagin authored Jun 27, 2016

Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

67960bbd

sysvshm: Don't mprotect segments with PROT_EXEC · 794ad7bc

Pavel Emelyanov authored Jul 16, 2016

When fixing mprotected (ro) sysvshmems I used the PROT_EXEC flag
to keep the information about whether the segment itself should
be rw or ro. This flag leaked to sys_mprotect and some attachments
of the segment became executable after restore.

Fix this by dropping the EXEC flag.

https://github.com/xemul/criu/issues/180Reported-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
Acked-by: Andrew Vagin <avagin@virtuozzo.com>

794ad7bc

restore: don't check tcore->thread_core · e6cf4061

Andrew Vagin authored Jul 18, 2016

It is always not NULL in sigreturn_restore().

CID 164716 (#1 of 1): Dereference after null check (FORWARD_NULL)
64. var_deref_model: Passing tcore to construct_sigframe, which dereferences null tcore->thread_core. [show details]
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Acked-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

e6cf4061

make: do not rm version header in criu make · dde7fae6

Dmitry Safonov authored Jul 18, 2016

It's generated and cleaned in the top Makefile.
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

dde7fae6

make: add Makefile.packages and simplify it · 8c3da46a

Dmitry Safonov authored Jul 18, 2016

I think, we can simplify criu's makefile by moving packages
checks out to special makefile.
Now we only need to make criu's target depend on 'check-packages'.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

8c3da46a

make: simplify checking of installed libraries · 98b136f6

Dmitry Safonov authored Jul 18, 2016

Impact: use /dev/null as $(CC) output, drop temporary file.

Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Dmitry Safonov <dsafonov@virtuozzo.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

98b136f6

seize: fix memory corruption · d13be2f8

Andrew Vagin authored Jul 18, 2016

277                     }
>>> >>>     CID 164718:  Memory - corruptions  (OVERRUN)
>>> >>>     Overrunning array "stackbuf" of 2048 bytes at byte offset 2048 using index "ret" (which evaluates to 2048).
278                     stackbuf[ret] = '\0';
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

d13be2f8

seize: don't leak a file descriptor · 3de0f49a

Andrew Vagin authored Jul 18, 2016

267                     if (stack < 0) {
268                             pr_perror("couldn't log %d's stack", pid);
>>> >>>     CID 164721:  Resource leaks  (RESOURCE_LEAK)
>>> >>>     Variable "f" going out of scope leaks the storage it points to.
269                             return -1;
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

3de0f49a

criu: check that ghost files are cleaned up in error cases · 93e92223
Andrew Vagin authored Jul 15, 2016
```
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>
```
93e92223

mnt: clean up a root yard after openning all files · 10af342f

Andrew Vagin authored Jul 15, 2016

The root yard is used to clean up ghost files.

Now try_clean_remaps() is called from depopulate_roots_yard(), so
the code about switching mount namespaces was moved to
depopulate_roots_yard().

v2: call clean_remaps() when processes are restored in
    the host mount namespace.

Now depopulate_roots_yard() is called from the root task before
finishing CR_STATE_FORKING.

I moved it to the criu process and do it after clean_remaps(), because
clean_remaps() uses the roots yard.

It's called after openning all files, because only at this moment we can
be sure that all link remap files can be removed.

restore_task_with_children()		| restore_root_task()
-----------------------------------------------------------------------
depopulate_roots_yard()			|
restore_finish_stage(CR_STATE_FORKING)	|
prepare_fds()				|
open_vmas()				|
					| restore_switch_stage(CR_STATE_RESTORE_SIGCHLD)
					| clean_remaps = 0;

If something fails between CR_STATE_FORKING and CR_STATE_RESTORE_SIGCHLD,
try_clean_remaps will be called().

try_clean_remaps()
  try_clean_ghost()
    rst_get_mnt_root()
      print_ns_root()
	snprintf(buf, bs, "%s/%d", mnt_roots, ns->id);

it uses mnt_roots, actually it is what we called the roots yard.
Signed-off-by: Andrew Vagin <avagin@virtuozzo.com>
Signed-off-by: Pavel Emelyanov <xemul@virtuozzo.com>

10af342f