- 16 Sep, 2017 40 commits
-
-
Mike Rapoport authored
Restore of a zombie process does not call setup_uffd which causes lazy-pages daemon to stuck forever waiting for (pid, uffd) pair to arrive. Let's extend the protocol between restore and lazy-pages so that for zombie process a (0, -1) pair will be sent instead of actual (uffd, pid). travis-ci: success for lazy-pages: misc fixes (rev4) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
travis-ci: success for lazy-pages: misc fixes (rev4) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
To properly handle zombie processes we will need to distinguish failures coming from socket communications from absent userfault file descriptor travis-ci: success for lazy-pages: misc fixes (rev4) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
When a VMA is mapped with MAP_LOCKED it is address space is populated with pages which causes UFFDIO_COPY to return -EXISTS. Until we can find some better solution let's avoid marking VMAs with MAP_LOCKED as lazy. Fixes: #238 travis-ci: success for lazy-pages: misc fixes (rev3) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Adrian Reber authored
# criu dump --display-stats -D /tmp/cp -t <PID> Displaying dump stats: ... Zero memory pages: 0 (0x0) Lazy memory pages: 0 (0x0) travis-ci: success for Added option to display dump/restore stats (rev2) Signed-off-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Adrian Reber authored
Only the UFFD daemon is aware if pages are in the parent or not. The restore will continue to work as any lazy-restore except that pages from parent checkpoints will be pre-populated by the restorer. The restorer will still register the whole memory region as being handled by userfaultfd even if it contains pages from parent checkpoints. Userfaultfd page faults will only happen on pages which contain no data. This means from the parent pre-populated pages will not trigger a userfaultfd message even if marked as being handled by userfaultfd. The UFFD daemon knows about pages which are available in the parent checkpoints and will not push those pages unnecessarily to userfaultfd. Following steps to migrate a process are now possible: Source system: * criu pre-dump -D /tmp/cp/1 -t <PID> * rsync -a /tmp/cp <destination>:/tmp * criu dump -D /tmp/cp/2 -t <PID> --port 27 --lazy-pages \ --prev-images-dir ../1/ --track-mem Destination system: * rsync -a <source>:/tmp/cp /tmp/ * criu lazy-pages --page-server --address <source> --port 27 \ -D /tmp/cp/2 & * criu restore --lazy-pages -D /tmp/cp/2 This will now restore all pages from the parent checkpoint if they are not marked as lazy in the second checkpoint. v2: - changed parent detection to use pagemap_in_parent() v3: - unfortunately this reverts c11cf95afbe023a2816a3afaecb65cc4fee670d7 "criu: mem: skip lazy pages during restore based on pagemap info" To be able to split the VMA-s in the right chunks for the restorer it is necessary to make the decision lazy or not on the VmaEntry level. v4: - everything has changed thanks to Mike Rapoport's suggestion - the VMA-s are no longer touched or split - instead of over 100 lines of changes this is now two line patch Signed-off-by:
Adrian Reber <areber@redhat.com> Acked-by:
Mike Rapoprot <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Adrian Reber authored
Combining pre-copy (pre-dump) and post-copy (lazy-pages) mode showed a problem in the function page_pipe_split_ppb(). The function is used to split the page-pipe-buffer so that it only contains the IOVs request from the restore side during lazy restore. Unfortunately it only splits the leading IOVs out of the page-pipe-buffer and not the trailing: Before split for requested address 0x7f27284d1000: page-pipe: ppb->iov 0x7f0f74d93040 page-pipe: 0x7f27282bb000 1 page-pipe: 0x7f27284d1000 1 page-pipe: 0x7f27284dd000 2 After split: page-pipe: ppb->iov 0x7f0f74d93050 page-pipe: 0x7f27284d1000 1 page-pipe: 0x7f27284dd000 2 and: page-pipe: ppb->iov 0x7f0f74d93040 page-pipe: 0x7f27282bb000 1 This patch keeps on splitting the page-pipe-buffer until it contains only the requested address with the requested length. After split (still trying to load 0x7f27284d1000): page-pipe: ppb->iov 0x7f0f74d93050 page-pipe: 0x7f27284d1000 1 and: page-pipe: ppb->iov 0x7f0f74d93040 page-pipe: 0x7f27282bb000 1 and: page-pipe: ppb->iov 0x7f0f74d93060 page-pipe: 0x7f27284dd000 2 v2: - moved declarations to the declaration block Signed-off-by:
Adrian Reber <areber@redhat.com> Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Currently potentially lazy pages are not counted as written even if they are dump into pages*img. Count these pages as "pages_written" when dump is not going to skip writing lazy pages to disk. travis-ci: success for criu: mem: count all pages actually written to image as "pages_written" Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Adrian Reber authored
This translates pagemap flags into strings for easier readability. Signed-off-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Emelyanov authored
Do the same here, the flags is now enough to tell hole from pagemap. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com>
-
Pavel Emelyanov authored
They are now the same and PE_PRESENT bit helps us distinguish holes from pagemaps having pages inside. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com>
-
Pavel Emelyanov authored
The ->write_hole and ->write_pagemap now look very much alike, so let's merge them. This is preparatory patch that makes holy type decision based on PE_FOO flags. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com>
-
Mike Rapoport authored
Instead of checking whether the VMA containing a page can be lazy for each page, skip the entire parts of pagemap that have PE_LAZY flag set. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
The PE_PRESENT flags is always set for pagemap entries that have corresponding pages in the pages*img. Pagemap entries describing a hole either with zero page or with pages in the parent snapshot will no have PE_PRESENT flag set. Pagemap entry that may be lazily restored is a special case. For the lazy restore from disk case, both PE_LAZY and PE_PRESENT will be set in the pagemap, but for the remote lazy pages case only PE_LAZY will be set. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
With 'zero' and 'lazy' booleans replaced by the flags field in PagemapEntry, it is required that page-xfer will be aware of the change. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Having three booleans in pagemap entry clues for usage of good old flags. Replace 'zero' and 'lazy' booleans with flags and use flags for internal tracking of in_parent value. Eventually, in_parent may be deprecated. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Create the socket early so that it will be available after restoring the namespaces Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Very minimalistic at the moment, no remote pages and namesapces. Still better than nothing :) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Cc: Adrian Reber <areber@redhat.com> Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Eugene Batalov authored
We'll use it in anon shmem dedup so we need to have access to it in shmem.c Signed-off-by:
Eugene Batalov <eabatalov89@gmail.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
The UNIX sockets do not like relative paths. Assuming both lazy-pages daemon and restore use the same opts.work_dir, their working directory full path will be the same. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Tikhomirov authored
https://github.com/xemul/criu/issues/187Signed-off-by:
Pavel Tikhomirov <ptikhomirov@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
The remote lazy pages variant can be run as follows: src# criu dump -t <pid> --lazy-pages --port 9876 -D /tmp/1 & src# while ! sudo fuser 9876/tcp ; do sleep 1; done src# scp -r /tmp/1/ dst:/tmp/ dst# criu lazy-pages --page-server --address dst --port 9876 -D /tmp/1 & dst# criu restore --lazy-pages -D /tmp/1 In a nutshell, this implementation of remote lazy pages does the following: - dump collects the process memory into the pipes, transfers non-lazy pages to the images or to the page-server on the restore side. The lazy pages are kept in pipes for later transfer - when the dump creates the page_pipe_bufs, it marks the buffers containing potentially lazy pages with PPB_LAZY - at the dump_finish stage, the dump side starts TCP server that will handle page requests from the restore side - the checkpoint directory is transferred to the restore side - on the restore side lazy-pages daemon is started, it creates UNIX socket to receive uffd's from the restore and a TCP socket to forward page requests to the dump side - restore creates memory mappings and fills the VMAs that cannot be handled by uffd with the contents of the pages*img. - restore registers lazy VMAs with uffd and sends the userfault file descriptors to the lazy-pages daemon - when a #PF occurs, the lazy-pages daemon sends PS_IOV_GET command to the dump side; the command contains PID, the faulting address and amount of pages (always 1 at the moment) - the dump side extracts the requested pages from the pipe and splices them into the TCP socket. - the lazy-pages daemon copies the received pages into the restored process address space Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
When appropriate, the lazy pages will no be written to the destination. Instead, a pagemap entry for range of such pages will be marked with 'lazy' flag. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
The pages that are mapped to zero_page_pfn are not dumped but information where are they located is required for lazy restore. Note that get_pagemap users presumed that zero pages are not a part of the pagemap and these pages were just silently skipped during memory restore. At the moment I preserve this semantics and force get_pagemap to skip zero pages. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Pagemap now is more friendly to random accesses, enable use of new APIs. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by:
Adrian Reber <areber@redhat.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Fix CID 163485 (#2 of 2): Dereference null return value (NULL_RETURNS) 7. dereference: Dereferencing a pointer that might be null dest when calling handle_user_fault. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
This will allow to split a ppb so that data residing at specified address will be immediately available Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
for buffers that contain potentially lazy pages Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Pavel Emelyanov authored
Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Reviewed-by:
Cyrill Gorcunov <gorcunov@openvz.org>
-
Andrew Vagin authored
>>> >>> CID 161312: Error handling issues (NEGATIVE_RETURNS) >>> >>> "task_args->uffd" is passed to a parameter that cannot be negative. Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
>>> >>> CID 161322: API usage errors (USE_AFTER_FREE) >>> >>> Calling "close(int)" closes handle "client" which has already been closed. Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Adrian Reber authored
Now that userfaultfd/lazy-pages support is enable all the time, this adds runtime detection of userfaultfd. On a system without the userfaultfd syscall following is printed: uffd daemon: (00.000004) Error (uffd.c:176): lazy-pages: Runtime detection of userfaultfd failed on this system. (00.000024) Error (uffd.c:177): lazy-pages: Processes cannot be lazy-restored on this system. or criu restore (00.457047) 6858: Error (uffd.c:176): lazy-pages: Runtime detection of userfaultfd failed on this system. (00.457049) 6858: Error (uffd.c:177): lazy-pages: Processes cannot be lazy-restored on this system. Signed-off-by:
Adrian Reber <areber@redhat.com> Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Andrew Vagin authored
Add linux/userfaultfd.h to criu sources. This header is a part of the kernel API and I see nothing wrong to have in the repo. Why we want to do this: * to check that criu works correctly if a kernel doesn't support userfaultfd. * to check compilation of the userfaultfd part in travis-ci. v2: remove UFFD from FEATURES_LIST Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by:
Adrian Reber <areber@redhat.com> Cc: Mike Rapoport <rppt@linux.vnet.ibm.com> Cc: Adrian Reber <areber@redhat.com> Signed-off-by:
Andrew Vagin <avagin@virtuozzo.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-