1. 16 Sep, 2017 18 commits
    • Mike Rapoport's avatar
      lazy-pages: refactor unix socket initializaton · 1c54c003
      Mike Rapoport authored
      so that listenning file descriptor might be used in select/poll
      Signed-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      1c54c003
    • Mike Rapoport's avatar
      lazy-pages: always compile uffd.c · d08ea98b
      Mike Rapoport authored
      If CONFIG_HAS_UFFD is not defined an attempt to run the lazy pages daemon
      will result in error message
      Signed-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      d08ea98b
    • Mike Rapoport's avatar
      34693cb4
    • Mike Rapoport's avatar
      ppc64le: fix build with UFFD · 8337e998
      Mike Rapoport authored
      The __u64 is 'unsigned long' on Power and 'unsigned long long' on x86_64.
      Using PRI?64 does not help because, for instance, PRIu64 is 'lu'.
      
      According to [1] the solution is to define __SANE_USERSPACE_TYPES__ for
      Power builds
      
      [1] http://thread.gmane.org/gmane.linux.kernel/1425475/focus=1427433Signed-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      8337e998
    • Mike Rapoport's avatar
      uffd: add handling of zero pages · 6273fc97
      Mike Rapoport authored
      When get_page returns 0, it means that a page is mapped by a vma but it is
      not found in the pagemap. This happens when a page is a zero page and
      threofre skipped by dump.
      Use UFFDIO_ZEROMAP to create a zero page in the restored process address
      space.
      Signed-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      6273fc97
    • Mike Rapoport's avatar
      uffd: introduce uffd_handle_page · 58edba63
      Mike Rapoport authored
      so that it'll be able to handle both UFFDIO_COPY and UFFDIO_ZEROPAGE
      Signed-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      58edba63
    • Mike Rapoport's avatar
      uffd: increment uffd_copied_pages only in one place · e3f05ea0
      Mike Rapoport authored
      The uffd_copied_pages can be incremented in uffd_copy_page function rather
      than in its callers
      Signed-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      e3f05ea0
    • Adrian Reber's avatar
      uffd.c: move the code out of the 'main' function · a7004002
      Adrian Reber authored
      Most of the UFFD logic was in the function uffd_listen() which was
      directly called from crtools.c. In preparation for the remote lazy
      restore most of the code has been moved to separate function for better
      integration of the network functionality.
      Signed-off-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      a7004002
    • Adrian Reber's avatar
      uffd.c: make some variable static global · b3ae1cc2
      Adrian Reber authored
      To better track how many pages have been handled by UFFD a few variables
      have been made static global to easier access them and to reduce the
      number of parameters passed around.
      Signed-off-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      b3ae1cc2
    • Adrian Reber's avatar
      uffd.c: move code into subfunctions · 048b31b2
      Adrian Reber authored
      uffd_listen() is a rather large function and this starts to move code
      into subfunctions.
      Signed-off-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      048b31b2
    • Adrian Reber's avatar
      uffd.c: remove unused variable vma_size · c3abfff0
      Adrian Reber authored
      The variable vma_size was used for early debugging of lazy restore and
      has no significance now.
      Signed-off-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      c3abfff0
    • Mike Rapoport's avatar
      uffd: remove handling of VDSO pages · fcaf36f5
      Mike Rapoport authored
      Since VDSO pages cannot be lazy, no need to take care of them in lazy-pages
      daemon.
      Signed-off-by: 's avatarMike Rapoport <rapoport@il.ibm.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      fcaf36f5
    • Mike Rapoport's avatar
      uffd: do not treat VDSO pages as lazy · 9c7970c2
      Mike Rapoport authored
      VDSO is just a few pages and they can be loaded directly rather than go
      through userfaultfd to save some complexity on the lazy-pages daemon side.
      Signed-off-by: 's avatarMike Rapoport <rapoport@il.ibm.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      9c7970c2
    • Adrian Reber's avatar
      uffd.c: do not call unneeded functions · 33696ceb
      Adrian Reber authored
      For a lazy restore via userfaultfd the lazy-pages daemon
      needs to know which pages exist, so that it knows when all
      pages have finally been migrated so that the restored process
      has all of its memory. Therefore it needs to know which pages
      exist and it needs to parse the files in the dump result directory.
      
      The existing criu functions are designed to be used by a 'normal'
      restore and thus a lot of assumptions are made what has to be set up.
      
      For the lazy-pages restore the complete 'restore' initialization is
      not necessary and therefore the criu common code dependencies are
      minimized with this commit.
      Signed-off-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      33696ceb
    • Adrian Reber's avatar
      b8f46c36
    • Adrian Reber's avatar
      Try to include userfaultfd with criu (part 2) · 57891afc
      Adrian Reber authored
      This is a first try to include userfaultfd with criu. Right now it
      still requires a "normal" checkpoint. After checkpointing the
      application it can be restored with the help of userfaultfd.
      
      All restored pages with MAP_ANONYMOUS and MAP_PRIVATE set are marked as
      being handled by userfaultfd.
      
      As soon as the process is restored it blocks on the first memory access
      and waits for pages being transferred by userfaultfd.
      
      To handle the required pages a new criu command has been added. For a
      userfaultfd supported restore the first step is to start the
      'lazy-pages' server:
      
        criu lazy-pages -v4 -D /tmp/3/ --address /tmp/userfault.socket
      
      This waits on a unix domain socket (defined using the --address option)
      to receive a userfaultfd file descriptor from a '--lazy-pages' enabled
      'criu restore':
      
        criu restore -D /tmp/3 -j -v4 --lazy-pages \
        --address /tmp/userfault.socket
      
      In the first step the VDSO pages are pushed from the lazy-pages server
      into the restored process. After that the lazy-pages server waits on the
      UFFD FD for a UFFD requested page. If there are no requests received
      during a period of 5 seconds the lazy-pages server switches into a mode
      where the remaining, non-transferred pages are copied into the
      destination process. After all remaining pages have been copied the
      lazy-pages server exits.
      
      The first page that usually is requested is a VDSO page. The process
      currently used for restoring has two VDSO pages, but only one is
      requested
      via userfaultfd. In the second part where the remaining pages are copied
      into the process, the second VDSO page is also copied into the process
      as it has not been requested previously. Unfortunately, even as this
      page has not been requested before, it is not accepted by userfaultfd.
      EINVAL is returned. The reason for EINVAL is not understood and
      therefore
      the VDSO pages are copied first into the process, then switching to
      request
      mode and copying the pages which are requested via userfaultfd. To
      decide at which point the VDSO pages can be copied into the process, the
      lazy-pages server is currently waiting for the first page requested via
      userfaultfd. This is one of the VDSO pages. To not copy a page a second
      time, which is unnecessary and not possible, there is now a check to see
      if the page has been transferred previously.
      
      The use case to use usefaultfd with a checkpointed process on a remote
      machine will probably benefit from the current work related to
      image-cache and image-proxy.
      
      For the final implementation it would be nice to have a restore running
      in uffd mode on one system which requests the memory pages over the
      network from another system which is running 'criu checkpoint' also in
      uffd mode. This way the pages need to be copied only 'once' from the
      checkpoint process to the uffd restore process.
      
      TODO:
          * Contains still many debug outputs which need to be cleaned up.
          * Maybe transfer the dump directory FD also via unix domain sockets
            so that the 'uffd'/'lazy-pages' server can keep running without
            the need to specify the dump directory with '-D'
          * Keep the lazy-pages server running after all pages have been
            transferred and start waiting for new connections to serve.
          * Resurrect the non-cooperative patch set, as once the restored task
            fork()'s or calls mremap() the whole thing becomes broken.
          * Figure out if current VDSO handling is correct.
          * Figure out when and how zero pages need to be inserted via uffd.
      
      v2:
          * provide option '--lazy-pages' to enable uffd style restore
          * use send_fd()/recv_fd() provided by criu (instead of own
            implementation)
          * do not install the uffd as service_fd
          * use named constants for MAP_ANONYMOUS
          * do not restore memory pages and then later mark them as uffd
            handled
          * remove function find_pages() to search in pages-<id>.img;
            now using criu functions to find the necessary pages;
            for each new page search the pages-<id>.img file is opened
          * only check the UFFDIO_API once
          * trying to protect uffd code by CONFIG_UFFD;
            use make UFFD=1 to compile criu with this patch
      
      v3:
         * renamed the server mode from 'uffd' -> 'lazy-pages'
         * switched client and server roles transferring the UFFD FD
           * the criu part running in lazy-pages server mode is now
             waiting for connections
           * the criu restore process connects to the lazy-pages server
             to pass the UFFD FD
         * before UFFD copying anything else the VDSO pages are copied
           as it fails to copy unused VDSO pages once the process is running.
           this was necessary to be able to copy all pages.
         * if there are no more UFFD messages for 5 seconds the lazy-pages
           server switches in copy mode to copy all remaining pages, which
           have not been requested yet, into the restored process
         * check the UFFDIO_API at the correct place
         * close UFFD FD in the restorer to remove open UFFD FD in the
           restored process
      
      v4:
          * removed unnecessary madvise() calls ; it seemed necessary when
            first running tests with uffd; it actually is not necessary
          * auto-detect if build-system provides linux/userfaultfd.h
            header.
          * simplify unix domain socket setup and communication.
          * use --address to specify the location of the used
            unix domain socket.
      
      v5:
          * split the userfaultfd patch in multiple smaller patches
          * introduced vma_can_be_lazy() function to check if a page
            can be handled by uffd
          * moved uffd related code from cr-restore.c to uffd.c
          * handle failure to register a memory page of the restored process
            with userfaultfd
      
      v6:
          * get PID of to be restored process from the 'criu restore' process;
            first the PID is transferred and then the UFFD
      Signed-off-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      57891afc
    • Adrian Reber's avatar
      Try to include userfaultfd with criu (part 1) · e2268aa3
      Adrian Reber authored
      This is a first try to include userfaultfd with criu. Right now it
      still requires a "normal" checkpoint. After checkpointing the
      application it can be restored with the help of userfaultfd.
      
      All restored pages with MAP_ANONYMOUS and MAP_PRIVATE set are marked as
      being handled by userfaultfd.
      
      As soon as the process is restored it blocks on the first memory access
      and waits for pages being transferred by userfaultfd.
      
      To handle the required pages a new criu command has been added. For a
      userfaultfd supported restore the first step is to start the
      'lazy-pages' server:
      
        criu lazy-pages -v4 -D /tmp/3/ --address /tmp/userfault.socket
      
      This is part 1 of the userfaultfd integration which provides the
      'lazy-pages' server implementation.
      
      v2:
          * provide option '--lazy-pages' to enable uffd style restore
          * use send_fd()/recv_fd() provided by criu (instead of own
            implementation)
          * do not install the uffd as service_fd
          * use named constants for MAP_ANONYMOUS
          * do not restore memory pages and then later mark them as uffd
            handled
          * remove function find_pages() to search in pages-<id>.img;
            now using criu functions to find the necessary pages;
            for each new page search the pages-<id>.img file is opened
          * only check the UFFDIO_API once
          * trying to protect uffd code by CONFIG_UFFD;
            use make UFFD=1 to compile criu with this patch
      
      v3:
         * renamed the server mode from 'uffd' -> 'lazy-pages'
         * switched client and server roles transferring the UFFD FD
           * the criu part running in lazy-pages server mode is now
             waiting for connections
           * the criu restore process connects to the lazy-pages server
             to pass the UFFD FD
         * before UFFD copying anything else the VDSO pages are copied
           as it fails to copy unused VDSO pages once the process is running.
           this was necessary to be able to copy all pages.
         * if there are no more UFFD messages for 5 seconds the lazy-pages
           server switches in copy mode to copy all remaining pages, which
           have not been requested yet, into the restored process
         * check the UFFDIO_API at the correct place
         * close UFFD FD in the restorer to remove open UFFD FD in the
           restored process
      
      v4:
          * removed unnecessary madvise() calls ; it seemed necessary when
            first running tests with uffd; it actually is not necessary
          * auto-detect if build-system provides linux/userfaultfd.h
            header
          * simplify unix domain socket setup and communication.
          * use --address to specify the location of the used
            unix domain socket
      
      v5:
          * split the userfaultfd patch in multiple smaller patches
          * introduced vma_can_be_lazy() function to check if a page
            can be handled by uffd
          * moved uffd related code from cr-restore.c to uffd.c
          * handle failure to register a memory page of the restored process
            with userfaultfd
      
      v6:
          * get PID of to be restored process from the 'criu restore' process;
            first the PID is transferred and then the UFFD
          * code has been re-ordered to be better prepared for lazy-restore
            from remote host
          * compile test for UFFD availability only once
      Signed-off-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      e2268aa3
    • Adrian Reber's avatar
      Remove static from prepare_task_entries function · 27e60179
      Adrian Reber authored
      For the upcoming userfaultfd integration the lazy-pages mode needs to
      setup the criu infrastructure to read the pages files.
      Signed-off-by: 's avatarAdrian Reber <areber@redhat.com>
      Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
      27e60179
  2. 30 Aug, 2017 2 commits
  3. 21 Aug, 2017 1 commit
  4. 17 Aug, 2017 1 commit
  5. 16 Aug, 2017 6 commits
  6. 15 Aug, 2017 12 commits