- 16 Sep, 2017 40 commits
-
-
Mike Rapoport authored
Acked-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
The PR_SET_THP_DISABLE prctl allows control of transparent huge pages on per-process basis. It is available since Linux 3.15, but until recently it set VM_NOHUGEPAGE for all VMAs created after prctl() call, which prevents proper restore for combination of pre- and post-copy. A recent change to prctl(PR_SET_THP_DISABLE) behavior eliminates the use of per-VMA flags and we can use the new version of the prctl() to disable THP. Acked-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
The is_vma_range_fmt and parse_vmflags will be required for detection of availability of PR_SET_THP_DISABLE prctl Acked-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Now we have two separate recv-calling routines, that receive header and pages from page-server. These two can finally be unified. After this the sync-read code looks like -- start async one and wait for it to finish right at once. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
This is prerequisite for the next patch. v2: spellchecks, code reshuffle Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
Now these two look exactly the same and we can have only one call with additional sync/async (flags) arg. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
The newly introduced sync-read call may look exactly the same as its async pair by using the respective complete callback. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Acked-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
There's no need in two API calls to read xfer header and pages themselves, so merge them into one single call. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
* drop --keep-going etc from --lazy-pages pass * add --remote-lazy-pages pass Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
* select excluded tests based on the kernel version * test local and remote lazy-pages with and withour pre-dump Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
The page-read for child process is a shallow copy of the parent process page-read. They share the open file descriptors and the pagemap. The lpi_fini of the child processes should not release any resources, they all will be released during lpi_fini of the parent process. Fixes: #325 Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
For the remote lazy pages case, to access pages in the middle of a pipe we are splitting the page_pipe_buffers and iovecs and use splice() to move the data between the underlying pipes. After the splits we get page_pipe_buffer with single iovec that can be used to splice() the data further into the socket. This patch replaces the splitting and splicing with use of a helper pipe and tee(). We tee() the pages from beginning of the pipe up to the last requested page into a helper pipe, sink the unneeded head part into /dev/null and we get the requested pages ready for splice() into the socket. This allows lazy-pages daemon to request the same page several time, which is required to properly support fork() after the restore. As added bonus we simplify the code and reduce amount of pipes that live in the system. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Until now once we've started to fetch an iovec we've been waiting until it's completely copied before returning to event processing loop. Now we can have several request for the remote pages in flight. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
There could be several outstaning requests for the same page, either from page fault handler or from handle_remaining_pages. Verifying that the faulting address is already requested is not enough. We need to check if there any request in flight that covers the faulting address. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Pavel Emelyanov authored
v2: When uffd is present, the reported features may still be 0, so we need one more bool for uffd syscall itself. Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
They still will fail with --remote-lazy-pages, so mark them as 'noremotelazy' Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
This allows skipping tests that are not yet run with --remote-lazy-pages, but can be run with --lazy-pages Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
When running with --lazy-pages or --remote-lazy-pages, the daemons should run in the background, rather than complete before t.stop() is called. Many tests try to verify things are ok after test_waitsig() and that's exactly the place where they access memory and cause page faults. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Most of zdtm test should pass with --lazy-pages with kernels newer than 4.11. Some test excluded for older kernels surprisingly pass even now, mainly becuase they do not actually stress userfaultfd, which will be fixed in the upcoming commits :) The cmdlinenv00 fails even with kernel 4.11 because of a race between uffd and gup in the case external process reads /proc/<pid>/cmdline before memory containing the command line is populated. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
This is the version from v4.11-rc5. Apparently, that would be the userfault ABI for the next few month. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
The UFFDIO_EVENT_EXIT didn't make it upstream because of possible races in exit() syscall [1]. The only way to detect that the monitored process is exited is checking for ENOSPC errno value set by uffdio_copy. [1] http://www.spinics.net/lists/linux-mm/msg122467.htmlSigned-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
We only use one epoll instance to manage lazy-pages related I/O. Making epollfd file-visible will allow cleaner implementation of the restored process exit() calls tracking. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Both lazy_iov and lp_req have two fields for address/start: the run-time address that tracks remaps, and the "dump time" address, which is required for pagemap accesses. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Acked-by:
Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
The lazy-pages daemon have to properly track changes to virtual memory layout of the restored process. The test verifies that lazy-pages daemon properly reacts to fork(), exit(), madvise(MADV_DONTNEED) and mremap() events. Currently, no zdtm tests would generate UFFD_EVENT_{REMAP,REMOVE}. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
The dup_page_read performs a shallow copy of a page_read object. It is required for implementation of fork event in lazy-pages daemon. When a restored process fork()'s a child, the lazy-pages daemon will handle page faults of the child process, and it will use the parent process memory dump for that. travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
Replace "pr<id>" with "pr<pid>-<id>" when printing information about a particular page-read. travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
When the restored process calls mremap(), we will see #PF's on the new addresses and we have to create a correspondence between the addresses found in the dump and the actual addresses the process uses. For this purpose we distinguish "live" address and "image" address in the lazy IOVs and outstanding requests. The "live" address is used to find the appropriate IOV and in uffd_copy and the "image" address is used to request pages from the page-read. If the mremap() call causes the mapping to grow, the additional part will receive zero pages, as expected. For the shrinking remaps, we will get UFFD_EVENT_UNMAP for the dropped part of the mapping. travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
The UNMAP event is generated by userfaultfd when a VMA (or a part of it) is unmapped from the process address space. We won't receive #PF's at the unmapped range, but we need to make sure we are not trying to fill that range at handle_remaining_pages. Note, that the VMA is gone, so there is no sense to unregister uffd from it. travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
When the restored process calls madvise(MADV_DONTNEED) or madvise(MADV_REMOVE) the memory range specified by the madvise() call should be remapped to zero pfn and we should stop monitoring this range in order to avoid its pollution with data the process does not expect. All we need to do here, is to unregister the memory range from userfaultfd and the kernel will take care of the rest. travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
This is the version from linux-next at the moment. travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-
Mike Rapoport authored
With address space manipulations, amount of pages that the lazy-pages daemon will copy might differ from amount of pages we had in the dumps. Disable the warning and error retval for now; we can restore the accounting once uffd event handling stabilizes a bit. travis-ci: success for lazy-pages: add non-#PF events handling (rev2) Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Pavel Emelyanov <xemul@virtuozzo.com>
-