- 12 May, 2018 40 commits
-
-
Mike Rapoport authored
Make sure we handle various corner cases: * we received less pages than requested * the request was capped because of unmap/remap etc * the process has exited underneath us Currently we are freeing the request once we've found the address to use with uffd_copy(). Instead, let's keep the request object around, use it to properly calculate number of pages we pass to uffd_copy() and then re-add tailing range (if any) to the IOVs list. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Instead of merging unfinished requests with child's IOVs we queued them into parent's IOV list. Fix it. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Commit 9cb20327aa4 ("return to epoll_wait after completing forks") was only half way there. Adding the other half. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
It is possible that when pages request from the remove source arrive, part of the memory range covered by the request would be already gone because of madvise(MADV_DONTNEED), mremap() etc. Ensure we are not trying to uffd_copy more than we are allowed. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
If we get fork() event just before transferring last IOV of the parent process, continuing to background fetch after completing fork event handling will cause lazy-pages daemon to exit and nothing will monitor the child process memory. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Since the memory mapping is now split between ->iovs and ->reqs lists, any update to memory layout should take into account both lists. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Instead of recalculating required for lazy_pages_info->buf when copying IOVs at fork() time, keep the size of the buffer in the lazy_pages_info struct. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
When we return from epoll_run_rfds with positive return value it means that event handling loop was interrupted because the event should be handled outside of that loop. Is always the case with UFFD_EVENT_FORK. It may happen that the event occurred after we've completed the memory transfer and we are on the way to successful return from the handle_requests() function, but instead of returning 0 we will return the positive value we've got from epoll_run_rfds. Explicitly assigning return value of complete_forks() fixes this issue. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
With userfaultfd we cannot reliably service process_vm_readv calls. The maps007 test that uses these calls passed previously by sheer luck. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
In the current model we haven't started the background page transfer until POLL_TIMEOUT time has elapsed since the last uffd or socket event. If the restored process will do memory access one in (POLL_TIMEOUT - eplsilon) the filling of its memory can take ages. This patch changes them model in the following way: * poll for the events indefinitely until the restore is complete * the restore completion event causes reset of the poll timeout to zero and * starts the background transfers * after each transfer we return to check if there are any uffd events to handle Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Currently, once we get to transfer pages in the "background", we try to fetch the entire IOV at once. For large IOVs this may impact #PF latency for the #PF events occurred during the transfer. Let's add a simple heuristic for controlling size of the background transfers. Initially, the transfer will be limited to some default value. Every time we transfer a chunk we increase the transfer size until it reaches a pre-defined maximal size. A page fault event resets the background transfer size to its initial value. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
The complete_forks function presumes that it always has a work to do because we assume that fork event is the only case when we drop out of epoll_run_rfds with positive return value. Teach complete_forks to bail out when there is no pending forks to process to allow exiting epoll_run_rfds for different reasons. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
First check if there are pages we need to transfer and only afterwards check if there are outstanding requests. Also, instead checking 'bool remaining' to see if there is more work to do we can simply check if all the lpi's have been already serviced. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
The intention is to use this function for transferring all the pages that didn't cause a #PF. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
The function anyway pick the next page range to transfer it's just doing it in very simple FIFO manner. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
We already have a queue for the requested memory ranges which contains 'lp_req' objects. These objects hold the same information as the lazy_iov: start address of the range, end address and the address that the range had at the dump time. Rather than keep this information twice and use double bookkeeping, we can extract the requested range from lpi->iovs and move it to lpi->reqs. Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Mike Rapoport authored
Instead of relying on length of various lists add a boolean variable to lazy_pages_info to make it clean when the process has exited Signed-off-by:
Mike Rapoport <rppt@linux.vnet.ibm.com> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
Currently zdtm doesn't detect when restore failed, if it is executed with strace. With this patch, fake-restore.sh creates a test file, and zdtm is able to distinguish when restore failed. Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrei Vagin authored
The get() method requires a key and now we are using an index. That will never work correctly as it is now. Acked-by:
Adrian Reber <adrian@lisas.de> Reported-by:
Adrian Reber <adrian@lisas.de> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
Currently we restore all sockets in the root mount namespace, because we were not able to get any information about a mount point where a socket is bound. It is obviously incorrect in some cases. In 4.10 kernel, we added the SIOCUNIXFILE ioctl for unix sockets. This ioctl opens a file to which a socket is bound and returns a file descriptor. This new ioctl allows us to get mnt_id by reading fdinfo, and mnt_id is enough to find a proper mount point and a mount namespace. The logic of this patch is straight forward. On dump, we save mnt_id for sockets, on restore we find a mount namespace by mnt_id and restore this socket in its mount namespace. Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
unix_process_name() are called when sockets are being collected, but at this moment we don't have socket descriptors. A socket descriptor is reuired to get mnt_id, what will allow to resolve a socket path in its mount namespace. Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
This ioctl opens a file to which a socket is bound and returns a file descriptor. This file descriptor can be used to get mnt_id and a file path. Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
The USK_CALLBACK flag means that a socket is externel and will be restored by a plugin. open_unixsk_standalone should not be called to these sockets. $ make -C test/others/unix-callback/ run ... (00.109338) 7471: sk unix: Opening standalone socket (id 0xd ino 0 peer 0x63b) (00.109376) 7471: Error (criu/sk-unix.c:1128): sk unix: BUG at criu/sk-unix.c:1128 Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Andrey Vagin authored
Unix file sockets have to be restored in proper mount namespaces. Acked-by:
Cyrill Gorcunov <gorcunov@openvz.org> Signed-off-by:
Andrei Vagin <avagin@virtuozzo.com>
-
Cyrill Gorcunov authored
When walking over unix sockets make sure the queuer is present before accessing it. https://jira.sw.ru/browse/PSBM-82796Reported-by:
Vitaly Ostrosablin <vostrosablin@virtuozzo.com> Signed-off-by:
Cyrill Gorcunov <gorcunov@virtuozzo.com> Reviewed-by:
Kirill Tkhai <ktkhai@virtuozzo.com>
-
Kir Kolyshkin authored
There was a "; done" leftover here, somehow ignored by dash but not bash. Remove it. Signed-off-by:
Kir Kolyshkin <kolyshkin@gmail.com>
-
Kir Kolyshkin authored
It is not used, probably was committed by mistake. Fixes: 2d093a17 ("travis: add a job to test on the fedora rawhide") Signed-off-by:
Kir Kolyshkin <kolyshkin@gmail.com>
-
Kir Kolyshkin authored
Fix Fedora rawhide CI failure caused by coreutils-single and our way of running under QEMU. Signed-off-by:
Kir Kolyshkin <kolyshkin@gmail.com>
-
Kir Kolyshkin authored
1. Sort lists of packages to be installed, unify indentation. 2. Merge "ccache -s" and "ccache -z". Signed-off-by:
Kir Kolyshkin <kolyshkin@gmail.com>
-
Kir Kolyshkin authored
In Ubuntu Bionic for armhf, clang is compiled for armv8l rather than armv7l (as it was and still is for gcc) and so it uses armv8 by default. This breaks compilation of tests using smp_mb(): > error: instruction requires: data-barriers The fix is to add "-march=armv7-a" to CFLAGS which we already do, except not for the tests. Signed-off-by:
Kir Kolyshkin <kolyshkin@gmail.com>
-