• Eugene Batalov's avatar
    shmem: use mincore page status bit with anon shared mem dumping · 6b062106
    Eugene Batalov authored
    It turned out that anon shmem can have pages with non zero content
    and with both PME_PRESENT and PME_SWAP bits unset in all its vmas
    in the whole ps tree.
    Such case is reproduced in issue #209:
    1. Dump ps tree with anon shmem filled using datagen.
    2. Restore ps tree. anon shmem content is restored
       in open_shmem(). fd is created for it and it is
       unmapped from restorer process.
    3. anon shmem vma is mapped in restore_mapping() of pie restorer context.
       anon shmem content is already initialized to non zero content
       but restored process doesn't touch its newly mapped vma.
    4. Run CRIU dump again. All the pages of anon shmem vmas have
       PME_PRESENT and PME_SWAP bits unset and we don't put
       vma pages to dump.
    
    So if we filter anon shmem pages using PME_PRESENT and PME_SWAP bits
    the same way as we do it for anon private mem then we have a bug.
    PME_PRESENT and PME_SWAP bits work for anon private mem because
    at least one process would restore content of private anon vma
    in its own address space thus PME bits will be set and pages
    will be damped.
    
    We can't just stop using PME_PRESENT and PME_SWAP bits and dump all
    non soft dirty and non zero pfn pages. In this case each 1Gb of
    mapped and not used anon shmem vma will go to dump. This is too bad.
    
    To fix the bug in this patch we use mincore bits to finally
    understand should we dump page or not. mincore bits show page
    usage status better because mincore performs deeper checking of
    internal in-kernel state. PME bits filling is based only on
    process page table.
    
    Using mincore has a drawback. It doesn't work when page is in swap.
    But it's ok for now because mincore was used before we started using
    PME bits. Also mincore doesn't break page changes tracking
    functionality for anon shmem that we have now.
    
    This bug can be fixed in another way. For example we can make anon shmem
    restoration work similar to anon private mem restoration.
    But this fix looks much harder to implement.
    Signed-off-by: 's avatarEugene Batalov <eabatalov89@gmail.com>
    Signed-off-by: 's avatarPavel Emelyanov <xemul@virtuozzo.com>
    6b062106
Name
Last commit
Last update
Documentation Loading commit data...
contrib Loading commit data...
coredump Loading commit data...
crit Loading commit data...
criu Loading commit data...
images Loading commit data...
lib Loading commit data...
scripts Loading commit data...
test Loading commit data...
.gitignore Loading commit data...
.mailmap Loading commit data...
.travis.yml Loading commit data...
COPYING Loading commit data...
CREDITS Loading commit data...
INSTALL.md Loading commit data...
Makefile Loading commit data...
Makefile.install Loading commit data...
Makefile.versions Loading commit data...
README.md Loading commit data...