RFC: reviving mlock isolation dead code

* RFC: reviving mlock isolation dead code
@ 2010-10-30 10:16 Michel Lespinasse
  2010-10-30 12:48 ` Michel Lespinasse
  2010-11-01  7:05 ` KOSAKI Motohiro
  0 siblings, 2 replies; 10+ messages in thread
From: Michel Lespinasse @ 2010-10-30 10:16 UTC (permalink / raw)
  To: linux-mm

Hi,

The following code at the bottom of try_to_unmap_one appears to be dead:

 out_mlock:
        pte_unmap_unlock(pte, ptl);

        /*
         * We need mmap_sem locking, Otherwise VM_LOCKED check makes
         * unstable result and race. Plus, We can't wait here because
         * we now hold anon_vma->lock or mapping->i_mmap_lock.
         * if trylock failed, the page remain in evictable lru and later
         * vmscan could retry to move the page to unevictable lru if the
         * page is actually mlocked.
         */
        if (down_read_trylock(&vma->vm_mm->mmap_sem)) {
                if (vma->vm_flags & VM_LOCKED) {
                        mlock_vma_page(page);
                        ret = SWAP_MLOCK;
                }
                up_read(&vma->vm_mm->mmap_sem);
        }
        return ret;

The mmap_sem read acquire always fais here, because the mmap_sem is
held exclusively by __mlock_vma_pages_range(). By the time
__mlock_vma_pages_range() terminates (so that its caller can release
mmap_sem), all mlocked pages have been isolated already so that LRU
eviction algorithms should not encounter them (and if they do, the
pages should at least be already marked as mlocked).

I would like to resurect this, as I am seeing problems during a large
mlock (many GB). The mlock takes a long time to complete
(__mlock_vma_pages_range() is loading pages from disk), there is
memory pressure as some pages have to be evicted to make room for the
large mlock, and the LRU algorithm performs badly with the high amount
of pages still on LRU list - PageMlocked has not been set yet - while
their VMA is already VM_LOCKED.

One approach I am considering would be to modify
__mlock_vma_pages_range() and it call sites so the mmap sem is only
read-owned while __mlock_vma_pages_range() runs. The mlock handling
code in try_to_unmap_one() would then be able to acquire the
mmap_sem() and help, as it is designed to do.

Please comment if you have any concerns about this.

Thanks,

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 10+ messages in thread