* [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex @ 2012-08-18 1:06 Tim Chen 2012-08-20 16:43 ` Tim Chen 0 siblings, 1 reply; 9+ messages in thread From: Tim Chen @ 2012-08-18 1:06 UTC (permalink / raw) To: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki Cc: Kirill A. Shutemov, Matthew Wilcox, Andi Kleen, linux-mm, linux-kernel, Alex Shi In shrink_page_list, call to page_referenced_file will causes the acquisition/release of mapping->i_mmap_mutex for each page in the page list. However, it is very likely that successive pages in the list share the same mapping and we can reduce the frequency of i_mmap_mutex acquisition by holding the mutex in shrink_page_list. This improves the performance when the system has a lot page reclamations for file mapped pages if workloads are using a lot of memory for page cache. Tim --- Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 4b974ae..f0174ae 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -156,8 +156,8 @@ static inline void page_dup_rmap(struct page *page) /* * Called from mm/vmscan.c to handle paging out */ -int page_referenced(struct page *, int is_locked, - struct mem_cgroup *memcg, unsigned long *vm_flags); +int page_referenced(struct page *, int is_locked, struct mem_cgroup *memcg, + unsigned long *vm_flags, int mmap_mutex_locked); int page_referenced_one(struct page *, struct vm_area_struct *, unsigned long address, unsigned int *mapcount, unsigned long *vm_flags); @@ -175,7 +175,7 @@ enum ttu_flags { bool is_vma_temporary_stack(struct vm_area_struct *vma); -int try_to_unmap(struct page *, enum ttu_flags flags, int file_mapped); +int try_to_unmap(struct page *, enum ttu_flags flags, int mmap_mutex_locked); int try_to_unmap_one(struct page *, struct vm_area_struct *, unsigned long address, enum ttu_flags flags); diff --git a/mm/migrate.c b/mm/migrate.c index 102a5b1..1aa37b1 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -918,7 +918,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, if (PageAnon(hpage)) anon_vma = page_get_anon_vma(hpage); - try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS, 0); + try_to_unmap(hpage, + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS, 0); if (!page_mapped(hpage)) rc = move_to_new_page(new_hpage, hpage, 1, mode); diff --git a/mm/rmap.c b/mm/rmap.c index cb7de48..63fc84b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -820,7 +820,7 @@ static int page_referenced_anon(struct page *page, */ static int page_referenced_file(struct page *page, struct mem_cgroup *memcg, - unsigned long *vm_flags) + unsigned long *vm_flags, int mmap_mutex_locked) { unsigned int mapcount; struct address_space *mapping = page->mapping; @@ -839,11 +839,14 @@ static int page_referenced_file(struct page *page, /* * The page lock not only makes sure that page->mapping cannot * suddenly be NULLified by truncation, it makes sure that the - * structure at mapping cannot be freed and reused yet. - * We should have taken mapping->i_mmap_mutex. + * structure at mapping cannot be freed and reused yet, + * so we can safely take mapping->i_mmap_mutex. */ BUG_ON(!PageLocked(page)); + if (!mmap_mutex_locked) + mutex_lock(&mapping->i_mmap_mutex); + /* * i_mmap_mutex does not stabilize mapcount at all, but mapcount * is more likely to be accurate if we note it after spinning. @@ -867,6 +870,8 @@ static int page_referenced_file(struct page *page, break; } + if (!mmap_mutex_locked) + mutex_unlock(&mapping->i_mmap_mutex); return referenced; } @@ -883,7 +888,7 @@ static int page_referenced_file(struct page *page, int page_referenced(struct page *page, int is_locked, struct mem_cgroup *memcg, - unsigned long *vm_flags) + unsigned long *vm_flags, int mmap_mutex_locked) { int referenced = 0; int we_locked = 0; @@ -903,14 +908,9 @@ int page_referenced(struct page *page, else if (PageAnon(page)) referenced += page_referenced_anon(page, memcg, vm_flags); - else if (page->mapping) { - if (!is_locked) - mutex_lock(&page->mapping->i_mmap_mutex); + else if (page->mapping) referenced += page_referenced_file(page, memcg, - vm_flags); - if (!is_locked) - mutex_unlock(&page->mapping->i_mmap_mutex); - } + vm_flags, mmap_mutex_locked); if (we_locked) unlock_page(page); @@ -1550,7 +1550,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags) * vm_flags for that VMA. That should be OK, because that vma shouldn't be * 'LOCKED. */ -static int try_to_unmap_file(struct page *page, enum ttu_flags flags, int filemap_locked) +static int try_to_unmap_file(struct page *page, enum ttu_flags flags, + int mmap_mutex_locked) { struct address_space *mapping = page->mapping; pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); @@ -1562,7 +1563,7 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags, int filema unsigned long max_nl_size = 0; unsigned int mapcount; - if (!filemap_locked) + if (!mmap_mutex_locked) mutex_lock(&mapping->i_mmap_mutex); vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) { unsigned long address = vma_address(page, vma); @@ -1643,7 +1644,7 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags, int filema list_for_each_entry(vma, &mapping->i_mmap_nonlinear, shared.vm_set.list) vma->vm_private_data = NULL; out: - if (!filemap_locked) + if (!mmap_mutex_locked) mutex_unlock(&mapping->i_mmap_mutex); return ret; } @@ -1662,7 +1663,7 @@ out: * SWAP_FAIL - the page is unswappable * SWAP_MLOCK - page is mlocked. */ -int try_to_unmap(struct page *page, enum ttu_flags flags, int filemap_locked) +int try_to_unmap(struct page *page, enum ttu_flags flags, int mmap_mutex_locked) { int ret; @@ -1674,7 +1675,7 @@ int try_to_unmap(struct page *page, enum ttu_flags flags, int filemap_locked) else if (PageAnon(page)) ret = try_to_unmap_anon(page, flags); else - ret = try_to_unmap_file(page, flags, filemap_locked); + ret = try_to_unmap_file(page, flags, mmap_mutex_locked); if (ret != SWAP_MLOCK && !page_mapped(page)) ret = SWAP_SUCCESS; return ret; diff --git a/mm/vmscan.c b/mm/vmscan.c index 83fb7f3..1af1f1c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -27,6 +27,7 @@ buffer_heads_over_limit */ #include <linux/mm_inline.h> #include <linux/backing-dev.h> +#include <linux/ksm.h> #include <linux/rmap.h> #include <linux/topology.h> #include <linux/cpu.h> @@ -781,12 +782,14 @@ enum page_references { static enum page_references page_check_references(struct page *page, struct mem_cgroup_zone *mz, - struct scan_control *sc) + struct scan_control *sc, + int mmap_mutex_locked) { int referenced_ptes, referenced_page; unsigned long vm_flags; - referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags); + referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags, + mmap_mutex_locked); referenced_page = TestClearPageReferenced(page); /* Lumpy reclaim - ignore references */ @@ -856,7 +859,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, unsigned long nr_congested = 0; unsigned long nr_reclaimed = 0; unsigned long nr_writeback = 0; - struct mutex *i_mmap_mutex=NULL; + struct mutex *i_mmap_mutex = NULL; cond_resched(); @@ -866,12 +869,14 @@ static unsigned long shrink_page_list(struct list_head *page_list, struct address_space *mapping; struct page *page; int may_enter_fs; + int mmap_mutex_locked; cond_resched(); page = lru_to_page(page_list); list_del(&page->lru); + mmap_mutex_locked = 0; if (!trylock_page(page)) goto keep; @@ -911,15 +916,22 @@ static unsigned long shrink_page_list(struct list_head *page_list, } if (page->mapping) { - if (i_mmap_mutex != &page->mapping->i_mmap_mutex) { - if (i_mmap_mutex) - mutex_unlock(i_mmap_mutex); - i_mmap_mutex = &page->mapping->i_mmap_mutex; - mutex_lock(i_mmap_mutex); + if (i_mmap_mutex == &page->mapping->i_mmap_mutex) { + mmap_mutex_locked = 1; + } else { + if (page_mapped(page) && page_rmapping(page) + && !PageKsm(page) && !PageAnon(page)) { + if (i_mmap_mutex) + mutex_unlock(i_mmap_mutex); + i_mmap_mutex = &page->mapping->i_mmap_mutex; + mutex_lock(i_mmap_mutex); + mmap_mutex_locked = 1; + } } } - references = page_check_references(page, mz, sc); + references = page_check_references(page, mz, sc, + mmap_mutex_locked); switch (references) { case PAGEREF_ACTIVATE: goto activate_locked; @@ -949,7 +961,8 @@ static unsigned long shrink_page_list(struct list_head *page_list, * processes. Try to unmap it here. */ if (page_mapped(page) && mapping) { - switch (try_to_unmap(page, TTU_UNMAP, 1)) { + switch (try_to_unmap(page, TTU_UNMAP, + mmap_mutex_locked)) { case SWAP_FAIL: goto activate_locked; case SWAP_AGAIN: @@ -1830,7 +1843,7 @@ static void shrink_active_list(unsigned long nr_to_scan, } } - if (page_referenced(page, 0, mz->mem_cgroup, &vm_flags)) { + if (page_referenced(page, 0, mz->mem_cgroup, &vm_flags, 0)) { nr_rotated += hpage_nr_pages(page); /* * Identify referenced, file-backed active pages and ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex 2012-08-18 1:06 [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex Tim Chen @ 2012-08-20 16:43 ` Tim Chen 2012-08-21 13:21 ` Matthew Wilcox 0 siblings, 1 reply; 9+ messages in thread From: Tim Chen @ 2012-08-20 16:43 UTC (permalink / raw) To: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki Cc: Matthew Wilcox, Kirill A. Shutemov, Andi Kleen, linux-mm, linux-kernel, Alex Shi On Fri, 2012-08-17 at 18:06 -0700, Tim Chen wrote: > In shrink_page_list, call to page_referenced_file will causes the > acquisition/release of mapping->i_mmap_mutex for each page in the page > list. However, it is very likely that successive pages in the list > share the same mapping and we can reduce the frequency of i_mmap_mutex > acquisition by holding the mutex in shrink_page_list. This improves the > performance when the system has a lot page reclamations for file mapped > pages if workloads are using a lot of memory for page cache. > > Tim > > --- The patch I sent previously was generated against an intermediate tree. Here's the correct version. Thanks. Tim --- Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- diff --git a/include/linux/rmap.h b/include/linux/rmap.h index fd07c45..f0174ae 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -156,8 +156,8 @@ static inline void page_dup_rmap(struct page *page) /* * Called from mm/vmscan.c to handle paging out */ -int page_referenced(struct page *, int is_locked, - struct mem_cgroup *memcg, unsigned long *vm_flags); +int page_referenced(struct page *, int is_locked, struct mem_cgroup *memcg, + unsigned long *vm_flags, int mmap_mutex_locked); int page_referenced_one(struct page *, struct vm_area_struct *, unsigned long address, unsigned int *mapcount, unsigned long *vm_flags); @@ -175,7 +175,7 @@ enum ttu_flags { bool is_vma_temporary_stack(struct vm_area_struct *vma); -int try_to_unmap(struct page *, enum ttu_flags flags); +int try_to_unmap(struct page *, enum ttu_flags flags, int mmap_mutex_locked); int try_to_unmap_one(struct page *, struct vm_area_struct *, unsigned long address, enum ttu_flags flags); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 97cc273..5f162c7 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -953,7 +953,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn, if (hpage != ppage) lock_page(ppage); - ret = try_to_unmap(ppage, ttu); + ret = try_to_unmap(ppage, ttu, 0); if (ret != SWAP_SUCCESS) printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n", pfn, page_mapcount(ppage)); diff --git a/mm/migrate.c b/mm/migrate.c index 1107238..1aa37b1 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -802,7 +802,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage, } /* Establish migration ptes or remove ptes */ - try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS, 0); skip_unmap: if (!page_mapped(page)) @@ -918,7 +918,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, if (PageAnon(hpage)) anon_vma = page_get_anon_vma(hpage); - try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); + try_to_unmap(hpage, + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS, 0); if (!page_mapped(hpage)) rc = move_to_new_page(new_hpage, hpage, 1, mode); diff --git a/mm/rmap.c b/mm/rmap.c index 5b5ad58..63fc84b 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -820,7 +820,7 @@ static int page_referenced_anon(struct page *page, */ static int page_referenced_file(struct page *page, struct mem_cgroup *memcg, - unsigned long *vm_flags) + unsigned long *vm_flags, int mmap_mutex_locked) { unsigned int mapcount; struct address_space *mapping = page->mapping; @@ -844,7 +844,8 @@ static int page_referenced_file(struct page *page, */ BUG_ON(!PageLocked(page)); - mutex_lock(&mapping->i_mmap_mutex); + if (!mmap_mutex_locked) + mutex_lock(&mapping->i_mmap_mutex); /* * i_mmap_mutex does not stabilize mapcount at all, but mapcount @@ -869,7 +870,8 @@ static int page_referenced_file(struct page *page, break; } - mutex_unlock(&mapping->i_mmap_mutex); + if (!mmap_mutex_locked) + mutex_unlock(&mapping->i_mmap_mutex); return referenced; } @@ -886,7 +888,7 @@ static int page_referenced_file(struct page *page, int page_referenced(struct page *page, int is_locked, struct mem_cgroup *memcg, - unsigned long *vm_flags) + unsigned long *vm_flags, int mmap_mutex_locked) { int referenced = 0; int we_locked = 0; @@ -908,7 +910,7 @@ int page_referenced(struct page *page, vm_flags); else if (page->mapping) referenced += page_referenced_file(page, memcg, - vm_flags); + vm_flags, mmap_mutex_locked); if (we_locked) unlock_page(page); @@ -1548,7 +1550,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags) * vm_flags for that VMA. That should be OK, because that vma shouldn't be * 'LOCKED. */ -static int try_to_unmap_file(struct page *page, enum ttu_flags flags) +static int try_to_unmap_file(struct page *page, enum ttu_flags flags, + int mmap_mutex_locked) { struct address_space *mapping = page->mapping; pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); @@ -1560,7 +1563,8 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags) unsigned long max_nl_size = 0; unsigned int mapcount; - mutex_lock(&mapping->i_mmap_mutex); + if (!mmap_mutex_locked) + mutex_lock(&mapping->i_mmap_mutex); vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) { unsigned long address = vma_address(page, vma); if (address == -EFAULT) @@ -1640,7 +1644,8 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags) list_for_each_entry(vma, &mapping->i_mmap_nonlinear, shared.vm_set.list) vma->vm_private_data = NULL; out: - mutex_unlock(&mapping->i_mmap_mutex); + if (!mmap_mutex_locked) + mutex_unlock(&mapping->i_mmap_mutex); return ret; } @@ -1658,7 +1663,7 @@ out: * SWAP_FAIL - the page is unswappable * SWAP_MLOCK - page is mlocked. */ -int try_to_unmap(struct page *page, enum ttu_flags flags) +int try_to_unmap(struct page *page, enum ttu_flags flags, int mmap_mutex_locked) { int ret; @@ -1670,7 +1675,7 @@ int try_to_unmap(struct page *page, enum ttu_flags flags) else if (PageAnon(page)) ret = try_to_unmap_anon(page, flags); else - ret = try_to_unmap_file(page, flags); + ret = try_to_unmap_file(page, flags, mmap_mutex_locked); if (ret != SWAP_MLOCK && !page_mapped(page)) ret = SWAP_SUCCESS; return ret; @@ -1700,7 +1705,7 @@ int try_to_munlock(struct page *page) else if (PageAnon(page)) return try_to_unmap_anon(page, TTU_MUNLOCK); else - return try_to_unmap_file(page, TTU_MUNLOCK); + return try_to_unmap_file(page, TTU_MUNLOCK, 0); } void __put_anon_vma(struct anon_vma *anon_vma) diff --git a/mm/vmscan.c b/mm/vmscan.c index a538565..1af1f1c 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -27,6 +27,7 @@ buffer_heads_over_limit */ #include <linux/mm_inline.h> #include <linux/backing-dev.h> +#include <linux/ksm.h> #include <linux/rmap.h> #include <linux/topology.h> #include <linux/cpu.h> @@ -781,12 +782,14 @@ enum page_references { static enum page_references page_check_references(struct page *page, struct mem_cgroup_zone *mz, - struct scan_control *sc) + struct scan_control *sc, + int mmap_mutex_locked) { int referenced_ptes, referenced_page; unsigned long vm_flags; - referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags); + referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags, + mmap_mutex_locked); referenced_page = TestClearPageReferenced(page); /* Lumpy reclaim - ignore references */ @@ -856,6 +859,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, unsigned long nr_congested = 0; unsigned long nr_reclaimed = 0; unsigned long nr_writeback = 0; + struct mutex *i_mmap_mutex = NULL; cond_resched(); @@ -865,12 +869,14 @@ static unsigned long shrink_page_list(struct list_head *page_list, struct address_space *mapping; struct page *page; int may_enter_fs; + int mmap_mutex_locked; cond_resched(); page = lru_to_page(page_list); list_del(&page->lru); + mmap_mutex_locked = 0; if (!trylock_page(page)) goto keep; @@ -909,7 +915,23 @@ static unsigned long shrink_page_list(struct list_head *page_list, } } - references = page_check_references(page, mz, sc); + if (page->mapping) { + if (i_mmap_mutex == &page->mapping->i_mmap_mutex) { + mmap_mutex_locked = 1; + } else { + if (page_mapped(page) && page_rmapping(page) + && !PageKsm(page) && !PageAnon(page)) { + if (i_mmap_mutex) + mutex_unlock(i_mmap_mutex); + i_mmap_mutex = &page->mapping->i_mmap_mutex; + mutex_lock(i_mmap_mutex); + mmap_mutex_locked = 1; + } + } + } + + references = page_check_references(page, mz, sc, + mmap_mutex_locked); switch (references) { case PAGEREF_ACTIVATE: goto activate_locked; @@ -939,7 +961,8 @@ static unsigned long shrink_page_list(struct list_head *page_list, * processes. Try to unmap it here. */ if (page_mapped(page) && mapping) { - switch (try_to_unmap(page, TTU_UNMAP)) { + switch (try_to_unmap(page, TTU_UNMAP, + mmap_mutex_locked)) { case SWAP_FAIL: goto activate_locked; case SWAP_AGAIN: @@ -1090,6 +1113,8 @@ keep_lumpy: VM_BUG_ON(PageLRU(page) || PageUnevictable(page)); } + if (i_mmap_mutex) + mutex_unlock(i_mmap_mutex); nr_reclaimed += __remove_mapping_batch(&unmap_pages, &ret_pages, &free_pages); @@ -1818,7 +1843,7 @@ static void shrink_active_list(unsigned long nr_to_scan, } } - if (page_referenced(page, 0, mz->mem_cgroup, &vm_flags)) { + if (page_referenced(page, 0, mz->mem_cgroup, &vm_flags, 0)) { nr_rotated += hpage_nr_pages(page); /* * Identify referenced, file-backed active pages and ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex 2012-08-20 16:43 ` Tim Chen @ 2012-08-21 13:21 ` Matthew Wilcox 2012-08-21 17:51 ` Tim Chen ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Matthew Wilcox @ 2012-08-21 13:21 UTC (permalink / raw) To: Tim Chen Cc: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki, Kirill A. Shutemov, Andi Kleen, linux-mm, linux-kernel, Alex Shi On Mon, Aug 20, 2012 at 09:43:02AM -0700, Tim Chen wrote: > > In shrink_page_list, call to page_referenced_file will causes the > > acquisition/release of mapping->i_mmap_mutex for each page in the page > > list. However, it is very likely that successive pages in the list > > share the same mapping and we can reduce the frequency of i_mmap_mutex > > acquisition by holding the mutex in shrink_page_list. This improves the > > performance when the system has a lot page reclamations for file mapped > > pages if workloads are using a lot of memory for page cache. Is there a (performant) way to avoid passing around the 'mmap_mutex_locked' state? For example, does it hurt to have all the callers hold the i_mmap_mutex() over the entire call, or do we rely on being able to execute large chunks of this in parallel? Here's what I'm thinking: 1. Rename the existing page_referenced implementation to __page_referenced(). 2. Add: int needs_page_mmap_mutex(struct page *page) { return page->mapping && page_mapped(page) && page_rmapping(page) && !PageKsm(page) && !PageAnon(page); } int page_referenced(struct page *page, int is_locked, struct mem_cgroup *memcg, unsigned long *vm_flags) { int result, needs_lock; needs_lock = needs_page_mmap_mutex(page); if (needs_lock) mutex_lock(&page->mapping->i_mmap_mutex); result = __page_referenced(page, is_locked, memcg, vm_flags); if (needs_lock) mutex_unlock(&page->mapping->i_mmap_mutex); return result; } 3. Rename the existing try_to_unmap() to __try_to_unmap() 4. Add: int try_to_unmap(struct page *page, enum ttu_flags flags) { int result, needs_lock; needs_lock = needs_page_mmap_mutex(page); if (needs_lock) mutex_lock(&page->mapping->i_mmap_mutex); result = __try_to_unmap(page, is_locked, memcg, vm_flags); if (needs_lock) mutex_unlock(&page->mapping->i_mmap_mutex); return result; } 5. Change page_check_references to always call __page_referenced (since it now always holds the mutex) 6. Replace the mutex_lock() calls in page_referenced_file() and try_to_unmap_file() with BUG_ON(!mutex_is_locked(&mapping->i_mmap_mutex)); 7. I think you can simplify this: > - references = page_check_references(page, mz, sc); > + if (page->mapping) { > + if (i_mmap_mutex == &page->mapping->i_mmap_mutex) { > + mmap_mutex_locked = 1; > + } else { > + if (page_mapped(page) && page_rmapping(page) > + && !PageKsm(page) && !PageAnon(page)) { > + if (i_mmap_mutex) > + mutex_unlock(i_mmap_mutex); > + i_mmap_mutex = &page->mapping->i_mmap_mutex; > + mutex_lock(i_mmap_mutex); > + mmap_mutex_locked = 1; > + } > + } > + } > + > + references = page_check_references(page, mz, sc, > + mmap_mutex_locked); to just: + if (needs_page_mmap_mutex(page) && + i_mmap_mutex != &page->mapping->i_mmap_mutex) { + if (i_mmap_mutex) + mutex_unlock(i_mmap_mutex); + i_mmap_mutex = &page->mapping->i_mmap_mutex; + mutex_lock(i_mmap_mutex); + } references = page_check_references(page, mz, sc); The only clunky bit would seem to be this bit: > if (page_mapped(page) && mapping) { > - switch (try_to_unmap(page, TTU_UNMAP)) { > + switch (try_to_unmap(page, TTU_UNMAP, > + mmap_mutex_locked)) { Which I think has to look like this: if (page_mapped(page) && mapping) { - switch (try_to_unmap(page, TTU_UNMAP)) { + int result; + if (i_mmap_mutex) + result = __try_to_unmap(page, TTU_UNMAP); + else + result = try_to_unmap(page, TTU_UNMAP); + switch (result) { Either that, or use a function pointer. But ... bleugh. > diff --git a/include/linux/rmap.h b/include/linux/rmap.h > index fd07c45..f0174ae 100644 > --- a/include/linux/rmap.h > +++ b/include/linux/rmap.h > @@ -156,8 +156,8 @@ static inline void page_dup_rmap(struct page *page) > /* > * Called from mm/vmscan.c to handle paging out > */ > -int page_referenced(struct page *, int is_locked, > - struct mem_cgroup *memcg, unsigned long *vm_flags); > +int page_referenced(struct page *, int is_locked, struct mem_cgroup *memcg, > + unsigned long *vm_flags, int mmap_mutex_locked); > int page_referenced_one(struct page *, struct vm_area_struct *, > unsigned long address, unsigned int *mapcount, unsigned long *vm_flags); > > @@ -175,7 +175,7 @@ enum ttu_flags { > > bool is_vma_temporary_stack(struct vm_area_struct *vma); > > -int try_to_unmap(struct page *, enum ttu_flags flags); > +int try_to_unmap(struct page *, enum ttu_flags flags, int mmap_mutex_locked); > int try_to_unmap_one(struct page *, struct vm_area_struct *, > unsigned long address, enum ttu_flags flags); > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 97cc273..5f162c7 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -953,7 +953,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn, > if (hpage != ppage) > lock_page(ppage); > > - ret = try_to_unmap(ppage, ttu); > + ret = try_to_unmap(ppage, ttu, 0); > if (ret != SWAP_SUCCESS) > printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n", > pfn, page_mapcount(ppage)); > diff --git a/mm/migrate.c b/mm/migrate.c > index 1107238..1aa37b1 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -802,7 +802,7 @@ static int __unmap_and_move(struct page *page, struct page *newpage, > } > > /* Establish migration ptes or remove ptes */ > - try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); > + try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS, 0); > > skip_unmap: > if (!page_mapped(page)) > @@ -918,7 +918,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page, > if (PageAnon(hpage)) > anon_vma = page_get_anon_vma(hpage); > > - try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS); > + try_to_unmap(hpage, > + TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS, 0); > > if (!page_mapped(hpage)) > rc = move_to_new_page(new_hpage, hpage, 1, mode); > diff --git a/mm/rmap.c b/mm/rmap.c > index 5b5ad58..63fc84b 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -820,7 +820,7 @@ static int page_referenced_anon(struct page *page, > */ > static int page_referenced_file(struct page *page, > struct mem_cgroup *memcg, > - unsigned long *vm_flags) > + unsigned long *vm_flags, int mmap_mutex_locked) > { > unsigned int mapcount; > struct address_space *mapping = page->mapping; > @@ -844,7 +844,8 @@ static int page_referenced_file(struct page *page, > */ > BUG_ON(!PageLocked(page)); > > - mutex_lock(&mapping->i_mmap_mutex); > + if (!mmap_mutex_locked) > + mutex_lock(&mapping->i_mmap_mutex); > > /* > * i_mmap_mutex does not stabilize mapcount at all, but mapcount > @@ -869,7 +870,8 @@ static int page_referenced_file(struct page *page, > break; > } > > - mutex_unlock(&mapping->i_mmap_mutex); > + if (!mmap_mutex_locked) > + mutex_unlock(&mapping->i_mmap_mutex); > return referenced; > } > > @@ -886,7 +888,7 @@ static int page_referenced_file(struct page *page, > int page_referenced(struct page *page, > int is_locked, > struct mem_cgroup *memcg, > - unsigned long *vm_flags) > + unsigned long *vm_flags, int mmap_mutex_locked) > { > int referenced = 0; > int we_locked = 0; > @@ -908,7 +910,7 @@ int page_referenced(struct page *page, > vm_flags); > else if (page->mapping) > referenced += page_referenced_file(page, memcg, > - vm_flags); > + vm_flags, mmap_mutex_locked); > if (we_locked) > unlock_page(page); > > @@ -1548,7 +1550,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags) > * vm_flags for that VMA. That should be OK, because that vma shouldn't be > * 'LOCKED. > */ > -static int try_to_unmap_file(struct page *page, enum ttu_flags flags) > +static int try_to_unmap_file(struct page *page, enum ttu_flags flags, > + int mmap_mutex_locked) > { > struct address_space *mapping = page->mapping; > pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT); > @@ -1560,7 +1563,8 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags) > unsigned long max_nl_size = 0; > unsigned int mapcount; > > - mutex_lock(&mapping->i_mmap_mutex); > + if (!mmap_mutex_locked) > + mutex_lock(&mapping->i_mmap_mutex); > vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) { > unsigned long address = vma_address(page, vma); > if (address == -EFAULT) > @@ -1640,7 +1644,8 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags) > list_for_each_entry(vma, &mapping->i_mmap_nonlinear, shared.vm_set.list) > vma->vm_private_data = NULL; > out: > - mutex_unlock(&mapping->i_mmap_mutex); > + if (!mmap_mutex_locked) > + mutex_unlock(&mapping->i_mmap_mutex); > return ret; > } > > @@ -1658,7 +1663,7 @@ out: > * SWAP_FAIL - the page is unswappable > * SWAP_MLOCK - page is mlocked. > */ > -int try_to_unmap(struct page *page, enum ttu_flags flags) > +int try_to_unmap(struct page *page, enum ttu_flags flags, int mmap_mutex_locked) > { > int ret; > > @@ -1670,7 +1675,7 @@ int try_to_unmap(struct page *page, enum ttu_flags flags) > else if (PageAnon(page)) > ret = try_to_unmap_anon(page, flags); > else > - ret = try_to_unmap_file(page, flags); > + ret = try_to_unmap_file(page, flags, mmap_mutex_locked); > if (ret != SWAP_MLOCK && !page_mapped(page)) > ret = SWAP_SUCCESS; > return ret; > @@ -1700,7 +1705,7 @@ int try_to_munlock(struct page *page) > else if (PageAnon(page)) > return try_to_unmap_anon(page, TTU_MUNLOCK); > else > - return try_to_unmap_file(page, TTU_MUNLOCK); > + return try_to_unmap_file(page, TTU_MUNLOCK, 0); > } > > void __put_anon_vma(struct anon_vma *anon_vma) > diff --git a/mm/vmscan.c b/mm/vmscan.c > index a538565..1af1f1c 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -27,6 +27,7 @@ > buffer_heads_over_limit */ > #include <linux/mm_inline.h> > #include <linux/backing-dev.h> > +#include <linux/ksm.h> > #include <linux/rmap.h> > #include <linux/topology.h> > #include <linux/cpu.h> > @@ -781,12 +782,14 @@ enum page_references { > > static enum page_references page_check_references(struct page *page, > struct mem_cgroup_zone *mz, > - struct scan_control *sc) > + struct scan_control *sc, > + int mmap_mutex_locked) > { > int referenced_ptes, referenced_page; > unsigned long vm_flags; > > - referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags); > + referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags, > + mmap_mutex_locked); > referenced_page = TestClearPageReferenced(page); > > /* Lumpy reclaim - ignore references */ > @@ -856,6 +859,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, > unsigned long nr_congested = 0; > unsigned long nr_reclaimed = 0; > unsigned long nr_writeback = 0; > + struct mutex *i_mmap_mutex = NULL; > > cond_resched(); > > @@ -865,12 +869,14 @@ static unsigned long shrink_page_list(struct list_head *page_list, > struct address_space *mapping; > struct page *page; > int may_enter_fs; > + int mmap_mutex_locked; > > cond_resched(); > > page = lru_to_page(page_list); > list_del(&page->lru); > > + mmap_mutex_locked = 0; > if (!trylock_page(page)) > goto keep; > > @@ -909,7 +915,23 @@ static unsigned long shrink_page_list(struct list_head *page_list, > } > } > > - references = page_check_references(page, mz, sc); > + if (page->mapping) { > + if (i_mmap_mutex == &page->mapping->i_mmap_mutex) { > + mmap_mutex_locked = 1; > + } else { > + if (page_mapped(page) && page_rmapping(page) > + && !PageKsm(page) && !PageAnon(page)) { > + if (i_mmap_mutex) > + mutex_unlock(i_mmap_mutex); > + i_mmap_mutex = &page->mapping->i_mmap_mutex; > + mutex_lock(i_mmap_mutex); > + mmap_mutex_locked = 1; > + } > + } > + } > + > + references = page_check_references(page, mz, sc, > + mmap_mutex_locked); > switch (references) { > case PAGEREF_ACTIVATE: > goto activate_locked; > @@ -939,7 +961,8 @@ static unsigned long shrink_page_list(struct list_head *page_list, > * processes. Try to unmap it here. > */ > if (page_mapped(page) && mapping) { > - switch (try_to_unmap(page, TTU_UNMAP)) { > + switch (try_to_unmap(page, TTU_UNMAP, > + mmap_mutex_locked)) { > case SWAP_FAIL: > goto activate_locked; > case SWAP_AGAIN: > @@ -1090,6 +1113,8 @@ keep_lumpy: > VM_BUG_ON(PageLRU(page) || PageUnevictable(page)); > } > > + if (i_mmap_mutex) > + mutex_unlock(i_mmap_mutex); > nr_reclaimed += __remove_mapping_batch(&unmap_pages, &ret_pages, > &free_pages); > > @@ -1818,7 +1843,7 @@ static void shrink_active_list(unsigned long nr_to_scan, > } > } > > - if (page_referenced(page, 0, mz->mem_cgroup, &vm_flags)) { > + if (page_referenced(page, 0, mz->mem_cgroup, &vm_flags, 0)) { > nr_rotated += hpage_nr_pages(page); > /* > * Identify referenced, file-backed active pages and > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex 2012-08-21 13:21 ` Matthew Wilcox @ 2012-08-21 17:51 ` Tim Chen 2012-08-21 22:53 ` Tim Chen 2012-08-22 0:48 ` Tim Chen 2 siblings, 0 replies; 9+ messages in thread From: Tim Chen @ 2012-08-21 17:51 UTC (permalink / raw) To: Matthew Wilcox Cc: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki, Kirill A. Shutemov, Andi Kleen, linux-mm, linux-kernel, Alex Shi On Tue, 2012-08-21 at 09:21 -0400, Matthew Wilcox wrote: > Is there a (performant) way to avoid passing around the > 'mmap_mutex_locked' state? > > For example, does it hurt to have all the callers hold the i_mmap_mutex() > over the entire call, or do we rely on being able to execute large chunks > of this in parallel? > > Here's what I'm thinking: > > 1. Rename the existing page_referenced implementation to __page_referenced(). > 2. Add: > > int needs_page_mmap_mutex(struct page *page) > { > return page->mapping && page_mapped(page) && page_rmapping(page) && > !PageKsm(page) && !PageAnon(page); > } > > int page_referenced(struct page *page, int is_locked, struct mem_cgroup *memcg, > unsigned long *vm_flags) > { > int result, needs_lock; > > needs_lock = needs_page_mmap_mutex(page); > if (needs_lock) > mutex_lock(&page->mapping->i_mmap_mutex); > result = __page_referenced(page, is_locked, memcg, vm_flags); > if (needs_lock) > mutex_unlock(&page->mapping->i_mmap_mutex); > return result; > } > > 3. Rename the existing try_to_unmap() to __try_to_unmap() > 4. Add: > > int try_to_unmap(struct page *page, enum ttu_flags flags) > { > int result, needs_lock; > > needs_lock = needs_page_mmap_mutex(page); > if (needs_lock) > mutex_lock(&page->mapping->i_mmap_mutex); > result = __try_to_unmap(page, is_locked, memcg, vm_flags); > if (needs_lock) > mutex_unlock(&page->mapping->i_mmap_mutex); > return result; > } > > 5. Change page_check_references to always call __page_referenced (since it > now always holds the mutex) > 6. Replace the mutex_lock() calls in page_referenced_file() and > try_to_unmap_file() with > BUG_ON(!mutex_is_locked(&mapping->i_mmap_mutex)); > 7. I think you can simplify this: I like your proposal and will try to test with a new patch along your suggestions. Though I will be out the rest of the week and may be delayed a bit getting the testing completed. Tim ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex 2012-08-21 13:21 ` Matthew Wilcox 2012-08-21 17:51 ` Tim Chen @ 2012-08-21 22:53 ` Tim Chen 2012-08-22 0:48 ` Tim Chen 2 siblings, 0 replies; 9+ messages in thread From: Tim Chen @ 2012-08-21 22:53 UTC (permalink / raw) To: Matthew Wilcox Cc: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki, Kirill A. Shutemov, Andi Kleen, linux-mm, linux-kernel, Alex Shi On Tue, 2012-08-21 at 09:21 -0400, Matthew Wilcox wrote: > > The only clunky bit would seem to be this bit: > > > if (page_mapped(page) && mapping) { > > - switch (try_to_unmap(page, TTU_UNMAP)) { > > + switch (try_to_unmap(page, TTU_UNMAP, > > + mmap_mutex_locked)) { > > Which I think has to look like this: > > if (page_mapped(page) && mapping) { > - switch (try_to_unmap(page, TTU_UNMAP)) { > + int result; > + if (i_mmap_mutex) > + result = __try_to_unmap(page, TTU_UNMAP); > + else > + result = try_to_unmap(page, TTU_UNMAP); > + switch (result) { > I think - switch (try_to_unmap(page, TTU_UNMAP)) { + switch (__try_to_unmap(page, TTU_UNMAP)) { should be enough when your changes are adopted. Because if the page mmap mutex needs to be locked, we will have locked it here before __try_to_unmap gets used. + if (needs_page_mmap_mutex(page) && + i_mmap_mutex != &page->mapping->i_mmap_mutex) { + if (i_mmap_mutex) + mutex_unlock(i_mmap_mutex); + i_mmap_mutex = &page->mapping->i_mmap_mutex; + mutex_lock(i_mmap_mutex); + } Tim ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex 2012-08-21 13:21 ` Matthew Wilcox 2012-08-21 17:51 ` Tim Chen 2012-08-21 22:53 ` Tim Chen @ 2012-08-22 0:48 ` Tim Chen 2012-08-23 18:33 ` Matthew Wilcox 2012-09-04 15:21 ` Tim Chen 2 siblings, 2 replies; 9+ messages in thread From: Tim Chen @ 2012-08-22 0:48 UTC (permalink / raw) To: Matthew Wilcox Cc: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki, Kirill A. Shutemov, Andi Kleen, linux-mm, linux-kernel, Alex Shi On Tue, 2012-08-21 at 09:21 -0400, Matthew Wilcox wrote: > On Mon, Aug 20, 2012 at 09:43:02AM -0700, Tim Chen wrote: > > > In shrink_page_list, call to page_referenced_file will causes the > > > acquisition/release of mapping->i_mmap_mutex for each page in the page > > > list. However, it is very likely that successive pages in the list > > > share the same mapping and we can reduce the frequency of i_mmap_mutex > > > acquisition by holding the mutex in shrink_page_list. This improves the > > > performance when the system has a lot page reclamations for file mapped > > > pages if workloads are using a lot of memory for page cache. > > Is there a (performant) way to avoid passing around the > 'mmap_mutex_locked' state? > > For example, does it hurt to have all the callers hold the i_mmap_mutex() > over the entire call, or do we rely on being able to execute large chunks > of this in parallel? > > Here's what I'm thinking: > > 1. Rename the existing page_referenced implementation to __page_referenced(). > 2. Add: > > int needs_page_mmap_mutex(struct page *page) > { > return page->mapping && page_mapped(page) && page_rmapping(page) && > !PageKsm(page) && !PageAnon(page); > } > > int page_referenced(struct page *page, int is_locked, struct mem_cgroup *memcg, > unsigned long *vm_flags) > { > int result, needs_lock; > > needs_lock = needs_page_mmap_mutex(page); > if (needs_lock) > mutex_lock(&page->mapping->i_mmap_mutex); > result = __page_referenced(page, is_locked, memcg, vm_flags); > if (needs_lock) > mutex_unlock(&page->mapping->i_mmap_mutex); > return result; > } > > 3. Rename the existing try_to_unmap() to __try_to_unmap() > 4. Add: > > int try_to_unmap(struct page *page, enum ttu_flags flags) > { > int result, needs_lock; > > needs_lock = needs_page_mmap_mutex(page); > if (needs_lock) > mutex_lock(&page->mapping->i_mmap_mutex); > result = __try_to_unmap(page, is_locked, memcg, vm_flags); > if (needs_lock) > mutex_unlock(&page->mapping->i_mmap_mutex); > return result; > } > > 5. Change page_check_references to always call __page_referenced (since it > now always holds the mutex) > 6. Replace the mutex_lock() calls in page_referenced_file() and > try_to_unmap_file() with > BUG_ON(!mutex_is_locked(&mapping->i_mmap_mutex)); > 7. I think you can simplify this: Thanks to Matthew's suggestions on improving the patch. Here's the updated version. It seems to be sane when I booted my machine up. I will put it through more testing when I get a chance. Tim --- Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> --- diff --git a/include/linux/rmap.h b/include/linux/rmap.h index fd07c45..f1320b1 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -156,8 +156,11 @@ static inline void page_dup_rmap(struct page *page) /* * Called from mm/vmscan.c to handle paging out */ -int page_referenced(struct page *, int is_locked, - struct mem_cgroup *memcg, unsigned long *vm_flags); +int needs_page_mmap_mutex(struct page *page); +int page_referenced(struct page *, int is_locked, struct mem_cgroup *memcg, + unsigned long *vm_flags); +int __page_referenced(struct page *, int is_locked, struct mem_cgroup *memcg, + unsigned long *vm_flags); int page_referenced_one(struct page *, struct vm_area_struct *, unsigned long address, unsigned int *mapcount, unsigned long *vm_flags); @@ -176,6 +179,7 @@ enum ttu_flags { bool is_vma_temporary_stack(struct vm_area_struct *vma); int try_to_unmap(struct page *, enum ttu_flags flags); +int __try_to_unmap(struct page *, enum ttu_flags flags); int try_to_unmap_one(struct page *, struct vm_area_struct *, unsigned long address, enum ttu_flags flags); diff --git a/mm/rmap.c b/mm/rmap.c index 5b5ad58..ca8cd21 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -843,8 +843,7 @@ static int page_referenced_file(struct page *page, * so we can safely take mapping->i_mmap_mutex. */ BUG_ON(!PageLocked(page)); - - mutex_lock(&mapping->i_mmap_mutex); + BUG_ON(!mutex_is_locked(&mapping->i_mmap_mutex)); /* * i_mmap_mutex does not stabilize mapcount at all, but mapcount @@ -869,21 +868,16 @@ static int page_referenced_file(struct page *page, break; } - mutex_unlock(&mapping->i_mmap_mutex); return referenced; } -/** - * page_referenced - test if the page was referenced - * @page: the page to test - * @is_locked: caller holds lock on the page - * @memcg: target memory cgroup - * @vm_flags: collect encountered vma->vm_flags who actually referenced the page - * - * Quick test_and_clear_referenced for all mappings to a page, - * returns the number of ptes which referenced the page. - */ -int page_referenced(struct page *page, +int needs_page_mmap_mutex(struct page *page) +{ + return page->mapping && page_mapped(page) && page_rmapping(page) && + !PageKsm(page) && !PageAnon(page); +} + +int __page_referenced(struct page *page, int is_locked, struct mem_cgroup *memcg, unsigned long *vm_flags) @@ -919,6 +913,32 @@ out: return referenced; } +/** + * page_referenced - test if the page was referenced + * @page: the page to test + * @is_locked: caller holds lock on the page + * @memcg: target memory cgroup + * @vm_flags: collect encountered vma->vm_flags who actually referenced the page + * + * Quick test_and_clear_referenced for all mappings to a page, + * returns the number of ptes which referenced the page. + */ +int page_referenced(struct page *page, + int is_locked, + struct mem_cgroup *memcg, + unsigned long *vm_flags) +{ + int result, needs_lock; + + needs_lock = needs_page_mmap_mutex(page); + if (needs_lock) + mutex_lock(&page->mapping->i_mmap_mutex); + result = __page_referenced(page, is_locked, memcg, vm_flags); + if (needs_lock) + mutex_unlock(&page->mapping->i_mmap_mutex); + return result; +} + static int page_mkclean_one(struct page *page, struct vm_area_struct *vma, unsigned long address) { @@ -1560,7 +1580,7 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags) unsigned long max_nl_size = 0; unsigned int mapcount; - mutex_lock(&mapping->i_mmap_mutex); + BUG_ON(!mutex_is_locked(&mapping->i_mmap_mutex)); vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) { unsigned long address = vma_address(page, vma); if (address == -EFAULT) @@ -1640,7 +1660,24 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags) list_for_each_entry(vma, &mapping->i_mmap_nonlinear, shared.vm_set.list) vma->vm_private_data = NULL; out: - mutex_unlock(&mapping->i_mmap_mutex); + return ret; +} + +int __try_to_unmap(struct page *page, enum ttu_flags flags) +{ + int ret; + + BUG_ON(!PageLocked(page)); + VM_BUG_ON(!PageHuge(page) && PageTransHuge(page)); + + if (unlikely(PageKsm(page))) + ret = try_to_unmap_ksm(page, flags); + else if (PageAnon(page)) + ret = try_to_unmap_anon(page, flags); + else + ret = try_to_unmap_file(page, flags); + if (ret != SWAP_MLOCK && !page_mapped(page)) + ret = SWAP_SUCCESS; return ret; } @@ -1660,20 +1697,27 @@ out: */ int try_to_unmap(struct page *page, enum ttu_flags flags) { - int ret; + int result, needs_lock; + + needs_lock = needs_page_mmap_mutex(page); + if (needs_lock) + mutex_lock(&page->mapping->i_mmap_mutex); + result = __try_to_unmap(page, flags); + if (needs_lock) + mutex_unlock(&page->mapping->i_mmap_mutex); + return result; +} - BUG_ON(!PageLocked(page)); - VM_BUG_ON(!PageHuge(page) && PageTransHuge(page)); +int __try_to_munlock(struct page *page) +{ + VM_BUG_ON(!PageLocked(page) || PageLRU(page)); if (unlikely(PageKsm(page))) - ret = try_to_unmap_ksm(page, flags); + return try_to_unmap_ksm(page, TTU_MUNLOCK); else if (PageAnon(page)) - ret = try_to_unmap_anon(page, flags); + return try_to_unmap_anon(page, TTU_MUNLOCK); else - ret = try_to_unmap_file(page, flags); - if (ret != SWAP_MLOCK && !page_mapped(page)) - ret = SWAP_SUCCESS; - return ret; + return try_to_unmap_file(page, TTU_MUNLOCK); } /** @@ -1693,14 +1737,15 @@ int try_to_unmap(struct page *page, enum ttu_flags flags) */ int try_to_munlock(struct page *page) { - VM_BUG_ON(!PageLocked(page) || PageLRU(page)); - - if (unlikely(PageKsm(page))) - return try_to_unmap_ksm(page, TTU_MUNLOCK); - else if (PageAnon(page)) - return try_to_unmap_anon(page, TTU_MUNLOCK); - else - return try_to_unmap_file(page, TTU_MUNLOCK); + int result, needs_lock; + + needs_lock = needs_page_mmap_mutex(page); + if (needs_lock) + mutex_lock(&page->mapping->i_mmap_mutex); + result = __try_to_munlock(page); + if (needs_lock) + mutex_unlock(&page->mapping->i_mmap_mutex); + return result; } void __put_anon_vma(struct anon_vma *anon_vma) diff --git a/mm/vmscan.c b/mm/vmscan.c index d4ab646..74a5fd0 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -27,6 +27,7 @@ buffer_heads_over_limit */ #include <linux/mm_inline.h> #include <linux/backing-dev.h> +#include <linux/ksm.h> #include <linux/rmap.h> #include <linux/topology.h> #include <linux/cpu.h> @@ -786,7 +787,7 @@ static enum page_references page_check_references(struct page *page, int referenced_ptes, referenced_page; unsigned long vm_flags; - referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags); + referenced_ptes = __page_referenced(page, 1, mz->mem_cgroup, &vm_flags); referenced_page = TestClearPageReferenced(page); /* Lumpy reclaim - ignore references */ @@ -856,6 +857,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, unsigned long nr_congested = 0; unsigned long nr_reclaimed = 0; unsigned long nr_writeback = 0; + struct mutex *i_mmap_mutex = NULL; cond_resched(); @@ -909,7 +911,15 @@ static unsigned long shrink_page_list(struct list_head *page_list, } } + if (needs_page_mmap_mutex(page) && + i_mmap_mutex != &page->mapping->i_mmap_mutex) { + if (i_mmap_mutex) + mutex_unlock(i_mmap_mutex); + i_mmap_mutex = &page->mapping->i_mmap_mutex; + mutex_lock(i_mmap_mutex); + } references = page_check_references(page, mz, sc); + switch (references) { case PAGEREF_ACTIVATE: goto activate_locked; @@ -939,7 +949,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, * processes. Try to unmap it here. */ if (page_mapped(page) && mapping) { - switch (try_to_unmap(page, TTU_UNMAP)) { + switch (__try_to_unmap(page, TTU_UNMAP)) { case SWAP_FAIL: goto activate_locked; case SWAP_AGAIN: @@ -1090,6 +1100,8 @@ keep_lumpy: VM_BUG_ON(PageLRU(page) || PageUnevictable(page)); } + if (i_mmap_mutex) + mutex_unlock(i_mmap_mutex); nr_reclaimed += __remove_mapping_batch(&unmap_pages, &ret_pages, &free_pages); ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex 2012-08-22 0:48 ` Tim Chen @ 2012-08-23 18:33 ` Matthew Wilcox 2012-09-04 15:21 ` Tim Chen 1 sibling, 0 replies; 9+ messages in thread From: Matthew Wilcox @ 2012-08-23 18:33 UTC (permalink / raw) To: Tim Chen Cc: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki, Kirill A. Shutemov, Andi Kleen, linux-mm, linux-kernel, Alex Shi On Tue, Aug 21, 2012 at 05:48:20PM -0700, Tim Chen wrote: > Thanks to Matthew's suggestions on improving the patch. Here's the > updated version. It seems to be sane when I booted my machine up. I > will put it through more testing when I get a chance. Looks good. > +int __try_to_munlock(struct page *page) Nit: I think this can be static. There aren't any users of it other than try_to_munlock() itself. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex 2012-08-22 0:48 ` Tim Chen 2012-08-23 18:33 ` Matthew Wilcox @ 2012-09-04 15:21 ` Tim Chen 2012-09-04 22:54 ` Tim Chen 1 sibling, 1 reply; 9+ messages in thread From: Tim Chen @ 2012-09-04 15:21 UTC (permalink / raw) To: Matthew Wilcox Cc: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki, Kirill A. Shutemov, Andi Kleen, linux-mm, linux-kernel, Alex Shi On Tue, 2012-08-21 at 17:48 -0700, Tim Chen wrote: > > Thanks to Matthew's suggestions on improving the patch. Here's the > updated version. It seems to be sane when I booted my machine up. I > will put it through more testing when I get a chance. > > Tim > Matthew, The new patch seems to be causing some of the workloads with mmaped file read to seg fault. Will need to dig further to find out why. Tim > --- > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com> > --- > diff --git a/include/linux/rmap.h b/include/linux/rmap.h > index fd07c45..f1320b1 100644 > --- a/include/linux/rmap.h > +++ b/include/linux/rmap.h > @@ -156,8 +156,11 @@ static inline void page_dup_rmap(struct page *page) > /* > * Called from mm/vmscan.c to handle paging out > */ > -int page_referenced(struct page *, int is_locked, > - struct mem_cgroup *memcg, unsigned long *vm_flags); > +int needs_page_mmap_mutex(struct page *page); > +int page_referenced(struct page *, int is_locked, struct mem_cgroup *memcg, > + unsigned long *vm_flags); > +int __page_referenced(struct page *, int is_locked, struct mem_cgroup *memcg, > + unsigned long *vm_flags); > int page_referenced_one(struct page *, struct vm_area_struct *, > unsigned long address, unsigned int *mapcount, unsigned long *vm_flags); > > @@ -176,6 +179,7 @@ enum ttu_flags { > bool is_vma_temporary_stack(struct vm_area_struct *vma); > > int try_to_unmap(struct page *, enum ttu_flags flags); > +int __try_to_unmap(struct page *, enum ttu_flags flags); > int try_to_unmap_one(struct page *, struct vm_area_struct *, > unsigned long address, enum ttu_flags flags); > > diff --git a/mm/rmap.c b/mm/rmap.c > index 5b5ad58..ca8cd21 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -843,8 +843,7 @@ static int page_referenced_file(struct page *page, > * so we can safely take mapping->i_mmap_mutex. > */ > BUG_ON(!PageLocked(page)); > - > - mutex_lock(&mapping->i_mmap_mutex); > + BUG_ON(!mutex_is_locked(&mapping->i_mmap_mutex)); > > /* > * i_mmap_mutex does not stabilize mapcount at all, but mapcount > @@ -869,21 +868,16 @@ static int page_referenced_file(struct page *page, > break; > } > > - mutex_unlock(&mapping->i_mmap_mutex); > return referenced; > } > > -/** > - * page_referenced - test if the page was referenced > - * @page: the page to test > - * @is_locked: caller holds lock on the page > - * @memcg: target memory cgroup > - * @vm_flags: collect encountered vma->vm_flags who actually referenced the page > - * > - * Quick test_and_clear_referenced for all mappings to a page, > - * returns the number of ptes which referenced the page. > - */ > -int page_referenced(struct page *page, > +int needs_page_mmap_mutex(struct page *page) > +{ > + return page->mapping && page_mapped(page) && page_rmapping(page) && > + !PageKsm(page) && !PageAnon(page); > +} > + > +int __page_referenced(struct page *page, > int is_locked, > struct mem_cgroup *memcg, > unsigned long *vm_flags) > @@ -919,6 +913,32 @@ out: > return referenced; > } > > +/** > + * page_referenced - test if the page was referenced > + * @page: the page to test > + * @is_locked: caller holds lock on the page > + * @memcg: target memory cgroup > + * @vm_flags: collect encountered vma->vm_flags who actually referenced the page > + * > + * Quick test_and_clear_referenced for all mappings to a page, > + * returns the number of ptes which referenced the page. > + */ > +int page_referenced(struct page *page, > + int is_locked, > + struct mem_cgroup *memcg, > + unsigned long *vm_flags) > +{ > + int result, needs_lock; > + > + needs_lock = needs_page_mmap_mutex(page); > + if (needs_lock) > + mutex_lock(&page->mapping->i_mmap_mutex); > + result = __page_referenced(page, is_locked, memcg, vm_flags); > + if (needs_lock) > + mutex_unlock(&page->mapping->i_mmap_mutex); > + return result; > +} > + > static int page_mkclean_one(struct page *page, struct vm_area_struct *vma, > unsigned long address) > { > @@ -1560,7 +1580,7 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags) > unsigned long max_nl_size = 0; > unsigned int mapcount; > > - mutex_lock(&mapping->i_mmap_mutex); > + BUG_ON(!mutex_is_locked(&mapping->i_mmap_mutex)); > vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, pgoff) { > unsigned long address = vma_address(page, vma); > if (address == -EFAULT) > @@ -1640,7 +1660,24 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags) > list_for_each_entry(vma, &mapping->i_mmap_nonlinear, shared.vm_set.list) > vma->vm_private_data = NULL; > out: > - mutex_unlock(&mapping->i_mmap_mutex); > + return ret; > +} > + > +int __try_to_unmap(struct page *page, enum ttu_flags flags) > +{ > + int ret; > + > + BUG_ON(!PageLocked(page)); > + VM_BUG_ON(!PageHuge(page) && PageTransHuge(page)); > + > + if (unlikely(PageKsm(page))) > + ret = try_to_unmap_ksm(page, flags); > + else if (PageAnon(page)) > + ret = try_to_unmap_anon(page, flags); > + else > + ret = try_to_unmap_file(page, flags); > + if (ret != SWAP_MLOCK && !page_mapped(page)) > + ret = SWAP_SUCCESS; > return ret; > } > > @@ -1660,20 +1697,27 @@ out: > */ > int try_to_unmap(struct page *page, enum ttu_flags flags) > { > - int ret; > + int result, needs_lock; > + > + needs_lock = needs_page_mmap_mutex(page); > + if (needs_lock) > + mutex_lock(&page->mapping->i_mmap_mutex); > + result = __try_to_unmap(page, flags); > + if (needs_lock) > + mutex_unlock(&page->mapping->i_mmap_mutex); > + return result; > +} > > - BUG_ON(!PageLocked(page)); > - VM_BUG_ON(!PageHuge(page) && PageTransHuge(page)); > +int __try_to_munlock(struct page *page) > +{ > + VM_BUG_ON(!PageLocked(page) || PageLRU(page)); > > if (unlikely(PageKsm(page))) > - ret = try_to_unmap_ksm(page, flags); > + return try_to_unmap_ksm(page, TTU_MUNLOCK); > else if (PageAnon(page)) > - ret = try_to_unmap_anon(page, flags); > + return try_to_unmap_anon(page, TTU_MUNLOCK); > else > - ret = try_to_unmap_file(page, flags); > - if (ret != SWAP_MLOCK && !page_mapped(page)) > - ret = SWAP_SUCCESS; > - return ret; > + return try_to_unmap_file(page, TTU_MUNLOCK); > } > > /** > @@ -1693,14 +1737,15 @@ int try_to_unmap(struct page *page, enum ttu_flags flags) > */ > int try_to_munlock(struct page *page) > { > - VM_BUG_ON(!PageLocked(page) || PageLRU(page)); > - > - if (unlikely(PageKsm(page))) > - return try_to_unmap_ksm(page, TTU_MUNLOCK); > - else if (PageAnon(page)) > - return try_to_unmap_anon(page, TTU_MUNLOCK); > - else > - return try_to_unmap_file(page, TTU_MUNLOCK); > + int result, needs_lock; > + > + needs_lock = needs_page_mmap_mutex(page); > + if (needs_lock) > + mutex_lock(&page->mapping->i_mmap_mutex); > + result = __try_to_munlock(page); > + if (needs_lock) > + mutex_unlock(&page->mapping->i_mmap_mutex); > + return result; > } > > void __put_anon_vma(struct anon_vma *anon_vma) > diff --git a/mm/vmscan.c b/mm/vmscan.c > index d4ab646..74a5fd0 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -27,6 +27,7 @@ > buffer_heads_over_limit */ > #include <linux/mm_inline.h> > #include <linux/backing-dev.h> > +#include <linux/ksm.h> > #include <linux/rmap.h> > #include <linux/topology.h> > #include <linux/cpu.h> > @@ -786,7 +787,7 @@ static enum page_references page_check_references(struct page *page, > int referenced_ptes, referenced_page; > unsigned long vm_flags; > > - referenced_ptes = page_referenced(page, 1, mz->mem_cgroup, &vm_flags); > + referenced_ptes = __page_referenced(page, 1, mz->mem_cgroup, &vm_flags); > referenced_page = TestClearPageReferenced(page); > > /* Lumpy reclaim - ignore references */ > @@ -856,6 +857,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, > unsigned long nr_congested = 0; > unsigned long nr_reclaimed = 0; > unsigned long nr_writeback = 0; > + struct mutex *i_mmap_mutex = NULL; > > cond_resched(); > > @@ -909,7 +911,15 @@ static unsigned long shrink_page_list(struct list_head *page_list, > } > } > > + if (needs_page_mmap_mutex(page) && > + i_mmap_mutex != &page->mapping->i_mmap_mutex) { > + if (i_mmap_mutex) > + mutex_unlock(i_mmap_mutex); > + i_mmap_mutex = &page->mapping->i_mmap_mutex; > + mutex_lock(i_mmap_mutex); > + } > references = page_check_references(page, mz, sc); > + > switch (references) { > case PAGEREF_ACTIVATE: > goto activate_locked; > @@ -939,7 +949,7 @@ static unsigned long shrink_page_list(struct list_head *page_list, > * processes. Try to unmap it here. > */ > if (page_mapped(page) && mapping) { > - switch (try_to_unmap(page, TTU_UNMAP)) { > + switch (__try_to_unmap(page, TTU_UNMAP)) { > case SWAP_FAIL: > goto activate_locked; > case SWAP_AGAIN: > @@ -1090,6 +1100,8 @@ keep_lumpy: > VM_BUG_ON(PageLRU(page) || PageUnevictable(page)); > } > > + if (i_mmap_mutex) > + mutex_unlock(i_mmap_mutex); > nr_reclaimed += __remove_mapping_batch(&unmap_pages, &ret_pages, > &free_pages); > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex 2012-09-04 15:21 ` Tim Chen @ 2012-09-04 22:54 ` Tim Chen 0 siblings, 0 replies; 9+ messages in thread From: Tim Chen @ 2012-09-04 22:54 UTC (permalink / raw) To: Matthew Wilcox Cc: Mel Gorman, Andrew Morton, Minchan Kim, Johannes Weiner, KAMEZAWA Hiroyuki, Kirill A. Shutemov, Andi Kleen, linux-mm, linux-kernel, Alex Shi On Tue, 2012-09-04 at 08:21 -0700, Tim Chen wrote: > On Tue, 2012-08-21 at 17:48 -0700, Tim Chen wrote: > > > > > Thanks to Matthew's suggestions on improving the patch. Here's the > > updated version. It seems to be sane when I booted my machine up. I > > will put it through more testing when I get a chance. > > > > Tim > > > > Matthew, > > The new patch seems to be causing some of the workloads with mmaped file > read to seg fault. Will need to dig further to find out why. > > Tim > Okay, the problem seems to be the code below. It is too restrictive and causes some cases where the mutex needs to be taken in try_to_unmap_file to be missed. > > +int needs_page_mmap_mutex(struct page *page) > > +{ > > + return page->mapping && page_mapped(page) && page_rmapping(page) && > > + !PageKsm(page) && !PageAnon(page); > > +} > > + Changing the check to the following fixes the problem: @@ -873,8 +873,7 @@ static int page_referenced_file(struct page *page, int needs_page_mmap_mutex(struct page *page) { - return page->mapping && page_mapped(page) && page_rmapping(page) && - !PageKsm(page) && !PageAnon(page); + return page->mapping && !PageKsm(page) && !PageAnon(page); } I'll do more testing and generate a second version of the patch set with the fixes. Tim ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2012-09-04 22:54 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-08-18 1:06 [RFC PATCH 2/2] mm: Batch page_check_references in shrink_page_list sharing the same i_mmap_mutex Tim Chen 2012-08-20 16:43 ` Tim Chen 2012-08-21 13:21 ` Matthew Wilcox 2012-08-21 17:51 ` Tim Chen 2012-08-21 22:53 ` Tim Chen 2012-08-22 0:48 ` Tim Chen 2012-08-23 18:33 ` Matthew Wilcox 2012-09-04 15:21 ` Tim Chen 2012-09-04 22:54 ` Tim Chen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).