From: Hugh Dickins <hughd@google.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
Ning Qu <quning@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
David Rientjes <rientjes@google.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH 04/24] mm: make page migration's newpage handling more robust
Date: Fri, 20 Feb 2015 19:58:36 -0800 (PST) [thread overview]
Message-ID: <alpine.LSU.2.11.1502201956260.14414@eggly.anvils> (raw)
In-Reply-To: <alpine.LSU.2.11.1502201941340.14414@eggly.anvils>
I don't know of any problem in the current tree, but huge tmpfs wants
to use the custom put_new_page feature of page migration, with a pool
of its own pages: and met some surprises and difficulties in doing so.
An unused newpage is expected to be released with the put_new_page(),
but there was one MIGRATEPAGE_SUCCESS (0) path which released it with
putback_lru_page(), which was wrong for this custom pool. Fixed more
easily by resetting put_new_page once it won't be needed, than by
adding a further flag to modify the rc test.
Definitely an extension rather than a bugfix: pages of that newpage
pool might in rare cases still be speculatively accessed and locked
by another task, so relax move_to_new_page()'s !trylock_page() BUG
to an -EAGAIN, just like when old page is found locked. Do its
__ClearPageSwapBacked on failure while still holding page lock;
and don't reset old page->mapping if PageAnon, we often assume
that PageAnon is persistent once set.
Actually, move the trylock_page(newpage) and unlock_page(newpage)
up a level into __unmap_and_move(), to the same level as the trylock
and unlock of old page: though I moved them originally to suit an old
tree (one using mem_cgroup_prepare_migration()), it still seems a
better placement.
Then the remove_migration_ptes() can be done in one place,
instead of at one level on success but another level on failure.
Add some VM_BUG_ONs to enforce the new convention, and adjust
unmap_and_move_huge_page() and balloon_page_migrate() to fit too.
Finally, clean up __unmap_and_move()'s increasingly weird block
"if (anon_vma) nothing; else if (PageSwapCache) nothing; else out;"
while keeping its useful comment on unmapped swapcache.
Signed-off-by: Hugh Dickins <hughd@google.com>
---
mm/balloon_compaction.c | 10 ---
mm/migrate.c | 120 +++++++++++++++++++-------------------
2 files changed, 63 insertions(+), 67 deletions(-)
--- thpfs.orig/mm/balloon_compaction.c 2014-12-07 14:21:05.000000000 -0800
+++ thpfs/mm/balloon_compaction.c 2015-02-20 19:33:40.872062714 -0800
@@ -199,23 +199,17 @@ int balloon_page_migrate(struct page *ne
struct balloon_dev_info *balloon = balloon_page_device(page);
int rc = -EAGAIN;
- /*
- * Block others from accessing the 'newpage' when we get around to
- * establishing additional references. We should be the only one
- * holding a reference to the 'newpage' at this point.
- */
- BUG_ON(!trylock_page(newpage));
+ VM_BUG_ON_PAGE(!PageLocked(page), page);
+ VM_BUG_ON_PAGE(!PageLocked(newpage), newpage);
if (WARN_ON(!__is_movable_balloon_page(page))) {
dump_page(page, "not movable balloon page");
- unlock_page(newpage);
return rc;
}
if (balloon && balloon->migratepage)
rc = balloon->migratepage(balloon, newpage, page, mode);
- unlock_page(newpage);
return rc;
}
#endif /* CONFIG_BALLOON_COMPACTION */
--- thpfs.orig/mm/migrate.c 2015-02-20 19:33:35.676074594 -0800
+++ thpfs/mm/migrate.c 2015-02-20 19:33:40.876062705 -0800
@@ -746,18 +746,13 @@ static int fallback_migrate_page(struct
* MIGRATEPAGE_SUCCESS - success
*/
static int move_to_new_page(struct page *newpage, struct page *page,
- int page_was_mapped, enum migrate_mode mode)
+ enum migrate_mode mode)
{
struct address_space *mapping;
int rc;
- /*
- * Block others from accessing the page when we get around to
- * establishing additional references. We are the only one
- * holding a reference to the new page at this point.
- */
- if (!trylock_page(newpage))
- BUG();
+ VM_BUG_ON_PAGE(!PageLocked(page), page);
+ VM_BUG_ON_PAGE(!PageLocked(newpage), newpage);
/* Prepare mapping for the new page.*/
newpage->index = page->index;
@@ -781,16 +776,14 @@ static int move_to_new_page(struct page
rc = fallback_migrate_page(mapping, newpage, page, mode);
if (rc != MIGRATEPAGE_SUCCESS) {
+ __ClearPageSwapBacked(newpage);
newpage->mapping = NULL;
} else {
mem_cgroup_migrate(page, newpage, false);
- if (page_was_mapped)
- remove_migration_ptes(page, newpage);
- page->mapping = NULL;
+ if (!PageAnon(page))
+ page->mapping = NULL;
}
- unlock_page(newpage);
-
return rc;
}
@@ -839,6 +832,7 @@ static int __unmap_and_move(struct page
goto out_unlock;
wait_on_page_writeback(page);
}
+
/*
* By try_to_unmap(), page->mapcount goes down to 0 here. In this case,
* we cannot notice that anon_vma is freed while we migrates a page.
@@ -853,28 +847,29 @@ static int __unmap_and_move(struct page
* getting a hold on an anon_vma from outside one of its mms.
*/
anon_vma = page_get_anon_vma(page);
- if (anon_vma) {
- /*
- * Anon page
- */
- } else if (PageSwapCache(page)) {
- /*
- * We cannot be sure that the anon_vma of an unmapped
- * swapcache page is safe to use because we don't
- * know in advance if the VMA that this page belonged
- * to still exists. If the VMA and others sharing the
- * data have been freed, then the anon_vma could
- * already be invalid.
- *
- * To avoid this possibility, swapcache pages get
- * migrated but are not remapped when migration
- * completes
- */
- } else {
+ if (!anon_vma && !PageSwapCache(page))
goto out_unlock;
- }
+ /*
+ * We cannot be sure that the anon_vma of an unmapped swapcache
+ * page is safe to use because we don't know in advance if the
+ * VMA that this page belonged to still exists. If the VMA and
+ * others sharing it have been freed, then the anon_vma could
+ * be invalid. To avoid this possibility, swapcache pages
+ * are migrated, but not remapped when migration completes.
+ */
}
+ /*
+ * Block others from accessing the new page when we get around to
+ * establishing additional references. We are usually the only one
+ * holding a reference to newpage at this point. We used to have a BUG
+ * here if trylock_page(newpage) fails, but intend to introduce a case
+ * where there might be a race with the previous use of newpage. This
+ * is much like races on the refcount of oldpage: just don't BUG().
+ */
+ if (unlikely(!trylock_page(newpage)))
+ goto out_unlock;
+
if (unlikely(isolated_balloon_page(page))) {
/*
* A ballooned page does not need any special attention from
@@ -884,7 +879,7 @@ static int __unmap_and_move(struct page
* the page migration right away (proteced by page lock).
*/
rc = balloon_page_migrate(newpage, page, mode);
- goto out_unlock;
+ goto out_unlock_both;
}
/*
@@ -903,30 +898,28 @@ static int __unmap_and_move(struct page
VM_BUG_ON_PAGE(PageAnon(page), page);
if (page_has_private(page)) {
try_to_free_buffers(page);
- goto out_unlock;
+ goto out_unlock_both;
}
- goto skip_unmap;
- }
-
- /* Establish migration ptes or remove ptes */
- if (page_mapped(page)) {
+ } else if (page_mapped(page)) {
+ /* Establish migration ptes or remove ptes */
try_to_unmap(page,
TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
page_was_mapped = 1;
}
-skip_unmap:
if (!page_mapped(page))
- rc = move_to_new_page(newpage, page, page_was_mapped, mode);
+ rc = move_to_new_page(newpage, page, mode);
- if (rc && page_was_mapped)
- remove_migration_ptes(page, page);
+ if (page_was_mapped)
+ remove_migration_ptes(page,
+ rc == MIGRATEPAGE_SUCCESS ? newpage : page);
+out_unlock_both:
+ unlock_page(newpage);
+out_unlock:
/* Drop an anon_vma reference if we took one */
if (anon_vma)
put_anon_vma(anon_vma);
-
-out_unlock:
unlock_page(page);
out:
return rc;
@@ -940,10 +933,11 @@ static int unmap_and_move(new_page_t get
unsigned long private, struct page *page, int force,
enum migrate_mode mode)
{
- int rc = 0;
+ int rc = MIGRATEPAGE_SUCCESS;
int *result = NULL;
- struct page *newpage = get_new_page(page, private, &result);
+ struct page *newpage;
+ newpage = get_new_page(page, private, &result);
if (!newpage)
return -ENOMEM;
@@ -957,6 +951,8 @@ static int unmap_and_move(new_page_t get
goto out;
rc = __unmap_and_move(page, newpage, force, mode);
+ if (rc == MIGRATEPAGE_SUCCESS)
+ put_new_page = NULL;
out:
if (rc != -EAGAIN) {
@@ -977,10 +973,9 @@ out:
* it. Otherwise, putback_lru_page() will drop the reference grabbed
* during isolation.
*/
- if (rc != MIGRATEPAGE_SUCCESS && put_new_page) {
- __ClearPageSwapBacked(newpage);
+ if (put_new_page)
put_new_page(newpage, private);
- } else if (unlikely(__is_movable_balloon_page(newpage))) {
+ else if (unlikely(__is_movable_balloon_page(newpage))) {
/* drop our reference, page already in the balloon */
put_page(newpage);
} else
@@ -1018,7 +1013,7 @@ static int unmap_and_move_huge_page(new_
struct page *hpage, int force,
enum migrate_mode mode)
{
- int rc = 0;
+ int rc = -EAGAIN;
int *result = NULL;
int page_was_mapped = 0;
struct page *new_hpage;
@@ -1040,8 +1035,6 @@ static int unmap_and_move_huge_page(new_
if (!new_hpage)
return -ENOMEM;
- rc = -EAGAIN;
-
if (!trylock_page(hpage)) {
if (!force || mode != MIGRATE_SYNC)
goto out;
@@ -1051,6 +1044,9 @@ static int unmap_and_move_huge_page(new_
if (PageAnon(hpage))
anon_vma = page_get_anon_vma(hpage);
+ if (unlikely(!trylock_page(new_hpage)))
+ goto put_anon;
+
if (page_mapped(hpage)) {
try_to_unmap(hpage,
TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
@@ -1058,16 +1054,22 @@ static int unmap_and_move_huge_page(new_
}
if (!page_mapped(hpage))
- rc = move_to_new_page(new_hpage, hpage, page_was_mapped, mode);
+ rc = move_to_new_page(new_hpage, hpage, mode);
- if (rc != MIGRATEPAGE_SUCCESS && page_was_mapped)
- remove_migration_ptes(hpage, hpage);
+ if (page_was_mapped)
+ remove_migration_ptes(hpage,
+ rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage);
+ unlock_page(new_hpage);
+
+put_anon:
if (anon_vma)
put_anon_vma(anon_vma);
- if (rc == MIGRATEPAGE_SUCCESS)
+ if (rc == MIGRATEPAGE_SUCCESS) {
hugetlb_cgroup_migrate(hpage, new_hpage);
+ put_new_page = NULL;
+ }
unlock_page(hpage);
out:
@@ -1079,7 +1081,7 @@ out:
* it. Otherwise, put_page() will drop the reference grabbed during
* isolation.
*/
- if (rc != MIGRATEPAGE_SUCCESS && put_new_page)
+ if (put_new_page)
put_new_page(new_hpage, private);
else
put_page(new_hpage);
@@ -1109,7 +1111,7 @@ out:
*
* The function returns after 10 attempts or if no pages are movable any more
* because the list has become empty or no retryable pages exist any more.
- * The caller should call putback_lru_pages() to return pages to the LRU
+ * The caller should call putback_movable_pages() to return pages to the LRU
* or free list only if ret != 0.
*
* Returns the number of pages that were not migrated, or an error code.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2015-02-21 3:58 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-02-21 3:49 [PATCH 00/24] huge tmpfs: an alternative approach to THPageCache Hugh Dickins
2015-02-21 3:51 ` [PATCH 01/24] mm: update_lru_size warn and reset bad lru_size Hugh Dickins
2015-02-23 9:30 ` Kirill A. Shutemov
2015-03-23 2:44 ` Hugh Dickins
2015-02-21 3:54 ` [PATCH 02/24] mm: update_lru_size do the __mod_zone_page_state Hugh Dickins
2015-02-21 3:56 ` [PATCH 03/24] mm: use __SetPageSwapBacked and don't ClearPageSwapBacked Hugh Dickins
2015-02-25 10:53 ` Mel Gorman
2015-03-23 3:01 ` Hugh Dickins
2015-02-21 3:58 ` Hugh Dickins [this message]
2015-02-21 4:00 ` [PATCH 05/24] tmpfs: preliminary minor tidyups Hugh Dickins
2015-02-21 4:01 ` [PATCH 06/24] huge tmpfs: prepare counts in meminfo, vmstat and SysRq-m Hugh Dickins
2015-02-21 4:03 ` [PATCH 07/24] huge tmpfs: include shmem freeholes in available memory counts Hugh Dickins
2015-02-21 4:05 ` [PATCH 08/24] huge tmpfs: prepare huge=N mount option and /proc/sys/vm/shmem_huge Hugh Dickins
2015-02-21 4:06 ` [PATCH 09/24] huge tmpfs: try to allocate huge pages, split into a team Hugh Dickins
2015-02-21 4:07 ` [PATCH 10/24] huge tmpfs: avoid team pages in a few places Hugh Dickins
2015-02-21 4:09 ` [PATCH 11/24] huge tmpfs: shrinker to migrate and free underused holes Hugh Dickins
2015-03-19 16:56 ` Konstantin Khlebnikov
2015-03-23 4:40 ` Hugh Dickins
2015-03-23 12:50 ` Kirill A. Shutemov
2015-03-23 13:50 ` Kirill A. Shutemov
2015-03-24 12:57 ` Kirill A. Shutemov
2015-03-25 0:41 ` Hugh Dickins
2015-02-21 4:11 ` [PATCH 12/24] huge tmpfs: get_unmapped_area align and fault supply huge page Hugh Dickins
2015-02-21 4:12 ` [PATCH 13/24] huge tmpfs: extend get_user_pages_fast to shmem pmd Hugh Dickins
2015-02-21 4:13 ` [PATCH 14/24] huge tmpfs: extend vma_adjust_trans_huge " Hugh Dickins
2015-02-21 4:15 ` [PATCH 15/24] huge tmpfs: rework page_referenced_one and try_to_unmap_one Hugh Dickins
2015-02-21 4:16 ` [PATCH 16/24] huge tmpfs: fix problems from premature exposure of pagetable Hugh Dickins
2015-07-01 10:53 ` Kirill A. Shutemov
2015-02-21 4:18 ` [PATCH 17/24] huge tmpfs: map shmem by huge page pmd or by page team ptes Hugh Dickins
2015-02-21 4:20 ` [PATCH 18/24] huge tmpfs: mmap_sem is unlocked when truncation splits huge pmd Hugh Dickins
2015-02-21 4:22 ` [PATCH 19/24] huge tmpfs: disband split huge pmds on race or memory failure Hugh Dickins
2015-02-21 4:23 ` [PATCH 20/24] huge tmpfs: use Unevictable lru with variable hpage_nr_pages() Hugh Dickins
2015-02-21 4:25 ` [PATCH 21/24] huge tmpfs: fix Mlocked meminfo, tracking huge and unhuge mlocks Hugh Dickins
2015-02-21 4:27 ` [PATCH 22/24] huge tmpfs: fix Mapped meminfo, tracking huge and unhuge mappings Hugh Dickins
2015-02-21 4:29 ` [PATCH 23/24] kvm: plumb return of hva when resolving page fault Hugh Dickins
2015-02-21 4:31 ` [PATCH 24/24] kvm: teach kvm to map page teams as huge pages Hugh Dickins
2015-02-23 13:48 ` [PATCH 00/24] huge tmpfs: an alternative approach to THPageCache Kirill A. Shutemov
2015-03-23 2:25 ` Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1502201956260.14414@eggly.anvils \
--to=hughd@google.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=quning@gmail.com \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).