All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Nicholas Piggin <npiggin@gmail.com>
Subject: [PATCH] mm/huge_memory.c: split should clone page flags before unfreezing pageref
Date: Sun, 11 Feb 2018 13:35:17 +0300	[thread overview]
Message-ID: <151834531706.176342.14968581451762734122.stgit@buzz> (raw)

THP split makes non-atomic change of tail page flags. This is almost ok
because tail pages are locked and isolated but this breaks recent changes
in page locking: non-atomic operation could clear bit PG_waiters.

As a result concurrent sequence get_page_unless_zero() -> lock_page()
might block forever. Especially if this page was truncated later.

Fix is trivial: clone flags before unfreezing page reference counter.

This race exists since commit 62906027091f ("mm: add PageWaiters indicating
tasks are waiting for a page bit") while unsave unfreeze itself was added
in commit 8df651c7059e ("thp: cleanup split_huge_page()").

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 mm/huge_memory.c |   25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 87ab9b8f56b5..2b38d9f2f262 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2357,6 +2357,19 @@ static void __split_huge_page_tail(struct page *head, int tail,
 	VM_BUG_ON_PAGE(atomic_read(&page_tail->_mapcount) != -1, page_tail);
 	VM_BUG_ON_PAGE(page_ref_count(page_tail) != 0, page_tail);
 
+	/* Clone page flags before unfreezing refcount. */
+	page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
+	page_tail->flags |= (head->flags &
+			((1L << PG_referenced) |
+			 (1L << PG_swapbacked) |
+			 (1L << PG_swapcache) |
+			 (1L << PG_mlocked) |
+			 (1L << PG_uptodate) |
+			 (1L << PG_active) |
+			 (1L << PG_locked) |
+			 (1L << PG_unevictable) |
+			 (1L << PG_dirty)));
+
 	/*
 	 * tail_page->_refcount is zero and not changing from under us. But
 	 * get_page_unless_zero() may be running from under us on the
@@ -2375,18 +2388,6 @@ static void __split_huge_page_tail(struct page *head, int tail,
 		page_ref_add(page_tail, 2);
 	}
 
-	page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
-	page_tail->flags |= (head->flags &
-			((1L << PG_referenced) |
-			 (1L << PG_swapbacked) |
-			 (1L << PG_swapcache) |
-			 (1L << PG_mlocked) |
-			 (1L << PG_uptodate) |
-			 (1L << PG_active) |
-			 (1L << PG_locked) |
-			 (1L << PG_unevictable) |
-			 (1L << PG_dirty)));
-
 	/*
 	 * After clearing PageTail the gup refcount can be released.
 	 * Page flags also must be visible before we make the page non-compound.

WARNING: multiple messages have this Message-ID (diff)
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
To: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org
Cc: Michal Hocko <mhocko@suse.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Nicholas Piggin <npiggin@gmail.com>
Subject: [PATCH] mm/huge_memory.c: split should clone page flags before unfreezing pageref
Date: Sun, 11 Feb 2018 13:35:17 +0300	[thread overview]
Message-ID: <151834531706.176342.14968581451762734122.stgit@buzz> (raw)

THP split makes non-atomic change of tail page flags. This is almost ok
because tail pages are locked and isolated but this breaks recent changes
in page locking: non-atomic operation could clear bit PG_waiters.

As a result concurrent sequence get_page_unless_zero() -> lock_page()
might block forever. Especially if this page was truncated later.

Fix is trivial: clone flags before unfreezing page reference counter.

This race exists since commit 62906027091f ("mm: add PageWaiters indicating
tasks are waiting for a page bit") while unsave unfreeze itself was added
in commit 8df651c7059e ("thp: cleanup split_huge_page()").

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
---
 mm/huge_memory.c |   25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 87ab9b8f56b5..2b38d9f2f262 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2357,6 +2357,19 @@ static void __split_huge_page_tail(struct page *head, int tail,
 	VM_BUG_ON_PAGE(atomic_read(&page_tail->_mapcount) != -1, page_tail);
 	VM_BUG_ON_PAGE(page_ref_count(page_tail) != 0, page_tail);
 
+	/* Clone page flags before unfreezing refcount. */
+	page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
+	page_tail->flags |= (head->flags &
+			((1L << PG_referenced) |
+			 (1L << PG_swapbacked) |
+			 (1L << PG_swapcache) |
+			 (1L << PG_mlocked) |
+			 (1L << PG_uptodate) |
+			 (1L << PG_active) |
+			 (1L << PG_locked) |
+			 (1L << PG_unevictable) |
+			 (1L << PG_dirty)));
+
 	/*
 	 * tail_page->_refcount is zero and not changing from under us. But
 	 * get_page_unless_zero() may be running from under us on the
@@ -2375,18 +2388,6 @@ static void __split_huge_page_tail(struct page *head, int tail,
 		page_ref_add(page_tail, 2);
 	}
 
-	page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP;
-	page_tail->flags |= (head->flags &
-			((1L << PG_referenced) |
-			 (1L << PG_swapbacked) |
-			 (1L << PG_swapcache) |
-			 (1L << PG_mlocked) |
-			 (1L << PG_uptodate) |
-			 (1L << PG_active) |
-			 (1L << PG_locked) |
-			 (1L << PG_unevictable) |
-			 (1L << PG_dirty)));
-
 	/*
 	 * After clearing PageTail the gup refcount can be released.
 	 * Page flags also must be visible before we make the page non-compound.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2018-02-11 10:35 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-11 10:35 Konstantin Khlebnikov [this message]
2018-02-11 10:35 ` [PATCH] mm/huge_memory.c: split should clone page flags before unfreezing pageref Konstantin Khlebnikov
2018-02-11 11:07 ` Kirill A. Shutemov
2018-02-11 11:07   ` Kirill A. Shutemov
2018-02-11 13:13   ` Konstantin Khlebnikov
2018-02-11 13:13     ` Konstantin Khlebnikov
2018-02-11 14:29     ` [PATCH v2] mm/huge_memory.c: reorder operations in __split_huge_page_tail() Konstantin Khlebnikov
2018-02-11 14:29       ` Konstantin Khlebnikov
2018-02-11 15:14       ` Kirill A. Shutemov
2018-02-11 15:14         ` Kirill A. Shutemov
2018-02-11 15:32         ` Konstantin Khlebnikov
2018-02-11 15:32           ` Konstantin Khlebnikov
2018-02-11 15:47           ` Kirill A. Shutemov
2018-02-11 15:47             ` Kirill A. Shutemov
2018-02-11 15:55             ` Konstantin Khlebnikov
2018-02-11 15:55               ` Konstantin Khlebnikov
2018-02-11 20:09       ` Matthew Wilcox
2018-02-11 20:09         ` Matthew Wilcox
2018-02-12 13:58     ` [PATCH v3 1/2] mm/page_ref: use atomic_set_release in page_ref_unfreeze Konstantin Khlebnikov
2018-02-12 13:58       ` Konstantin Khlebnikov
2018-02-12 14:07       ` Kirill A. Shutemov
2018-02-12 14:07         ` Kirill A. Shutemov
2018-02-12 13:58     ` [PATCH v3 2/2] mm/huge_memory.c: reorder operations in __split_huge_page_tail() Konstantin Khlebnikov
2018-02-12 13:58       ` Konstantin Khlebnikov
2018-02-12 14:11       ` Kirill A. Shutemov
2018-02-12 14:11         ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=151834531706.176342.14968581451762734122.stgit@buzz \
    --to=khlebnikov@yandex-team.ru \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=npiggin@gmail.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.