All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@intel.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Zi Yan <zi.yan@cs.rutgers.edu>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Huang Ying <ying.huang@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Alexander Viro <viro@zeniv.linux.org.UK>
Subject: [RFC -mm] mm, userfaultfd, THP: Avoid waiting when PMD under THP migration
Date: Fri,  3 Nov 2017 15:52:31 +0800	[thread overview]
Message-ID: <20171103075231.25416-1-ying.huang@intel.com> (raw)

From: Huang Ying <ying.huang@intel.com>

If THP migration is enabled, the following situation is possible,

- A THP is mapped at source address
- Migration is started to move the THP to another node
- Page fault occurs
- The PMD (migration entry) is copied to the destination address in mremap

That is, it is possible for handle_userfault() encounter a PMD entry
which has been handled but !pmd_present().  In the current
implementation, we will wait for such PMD entries, which may cause
unnecessary waiting, and potential soft lockup.

This is fixed via avoiding to wait when !pmd_present(), only wait when
pmd_none().

Question:

I found userfaultfd_must_wait() is always called when PMD or PTE is
none, and with mm->mmap_sem read-lock held.  mremap() will write-lock
mm->mmap_sem.  And UFFDIO_COPY don't support to copy THP mapping.  So
the situation described above couldn't happen in practice?

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.UK>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 fs/userfaultfd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index b5a0193e1960..0fcf66c3e439 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -294,10 +294,13 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx,
 	 * pmd_trans_unstable) of the pmd.
 	 */
 	_pmd = READ_ONCE(*pmd);
-	if (!pmd_present(_pmd))
+	if (pmd_none(_pmd))
 		goto out;
 
 	ret = false;
+	if (!pmd_present(_pmd))
+		goto out;
+
 	if (pmd_trans_huge(_pmd))
 		goto out;
 
-- 
2.14.2

WARNING: multiple messages have this Message-ID (diff)
From: "Huang, Ying" <ying.huang@intel.com>
To: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Zi Yan <zi.yan@cs.rutgers.edu>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Huang Ying <ying.huang@intel.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Mike Rapoport <rppt@linux.vnet.ibm.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Alexander Viro <viro@zeniv.linux.org.UK>
Subject: [RFC -mm] mm, userfaultfd, THP: Avoid waiting when PMD under THP migration
Date: Fri,  3 Nov 2017 15:52:31 +0800	[thread overview]
Message-ID: <20171103075231.25416-1-ying.huang@intel.com> (raw)

From: Huang Ying <ying.huang@intel.com>

If THP migration is enabled, the following situation is possible,

- A THP is mapped at source address
- Migration is started to move the THP to another node
- Page fault occurs
- The PMD (migration entry) is copied to the destination address in mremap

That is, it is possible for handle_userfault() encounter a PMD entry
which has been handled but !pmd_present().  In the current
implementation, we will wait for such PMD entries, which may cause
unnecessary waiting, and potential soft lockup.

This is fixed via avoiding to wait when !pmd_present(), only wait when
pmd_none().

Question:

I found userfaultfd_must_wait() is always called when PMD or PTE is
none, and with mm->mmap_sem read-lock held.  mremap() will write-lock
mm->mmap_sem.  And UFFDIO_COPY don't support to copy THP mapping.  So
the situation described above couldn't happen in practice?

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.UK>
Cc: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
---
 fs/userfaultfd.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index b5a0193e1960..0fcf66c3e439 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -294,10 +294,13 @@ static inline bool userfaultfd_must_wait(struct userfaultfd_ctx *ctx,
 	 * pmd_trans_unstable) of the pmd.
 	 */
 	_pmd = READ_ONCE(*pmd);
-	if (!pmd_present(_pmd))
+	if (pmd_none(_pmd))
 		goto out;
 
 	ret = false;
+	if (!pmd_present(_pmd))
+		goto out;
+
 	if (pmd_trans_huge(_pmd))
 		goto out;
 
-- 
2.14.2

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2017-11-03  7:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-03  7:52 Huang, Ying [this message]
2017-11-03  7:52 ` [RFC -mm] mm, userfaultfd, THP: Avoid waiting when PMD under THP migration Huang, Ying
2017-11-03 15:00 ` Zi Yan
2017-11-05  3:01   ` huang ying
2017-11-05  3:01     ` huang ying
2017-11-06 15:53     ` Zi Yan
2017-11-06 15:53       ` Zi Yan
2017-11-06 20:35       ` Andrea Arcangeli
2017-11-06 20:35         ` Andrea Arcangeli
2017-11-07  2:30         ` Zi Yan
2017-11-06 20:21     ` Andrea Arcangeli
2017-11-06 20:21       ` Andrea Arcangeli
2017-11-09  7:33       ` Huang, Ying
2017-11-09  7:33         ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171103075231.25416-1-ying.huang@intel.com \
    --to=ying.huang@intel.com \
    --cc=aarcange@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=viro@zeniv.linux.org.UK \
    --cc=zi.yan@cs.rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.