From: Li Xinhai <lixinhai.lxh@gmail.com>
To: linux-mm@kvack.org
Cc: akpm@linux-foundation.org, Michal Hocko <mhocko@suse.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
linux-man <linux-man@vger.kernel.org>
Subject: [PATCH] mm/mempolicy: support MPOL_MF_STRICT for huge page mapping
Date: Fri, 31 Jan 2020 01:33:15 +0000 [thread overview]
Message-ID: <1580434395-9962-1-git-send-email-lixinhai.lxh@gmail.com> (raw)
MPOL_MF_STRICT is used in mbind() for purposes:
(1) MPOL_MF_STRICT is set alone without MPOL_MF_MOVE or MPOL_MF_MOVE_ALL,
to check if there is misplaced page and return -EIO;
(2) MPOL_MF_STRICT is set with MPOL_MF_MOVE or MPOL_MF_MOVE_ALL, to check
if there is misplaced page which is failed to isolate, or page is
success on isolate but failed to move, and return -EIO.
For non hugepage mapping, (1) and (2) are implemented as expectation.
For hugepage mapping, (1) is not implemented. And in (2), the part about
failed to isolate and report -EIO is not implemented.
This patch implements the missed parts for hugepage mapping. Benefits
with it applied:
- User space can apply same code logic to handle mbind() on hugepage and
non hugepage mapping;
- Reliably using MPOL_MF_STRICT alone to check whether there is misplaced
page or not when bind policy on address range, especially for address
range which contains both hugepage and non hugepage mapping.
Analysis of potential impact on existing users:
- For users who using MPOL_MF_STRICT alone, since mbind() don't report
reliable answer about misplaced page, their existing code have to
utilize other facilities, e.g. numa_maps of proc, to check misplaced
page. After this patch applied, that dedicated checking will still work
without been impacted;
- For users who using MPOL_MF_STRICT with MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL, the semantic about some pages could not be moved will
not be changed by this patch, because failed to isolate and failed to
move have same effects to users, so their existing code will not be
impacted.
In mbind man page, the note about 'MPOL_MF_STRICT is ignored on huge page
mappings' can be removed after this patch is applied.
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: linux-man <linux-man@vger.kernel.org>
---
Link to relevant discussion:
https://lore.kernel.org/linux-mm/1578993378-10860-2-git-send-email-lixinhai.lxh@gmail.com/
mm/mempolicy.c | 37 +++++++++++++++++++++++++++++++++----
1 file changed, 33 insertions(+), 4 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b2920ae..ec897d1 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -557,9 +557,10 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
unsigned long addr, unsigned long end,
struct mm_walk *walk)
{
+ int ret = 0;
#ifdef CONFIG_HUGETLB_PAGE
struct queue_pages *qp = walk->private;
- unsigned long flags = qp->flags;
+ unsigned long flags = (qp->flags & MPOL_MF_VALID);
struct page *page;
spinlock_t *ptl;
pte_t entry;
@@ -571,16 +572,44 @@ static int queue_pages_hugetlb(pte_t *pte, unsigned long hmask,
page = pte_page(entry);
if (!queue_pages_required(page, qp))
goto unlock;
+
+ if (flags == MPOL_MF_STRICT) {
+ /*
+ * STRICT alone means only detecting misplaced page and no
+ * need to further check other vma.
+ */
+ ret = -EIO;
+ goto unlock;
+ }
+
+ if (!vma_migratable(walk->vma)) {
+ /*
+ * Must be STRICT with MOVE*, otherwise .test_walk() have
+ * stopped walking current vma.
+ * Detecting misplaced page but allow migrating pages which
+ * have been queued.
+ */
+ ret = 1;
+ goto unlock;
+ }
+
/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
if (flags & (MPOL_MF_MOVE_ALL) ||
- (flags & MPOL_MF_MOVE && page_mapcount(page) == 1))
- isolate_huge_page(page, qp->pagelist);
+ (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
+ if (!isolate_huge_page(page, qp->pagelist) &&
+ (flags & MPOL_MF_STRICT))
+ /*
+ * Failed to isolate page but allow migrating pages
+ * which have been queued.
+ */
+ ret = 1;
+ }
unlock:
spin_unlock(ptl);
#else
BUG();
#endif
- return 0;
+ return ret;
}
#ifdef CONFIG_NUMA_BALANCING
--
1.8.3.1
next reply other threads:[~2020-01-31 1:35 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-31 1:33 Li Xinhai [this message]
2020-02-01 3:28 ` [PATCH] mm/mempolicy: support MPOL_MF_STRICT for huge page mapping Mike Kravetz
2020-02-01 13:56 ` Li Xinhai
2020-02-10 17:19 ` Mike Kravetz
2020-02-12 3:21 ` HORIGUCHI NAOYA(堀口 直也)
2020-02-12 5:25 ` Li Xinhai
2020-02-12 23:50 ` Mike Kravetz
2020-02-13 1:50 ` Li Xinhai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1580434395-9962-1-git-send-email-lixinhai.lxh@gmail.com \
--to=lixinhai.lxh@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=linux-man@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=n-horiguchi@ah.jp.nec.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).