Linux-man Archive on lore.kernel.org
 help / color / Atom feed
From: Li Xinhai <lixinhai.lxh@gmail.com>
To: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Hugh Dickins <hughd@google.com>,
	linux-man <linux-man@vger.kernel.org>
Subject: [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind
Date: Fri,  8 Nov 2019 21:01:44 +0800
Message-ID: <1573218104-11021-3-git-send-email-lixinhai.lxh@gmail.com> (raw)
In-Reply-To: <1573218104-11021-1-git-send-email-lixinhai.lxh@gmail.com>

mbind() is required to report EFAULT if range, specified by addr and len,
contains unmapped holes. In current implementation, below rules are applied
for this checking:
1 Unmapped holes at any part of the specified range should be reported as
  EFAULT if mbind() for none MPOL_DEFAULT cases;
2 Unmapped holes at any part of the specified range should be ignored (do
  not reprot EFAULT) if mbind() for MPOL_DEFAULT case;
3 The whole range in an unmapped hole should be reported as EFAULT;
Note that rule 2 does not fullfill the mbind() API definition, but since
that behavior has existed for long days (the internal flag
MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to
change it.

In current code, application observed inconsistent behavior on rule 1 and
rule 2 respectively. That inconsistency is fixed as below details.

Cases of rule 1:
1) Hole at head side of range. Current code reprot EFAULT, no change by
this patch.
[  vma  ][ hole ][  vma  ]
            [  range  ]
2) Hole at middle of range. Current code report EFAULT, no change by
this patch.
[  vma  ][ hole ][ vma ]
   [     range      ]
3) Hole at tail side of range. Current code do not report EFAULT, this
patch fix it.
[  vma  ][ hole ][ vma ]
   [  range  ]

Cases of rule 2:
1) Hole at head side of range. Current code reprot EFAULT, this patch
fix it.
[  vma  ][ hole ][  vma  ]
            [  range  ]
2) Hole at middle of range. Current code do not report EFAULT, no change
by this patch.
this patch.
[  vma  ][ hole ][ vma]
   [     range      ]
3) Hole at tail side of range. Current code do not report EFAULT, no
change by this patch.
[  vma  ][ hole ][ vma]
   [  range  ]

This patch has no changes to rule 3.

The unmapped hole checking can also be handled by using .pte_hole(),
instead of .test_walk(). But .pte_hole() is called for holes inside and
outside vma, which causes more cost, so this patch keeps the original
design with .test_walk().

Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-man <linux-man@vger.kernel.org>
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
---
 mm/mempolicy.c | 40 +++++++++++++++++++++++++++-------------
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 807f06f..c697b29 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -410,7 +410,9 @@ struct queue_pages {
 	struct list_head *pagelist;
 	unsigned long flags;
 	nodemask_t *nmask;
-	struct vm_area_struct *prev;
+	unsigned long start;
+	unsigned long end;
+	struct vm_area_struct *first;
 };
 
 /*
@@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
 	unsigned long flags = qp->flags;
 
 	/* range check first */
-	if (!(flags & MPOL_MF_DISCONTIG_OK)) {
-		if (!vma->vm_next && vma->vm_end < end)
-			return -EFAULT;
-		if (qp->prev && qp->prev->vm_end < vma->vm_start)
+	VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end));
+
+	if (!qp->first) {
+		qp->first = vma;
+		if (!(flags & MPOL_MF_DISCONTIG_OK) &&
+			(qp->start < vma->vm_start))
+			/* hole at head side of range */
 			return -EFAULT;
 	}
-
-	qp->prev = vma;
+	if (!(flags & MPOL_MF_DISCONTIG_OK) &&
+		((vma->vm_end < qp->end) && 
+		(!vma->vm_next || vma->vm_end < vma->vm_next->vm_start)))
+		/* hole at middle or tail of range */
+		return -EFAULT;
 
 	/*
 	 * Need check MPOL_MF_STRICT to return -EIO if possible
@@ -638,8 +646,6 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
 
 	if (endvma > end)
 		endvma = end;
-	if (vma->vm_start > start)
-		start = vma->vm_start;
 
 	if (flags & MPOL_MF_LAZY) {
 		/* Similar to task_numa_work, skip inaccessible VMAs */
@@ -680,14 +686,23 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
 		nodemask_t *nodes, unsigned long flags,
 		struct list_head *pagelist)
 {
+	int err;
 	struct queue_pages qp = {
 		.pagelist = pagelist,
 		.flags = flags,
 		.nmask = nodes,
-		.prev = NULL,
+		.start = start,
+		.end = end,
+		.first = NULL,
 	};
 
-	return walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
+	err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
+
+	if (!qp.first)
+		/* whole range in hole */
+		err = -EFAULT;
+
+	return err;
 }
 
 /*
@@ -739,8 +754,7 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
 	unsigned long vmend;
 
 	vma = find_vma(mm, start);
-	if (!vma || vma->vm_start > start)
-		return -EFAULT;
+	VM_BUG_ON(!vma);
 
 	prev = vma->vm_prev;
 	if (start > vma->vm_start)
-- 
1.8.3.1


  parent reply index

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <1573218104-11021-1-git-send-email-lixinhai.lxh@gmail.com>
2019-11-08 13:01 ` [PATCH v4 1/2] mm: Check range first in queue_pages_test_walk Li Xinhai
2019-11-14  9:35   ` Naoya Horiguchi
2019-11-08 13:01 ` Li Xinhai [this message]
2019-11-14  9:30   ` [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind Naoya Horiguchi
2019-11-15 15:33     ` lixinhai.lxh
2019-11-15 15:28 Li Xinhai

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1573218104-11021-3-git-send-email-lixinhai.lxh@gmail.com \
    --to=lixinhai.lxh@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-man@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-man Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-man/0 linux-man/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-man linux-man/ https://lore.kernel.org/linux-man \
		linux-man@vger.kernel.org
	public-inbox-index linux-man

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-man


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git