* [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind
@ 2019-11-15 15:28 Li Xinhai
0 siblings, 0 replies; 4+ messages in thread
From: Li Xinhai @ 2019-11-15 15:28 UTC (permalink / raw)
To: linux-mm
Cc: Andrew Morton, Naoya Horiguchi, Michal Hocko, Vlastimil Babka,
Hugh Dickins, linux-man
mbind() is required to report EFAULT if range, specified by addr and len,
contains unmapped holes. In current implementation, below rules are applied
for this checking:
1 Unmapped holes at any part of the specified range should be reported as
EFAULT if mbind() for none MPOL_DEFAULT cases;
2 Unmapped holes at any part of the specified range should be ignored (do
not reprot EFAULT) if mbind() for MPOL_DEFAULT case;
3 The whole range in an unmapped hole should be reported as EFAULT;
Note that rule 2 does not fullfill the mbind() API definition, but since
that behavior has existed for long days (the internal flag
MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to
change it.
In current code, application observed inconsistent behavior on rule 1 and
rule 2 respectively. That inconsistency is fixed as below details.
Cases of rule 1:
1) Hole at head side of range. Current code reprot EFAULT, no change by
this patch.
[ vma ][ hole ][ vma ]
[ range ]
2) Hole at middle of range. Current code report EFAULT, no change by
this patch.
[ vma ][ hole ][ vma ]
[ range ]
3) Hole at tail side of range. Current code do not report EFAULT, this
patch fix it.
[ vma ][ hole ][ vma ]
[ range ]
Cases of rule 2:
1) Hole at head side of range. Current code reprot EFAULT, this patch
fix it.
[ vma ][ hole ][ vma ]
[ range ]
2) Hole at middle of range. Current code do not report EFAULT, no change
by this patch.
this patch.
[ vma ][ hole ][ vma]
[ range ]
3) Hole at tail side of range. Current code do not report EFAULT, no
change by this patch.
[ vma ][ hole ][ vma]
[ range ]
This patch has no changes to rule 3.
The unmapped hole checking can also be handled by using .pte_hole(),
instead of .test_walk(). But .pte_hole() is called for holes inside and
outside vma, which causes more cost, so this patch keeps the original
design with .test_walk().
Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-man <linux-man@vger.kernel.org>
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
---
send again for removing space
mm/mempolicy.c | 40 +++++++++++++++++++++++++++-------------
1 file changed, 27 insertions(+), 13 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 807f06f..c697b29 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -410,7 +410,9 @@ struct queue_pages {
struct list_head *pagelist;
unsigned long flags;
nodemask_t *nmask;
- struct vm_area_struct *prev;
+ unsigned long start;
+ unsigned long end;
+ struct vm_area_struct *first;
};
/*
@@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
unsigned long flags = qp->flags;
/* range check first */
- if (!(flags & MPOL_MF_DISCONTIG_OK)) {
- if (!vma->vm_next && vma->vm_end < end)
- return -EFAULT;
- if (qp->prev && qp->prev->vm_end < vma->vm_start)
+ VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end));
+
+ if (!qp->first) {
+ qp->first = vma;
+ if (!(flags & MPOL_MF_DISCONTIG_OK) &&
+ (qp->start < vma->vm_start))
+ /* hole at head side of range */
return -EFAULT;
}
-
- qp->prev = vma;
+ if (!(flags & MPOL_MF_DISCONTIG_OK) &&
+ ((vma->vm_end < qp->end) &&
+ (!vma->vm_next || vma->vm_end < vma->vm_next->vm_start)))
+ /* hole at middle or tail of range */
+ return -EFAULT;
/*
* Need check MPOL_MF_STRICT to return -EIO if possible
@@ -638,8 +646,6 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
if (endvma > end)
endvma = end;
- if (vma->vm_start > start)
- start = vma->vm_start;
if (flags & MPOL_MF_LAZY) {
/* Similar to task_numa_work, skip inaccessible VMAs */
@@ -680,14 +686,23 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
nodemask_t *nodes, unsigned long flags,
struct list_head *pagelist)
{
+ int err;
struct queue_pages qp = {
.pagelist = pagelist,
.flags = flags,
.nmask = nodes,
- .prev = NULL,
+ .start = start,
+ .end = end,
+ .first = NULL,
};
- return walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
+ err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
+
+ if (!qp.first)
+ /* whole range in hole */
+ err = -EFAULT;
+
+ return err;
}
/*
@@ -739,8 +754,7 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
unsigned long vmend;
vma = find_vma(mm, start);
- if (!vma || vma->vm_start > start)
- return -EFAULT;
+ VM_BUG_ON(!vma);
prev = vma->vm_prev;
if (start > vma->vm_start)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
[parent not found: <1573218104-11021-1-git-send-email-lixinhai.lxh@gmail.com>]
* [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind
[not found] <1573218104-11021-1-git-send-email-lixinhai.lxh@gmail.com>
@ 2019-11-08 13:01 ` Li Xinhai
2019-11-14 9:30 ` Naoya Horiguchi
0 siblings, 1 reply; 4+ messages in thread
From: Li Xinhai @ 2019-11-08 13:01 UTC (permalink / raw)
To: linux-mm
Cc: Andrew Morton, Naoya Horiguchi, Michal Hocko, Vlastimil Babka,
Hugh Dickins, linux-man
mbind() is required to report EFAULT if range, specified by addr and len,
contains unmapped holes. In current implementation, below rules are applied
for this checking:
1 Unmapped holes at any part of the specified range should be reported as
EFAULT if mbind() for none MPOL_DEFAULT cases;
2 Unmapped holes at any part of the specified range should be ignored (do
not reprot EFAULT) if mbind() for MPOL_DEFAULT case;
3 The whole range in an unmapped hole should be reported as EFAULT;
Note that rule 2 does not fullfill the mbind() API definition, but since
that behavior has existed for long days (the internal flag
MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to
change it.
In current code, application observed inconsistent behavior on rule 1 and
rule 2 respectively. That inconsistency is fixed as below details.
Cases of rule 1:
1) Hole at head side of range. Current code reprot EFAULT, no change by
this patch.
[ vma ][ hole ][ vma ]
[ range ]
2) Hole at middle of range. Current code report EFAULT, no change by
this patch.
[ vma ][ hole ][ vma ]
[ range ]
3) Hole at tail side of range. Current code do not report EFAULT, this
patch fix it.
[ vma ][ hole ][ vma ]
[ range ]
Cases of rule 2:
1) Hole at head side of range. Current code reprot EFAULT, this patch
fix it.
[ vma ][ hole ][ vma ]
[ range ]
2) Hole at middle of range. Current code do not report EFAULT, no change
by this patch.
this patch.
[ vma ][ hole ][ vma]
[ range ]
3) Hole at tail side of range. Current code do not report EFAULT, no
change by this patch.
[ vma ][ hole ][ vma]
[ range ]
This patch has no changes to rule 3.
The unmapped hole checking can also be handled by using .pte_hole(),
instead of .test_walk(). But .pte_hole() is called for holes inside and
outside vma, which causes more cost, so this patch keeps the original
design with .test_walk().
Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hugh Dickins <hughd@google.com>
Cc: linux-man <linux-man@vger.kernel.org>
Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
---
mm/mempolicy.c | 40 +++++++++++++++++++++++++++-------------
1 file changed, 27 insertions(+), 13 deletions(-)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 807f06f..c697b29 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -410,7 +410,9 @@ struct queue_pages {
struct list_head *pagelist;
unsigned long flags;
nodemask_t *nmask;
- struct vm_area_struct *prev;
+ unsigned long start;
+ unsigned long end;
+ struct vm_area_struct *first;
};
/*
@@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
unsigned long flags = qp->flags;
/* range check first */
- if (!(flags & MPOL_MF_DISCONTIG_OK)) {
- if (!vma->vm_next && vma->vm_end < end)
- return -EFAULT;
- if (qp->prev && qp->prev->vm_end < vma->vm_start)
+ VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end));
+
+ if (!qp->first) {
+ qp->first = vma;
+ if (!(flags & MPOL_MF_DISCONTIG_OK) &&
+ (qp->start < vma->vm_start))
+ /* hole at head side of range */
return -EFAULT;
}
-
- qp->prev = vma;
+ if (!(flags & MPOL_MF_DISCONTIG_OK) &&
+ ((vma->vm_end < qp->end) &&
+ (!vma->vm_next || vma->vm_end < vma->vm_next->vm_start)))
+ /* hole at middle or tail of range */
+ return -EFAULT;
/*
* Need check MPOL_MF_STRICT to return -EIO if possible
@@ -638,8 +646,6 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
if (endvma > end)
endvma = end;
- if (vma->vm_start > start)
- start = vma->vm_start;
if (flags & MPOL_MF_LAZY) {
/* Similar to task_numa_work, skip inaccessible VMAs */
@@ -680,14 +686,23 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
nodemask_t *nodes, unsigned long flags,
struct list_head *pagelist)
{
+ int err;
struct queue_pages qp = {
.pagelist = pagelist,
.flags = flags,
.nmask = nodes,
- .prev = NULL,
+ .start = start,
+ .end = end,
+ .first = NULL,
};
- return walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
+ err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
+
+ if (!qp.first)
+ /* whole range in hole */
+ err = -EFAULT;
+
+ return err;
}
/*
@@ -739,8 +754,7 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
unsigned long vmend;
vma = find_vma(mm, start);
- if (!vma || vma->vm_start > start)
- return -EFAULT;
+ VM_BUG_ON(!vma);
prev = vma->vm_prev;
if (start > vma->vm_start)
--
1.8.3.1
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind
2019-11-08 13:01 ` Li Xinhai
@ 2019-11-14 9:30 ` Naoya Horiguchi
2019-11-15 15:33 ` lixinhai.lxh
0 siblings, 1 reply; 4+ messages in thread
From: Naoya Horiguchi @ 2019-11-14 9:30 UTC (permalink / raw)
To: Li Xinhai
Cc: linux-mm, Andrew Morton, Michal Hocko, Vlastimil Babka,
Hugh Dickins, linux-man
On Fri, Nov 08, 2019 at 09:01:44PM +0800, Li Xinhai wrote:
> mbind() is required to report EFAULT if range, specified by addr and len,
> contains unmapped holes. In current implementation, below rules are applied
> for this checking:
> 1 Unmapped holes at any part of the specified range should be reported as
> EFAULT if mbind() for none MPOL_DEFAULT cases;
> 2 Unmapped holes at any part of the specified range should be ignored (do
> not reprot EFAULT) if mbind() for MPOL_DEFAULT case;
> 3 The whole range in an unmapped hole should be reported as EFAULT;
> Note that rule 2 does not fullfill the mbind() API definition, but since
> that behavior has existed for long days (the internal flag
> MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to
> change it.
>
> In current code, application observed inconsistent behavior on rule 1 and
> rule 2 respectively. That inconsistency is fixed as below details.
>
> Cases of rule 1:
> 1) Hole at head side of range. Current code reprot EFAULT, no change by
> this patch.
> [ vma ][ hole ][ vma ]
> [ range ]
> 2) Hole at middle of range. Current code report EFAULT, no change by
> this patch.
> [ vma ][ hole ][ vma ]
> [ range ]
> 3) Hole at tail side of range. Current code do not report EFAULT, this
> patch fix it.
> [ vma ][ hole ][ vma ]
> [ range ]
>
> Cases of rule 2:
> 1) Hole at head side of range. Current code reprot EFAULT, this patch
> fix it.
> [ vma ][ hole ][ vma ]
> [ range ]
> 2) Hole at middle of range. Current code do not report EFAULT, no change
> by this patch.
> this patch.
> [ vma ][ hole ][ vma]
> [ range ]
> 3) Hole at tail side of range. Current code do not report EFAULT, no
> change by this patch.
> [ vma ][ hole ][ vma]
> [ range ]
>
> This patch has no changes to rule 3.
>
> The unmapped hole checking can also be handled by using .pte_hole(),
> instead of .test_walk(). But .pte_hole() is called for holes inside and
> outside vma, which causes more cost, so this patch keeps the original
> design with .test_walk().
>
> Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: linux-man <linux-man@vger.kernel.org>
> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
> ---
> mm/mempolicy.c | 40 +++++++++++++++++++++++++++-------------
> 1 file changed, 27 insertions(+), 13 deletions(-)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 807f06f..c697b29 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -410,7 +410,9 @@ struct queue_pages {
> struct list_head *pagelist;
> unsigned long flags;
> nodemask_t *nmask;
> - struct vm_area_struct *prev;
> + unsigned long start;
> + unsigned long end;
> + struct vm_area_struct *first;
> };
>
> /*
> @@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
> unsigned long flags = qp->flags;
>
> /* range check first */
> - if (!(flags & MPOL_MF_DISCONTIG_OK)) {
> - if (!vma->vm_next && vma->vm_end < end)
> - return -EFAULT;
> - if (qp->prev && qp->prev->vm_end < vma->vm_start)
> + VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end));
> +
> + if (!qp->first) {
> + qp->first = vma;
> + if (!(flags & MPOL_MF_DISCONTIG_OK) &&
> + (qp->start < vma->vm_start))
> + /* hole at head side of range */
> return -EFAULT;
> }
> -
> - qp->prev = vma;
> + if (!(flags & MPOL_MF_DISCONTIG_OK) &&
> + ((vma->vm_end < qp->end) &&
You here have a trailing whitespace.
Otherwise, looks good to me.
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind
2019-11-14 9:30 ` Naoya Horiguchi
@ 2019-11-15 15:33 ` lixinhai.lxh
0 siblings, 0 replies; 4+ messages in thread
From: lixinhai.lxh @ 2019-11-15 15:33 UTC (permalink / raw)
To: n-horiguchi
Cc: linux-mm, akpm, mhocko, Vlastimil Babka, Hugh Dickins, linux-man
On 2019-11-14 at 17:30 Naoya Horiguchi wrote:
>On Fri, Nov 08, 2019 at 09:01:44PM +0800, Li Xinhai wrote:
>> mbind() is required to report EFAULT if range, specified by addr and len,
>> contains unmapped holes. In current implementation, below rules are applied
>> for this checking:
>> 1 Unmapped holes at any part of the specified range should be reported as
>> EFAULT if mbind() for none MPOL_DEFAULT cases;
>> 2 Unmapped holes at any part of the specified range should be ignored (do
>> not reprot EFAULT) if mbind() for MPOL_DEFAULT case;
>> 3 The whole range in an unmapped hole should be reported as EFAULT;
>> Note that rule 2 does not fullfill the mbind() API definition, but since
>> that behavior has existed for long days (the internal flag
>> MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to
>> change it.
>>
>> In current code, application observed inconsistent behavior on rule 1 and
>> rule 2 respectively. That inconsistency is fixed as below details.
>>
>> Cases of rule 1:
>> 1) Hole at head side of range. Current code reprot EFAULT, no change by
>> this patch.
>> [ vma ][ hole ][ vma ]
>> [ range ]
>> 2) Hole at middle of range. Current code report EFAULT, no change by
>> this patch.
>> [ vma ][ hole ][ vma ]
>> [ range ]
>> 3) Hole at tail side of range. Current code do not report EFAULT, this
>> patch fix it.
>> [ vma ][ hole ][ vma ]
>> [ range ]
>>
>> Cases of rule 2:
>> 1) Hole at head side of range. Current code reprot EFAULT, this patch
>> fix it.
>> [ vma ][ hole ][ vma ]
>> [ range ]
>> 2) Hole at middle of range. Current code do not report EFAULT, no change
>> by this patch.
>> this patch.
>> [ vma ][ hole ][ vma]
>> [ range ]
>> 3) Hole at tail side of range. Current code do not report EFAULT, no
>> change by this patch.
>> [ vma ][ hole ][ vma]
>> [ range ]
>>
>> This patch has no changes to rule 3.
>>
>> The unmapped hole checking can also be handled by using .pte_hole(),
>> instead of .test_walk(). But .pte_hole() is called for holes inside and
>> outside vma, which causes more cost, so this patch keeps the original
>> design with .test_walk().
>>
>> Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()")
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: linux-man <linux-man@vger.kernel.org>
>> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
>> ---
>> mm/mempolicy.c | 40 +++++++++++++++++++++++++++-------------
>> 1 file changed, 27 insertions(+), 13 deletions(-)
>>
>> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
>> index 807f06f..c697b29 100644
>> --- a/mm/mempolicy.c
>> +++ b/mm/mempolicy.c
>> @@ -410,7 +410,9 @@ struct queue_pages {
>> struct list_head *pagelist;
>> unsigned long flags;
>> nodemask_t *nmask;
>> - struct vm_area_struct *prev;
>> + unsigned long start;
>> + unsigned long end;
>> + struct vm_area_struct *first;
>> };
>>
>> /*
>> @@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end,
>> unsigned long flags = qp->flags;
>>
>> /* range check first */
>> - if (!(flags & MPOL_MF_DISCONTIG_OK)) {
>> - if (!vma->vm_next && vma->vm_end < end)
>> - return -EFAULT;
>> - if (qp->prev && qp->prev->vm_end < vma->vm_start)
>> + VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end));
>> +
>> + if (!qp->first) {
>> + qp->first = vma;
>> + if (!(flags & MPOL_MF_DISCONTIG_OK) &&
>> + (qp->start < vma->vm_start))
>> + /* hole at head side of range */
>> return -EFAULT;
>> }
>> -
>> - qp->prev = vma;
>> + if (!(flags & MPOL_MF_DISCONTIG_OK) &&
>> + ((vma->vm_end < qp->end) &&
>
>You here have a trailing whitespace.
>
>Otherwise, looks good to me.
>
>Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Thanks for review, I'v sent out new patch with same title
- Xinhai
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-11-15 15:33 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-15 15:28 [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind Li Xinhai
[not found] <1573218104-11021-1-git-send-email-lixinhai.lxh@gmail.com>
2019-11-08 13:01 ` Li Xinhai
2019-11-14 9:30 ` Naoya Horiguchi
2019-11-15 15:33 ` lixinhai.lxh
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).