* [PATCH v4 0/2] mm: Fix checking unmapped holes for mbind @ 2019-11-08 13:01 Li Xinhai 2019-11-08 13:01 ` [PATCH v4 1/2] mm: Check range first in queue_pages_test_walk Li Xinhai 2019-11-08 13:01 ` [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind Li Xinhai 0 siblings, 2 replies; 5+ messages in thread From: Li Xinhai @ 2019-11-08 13:01 UTC (permalink / raw) To: linux-mm This patchset fix checking unmapped holes for mbind(). First patch makes sure the vma been correctly tracked in .test_walk(), so each time when .test_walk() is called, the neighborhood of two vma is correct. Current problem is that the !vma_migratable() check could cause return immediately without update tracking to vma. Second patch fix the inconsistent report of EFAULT when mbind() is called for MPOL_DEFAULT and non MPOL_DEFAULT cases, so application do not need to have workaround code to handle this special behavior. Currently there are two problems, one is that the .test_walk() can not know there is hole at tail side of range, because .test_walk() only call for vma not for hole. The other one is that mbind_range() checks for hole at head side of range but do not consider the MPOL_MF_DISCONTIG_OK flag as done in .test_walk(). Changes v3->v4: - Split to two patch; - Illustrates the application visible behaviors in changelog; - Hole at tail side of range is checked in .test_walk(), not after walk page. Only the case for whole range in hole is checked after walk page, because our .test_walk() function would not be called at all in this case. Changes v2->v3: - Add more details in change log; - Check holes in .test_walk() and after call walk_page_range(); Li Xinhai (2): mm: Check range first in queue_pages_test_walk mm: Fix checking unmapped holes for mbind mm/mempolicy.c | 47 +++++++++++++++++++++++++++++++---------------- 1 file changed, 31 insertions(+), 16 deletions(-) -- 1.8.3.1 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v4 1/2] mm: Check range first in queue_pages_test_walk 2019-11-08 13:01 [PATCH v4 0/2] mm: Fix checking unmapped holes for mbind Li Xinhai @ 2019-11-08 13:01 ` Li Xinhai 2019-11-14 9:35 ` Naoya Horiguchi 2019-11-08 13:01 ` [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind Li Xinhai 1 sibling, 1 reply; 5+ messages in thread From: Li Xinhai @ 2019-11-08 13:01 UTC (permalink / raw) To: linux-mm Cc: Andrew Morton, Naoya Horiguchi, Michal Hocko, Vlastimil Babka, Hugh Dickins, linux-man Checking unmapped hole and updating the previous vma must be handled first, otherwise the unmapped hole could be calculated from a wrong previous vma. Several commits were relevant to this error: commit 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()"), This commit was correct, the VM_PFNMAP check was after updateing previous vma commit 48684a65b4e3 (mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP)), This commit added VM_PFNMAP check before updateing previous vma. Then, there were two VM_PFNMAP check did same thing twice. commit acda0c334028 (mm/mempolicy.c: get rid of duplicated check for vma(VM_PFNMAP) in queue_page s_range()), This commit tried to fix the duplicated VM_PFNMAP check, but it wrongly removed the one which was after updateing vma. Fixes: acda0c334028 (mm/mempolicy.c: get rid of duplicated check for vma(VM_PFNMAP) in queue_page s_range()) Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: linux-man <linux-man@vger.kernel.org> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> --- mm/mempolicy.c | 19 ++++++++++--------- 1 file changed, 10 insertions(+), 9 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 4ae967b..807f06f 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -618,6 +618,16 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end, unsigned long endvma = vma->vm_end; unsigned long flags = qp->flags; + /* range check first */ + if (!(flags & MPOL_MF_DISCONTIG_OK)) { + if (!vma->vm_next && vma->vm_end < end) + return -EFAULT; + if (qp->prev && qp->prev->vm_end < vma->vm_start) + return -EFAULT; + } + + qp->prev = vma; + /* * Need check MPOL_MF_STRICT to return -EIO if possible * regardless of vma_migratable @@ -631,15 +641,6 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end, if (vma->vm_start > start) start = vma->vm_start; - if (!(flags & MPOL_MF_DISCONTIG_OK)) { - if (!vma->vm_next && vma->vm_end < end) - return -EFAULT; - if (qp->prev && qp->prev->vm_end < vma->vm_start) - return -EFAULT; - } - - qp->prev = vma; - if (flags & MPOL_MF_LAZY) { /* Similar to task_numa_work, skip inaccessible VMAs */ if (!is_vm_hugetlb_page(vma) && -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v4 1/2] mm: Check range first in queue_pages_test_walk 2019-11-08 13:01 ` [PATCH v4 1/2] mm: Check range first in queue_pages_test_walk Li Xinhai @ 2019-11-14 9:35 ` Naoya Horiguchi 0 siblings, 0 replies; 5+ messages in thread From: Naoya Horiguchi @ 2019-11-14 9:35 UTC (permalink / raw) To: Li Xinhai Cc: linux-mm, Andrew Morton, Michal Hocko, Vlastimil Babka, Hugh Dickins, linux-man On Fri, Nov 08, 2019 at 09:01:43PM +0800, Li Xinhai wrote: > Checking unmapped hole and updating the previous vma must be handled first, > otherwise the unmapped hole could be calculated from a wrong previous vma. > > Several commits were relevant to this error: > commit 6f4576e3687b > ("mempolicy: apply page table walker on queue_pages_range()"), > This commit was correct, the VM_PFNMAP check was after updateing previous > vma > > commit 48684a65b4e3 > (mm: pagewalk: fix misbehavior of walk_page_range for vma(VM_PFNMAP)), > This commit added VM_PFNMAP check before updateing previous vma. Then, > there were two VM_PFNMAP check did same thing twice. > > commit acda0c334028 > (mm/mempolicy.c: get rid of duplicated check for vma(VM_PFNMAP) in queue_page > s_range()), > This commit tried to fix the duplicated VM_PFNMAP check, but it wrongly > removed the one which was after updateing vma. > > Fixes: acda0c334028 (mm/mempolicy.c: get rid of duplicated check for vma(VM_PFNMAP) in queue_page > s_range()) > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> > Cc: Michal Hocko <mhocko@suse.com> > Cc: Vlastimil Babka <vbabka@suse.cz> > Cc: Hugh Dickins <hughd@google.com> > Cc: linux-man <linux-man@vger.kernel.org> > Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind 2019-11-08 13:01 [PATCH v4 0/2] mm: Fix checking unmapped holes for mbind Li Xinhai 2019-11-08 13:01 ` [PATCH v4 1/2] mm: Check range first in queue_pages_test_walk Li Xinhai @ 2019-11-08 13:01 ` Li Xinhai [not found] ` <20191114093018.GA2144@hori.linux.bs1.fc.nec.co.jp> 1 sibling, 1 reply; 5+ messages in thread From: Li Xinhai @ 2019-11-08 13:01 UTC (permalink / raw) To: linux-mm Cc: Andrew Morton, Naoya Horiguchi, Michal Hocko, Vlastimil Babka, Hugh Dickins, linux-man mbind() is required to report EFAULT if range, specified by addr and len, contains unmapped holes. In current implementation, below rules are applied for this checking: 1 Unmapped holes at any part of the specified range should be reported as EFAULT if mbind() for none MPOL_DEFAULT cases; 2 Unmapped holes at any part of the specified range should be ignored (do not reprot EFAULT) if mbind() for MPOL_DEFAULT case; 3 The whole range in an unmapped hole should be reported as EFAULT; Note that rule 2 does not fullfill the mbind() API definition, but since that behavior has existed for long days (the internal flag MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to change it. In current code, application observed inconsistent behavior on rule 1 and rule 2 respectively. That inconsistency is fixed as below details. Cases of rule 1: 1) Hole at head side of range. Current code reprot EFAULT, no change by this patch. [ vma ][ hole ][ vma ] [ range ] 2) Hole at middle of range. Current code report EFAULT, no change by this patch. [ vma ][ hole ][ vma ] [ range ] 3) Hole at tail side of range. Current code do not report EFAULT, this patch fix it. [ vma ][ hole ][ vma ] [ range ] Cases of rule 2: 1) Hole at head side of range. Current code reprot EFAULT, this patch fix it. [ vma ][ hole ][ vma ] [ range ] 2) Hole at middle of range. Current code do not report EFAULT, no change by this patch. this patch. [ vma ][ hole ][ vma] [ range ] 3) Hole at tail side of range. Current code do not report EFAULT, no change by this patch. [ vma ][ hole ][ vma] [ range ] This patch has no changes to rule 3. The unmapped hole checking can also be handled by using .pte_hole(), instead of .test_walk(). But .pte_hole() is called for holes inside and outside vma, which causes more cost, so this patch keeps the original design with .test_walk(). Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()") Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: linux-man <linux-man@vger.kernel.org> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> --- mm/mempolicy.c | 40 +++++++++++++++++++++++++++------------- 1 file changed, 27 insertions(+), 13 deletions(-) diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 807f06f..c697b29 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -410,7 +410,9 @@ struct queue_pages { struct list_head *pagelist; unsigned long flags; nodemask_t *nmask; - struct vm_area_struct *prev; + unsigned long start; + unsigned long end; + struct vm_area_struct *first; }; /* @@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end, unsigned long flags = qp->flags; /* range check first */ - if (!(flags & MPOL_MF_DISCONTIG_OK)) { - if (!vma->vm_next && vma->vm_end < end) - return -EFAULT; - if (qp->prev && qp->prev->vm_end < vma->vm_start) + VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end)); + + if (!qp->first) { + qp->first = vma; + if (!(flags & MPOL_MF_DISCONTIG_OK) && + (qp->start < vma->vm_start)) + /* hole at head side of range */ return -EFAULT; } - - qp->prev = vma; + if (!(flags & MPOL_MF_DISCONTIG_OK) && + ((vma->vm_end < qp->end) && + (!vma->vm_next || vma->vm_end < vma->vm_next->vm_start))) + /* hole at middle or tail of range */ + return -EFAULT; /* * Need check MPOL_MF_STRICT to return -EIO if possible @@ -638,8 +646,6 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end, if (endvma > end) endvma = end; - if (vma->vm_start > start) - start = vma->vm_start; if (flags & MPOL_MF_LAZY) { /* Similar to task_numa_work, skip inaccessible VMAs */ @@ -680,14 +686,23 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end, nodemask_t *nodes, unsigned long flags, struct list_head *pagelist) { + int err; struct queue_pages qp = { .pagelist = pagelist, .flags = flags, .nmask = nodes, - .prev = NULL, + .start = start, + .end = end, + .first = NULL, }; - return walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp); + err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp); + + if (!qp.first) + /* whole range in hole */ + err = -EFAULT; + + return err; } /* @@ -739,8 +754,7 @@ static int mbind_range(struct mm_struct *mm, unsigned long start, unsigned long vmend; vma = find_vma(mm, start); - if (!vma || vma->vm_start > start) - return -EFAULT; + VM_BUG_ON(!vma); prev = vma->vm_prev; if (start > vma->vm_start) -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 5+ messages in thread
[parent not found: <20191114093018.GA2144@hori.linux.bs1.fc.nec.co.jp>]
* Re: [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind [not found] ` <20191114093018.GA2144@hori.linux.bs1.fc.nec.co.jp> @ 2019-11-15 15:33 ` lixinhai.lxh 0 siblings, 0 replies; 5+ messages in thread From: lixinhai.lxh @ 2019-11-15 15:33 UTC (permalink / raw) To: n-horiguchi Cc: linux-mm, akpm, mhocko, Vlastimil Babka, Hugh Dickins, linux-man On 2019-11-14 at 17:30 Naoya Horiguchi wrote: >On Fri, Nov 08, 2019 at 09:01:44PM +0800, Li Xinhai wrote: >> mbind() is required to report EFAULT if range, specified by addr and len, >> contains unmapped holes. In current implementation, below rules are applied >> for this checking: >> 1 Unmapped holes at any part of the specified range should be reported as >> EFAULT if mbind() for none MPOL_DEFAULT cases; >> 2 Unmapped holes at any part of the specified range should be ignored (do >> not reprot EFAULT) if mbind() for MPOL_DEFAULT case; >> 3 The whole range in an unmapped hole should be reported as EFAULT; >> Note that rule 2 does not fullfill the mbind() API definition, but since >> that behavior has existed for long days (the internal flag >> MPOL_MF_DISCONTIG_OK is for this purpose), this patch does not plan to >> change it. >> >> In current code, application observed inconsistent behavior on rule 1 and >> rule 2 respectively. That inconsistency is fixed as below details. >> >> Cases of rule 1: >> 1) Hole at head side of range. Current code reprot EFAULT, no change by >> this patch. >> [ vma ][ hole ][ vma ] >> [ range ] >> 2) Hole at middle of range. Current code report EFAULT, no change by >> this patch. >> [ vma ][ hole ][ vma ] >> [ range ] >> 3) Hole at tail side of range. Current code do not report EFAULT, this >> patch fix it. >> [ vma ][ hole ][ vma ] >> [ range ] >> >> Cases of rule 2: >> 1) Hole at head side of range. Current code reprot EFAULT, this patch >> fix it. >> [ vma ][ hole ][ vma ] >> [ range ] >> 2) Hole at middle of range. Current code do not report EFAULT, no change >> by this patch. >> this patch. >> [ vma ][ hole ][ vma] >> [ range ] >> 3) Hole at tail side of range. Current code do not report EFAULT, no >> change by this patch. >> [ vma ][ hole ][ vma] >> [ range ] >> >> This patch has no changes to rule 3. >> >> The unmapped hole checking can also be handled by using .pte_hole(), >> instead of .test_walk(). But .pte_hole() is called for holes inside and >> outside vma, which causes more cost, so this patch keeps the original >> design with .test_walk(). >> >> Fixes: 6f4576e3687b ("mempolicy: apply page table walker on queue_pages_range()") >> Cc: Andrew Morton <akpm@linux-foundation.org> >> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> >> Cc: Michal Hocko <mhocko@suse.com> >> Cc: Vlastimil Babka <vbabka@suse.cz> >> Cc: Hugh Dickins <hughd@google.com> >> Cc: linux-man <linux-man@vger.kernel.org> >> Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com> >> --- >> mm/mempolicy.c | 40 +++++++++++++++++++++++++++------------- >> 1 file changed, 27 insertions(+), 13 deletions(-) >> >> diff --git a/mm/mempolicy.c b/mm/mempolicy.c >> index 807f06f..c697b29 100644 >> --- a/mm/mempolicy.c >> +++ b/mm/mempolicy.c >> @@ -410,7 +410,9 @@ struct queue_pages { >> struct list_head *pagelist; >> unsigned long flags; >> nodemask_t *nmask; >> - struct vm_area_struct *prev; >> + unsigned long start; >> + unsigned long end; >> + struct vm_area_struct *first; >> }; >> >> /* >> @@ -619,14 +621,20 @@ static int queue_pages_test_walk(unsigned long start, unsigned long end, >> unsigned long flags = qp->flags; >> >> /* range check first */ >> - if (!(flags & MPOL_MF_DISCONTIG_OK)) { >> - if (!vma->vm_next && vma->vm_end < end) >> - return -EFAULT; >> - if (qp->prev && qp->prev->vm_end < vma->vm_start) >> + VM_BUG_ON((vma->vm_start > start) || (vma->vm_end < end)); >> + >> + if (!qp->first) { >> + qp->first = vma; >> + if (!(flags & MPOL_MF_DISCONTIG_OK) && >> + (qp->start < vma->vm_start)) >> + /* hole at head side of range */ >> return -EFAULT; >> } >> - >> - qp->prev = vma; >> + if (!(flags & MPOL_MF_DISCONTIG_OK) && >> + ((vma->vm_end < qp->end) && > >You here have a trailing whitespace. > >Otherwise, looks good to me. > >Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Thanks for review, I'v sent out new patch with same title - Xinhai ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-11-15 15:33 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-08 13:01 [PATCH v4 0/2] mm: Fix checking unmapped holes for mbind Li Xinhai 2019-11-08 13:01 ` [PATCH v4 1/2] mm: Check range first in queue_pages_test_walk Li Xinhai 2019-11-14 9:35 ` Naoya Horiguchi 2019-11-08 13:01 ` [PATCH v4 2/2] mm: Fix checking unmapped holes for mbind Li Xinhai [not found] ` <20191114093018.GA2144@hori.linux.bs1.fc.nec.co.jp> 2019-11-15 15:33 ` lixinhai.lxh
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).