* [PATCH 0/2] Change PAT to support mremap use-cases @ 2015-12-09 16:26 ` Toshi Kani 0 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw) To: tglx, mingo, hpa, bp; +Cc: stsp, x86, linux-mm, linux-kernel This patch-set fixes issues found in the pfn tracking code of PAT when mremap() is used on a VM_PFNMAP range. Patch 1/2 changes untrack_pfn() to handle the case of mremap() with MREMAP_FIXED, i.e. remapping virtual address on a VM_PFNMAP range. Patch 2/2 changes the free_memtype() path to handle the case of mremap() that shrinks the size of a VM_PFNMAP range. Note, mremap() to expand the size of a VM_PFNMAP range is not a valid case as VM_DONTEXPAND is set along with VM_PFNMAP. --- Toshi Kani (2): 1/2 x86/mm/pat: Change untrack_pfn() to handle unmapped vma 2/2 x86/mm/pat: Change free_memtype() to free shrinking range --- arch/x86/mm/pat.c | 19 ++++++++++++------- arch/x86/mm/pat_rbtree.c | 46 +++++++++++++++++++++++++++++++++++++--------- 2 files changed, 49 insertions(+), 16 deletions(-) ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 0/2] Change PAT to support mremap use-cases @ 2015-12-09 16:26 ` Toshi Kani 0 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw) To: tglx, mingo, hpa, bp; +Cc: stsp, x86, linux-mm, linux-kernel This patch-set fixes issues found in the pfn tracking code of PAT when mremap() is used on a VM_PFNMAP range. Patch 1/2 changes untrack_pfn() to handle the case of mremap() with MREMAP_FIXED, i.e. remapping virtual address on a VM_PFNMAP range. Patch 2/2 changes the free_memtype() path to handle the case of mremap() that shrinks the size of a VM_PFNMAP range. Note, mremap() to expand the size of a VM_PFNMAP range is not a valid case as VM_DONTEXPAND is set along with VM_PFNMAP. --- Toshi Kani (2): 1/2 x86/mm/pat: Change untrack_pfn() to handle unmapped vma 2/2 x86/mm/pat: Change free_memtype() to free shrinking range --- arch/x86/mm/pat.c | 19 ++++++++++++------- arch/x86/mm/pat_rbtree.c | 46 +++++++++++++++++++++++++++++++++++++--------- 2 files changed, 49 insertions(+), 16 deletions(-) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma 2015-12-09 16:26 ` Toshi Kani @ 2015-12-09 16:26 ` Toshi Kani -1 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw) To: tglx, mingo, hpa, bp Cc: stsp, x86, linux-mm, linux-kernel, Toshi Kani, Borislav Petkov mremap() with MREMAP_FIXED, remapping to a new virtual address, on a VM_PFNMAP range causes the following WARN_ON_ONCE() message in untrack_pfn(). WARNING: CPU: 1 PID: 3493 at arch/x86/mm/pat.c:985 untrack_pfn+0xbd/0xd0() Call Trace: [<ffffffff817729ea>] dump_stack+0x45/0x57 [<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0 [<ffffffff8109e5ea>] warn_slowpath_null+0x1a/0x20 [<ffffffff8106a88d>] untrack_pfn+0xbd/0xd0 [<ffffffff811d2d5e>] unmap_single_vma+0x80e/0x860 [<ffffffff811d3725>] unmap_vmas+0x55/0xb0 [<ffffffff811d916c>] unmap_region+0xac/0x120 [<ffffffff811db86a>] do_munmap+0x28a/0x460 [<ffffffff811dec33>] move_vma+0x1b3/0x2e0 [<ffffffff811df113>] SyS_mremap+0x3b3/0x510 [<ffffffff817793ee>] entry_SYSCALL_64_fastpath+0x12/0x71 MREMAP_FIXED moves a virtual address of VM_PFNMAP, but keeps the pfn and cache type. In this case, untrack_pfn() is called with the old vma after its translation has removed. Hence, when follow_phys() fails, track_pfn() is changed to keep the pfn tracked and clears VM_PAT from the old vma, instead of WARN_ON_ONCE() on the case. Reference: https://lkml.org/lkml/2015/10/28/865 Reported-by: Stas Sergeev <stsp@list.ru> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> --- arch/x86/mm/pat.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 188e3e0..f3e391e 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, /* * untrack_pfn is called while unmapping a pfnmap for a region. - * untrack can be called for a specific region indicated by pfn and size or - * can be for the entire vma (in which case pfn, size are zero). + * untrack_pfn can be called for a specific region indicated by pfn and + * size or can be for the entire vma (in which case pfn, size are zero). + * + * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the + * pfn and cache type. In this case, untrack_pfn() is called with the + * old vma after its translation has removed. Hence, when follow_phys() + * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the + * old vma. */ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, unsigned long size) @@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, /* free the chunk starting from pfn or the whole chunk */ paddr = (resource_size_t)pfn << PAGE_SHIFT; if (!paddr && !size) { - if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) { - WARN_ON_ONCE(1); - return; - } + if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) + goto out; size = vma->vm_end - vma->vm_start; } free_pfn_range(paddr, size); +out: vma->vm_flags &= ~VM_PAT; } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma @ 2015-12-09 16:26 ` Toshi Kani 0 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw) To: tglx, mingo, hpa, bp Cc: stsp, x86, linux-mm, linux-kernel, Toshi Kani, Borislav Petkov mremap() with MREMAP_FIXED, remapping to a new virtual address, on a VM_PFNMAP range causes the following WARN_ON_ONCE() message in untrack_pfn(). WARNING: CPU: 1 PID: 3493 at arch/x86/mm/pat.c:985 untrack_pfn+0xbd/0xd0() Call Trace: [<ffffffff817729ea>] dump_stack+0x45/0x57 [<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0 [<ffffffff8109e5ea>] warn_slowpath_null+0x1a/0x20 [<ffffffff8106a88d>] untrack_pfn+0xbd/0xd0 [<ffffffff811d2d5e>] unmap_single_vma+0x80e/0x860 [<ffffffff811d3725>] unmap_vmas+0x55/0xb0 [<ffffffff811d916c>] unmap_region+0xac/0x120 [<ffffffff811db86a>] do_munmap+0x28a/0x460 [<ffffffff811dec33>] move_vma+0x1b3/0x2e0 [<ffffffff811df113>] SyS_mremap+0x3b3/0x510 [<ffffffff817793ee>] entry_SYSCALL_64_fastpath+0x12/0x71 MREMAP_FIXED moves a virtual address of VM_PFNMAP, but keeps the pfn and cache type. In this case, untrack_pfn() is called with the old vma after its translation has removed. Hence, when follow_phys() fails, track_pfn() is changed to keep the pfn tracked and clears VM_PAT from the old vma, instead of WARN_ON_ONCE() on the case. Reference: https://lkml.org/lkml/2015/10/28/865 Reported-by: Stas Sergeev <stsp@list.ru> Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> --- arch/x86/mm/pat.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index 188e3e0..f3e391e 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, /* * untrack_pfn is called while unmapping a pfnmap for a region. - * untrack can be called for a specific region indicated by pfn and size or - * can be for the entire vma (in which case pfn, size are zero). + * untrack_pfn can be called for a specific region indicated by pfn and + * size or can be for the entire vma (in which case pfn, size are zero). + * + * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the + * pfn and cache type. In this case, untrack_pfn() is called with the + * old vma after its translation has removed. Hence, when follow_phys() + * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the + * old vma. */ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, unsigned long size) @@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, /* free the chunk starting from pfn or the whole chunk */ paddr = (resource_size_t)pfn << PAGE_SHIFT; if (!paddr && !size) { - if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) { - WARN_ON_ONCE(1); - return; - } + if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) + goto out; size = vma->vm_end - vma->vm_start; } free_pfn_range(paddr, size); +out: vma->vm_flags &= ~VM_PAT; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma 2015-12-09 16:26 ` Toshi Kani (?) @ 2015-12-20 9:21 ` Thomas Gleixner 2015-12-22 17:02 ` Toshi Kani -1 siblings, 1 reply; 12+ messages in thread From: Thomas Gleixner @ 2015-12-20 9:21 UTC (permalink / raw) To: Toshi Kani Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov Toshi, On Wed, 9 Dec 2015, Toshi Kani wrote: > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c > index 188e3e0..f3e391e 100644 > --- a/arch/x86/mm/pat.c > +++ b/arch/x86/mm/pat.c > @@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot, > > /* > * untrack_pfn is called while unmapping a pfnmap for a region. > - * untrack can be called for a specific region indicated by pfn and size or > - * can be for the entire vma (in which case pfn, size are zero). > + * untrack_pfn can be called for a specific region indicated by pfn and > + * size or can be for the entire vma (in which case pfn, size are zero). > + * > + * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the > + * pfn and cache type. In this case, untrack_pfn() is called with the > + * old vma after its translation has removed. Hence, when follow_phys() > + * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the > + * old vma. > */ > void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, > unsigned long size) > @@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, > /* free the chunk starting from pfn or the whole chunk */ > paddr = (resource_size_t)pfn << PAGE_SHIFT; > if (!paddr && !size) { > - if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) { > - WARN_ON_ONCE(1); > - return; > - } > + if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) > + goto out; Shouldn't we have an explicit call in the mremap code which clears the PAT flag on the mm instead of removing this sanity check? Because that's what we end up with there. We just clear the PAT flag. I rather prefer to do that explicitely, so the following call to untrack_pfn() from move_vma()->do_munmap() ... will see the PAT flag cleared. untrack_moved_pfn() or such. Thanks, tglx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma 2015-12-20 9:21 ` Thomas Gleixner @ 2015-12-22 17:02 ` Toshi Kani 0 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-22 17:02 UTC (permalink / raw) To: Thomas Gleixner Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov On Sun, 2015-12-20 at 10:21 +0100, Thomas Gleixner wrote: > Toshi, > > On Wed, 9 Dec 2015, Toshi Kani wrote: > > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c > > index 188e3e0..f3e391e 100644 > > --- a/arch/x86/mm/pat.c > > +++ b/arch/x86/mm/pat.c > > @@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma, > > pgprot_t *prot, > > > > /* > > * untrack_pfn is called while unmapping a pfnmap for a region. > > - * untrack can be called for a specific region indicated by pfn and > > size or > > - * can be for the entire vma (in which case pfn, size are zero). > > + * untrack_pfn can be called for a specific region indicated by pfn > > and > > + * size or can be for the entire vma (in which case pfn, size are > > zero). > > + * > > + * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the > > + * pfn and cache type. In this case, untrack_pfn() is called with the > > + * old vma after its translation has removed. Hence, when > > follow_phys() > > + * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the > > + * old vma. > > */ > > void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, > > unsigned long size) > > @@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma, > > unsigned long pfn, > > /* free the chunk starting from pfn or the whole chunk */ > > paddr = (resource_size_t)pfn << PAGE_SHIFT; > > if (!paddr && !size) { > > - if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) > > { > > - WARN_ON_ONCE(1); > > - return; > > - } > > + if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) > > + goto out; > > Shouldn't we have an explicit call in the mremap code which clears the > PAT flag on the mm instead of removing this sanity check? > > Because that's what we end up with there. We just clear the PAT flag. > > I rather prefer to do that explicitely, so the following call to > untrack_pfn() from move_vma()->do_munmap() ... will see the PAT flag > cleared. untrack_moved_pfn() or such. Agreed. I will add untrack_pfn_moved(), which clears the PAT flag. Thanks! -Toshi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma @ 2015-12-22 17:02 ` Toshi Kani 0 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-22 17:02 UTC (permalink / raw) To: Thomas Gleixner Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov On Sun, 2015-12-20 at 10:21 +0100, Thomas Gleixner wrote: > Toshi, > > On Wed, 9 Dec 2015, Toshi Kani wrote: > > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c > > index 188e3e0..f3e391e 100644 > > --- a/arch/x86/mm/pat.c > > +++ b/arch/x86/mm/pat.c > > @@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma, > > pgprot_t *prot, > > > > /* > > * untrack_pfn is called while unmapping a pfnmap for a region. > > - * untrack can be called for a specific region indicated by pfn and > > size or > > - * can be for the entire vma (in which case pfn, size are zero). > > + * untrack_pfn can be called for a specific region indicated by pfn > > and > > + * size or can be for the entire vma (in which case pfn, size are > > zero). > > + * > > + * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the > > + * pfn and cache type. In this case, untrack_pfn() is called with the > > + * old vma after its translation has removed. Hence, when > > follow_phys() > > + * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the > > + * old vma. > > */ > > void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn, > > unsigned long size) > > @@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma, > > unsigned long pfn, > > /* free the chunk starting from pfn or the whole chunk */ > > paddr = (resource_size_t)pfn << PAGE_SHIFT; > > if (!paddr && !size) { > > - if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) > > { > > - WARN_ON_ONCE(1); > > - return; > > - } > > + if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) > > + goto out; > > Shouldn't we have an explicit call in the mremap code which clears the > PAT flag on the mm instead of removing this sanity check? > > Because that's what we end up with there. We just clear the PAT flag. > > I rather prefer to do that explicitely, so the following call to > untrack_pfn() from move_vma()->do_munmap() ... will see the PAT flag > cleared. untrack_moved_pfn() or such. Agreed. I will add untrack_pfn_moved(), which clears the PAT flag. Thanks! -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range 2015-12-09 16:26 ` Toshi Kani @ 2015-12-09 16:26 ` Toshi Kani -1 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw) To: tglx, mingo, hpa, bp Cc: stsp, x86, linux-mm, linux-kernel, Toshi Kani, Borislav Petkov mremap() to shrink the map size of a VM_PFNMAP range causes the following error message, and leaves the pfn range allocated. x86/PAT: test:3493 freeing invalid memtype [mem 0x483200000-0x4863fffff] rbt_memtype_erase(), called from free_memtype() with spin_lock held, only accepts to free a whole memtype node in memtype_rbroot. Change rbt_memtype_erase() to accept a request that shrinks the size of a memtype node for mremap(). memtype_rb_exact_match() is renamed to memtype_rb_match(), which now performs exact match or shrink match per match_type. Since the memtype_rbroot tree allows overlapping ranges, rbt_memtype_erase() checks exact match first to ensure the case of freeing a whole node, which is the normal case. For the shrink case, rbt_memtype_erase() proceeds in two steps, 1) remove the node, and then 2) insert the updated node. This allows proper update of augmented values, subtree_max_end, in the tree. Reference: https://lkml.org/lkml/2015/10/28/865 Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> --- arch/x86/mm/pat.c | 2 +- arch/x86/mm/pat_rbtree.c | 46 +++++++++++++++++++++++++++++++++++++--------- 2 files changed, 38 insertions(+), 10 deletions(-) diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index f3e391e..f277efd 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -586,7 +586,7 @@ int free_memtype(u64 start, u64 end) entry = rbt_memtype_erase(start, end); spin_unlock(&memtype_lock); - if (!entry) { + if (IS_ERR(entry)) { pr_info("x86/PAT: %s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n", current->comm, current->pid, start, end - 1); return -EINVAL; diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c index 6393108..d6faef8 100644 --- a/arch/x86/mm/pat_rbtree.c +++ b/arch/x86/mm/pat_rbtree.c @@ -98,8 +98,13 @@ static struct memtype *memtype_rb_lowest_match(struct rb_root *root, return last_lower; /* Returns NULL if there is no overlap */ } -static struct memtype *memtype_rb_exact_match(struct rb_root *root, - u64 start, u64 end) +enum { + MEMTYPE_EXACT_MATCH = 0, + MEMTYPE_SHRINK_MATCH = 1 +}; + +static struct memtype *memtype_rb_match(struct rb_root *root, + u64 start, u64 end, int match_type) { struct memtype *match; @@ -107,7 +112,12 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root, while (match != NULL && match->start < end) { struct rb_node *node; - if (match->start == start && match->end == end) + if ((match_type == MEMTYPE_EXACT_MATCH) && + (match->start == start) && (match->end == end)) + return match; + + if ((match_type == MEMTYPE_SHRINK_MATCH) && + (match->start < start) && (match->end == end)) return match; node = rb_next(&match->rb); @@ -117,7 +127,7 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root, match = NULL; } - return NULL; /* Returns NULL if there is no exact match */ + return NULL; /* Returns NULL if there is no match */ } static int memtype_rb_check_conflict(struct rb_root *root, @@ -210,12 +220,30 @@ struct memtype *rbt_memtype_erase(u64 start, u64 end) { struct memtype *data; - data = memtype_rb_exact_match(&memtype_rbroot, start, end); - if (!data) - goto out; + /* Exact match takes precedence over shrink match */ + data = memtype_rb_match(&memtype_rbroot, start, end, + MEMTYPE_EXACT_MATCH); + if (!data) { + data = memtype_rb_match(&memtype_rbroot, start, end, + MEMTYPE_SHRINK_MATCH); + if (!data) + return ERR_PTR(-EINVAL); + } + + if (data->start == start) { + /* Exact match: erase this node */ + rb_erase_augmented(&data->rb, &memtype_rbroot, + &memtype_rb_augment_cb); + } else { + /* Shrink match: update the end value of this node */ + rb_erase_augmented(&data->rb, &memtype_rbroot, + &memtype_rb_augment_cb); + data->end = start; + data->subtree_max_end = data->end; + memtype_rb_insert(&memtype_rbroot, data); + return NULL; + } - rb_erase_augmented(&data->rb, &memtype_rbroot, &memtype_rb_augment_cb); -out: return data; } ^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range @ 2015-12-09 16:26 ` Toshi Kani 0 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw) To: tglx, mingo, hpa, bp Cc: stsp, x86, linux-mm, linux-kernel, Toshi Kani, Borislav Petkov mremap() to shrink the map size of a VM_PFNMAP range causes the following error message, and leaves the pfn range allocated. x86/PAT: test:3493 freeing invalid memtype [mem 0x483200000-0x4863fffff] rbt_memtype_erase(), called from free_memtype() with spin_lock held, only accepts to free a whole memtype node in memtype_rbroot. Change rbt_memtype_erase() to accept a request that shrinks the size of a memtype node for mremap(). memtype_rb_exact_match() is renamed to memtype_rb_match(), which now performs exact match or shrink match per match_type. Since the memtype_rbroot tree allows overlapping ranges, rbt_memtype_erase() checks exact match first to ensure the case of freeing a whole node, which is the normal case. For the shrink case, rbt_memtype_erase() proceeds in two steps, 1) remove the node, and then 2) insert the updated node. This allows proper update of augmented values, subtree_max_end, in the tree. Reference: https://lkml.org/lkml/2015/10/28/865 Signed-off-by: Toshi Kani <toshi.kani@hpe.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@suse.de> --- arch/x86/mm/pat.c | 2 +- arch/x86/mm/pat_rbtree.c | 46 +++++++++++++++++++++++++++++++++++++--------- 2 files changed, 38 insertions(+), 10 deletions(-) diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c index f3e391e..f277efd 100644 --- a/arch/x86/mm/pat.c +++ b/arch/x86/mm/pat.c @@ -586,7 +586,7 @@ int free_memtype(u64 start, u64 end) entry = rbt_memtype_erase(start, end); spin_unlock(&memtype_lock); - if (!entry) { + if (IS_ERR(entry)) { pr_info("x86/PAT: %s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n", current->comm, current->pid, start, end - 1); return -EINVAL; diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c index 6393108..d6faef8 100644 --- a/arch/x86/mm/pat_rbtree.c +++ b/arch/x86/mm/pat_rbtree.c @@ -98,8 +98,13 @@ static struct memtype *memtype_rb_lowest_match(struct rb_root *root, return last_lower; /* Returns NULL if there is no overlap */ } -static struct memtype *memtype_rb_exact_match(struct rb_root *root, - u64 start, u64 end) +enum { + MEMTYPE_EXACT_MATCH = 0, + MEMTYPE_SHRINK_MATCH = 1 +}; + +static struct memtype *memtype_rb_match(struct rb_root *root, + u64 start, u64 end, int match_type) { struct memtype *match; @@ -107,7 +112,12 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root, while (match != NULL && match->start < end) { struct rb_node *node; - if (match->start == start && match->end == end) + if ((match_type == MEMTYPE_EXACT_MATCH) && + (match->start == start) && (match->end == end)) + return match; + + if ((match_type == MEMTYPE_SHRINK_MATCH) && + (match->start < start) && (match->end == end)) return match; node = rb_next(&match->rb); @@ -117,7 +127,7 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root, match = NULL; } - return NULL; /* Returns NULL if there is no exact match */ + return NULL; /* Returns NULL if there is no match */ } static int memtype_rb_check_conflict(struct rb_root *root, @@ -210,12 +220,30 @@ struct memtype *rbt_memtype_erase(u64 start, u64 end) { struct memtype *data; - data = memtype_rb_exact_match(&memtype_rbroot, start, end); - if (!data) - goto out; + /* Exact match takes precedence over shrink match */ + data = memtype_rb_match(&memtype_rbroot, start, end, + MEMTYPE_EXACT_MATCH); + if (!data) { + data = memtype_rb_match(&memtype_rbroot, start, end, + MEMTYPE_SHRINK_MATCH); + if (!data) + return ERR_PTR(-EINVAL); + } + + if (data->start == start) { + /* Exact match: erase this node */ + rb_erase_augmented(&data->rb, &memtype_rbroot, + &memtype_rb_augment_cb); + } else { + /* Shrink match: update the end value of this node */ + rb_erase_augmented(&data->rb, &memtype_rbroot, + &memtype_rb_augment_cb); + data->end = start; + data->subtree_max_end = data->end; + memtype_rb_insert(&memtype_rbroot, data); + return NULL; + } - rb_erase_augmented(&data->rb, &memtype_rbroot, &memtype_rb_augment_cb); -out: return data; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range 2015-12-09 16:26 ` Toshi Kani (?) @ 2015-12-20 9:27 ` Thomas Gleixner 2015-12-22 17:19 ` Toshi Kani -1 siblings, 1 reply; 12+ messages in thread From: Thomas Gleixner @ 2015-12-20 9:27 UTC (permalink / raw) To: Toshi Kani Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov Toshi, On Wed, 9 Dec 2015, Toshi Kani wrote: > diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c > index 6393108..d6faef8 100644 > --- a/arch/x86/mm/pat_rbtree.c > +++ b/arch/x86/mm/pat_rbtree.c > @@ -107,7 +112,12 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root, > while (match != NULL && match->start < end) { > struct rb_node *node; > > - if (match->start == start && match->end == end) > + if ((match_type == MEMTYPE_EXACT_MATCH) && > + (match->start == start) && (match->end == end)) > + return match; > + > + if ((match_type == MEMTYPE_SHRINK_MATCH) && > + (match->start < start) && (match->end == end)) Confused. If we shrink a mapping then I'd expect that the start of the mapping stays the same and the end changes. I certainly miss something here, but if the above is correct, then it definitely needs a big fat comment explaining it. Thanks, tglx ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range 2015-12-20 9:27 ` Thomas Gleixner @ 2015-12-22 17:19 ` Toshi Kani 0 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-22 17:19 UTC (permalink / raw) To: Thomas Gleixner Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov On Sun, 2015-12-20 at 10:27 +0100, Thomas Gleixner wrote: > Toshi, > > On Wed, 9 Dec 2015, Toshi Kani wrote: > > diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c > > index 6393108..d6faef8 100644 > > --- a/arch/x86/mm/pat_rbtree.c > > +++ b/arch/x86/mm/pat_rbtree.c > > @@ -107,7 +112,12 @@ static struct memtype > > *memtype_rb_exact_match(struct rb_root *root, > > while (match != NULL && match->start < end) { > > struct rb_node *node; > > > > - if (match->start == start && match->end == end) > > + if ((match_type == MEMTYPE_EXACT_MATCH) && > > + (match->start == start) && (match->end == end)) > > + return match; > > + > > + if ((match_type == MEMTYPE_SHRINK_MATCH) && > > + (match->start < start) && (match->end == end)) > > Confused. If we shrink a mapping then I'd expect that the start of the > mapping stays the same and the end changes. Yes, that is correct after this request is done. > I certainly miss something here, but if the above is correct, then it > definitely needs a big fat comment explaining it. This request specifies a range being "unmapped", not the remaining mapped range. So, when the mapping range is going to shrink from the end, the unmapping range has a bigger 'start' value and the same 'end' value. Thanks, -Toshi ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range @ 2015-12-22 17:19 ` Toshi Kani 0 siblings, 0 replies; 12+ messages in thread From: Toshi Kani @ 2015-12-22 17:19 UTC (permalink / raw) To: Thomas Gleixner Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov On Sun, 2015-12-20 at 10:27 +0100, Thomas Gleixner wrote: > Toshi, > > On Wed, 9 Dec 2015, Toshi Kani wrote: > > diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c > > index 6393108..d6faef8 100644 > > --- a/arch/x86/mm/pat_rbtree.c > > +++ b/arch/x86/mm/pat_rbtree.c > > @@ -107,7 +112,12 @@ static struct memtype > > *memtype_rb_exact_match(struct rb_root *root, > > while (match != NULL && match->start < end) { > > struct rb_node *node; > > > > - if (match->start == start && match->end == end) > > + if ((match_type == MEMTYPE_EXACT_MATCH) && > > + (match->start == start) && (match->end == end)) > > + return match; > > + > > + if ((match_type == MEMTYPE_SHRINK_MATCH) && > > + (match->start < start) && (match->end == end)) > > Confused. If we shrink a mapping then I'd expect that the start of the > mapping stays the same and the end changes. Yes, that is correct after this request is done. > I certainly miss something here, but if the above is correct, then it > definitely needs a big fat comment explaining it. This request specifies a range being "unmapped", not the remaining mapped range. So, when the mapping range is going to shrink from the end, the unmapping range has a bigger 'start' value and the same 'end' value. Thanks, -Toshi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2015-12-22 17:20 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-12-09 16:26 [PATCH 0/2] Change PAT to support mremap use-cases Toshi Kani 2015-12-09 16:26 ` Toshi Kani 2015-12-09 16:26 ` [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma Toshi Kani 2015-12-09 16:26 ` Toshi Kani 2015-12-20 9:21 ` Thomas Gleixner 2015-12-22 17:02 ` Toshi Kani 2015-12-22 17:02 ` Toshi Kani 2015-12-09 16:26 ` [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range Toshi Kani 2015-12-09 16:26 ` Toshi Kani 2015-12-20 9:27 ` Thomas Gleixner 2015-12-22 17:19 ` Toshi Kani 2015-12-22 17:19 ` Toshi Kani
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.