All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] Change PAT to support mremap use-cases
@ 2015-12-09 16:26 ` Toshi Kani
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw)
  To: tglx, mingo, hpa, bp; +Cc: stsp, x86, linux-mm, linux-kernel

This patch-set fixes issues found in the pfn tracking code of PAT
when mremap() is used on a VM_PFNMAP range.

Patch 1/2 changes untrack_pfn() to handle the case of mremap() with
MREMAP_FIXED, i.e. remapping virtual address on a VM_PFNMAP range.

Patch 2/2 changes the free_memtype() path to handle the case of
mremap() that shrinks the size of a VM_PFNMAP range.

Note, mremap() to expand the size of a VM_PFNMAP range is not a valid
case as VM_DONTEXPAND is set along with VM_PFNMAP.

---
Toshi Kani (2):
 1/2 x86/mm/pat: Change untrack_pfn() to handle unmapped vma
 2/2 x86/mm/pat: Change free_memtype() to free shrinking range

---
 arch/x86/mm/pat.c        | 19 ++++++++++++-------
 arch/x86/mm/pat_rbtree.c | 46 +++++++++++++++++++++++++++++++++++++---------
 2 files changed, 49 insertions(+), 16 deletions(-)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 0/2] Change PAT to support mremap use-cases
@ 2015-12-09 16:26 ` Toshi Kani
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw)
  To: tglx, mingo, hpa, bp; +Cc: stsp, x86, linux-mm, linux-kernel

This patch-set fixes issues found in the pfn tracking code of PAT
when mremap() is used on a VM_PFNMAP range.

Patch 1/2 changes untrack_pfn() to handle the case of mremap() with
MREMAP_FIXED, i.e. remapping virtual address on a VM_PFNMAP range.

Patch 2/2 changes the free_memtype() path to handle the case of
mremap() that shrinks the size of a VM_PFNMAP range.

Note, mremap() to expand the size of a VM_PFNMAP range is not a valid
case as VM_DONTEXPAND is set along with VM_PFNMAP.

---
Toshi Kani (2):
 1/2 x86/mm/pat: Change untrack_pfn() to handle unmapped vma
 2/2 x86/mm/pat: Change free_memtype() to free shrinking range

---
 arch/x86/mm/pat.c        | 19 ++++++++++++-------
 arch/x86/mm/pat_rbtree.c | 46 +++++++++++++++++++++++++++++++++++++---------
 2 files changed, 49 insertions(+), 16 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma
  2015-12-09 16:26 ` Toshi Kani
@ 2015-12-09 16:26   ` Toshi Kani
  -1 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw)
  To: tglx, mingo, hpa, bp
  Cc: stsp, x86, linux-mm, linux-kernel, Toshi Kani, Borislav Petkov

mremap() with MREMAP_FIXED, remapping to a new virtual address, on
a VM_PFNMAP range causes the following WARN_ON_ONCE() message in
untrack_pfn().

  WARNING: CPU: 1 PID: 3493 at arch/x86/mm/pat.c:985 untrack_pfn+0xbd/0xd0()
  Call Trace:
  [<ffffffff817729ea>] dump_stack+0x45/0x57
  [<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0
  [<ffffffff8109e5ea>] warn_slowpath_null+0x1a/0x20
  [<ffffffff8106a88d>] untrack_pfn+0xbd/0xd0
  [<ffffffff811d2d5e>] unmap_single_vma+0x80e/0x860
  [<ffffffff811d3725>] unmap_vmas+0x55/0xb0
  [<ffffffff811d916c>] unmap_region+0xac/0x120
  [<ffffffff811db86a>] do_munmap+0x28a/0x460
  [<ffffffff811dec33>] move_vma+0x1b3/0x2e0
  [<ffffffff811df113>] SyS_mremap+0x3b3/0x510
  [<ffffffff817793ee>] entry_SYSCALL_64_fastpath+0x12/0x71

MREMAP_FIXED moves a virtual address of VM_PFNMAP, but keeps the pfn
and cache type.  In this case, untrack_pfn() is called with the old
vma after its translation has removed.  Hence, when follow_phys()
fails, track_pfn() is changed to keep the pfn tracked and clears
VM_PAT from the old vma, instead of WARN_ON_ONCE() on the case.

Reference: https://lkml.org/lkml/2015/10/28/865
Reported-by: Stas Sergeev <stsp@list.ru>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/pat.c |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 188e3e0..f3e391e 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
 
 /*
  * untrack_pfn is called while unmapping a pfnmap for a region.
- * untrack can be called for a specific region indicated by pfn and size or
- * can be for the entire vma (in which case pfn, size are zero).
+ * untrack_pfn can be called for a specific region indicated by pfn and
+ * size or can be for the entire vma (in which case pfn, size are zero).
+ *
+ * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the
+ * pfn and cache type.  In this case, untrack_pfn() is called with the
+ * old vma after its translation has removed.  Hence, when follow_phys()
+ * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the
+ * old vma.
  */
 void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 		 unsigned long size)
@@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 	/* free the chunk starting from pfn or the whole chunk */
 	paddr = (resource_size_t)pfn << PAGE_SHIFT;
 	if (!paddr && !size) {
-		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
-			WARN_ON_ONCE(1);
-			return;
-		}
+		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr))
+			goto out;
 
 		size = vma->vm_end - vma->vm_start;
 	}
 	free_pfn_range(paddr, size);
+out:
 	vma->vm_flags &= ~VM_PAT;
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma
@ 2015-12-09 16:26   ` Toshi Kani
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw)
  To: tglx, mingo, hpa, bp
  Cc: stsp, x86, linux-mm, linux-kernel, Toshi Kani, Borislav Petkov

mremap() with MREMAP_FIXED, remapping to a new virtual address, on
a VM_PFNMAP range causes the following WARN_ON_ONCE() message in
untrack_pfn().

  WARNING: CPU: 1 PID: 3493 at arch/x86/mm/pat.c:985 untrack_pfn+0xbd/0xd0()
  Call Trace:
  [<ffffffff817729ea>] dump_stack+0x45/0x57
  [<ffffffff8109e4b6>] warn_slowpath_common+0x86/0xc0
  [<ffffffff8109e5ea>] warn_slowpath_null+0x1a/0x20
  [<ffffffff8106a88d>] untrack_pfn+0xbd/0xd0
  [<ffffffff811d2d5e>] unmap_single_vma+0x80e/0x860
  [<ffffffff811d3725>] unmap_vmas+0x55/0xb0
  [<ffffffff811d916c>] unmap_region+0xac/0x120
  [<ffffffff811db86a>] do_munmap+0x28a/0x460
  [<ffffffff811dec33>] move_vma+0x1b3/0x2e0
  [<ffffffff811df113>] SyS_mremap+0x3b3/0x510
  [<ffffffff817793ee>] entry_SYSCALL_64_fastpath+0x12/0x71

MREMAP_FIXED moves a virtual address of VM_PFNMAP, but keeps the pfn
and cache type.  In this case, untrack_pfn() is called with the old
vma after its translation has removed.  Hence, when follow_phys()
fails, track_pfn() is changed to keep the pfn tracked and clears
VM_PAT from the old vma, instead of WARN_ON_ONCE() on the case.

Reference: https://lkml.org/lkml/2015/10/28/865
Reported-by: Stas Sergeev <stsp@list.ru>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/pat.c |   17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 188e3e0..f3e391e 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
 
 /*
  * untrack_pfn is called while unmapping a pfnmap for a region.
- * untrack can be called for a specific region indicated by pfn and size or
- * can be for the entire vma (in which case pfn, size are zero).
+ * untrack_pfn can be called for a specific region indicated by pfn and
+ * size or can be for the entire vma (in which case pfn, size are zero).
+ *
+ * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the
+ * pfn and cache type.  In this case, untrack_pfn() is called with the
+ * old vma after its translation has removed.  Hence, when follow_phys()
+ * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the
+ * old vma.
  */
 void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 		 unsigned long size)
@@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
 	/* free the chunk starting from pfn or the whole chunk */
 	paddr = (resource_size_t)pfn << PAGE_SHIFT;
 	if (!paddr && !size) {
-		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
-			WARN_ON_ONCE(1);
-			return;
-		}
+		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr))
+			goto out;
 
 		size = vma->vm_end - vma->vm_start;
 	}
 	free_pfn_range(paddr, size);
+out:
 	vma->vm_flags &= ~VM_PAT;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range
  2015-12-09 16:26 ` Toshi Kani
@ 2015-12-09 16:26   ` Toshi Kani
  -1 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw)
  To: tglx, mingo, hpa, bp
  Cc: stsp, x86, linux-mm, linux-kernel, Toshi Kani, Borislav Petkov

mremap() to shrink the map size of a VM_PFNMAP range causes
the following error message, and leaves the pfn range allocated.

 x86/PAT: test:3493 freeing invalid memtype [mem 0x483200000-0x4863fffff]

rbt_memtype_erase(), called from free_memtype() with spin_lock
held, only accepts to free a whole memtype node in memtype_rbroot.
Change rbt_memtype_erase() to accept a request that shrinks the
size of a memtype node for mremap().

memtype_rb_exact_match() is renamed to memtype_rb_match(),
which now performs exact match or shrink match per match_type.
Since the memtype_rbroot tree allows overlapping ranges,
rbt_memtype_erase() checks exact match first to ensure the case
of freeing a whole node, which is the normal case.  For the
shrink case, rbt_memtype_erase() proceeds in two steps, 1) remove
the node, and then 2) insert the updated node.  This allows proper
update of augmented values, subtree_max_end, in the tree.

Reference: https://lkml.org/lkml/2015/10/28/865
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/pat.c        |    2 +-
 arch/x86/mm/pat_rbtree.c |   46 +++++++++++++++++++++++++++++++++++++---------
 2 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f3e391e..f277efd 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -586,7 +586,7 @@ int free_memtype(u64 start, u64 end)
 	entry = rbt_memtype_erase(start, end);
 	spin_unlock(&memtype_lock);
 
-	if (!entry) {
+	if (IS_ERR(entry)) {
 		pr_info("x86/PAT: %s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n",
 			current->comm, current->pid, start, end - 1);
 		return -EINVAL;
diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index 6393108..d6faef8 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -98,8 +98,13 @@ static struct memtype *memtype_rb_lowest_match(struct rb_root *root,
 	return last_lower; /* Returns NULL if there is no overlap */
 }
 
-static struct memtype *memtype_rb_exact_match(struct rb_root *root,
-				u64 start, u64 end)
+enum {
+	MEMTYPE_EXACT_MATCH  = 0,
+	MEMTYPE_SHRINK_MATCH = 1
+};
+
+static struct memtype *memtype_rb_match(struct rb_root *root,
+				u64 start, u64 end, int match_type)
 {
 	struct memtype *match;
 
@@ -107,7 +112,12 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root,
 	while (match != NULL && match->start < end) {
 		struct rb_node *node;
 
-		if (match->start == start && match->end == end)
+		if ((match_type == MEMTYPE_EXACT_MATCH) &&
+		    (match->start == start) && (match->end == end))
+			return match;
+
+		if ((match_type == MEMTYPE_SHRINK_MATCH) &&
+		    (match->start < start) && (match->end == end))
 			return match;
 
 		node = rb_next(&match->rb);
@@ -117,7 +127,7 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root,
 			match = NULL;
 	}
 
-	return NULL; /* Returns NULL if there is no exact match */
+	return NULL; /* Returns NULL if there is no match */
 }
 
 static int memtype_rb_check_conflict(struct rb_root *root,
@@ -210,12 +220,30 @@ struct memtype *rbt_memtype_erase(u64 start, u64 end)
 {
 	struct memtype *data;
 
-	data = memtype_rb_exact_match(&memtype_rbroot, start, end);
-	if (!data)
-		goto out;
+	/* Exact match takes precedence over shrink match */
+	data = memtype_rb_match(&memtype_rbroot, start, end,
+				MEMTYPE_EXACT_MATCH);
+	if (!data) {
+		data = memtype_rb_match(&memtype_rbroot, start, end,
+					MEMTYPE_SHRINK_MATCH);
+		if (!data)
+			return ERR_PTR(-EINVAL);
+	}
+
+	if (data->start == start) {
+		/* Exact match: erase this node */
+		rb_erase_augmented(&data->rb, &memtype_rbroot,
+					&memtype_rb_augment_cb);
+	} else {
+		/* Shrink match: update the end value of this node */
+		rb_erase_augmented(&data->rb, &memtype_rbroot,
+					&memtype_rb_augment_cb);
+		data->end = start;
+		data->subtree_max_end = data->end;
+		memtype_rb_insert(&memtype_rbroot, data);
+		return NULL;
+	}
 
-	rb_erase_augmented(&data->rb, &memtype_rbroot, &memtype_rb_augment_cb);
-out:
 	return data;
 }
 

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range
@ 2015-12-09 16:26   ` Toshi Kani
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-09 16:26 UTC (permalink / raw)
  To: tglx, mingo, hpa, bp
  Cc: stsp, x86, linux-mm, linux-kernel, Toshi Kani, Borislav Petkov

mremap() to shrink the map size of a VM_PFNMAP range causes
the following error message, and leaves the pfn range allocated.

 x86/PAT: test:3493 freeing invalid memtype [mem 0x483200000-0x4863fffff]

rbt_memtype_erase(), called from free_memtype() with spin_lock
held, only accepts to free a whole memtype node in memtype_rbroot.
Change rbt_memtype_erase() to accept a request that shrinks the
size of a memtype node for mremap().

memtype_rb_exact_match() is renamed to memtype_rb_match(),
which now performs exact match or shrink match per match_type.
Since the memtype_rbroot tree allows overlapping ranges,
rbt_memtype_erase() checks exact match first to ensure the case
of freeing a whole node, which is the normal case.  For the
shrink case, rbt_memtype_erase() proceeds in two steps, 1) remove
the node, and then 2) insert the updated node.  This allows proper
update of augmented values, subtree_max_end, in the tree.

Reference: https://lkml.org/lkml/2015/10/28/865
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@suse.de>
---
 arch/x86/mm/pat.c        |    2 +-
 arch/x86/mm/pat_rbtree.c |   46 +++++++++++++++++++++++++++++++++++++---------
 2 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index f3e391e..f277efd 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -586,7 +586,7 @@ int free_memtype(u64 start, u64 end)
 	entry = rbt_memtype_erase(start, end);
 	spin_unlock(&memtype_lock);
 
-	if (!entry) {
+	if (IS_ERR(entry)) {
 		pr_info("x86/PAT: %s:%d freeing invalid memtype [mem %#010Lx-%#010Lx]\n",
 			current->comm, current->pid, start, end - 1);
 		return -EINVAL;
diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index 6393108..d6faef8 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -98,8 +98,13 @@ static struct memtype *memtype_rb_lowest_match(struct rb_root *root,
 	return last_lower; /* Returns NULL if there is no overlap */
 }
 
-static struct memtype *memtype_rb_exact_match(struct rb_root *root,
-				u64 start, u64 end)
+enum {
+	MEMTYPE_EXACT_MATCH  = 0,
+	MEMTYPE_SHRINK_MATCH = 1
+};
+
+static struct memtype *memtype_rb_match(struct rb_root *root,
+				u64 start, u64 end, int match_type)
 {
 	struct memtype *match;
 
@@ -107,7 +112,12 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root,
 	while (match != NULL && match->start < end) {
 		struct rb_node *node;
 
-		if (match->start == start && match->end == end)
+		if ((match_type == MEMTYPE_EXACT_MATCH) &&
+		    (match->start == start) && (match->end == end))
+			return match;
+
+		if ((match_type == MEMTYPE_SHRINK_MATCH) &&
+		    (match->start < start) && (match->end == end))
 			return match;
 
 		node = rb_next(&match->rb);
@@ -117,7 +127,7 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root,
 			match = NULL;
 	}
 
-	return NULL; /* Returns NULL if there is no exact match */
+	return NULL; /* Returns NULL if there is no match */
 }
 
 static int memtype_rb_check_conflict(struct rb_root *root,
@@ -210,12 +220,30 @@ struct memtype *rbt_memtype_erase(u64 start, u64 end)
 {
 	struct memtype *data;
 
-	data = memtype_rb_exact_match(&memtype_rbroot, start, end);
-	if (!data)
-		goto out;
+	/* Exact match takes precedence over shrink match */
+	data = memtype_rb_match(&memtype_rbroot, start, end,
+				MEMTYPE_EXACT_MATCH);
+	if (!data) {
+		data = memtype_rb_match(&memtype_rbroot, start, end,
+					MEMTYPE_SHRINK_MATCH);
+		if (!data)
+			return ERR_PTR(-EINVAL);
+	}
+
+	if (data->start == start) {
+		/* Exact match: erase this node */
+		rb_erase_augmented(&data->rb, &memtype_rbroot,
+					&memtype_rb_augment_cb);
+	} else {
+		/* Shrink match: update the end value of this node */
+		rb_erase_augmented(&data->rb, &memtype_rbroot,
+					&memtype_rb_augment_cb);
+		data->end = start;
+		data->subtree_max_end = data->end;
+		memtype_rb_insert(&memtype_rbroot, data);
+		return NULL;
+	}
 
-	rb_erase_augmented(&data->rb, &memtype_rbroot, &memtype_rb_augment_cb);
-out:
 	return data;
 }
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma
  2015-12-09 16:26   ` Toshi Kani
  (?)
@ 2015-12-20  9:21   ` Thomas Gleixner
  2015-12-22 17:02       ` Toshi Kani
  -1 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2015-12-20  9:21 UTC (permalink / raw)
  To: Toshi Kani
  Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov

Toshi,

On Wed, 9 Dec 2015, Toshi Kani wrote:
> diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> index 188e3e0..f3e391e 100644
> --- a/arch/x86/mm/pat.c
> +++ b/arch/x86/mm/pat.c
> @@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma, pgprot_t *prot,
>  
>  /*
>   * untrack_pfn is called while unmapping a pfnmap for a region.
> - * untrack can be called for a specific region indicated by pfn and size or
> - * can be for the entire vma (in which case pfn, size are zero).
> + * untrack_pfn can be called for a specific region indicated by pfn and
> + * size or can be for the entire vma (in which case pfn, size are zero).
> + *
> + * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the
> + * pfn and cache type.  In this case, untrack_pfn() is called with the
> + * old vma after its translation has removed.  Hence, when follow_phys()
> + * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the
> + * old vma.
>   */
>  void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
>  		 unsigned long size)
> @@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
>  	/* free the chunk starting from pfn or the whole chunk */
>  	paddr = (resource_size_t)pfn << PAGE_SHIFT;
>  	if (!paddr && !size) {
> -		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr)) {
> -			WARN_ON_ONCE(1);
> -			return;
> -		}
> +		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr))
> +			goto out;

Shouldn't we have an explicit call in the mremap code which clears the
PAT flag on the mm instead of removing this sanity check?
  
Because that's what we end up with there. We just clear the PAT flag.

I rather prefer to do that explicitely, so the following call to
untrack_pfn() from move_vma()->do_munmap() ... will see the PAT flag
cleared. untrack_moved_pfn() or such.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range
  2015-12-09 16:26   ` Toshi Kani
  (?)
@ 2015-12-20  9:27   ` Thomas Gleixner
  2015-12-22 17:19       ` Toshi Kani
  -1 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2015-12-20  9:27 UTC (permalink / raw)
  To: Toshi Kani
  Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov

Toshi,

On Wed, 9 Dec 2015, Toshi Kani wrote:
> diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
> index 6393108..d6faef8 100644
> --- a/arch/x86/mm/pat_rbtree.c
> +++ b/arch/x86/mm/pat_rbtree.c
> @@ -107,7 +112,12 @@ static struct memtype *memtype_rb_exact_match(struct rb_root *root,
>  	while (match != NULL && match->start < end) {
>  		struct rb_node *node;
>  
> -		if (match->start == start && match->end == end)
> +		if ((match_type == MEMTYPE_EXACT_MATCH) &&
> +		    (match->start == start) && (match->end == end))
> +			return match;
> +
> +		if ((match_type == MEMTYPE_SHRINK_MATCH) &&
> +		    (match->start < start) && (match->end == end))

Confused. If we shrink a mapping then I'd expect that the start of the
mapping stays the same and the end changes. I certainly miss something
here, but if the above is correct, then it definitely needs a big fat
comment explaining it.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma
  2015-12-20  9:21   ` Thomas Gleixner
@ 2015-12-22 17:02       ` Toshi Kani
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-22 17:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov

On Sun, 2015-12-20 at 10:21 +0100, Thomas Gleixner wrote:
> Toshi,
> 
> On Wed, 9 Dec 2015, Toshi Kani wrote:
> > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> > index 188e3e0..f3e391e 100644
> > --- a/arch/x86/mm/pat.c
> > +++ b/arch/x86/mm/pat.c
> > @@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma,
> > pgprot_t *prot,
> >  
> >  /*
> >   * untrack_pfn is called while unmapping a pfnmap for a region.
> > - * untrack can be called for a specific region indicated by pfn and
> > size or
> > - * can be for the entire vma (in which case pfn, size are zero).
> > + * untrack_pfn can be called for a specific region indicated by pfn
> > and
> > + * size or can be for the entire vma (in which case pfn, size are
> > zero).
> > + *
> > + * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the
> > + * pfn and cache type.  In this case, untrack_pfn() is called with the
> > + * old vma after its translation has removed.  Hence, when
> > follow_phys()
> > + * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the
> > + * old vma.
> >   */
> >  void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
> >  		 unsigned long size)
> > @@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma,
> > unsigned long pfn,
> >  	/* free the chunk starting from pfn or the whole chunk */
> >  	paddr = (resource_size_t)pfn << PAGE_SHIFT;
> >  	if (!paddr && !size) {
> > -		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr))
> > {
> > -			WARN_ON_ONCE(1);
> > -			return;
> > -		}
> > +		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr))
> > +			goto out;
> 
> Shouldn't we have an explicit call in the mremap code which clears the
> PAT flag on the mm instead of removing this sanity check?
>   
> Because that's what we end up with there. We just clear the PAT flag.
> 
> I rather prefer to do that explicitely, so the following call to
> untrack_pfn() from move_vma()->do_munmap() ... will see the PAT flag
> cleared. untrack_moved_pfn() or such.

Agreed.  I will add untrack_pfn_moved(), which clears the PAT flag.

Thanks!
-Toshi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma
@ 2015-12-22 17:02       ` Toshi Kani
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-22 17:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov

On Sun, 2015-12-20 at 10:21 +0100, Thomas Gleixner wrote:
> Toshi,
> 
> On Wed, 9 Dec 2015, Toshi Kani wrote:
> > diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
> > index 188e3e0..f3e391e 100644
> > --- a/arch/x86/mm/pat.c
> > +++ b/arch/x86/mm/pat.c
> > @@ -966,8 +966,14 @@ int track_pfn_insert(struct vm_area_struct *vma,
> > pgprot_t *prot,
> >  
> >  /*
> >   * untrack_pfn is called while unmapping a pfnmap for a region.
> > - * untrack can be called for a specific region indicated by pfn and
> > size or
> > - * can be for the entire vma (in which case pfn, size are zero).
> > + * untrack_pfn can be called for a specific region indicated by pfn
> > and
> > + * size or can be for the entire vma (in which case pfn, size are
> > zero).
> > + *
> > + * NOTE: mremap may move a virtual address of VM_PFNMAP, but keeps the
> > + * pfn and cache type.  In this case, untrack_pfn() is called with the
> > + * old vma after its translation has removed.  Hence, when
> > follow_phys()
> > + * fails, track_pfn() keeps the pfn tracked and clears VM_PAT from the
> > + * old vma.
> >   */
> >  void untrack_pfn(struct vm_area_struct *vma, unsigned long pfn,
> >  		 unsigned long size)
> > @@ -981,14 +987,13 @@ void untrack_pfn(struct vm_area_struct *vma,
> > unsigned long pfn,
> >  	/* free the chunk starting from pfn or the whole chunk */
> >  	paddr = (resource_size_t)pfn << PAGE_SHIFT;
> >  	if (!paddr && !size) {
> > -		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr))
> > {
> > -			WARN_ON_ONCE(1);
> > -			return;
> > -		}
> > +		if (follow_phys(vma, vma->vm_start, 0, &prot, &paddr))
> > +			goto out;
> 
> Shouldn't we have an explicit call in the mremap code which clears the
> PAT flag on the mm instead of removing this sanity check?
>   
> Because that's what we end up with there. We just clear the PAT flag.
> 
> I rather prefer to do that explicitely, so the following call to
> untrack_pfn() from move_vma()->do_munmap() ... will see the PAT flag
> cleared. untrack_moved_pfn() or such.

Agreed.  I will add untrack_pfn_moved(), which clears the PAT flag.

Thanks!
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range
  2015-12-20  9:27   ` Thomas Gleixner
@ 2015-12-22 17:19       ` Toshi Kani
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-22 17:19 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov

On Sun, 2015-12-20 at 10:27 +0100, Thomas Gleixner wrote:
> Toshi,
> 
> On Wed, 9 Dec 2015, Toshi Kani wrote:
> > diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
> > index 6393108..d6faef8 100644
> > --- a/arch/x86/mm/pat_rbtree.c
> > +++ b/arch/x86/mm/pat_rbtree.c
> > @@ -107,7 +112,12 @@ static struct memtype
> > *memtype_rb_exact_match(struct rb_root *root,
> >  	while (match != NULL && match->start < end) {
> >  		struct rb_node *node;
> >  
> > -		if (match->start == start && match->end == end)
> > +		if ((match_type == MEMTYPE_EXACT_MATCH) &&
> > +		    (match->start == start) && (match->end == end))
> > +			return match;
> > +
> > +		if ((match_type == MEMTYPE_SHRINK_MATCH) &&
> > +		    (match->start < start) && (match->end == end))
> 
> Confused. If we shrink a mapping then I'd expect that the start of the
> mapping stays the same and the end changes. 

Yes, that is correct after this request is done.

> I certainly miss something here, but if the above is correct, then it 
> definitely needs a big fat comment explaining it.

This request specifies a range being "unmapped", not the remaining mapped
range.  So, when the mapping range is going to shrink from the end, the
unmapping range has a bigger 'start' value and the same 'end' value. 

Thanks,
-Toshi

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range
@ 2015-12-22 17:19       ` Toshi Kani
  0 siblings, 0 replies; 12+ messages in thread
From: Toshi Kani @ 2015-12-22 17:19 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: mingo, hpa, bp, stsp, x86, linux-mm, linux-kernel, Borislav Petkov

On Sun, 2015-12-20 at 10:27 +0100, Thomas Gleixner wrote:
> Toshi,
> 
> On Wed, 9 Dec 2015, Toshi Kani wrote:
> > diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
> > index 6393108..d6faef8 100644
> > --- a/arch/x86/mm/pat_rbtree.c
> > +++ b/arch/x86/mm/pat_rbtree.c
> > @@ -107,7 +112,12 @@ static struct memtype
> > *memtype_rb_exact_match(struct rb_root *root,
> >  	while (match != NULL && match->start < end) {
> >  		struct rb_node *node;
> >  
> > -		if (match->start == start && match->end == end)
> > +		if ((match_type == MEMTYPE_EXACT_MATCH) &&
> > +		    (match->start == start) && (match->end == end))
> > +			return match;
> > +
> > +		if ((match_type == MEMTYPE_SHRINK_MATCH) &&
> > +		    (match->start < start) && (match->end == end))
> 
> Confused. If we shrink a mapping then I'd expect that the start of the
> mapping stays the same and the end changes. 

Yes, that is correct after this request is done.

> I certainly miss something here, but if the above is correct, then it 
> definitely needs a big fat comment explaining it.

This request specifies a range being "unmapped", not the remaining mapped
range.  So, when the mapping range is going to shrink from the end, the
unmapping range has a bigger 'start' value and the same 'end' value. 

Thanks,
-Toshi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2015-12-22 17:20 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-09 16:26 [PATCH 0/2] Change PAT to support mremap use-cases Toshi Kani
2015-12-09 16:26 ` Toshi Kani
2015-12-09 16:26 ` [PATCH 1/2] x86/mm/pat: Change untrack_pfn() to handle unmapped vma Toshi Kani
2015-12-09 16:26   ` Toshi Kani
2015-12-20  9:21   ` Thomas Gleixner
2015-12-22 17:02     ` Toshi Kani
2015-12-22 17:02       ` Toshi Kani
2015-12-09 16:26 ` [PATCH 2/2] x86/mm/pat: Change free_memtype() to free shrinking range Toshi Kani
2015-12-09 16:26   ` Toshi Kani
2015-12-20  9:27   ` Thomas Gleixner
2015-12-22 17:19     ` Toshi Kani
2015-12-22 17:19       ` Toshi Kani

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.