linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] Memory policy corruption fixes V2
@ 2012-08-20 16:36 Mel Gorman
  2012-08-20 16:36 ` [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" Mel Gorman
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
  To: Andrew Morton, KOSAKI Motohiro
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, LKML, Linux-MM, Mel Gorman

This is a rebase with some small changes to Kosaki's "mempolicy memory
corruption fixlet" series. I had expected that Kosaki would have revised
the series by now but it's been waiting a long time.

Changelog since V1
o Rebase to 3.6-rc2
o Editted some of the changelogs
o Converted sp->lock to sp->mutex to close a race in shared_policy_replace()
o Reworked the refcount imbalance fix slightly
o Do not call mpol_put in shmem_alloc_page.

I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
did not test LTP such as Josh reported a problem with or with a database that
used shared policies like Andi tested. The series is almost all Kosaki's
work of course. If he has a revised series that simply got delayed in
posting it should take precedence.

 include/linux/mempolicy.h |    2 +-
 mm/mempolicy.c            |  142 +++++++++++++++++++++++++++++----------------
 2 files changed, 93 insertions(+), 51 deletions(-)

-- 
1.7.7


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages"
  2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
  2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
  To: Andrew Morton, KOSAKI Motohiro
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, LKML, Linux-MM, Mel Gorman

From: KOSAKI Motohiro <kosaki.motohiro@gmail.com>

commit [05f144a0: mm: mempolicy: Let vma_merge and vma_split handle
vma->vm_policy linkages] removed vma->vm_policy updates code but it is
the purpose of mbind_range(). Now, mbind_range() is virtually a no-op
and while it does not allow memory corruption it is not the right fix.
This patch is a revert.

[mgorman@suse.de: Edited changelog]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/mempolicy.c |   41 ++++++++++++++++++++++++-----------------
 1 files changed, 24 insertions(+), 17 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index bd92431..22dcf9a 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -607,6 +607,27 @@ check_range(struct mm_struct *mm, unsigned long start, unsigned long end,
 	return first;
 }
 
+/* Apply policy to a single VMA */
+static int policy_vma(struct vm_area_struct *vma, struct mempolicy *new)
+{
+	int err = 0;
+	struct mempolicy *old = vma->vm_policy;
+
+	pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
+		 vma->vm_start, vma->vm_end, vma->vm_pgoff,
+		 vma->vm_ops, vma->vm_file,
+		 vma->vm_ops ? vma->vm_ops->set_policy : NULL);
+
+	if (vma->vm_ops && vma->vm_ops->set_policy)
+		err = vma->vm_ops->set_policy(vma, new);
+	if (!err) {
+		mpol_get(new);
+		vma->vm_policy = new;
+		mpol_put(old);
+	}
+	return err;
+}
+
 /* Step 2: apply policy to a range and do splits. */
 static int mbind_range(struct mm_struct *mm, unsigned long start,
 		       unsigned long end, struct mempolicy *new_pol)
@@ -655,23 +676,9 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
 			if (err)
 				goto out;
 		}
-
-		/*
-		 * Apply policy to a single VMA. The reference counting of
-		 * policy for vma_policy linkages has already been handled by
-		 * vma_merge and split_vma as necessary. If this is a shared
-		 * policy then ->set_policy will increment the reference count
-		 * for an sp node.
-		 */
-		pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
-			vma->vm_start, vma->vm_end, vma->vm_pgoff,
-			vma->vm_ops, vma->vm_file,
-			vma->vm_ops ? vma->vm_ops->set_policy : NULL);
-		if (vma->vm_ops && vma->vm_ops->set_policy) {
-			err = vma->vm_ops->set_policy(vma, new_pol);
-			if (err)
-				goto out;
-		}
+		err = policy_vma(vma, new_pol);
+		if (err)
+			goto out;
 	}
 
  out:
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/5] mempolicy: Remove mempolicy sharing
  2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
  2012-08-20 16:36 ` [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
  2012-08-20 19:35   ` Christoph Lameter
  2012-08-22 19:03   ` Andrew Morton
  2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
  To: Andrew Morton, KOSAKI Motohiro
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, LKML, Linux-MM, Mel Gorman

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

Dave Jones' system call fuzz testing tool "trinity" triggered the following
bug error with slab debugging enabled

[ 7613.229315] =============================================================================
[ 7613.229955] BUG numa_policy (Not tainted): Poison overwritten
[ 7613.230560] -----------------------------------------------------------------------------
[ 7613.230560]
[ 7613.231834] INFO: 0xffff880146498250-0xffff880146498250. First byte 0x6a instead of 0x6b
[ 7613.232518] INFO: Allocated in mpol_new+0xa3/0x140 age=46310 cpu=6 pid=32154
[ 7613.233188]  __slab_alloc+0x3d3/0x445
[ 7613.233877]  kmem_cache_alloc+0x29d/0x2b0
[ 7613.234564]  mpol_new+0xa3/0x140
[ 7613.235236]  sys_mbind+0x142/0x620
[ 7613.235929]  system_call_fastpath+0x16/0x1b
[ 7613.236640] INFO: Freed in __mpol_put+0x27/0x30 age=46268 cpu=6 pid=32154
[ 7613.237354]  __slab_free+0x2e/0x1de
[ 7613.238080]  kmem_cache_free+0x25a/0x260
[ 7613.238799]  __mpol_put+0x27/0x30
[ 7613.239515]  remove_vma+0x68/0x90
[ 7613.240223]  exit_mmap+0x118/0x140
[ 7613.240939]  mmput+0x73/0x110
[ 7613.241651]  exit_mm+0x108/0x130
[ 7613.242367]  do_exit+0x162/0xb90
[ 7613.243074]  do_group_exit+0x4f/0xc0
[ 7613.243790]  sys_exit_group+0x17/0x20
[ 7613.244507]  system_call_fastpath+0x16/0x1b
[ 7613.245212] INFO: Slab 0xffffea0005192600 objects=27 used=27 fp=0x          (null) flags=0x20000000004080
[ 7613.246000] INFO: Object 0xffff880146498250 @offset=592 fp=0xffff88014649b9d0

The problem is that the structure is being prematurely freed due to a
reference count imbalance. In the following case mbind(addr, len) should
replace the memory policies of both vma1 and vma2 and thus they will
become to share the same mempolicy and the new mempolicy will have the
MPOL_F_SHARED flag.

  +-------------------+-------------------+
  |     vma1          |     vma2(shmem)   |
  +-------------------+-------------------+
  |                                       |
 addr                                 addr+len

alloc_pages_vma() uses get_vma_policy() and mpol_cond_put() pair for
maintaining the mempolicy reference count. The current rule is that
get_vma_policy() only increments refcount for shmem VMA and mpol_conf_put()
only decrements refcount if the policy has MPOL_F_SHARED.

In above case, vma1 is not shmem vma and vma->policy has MPOL_F_SHARED!
The reference count will be decreased even though was not increased whenever
alloc_page_vma() is called. This has been broken since commit [52cd3b07:
mempolicy: rework mempolicy Reference Counting] in 2008.

There is another serious bug with the sharing of memory policies. Currently,
mempolicy rebind logic (it is called from cpuset rebinding) ignores a refcount
of mempolicy and override it forcibly. Thus, any mempolicy sharing may cause
mempolicy corruption. The bug was introduced by commit [68860ec1: cpusets:
automatic numa mempolicy rebinding].

Ideally, the shared policy handling would be rewritten to either properly
handle COW of the policy structures or at least reference count MPOL_F_SHARED
based exclusively on information within the policy.  However, this patch takes
the easier approach of disabling any policy sharing between VMAs. Each new
range allocated with sp_alloc will allocate a new policy, set the reference
count to 1 and drop the reference count of the old policy. This increases
the memory footprint but is not expected to be a major problem as mbind()
is unlikely to be used for fine-grained ranges. It is also inefficient
because it means we allocate a new policy even in cases where mbind_range()
could use the new_policy passed to it. However, it is more straight-forward
and the change should be invisible to the user.

[mgorman@suse.de: Edited changelog]
Reported-by: Dave Jones <davej@redhat.com>,
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>,
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/mempolicy.c |   52 ++++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 22dcf9a..8025720 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -607,24 +607,39 @@ check_range(struct mm_struct *mm, unsigned long start, unsigned long end,
 	return first;
 }
 
-/* Apply policy to a single VMA */
-static int policy_vma(struct vm_area_struct *vma, struct mempolicy *new)
+/*
+ * Apply policy to a single VMA
+ * This must be called with the mmap_sem held for writing.
+ */
+static int vma_replace_policy(struct vm_area_struct *vma,
+						struct mempolicy *pol)
 {
-	int err = 0;
-	struct mempolicy *old = vma->vm_policy;
+	int err;
+	struct mempolicy *old;
+	struct mempolicy *new;
 
 	pr_debug("vma %lx-%lx/%lx vm_ops %p vm_file %p set_policy %p\n",
 		 vma->vm_start, vma->vm_end, vma->vm_pgoff,
 		 vma->vm_ops, vma->vm_file,
 		 vma->vm_ops ? vma->vm_ops->set_policy : NULL);
 
-	if (vma->vm_ops && vma->vm_ops->set_policy)
+	new = mpol_dup(pol);
+	if (IS_ERR(new))
+		return PTR_ERR(new);
+
+	if (vma->vm_ops && vma->vm_ops->set_policy) {
 		err = vma->vm_ops->set_policy(vma, new);
-	if (!err) {
-		mpol_get(new);
-		vma->vm_policy = new;
-		mpol_put(old);
+		if (err)
+			goto err_out;
 	}
+
+	old = vma->vm_policy;
+	vma->vm_policy = new; /* protected by mmap_sem */
+	mpol_put(old);
+
+	return 0;
+ err_out:
+	mpol_put(new);
 	return err;
 }
 
@@ -676,7 +691,7 @@ static int mbind_range(struct mm_struct *mm, unsigned long start,
 			if (err)
 				goto out;
 		}
-		err = policy_vma(vma, new_pol);
+		err = vma_replace_policy(vma, new_pol);
 		if (err)
 			goto out;
 	}
@@ -2153,15 +2168,24 @@ static void sp_delete(struct shared_policy *sp, struct sp_node *n)
 static struct sp_node *sp_alloc(unsigned long start, unsigned long end,
 				struct mempolicy *pol)
 {
-	struct sp_node *n = kmem_cache_alloc(sn_cache, GFP_KERNEL);
+	struct sp_node *n;
+	struct mempolicy *newpol;
 
+	n = kmem_cache_alloc(sn_cache, GFP_KERNEL);
 	if (!n)
 		return NULL;
+
+	newpol = mpol_dup(pol);
+	if (IS_ERR(newpol)) {
+		kmem_cache_free(sn_cache, n);
+		return NULL;
+	}
+	newpol->flags |= MPOL_F_SHARED;
+
 	n->start = start;
 	n->end = end;
-	mpol_get(pol);
-	pol->flags |= MPOL_F_SHARED;	/* for unref */
-	n->policy = pol;
+	n->policy = newpol;
+
 	return n;
 }
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/5] mempolicy: fix a race in shared_policy_replace()
  2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
  2012-08-20 16:36 ` [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" Mel Gorman
  2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
  2012-08-20 19:52   ` Christoph Lameter
  2012-09-07 22:59   ` KOSAKI Motohiro
  2012-08-20 16:36 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
  To: Andrew Morton, KOSAKI Motohiro
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, LKML, Linux-MM, Mel Gorman

shared_policy_replace() use of sp_alloc() is unsafe. 1) sp_node cannot
be dereferenced if sp->lock is not held and 2) another thread can modify
sp_node between spin_unlock for allocating a new sp node and next spin_lock.
The bug was introduced before 2.6.12-rc2.

Kosaki's original patch for this problem was to allocate an sp node and policy
within shared_policy_replace and initialise it when the lock is reacquired. I
was not keen on this approach because it partially duplicates sp_alloc(). As
the paths were sp->lock is taken are not that performance critical this
patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc().

[kosaki.motohiro@jp.fujitsu.com: Original patch]
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 include/linux/mempolicy.h |    2 +-
 mm/mempolicy.c            |   37 ++++++++++++++++---------------------
 2 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 95b738c..df08254 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -188,7 +188,7 @@ struct sp_node {
 
 struct shared_policy {
 	struct rb_root root;
-	spinlock_t lock;
+	struct mutex mutex;
 };
 
 void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 8025720..d297929 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2083,7 +2083,7 @@ bool __mpol_equal(struct mempolicy *a, struct mempolicy *b)
  */
 
 /* lookup first element intersecting start-end */
-/* Caller holds sp->lock */
+/* Caller holds sp->mutex */
 static struct sp_node *
 sp_lookup(struct shared_policy *sp, unsigned long start, unsigned long end)
 {
@@ -2147,13 +2147,13 @@ mpol_shared_policy_lookup(struct shared_policy *sp, unsigned long idx)
 
 	if (!sp->root.rb_node)
 		return NULL;
-	spin_lock(&sp->lock);
+	mutex_lock(&sp->mutex);
 	sn = sp_lookup(sp, idx, idx+1);
 	if (sn) {
 		mpol_get(sn->policy);
 		pol = sn->policy;
 	}
-	spin_unlock(&sp->lock);
+	mutex_unlock(&sp->mutex);
 	return pol;
 }
 
@@ -2193,10 +2193,10 @@ static struct sp_node *sp_alloc(unsigned long start, unsigned long end,
 static int shared_policy_replace(struct shared_policy *sp, unsigned long start,
 				 unsigned long end, struct sp_node *new)
 {
-	struct sp_node *n, *new2 = NULL;
+	struct sp_node *n;
+	int ret = 0;
 
-restart:
-	spin_lock(&sp->lock);
+	mutex_lock(&sp->mutex);
 	n = sp_lookup(sp, start, end);
 	/* Take care of old policies in the same range. */
 	while (n && n->start < end) {
@@ -2209,16 +2209,14 @@ restart:
 		} else {
 			/* Old policy spanning whole new range. */
 			if (n->end > end) {
+				struct sp_node *new2;
+				new2 = sp_alloc(end, n->end, n->policy);
 				if (!new2) {
-					spin_unlock(&sp->lock);
-					new2 = sp_alloc(end, n->end, n->policy);
-					if (!new2)
-						return -ENOMEM;
-					goto restart;
+					ret = -ENOMEM;
+					goto out;
 				}
 				n->end = start;
 				sp_insert(sp, new2);
-				new2 = NULL;
 				break;
 			} else
 				n->end = start;
@@ -2229,12 +2227,9 @@ restart:
 	}
 	if (new)
 		sp_insert(sp, new);
-	spin_unlock(&sp->lock);
-	if (new2) {
-		mpol_put(new2->policy);
-		kmem_cache_free(sn_cache, new2);
-	}
-	return 0;
+out:
+	mutex_unlock(&sp->mutex);
+	return ret;
 }
 
 /**
@@ -2252,7 +2247,7 @@ void mpol_shared_policy_init(struct shared_policy *sp, struct mempolicy *mpol)
 	int ret;
 
 	sp->root = RB_ROOT;		/* empty tree == default mempolicy */
-	spin_lock_init(&sp->lock);
+	mutex_init(&sp->mutex);
 
 	if (mpol) {
 		struct vm_area_struct pvma;
@@ -2318,7 +2313,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
 
 	if (!p->root.rb_node)
 		return;
-	spin_lock(&p->lock);
+	mutex_lock(&p->mutex);
 	next = rb_first(&p->root);
 	while (next) {
 		n = rb_entry(next, struct sp_node, nd);
@@ -2327,7 +2322,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
 		mpol_put(n->policy);
 		kmem_cache_free(sn_cache, n);
 	}
-	spin_unlock(&p->lock);
+	mutex_unlock(&p->mutex);
 }
 
 /* assumes fs == KERNEL_DS */
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
  2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
                   ` (2 preceding siblings ...)
  2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
  2012-08-20 19:46   ` Christoph Lameter
  2012-08-20 16:36 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
  To: Andrew Morton, KOSAKI Motohiro
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, LKML, Linux-MM, Mel Gorman

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

When shared_policy_replace() fails to allocate new->policy is not freed
correctly by mpol_set_shared_policy().  The problem is that shared mempolicy
code directly call kmem_cache_free() in multiple places where it is easy
to make a mistake.

This patch creates an sp_free wrapper function and uses it. The bug was
introduced pre-git age (IOW, before 2.6.12-rc2).

[mgorman@suse.de: Editted changelog]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/mempolicy.c |   15 +++++++++------
 1 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index d297929..45f9825 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2157,12 +2157,17 @@ mpol_shared_policy_lookup(struct shared_policy *sp, unsigned long idx)
 	return pol;
 }
 
+static void sp_free(struct sp_node *n)
+{
+	mpol_put(n->policy);
+	kmem_cache_free(sn_cache, n);
+}
+
 static void sp_delete(struct shared_policy *sp, struct sp_node *n)
 {
 	pr_debug("deleting %lx-l%lx\n", n->start, n->end);
 	rb_erase(&n->nd, &sp->root);
-	mpol_put(n->policy);
-	kmem_cache_free(sn_cache, n);
+	sp_free(n);
 }
 
 static struct sp_node *sp_alloc(unsigned long start, unsigned long end,
@@ -2301,7 +2306,7 @@ int mpol_set_shared_policy(struct shared_policy *info,
 	}
 	err = shared_policy_replace(info, vma->vm_pgoff, vma->vm_pgoff+sz, new);
 	if (err && new)
-		kmem_cache_free(sn_cache, new);
+		sp_free(new);
 	return err;
 }
 
@@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
 	while (next) {
 		n = rb_entry(next, struct sp_node, nd);
 		next = rb_next(&n->nd);
-		rb_erase(&n->nd, &p->root);
-		mpol_put(n->policy);
-		kmem_cache_free(sn_cache, n);
+		sp_delete(p, n);
 	}
 	mutex_unlock(&p->mutex);
 }
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
  2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
                   ` (3 preceding siblings ...)
  2012-08-20 16:36 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman
@ 2012-08-20 16:36 ` Mel Gorman
  2012-08-20 19:51   ` Christoph Lameter
  2012-08-21  7:29 ` [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
  2012-08-21 21:46 ` Andi Kleen
  6 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2012-08-20 16:36 UTC (permalink / raw)
  To: Andrew Morton, KOSAKI Motohiro
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, LKML, Linux-MM, Mel Gorman

[cc9a6c87: cpuset: mm: reduce large amounts of memory barrier related damage
v3] introduced a potential memory corruption. shmem_alloc_page() uses a
pseudo vma and it has one significant unique combination, vma->vm_ops=NULL
and vma->policy->flags & MPOL_F_SHARED.

get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL and
mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED.
Therefore, when a cpuset update race occurs, alloc_pages_vma() falls in 'goto
retry_cpuset' path, decrements the reference count and frees the policy
prematurely.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/mempolicy.c |   17 +++++++++++++++--
 1 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 45f9825..82e872f 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1545,15 +1545,28 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
 		struct vm_area_struct *vma, unsigned long addr)
 {
 	struct mempolicy *pol = task->mempolicy;
+	int got_ref;
 
 	if (vma) {
 		if (vma->vm_ops && vma->vm_ops->get_policy) {
 			struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
 									addr);
-			if (vpol)
+			if (vpol) {
 				pol = vpol;
-		} else if (vma->vm_policy)
+				got_ref = 1;
+			}
+		} else if (vma->vm_policy) {
 			pol = vma->vm_policy;
+
+			/*
+			 * shmem_alloc_page() passes MPOL_F_SHARED policy with
+			 * a pseudo vma whose vma->vm_ops=NULL. Take a reference
+			 * count on these policies which will be dropped by
+			 * mpol_cond_put() later
+			 */
+			if (mpol_needs_cond_ref(pol))
+				mpol_get(pol);
+		}
 	}
 	if (!pol)
 		pol = &default_policy;
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] mempolicy: Remove mempolicy sharing
  2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
@ 2012-08-20 19:35   ` Christoph Lameter
  2012-08-22 19:03   ` Andrew Morton
  1 sibling, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2012-08-20 19:35 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Mon, 20 Aug 2012, Mel Gorman wrote:

> Ideally, the shared policy handling would be rewritten to either properly
> handle COW of the policy structures or at least reference count MPOL_F_SHARED
> based exclusively on information within the policy.  However, this patch takes
> the easier approach of disabling any policy sharing between VMAs. Each new
> range allocated with sp_alloc will allocate a new policy, set the reference
> count to 1 and drop the reference count of the old policy. This increases
> the memory footprint but is not expected to be a major problem as mbind()
> is unlikely to be used for fine-grained ranges. It is also inefficient
> because it means we allocate a new policy even in cases where mbind_range()
> could use the new_policy passed to it. However, it is more straight-forward
> and the change should be invisible to the user.


Hmmm. I dont like the additional memory use but this is definitely an
issue that needs addressing.

Reviewed-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
  2012-08-20 16:36 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman
@ 2012-08-20 19:46   ` Christoph Lameter
  2012-08-21  7:15     ` Mel Gorman
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2012-08-20 19:46 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Mon, 20 Aug 2012, Mel Gorman wrote:

> @@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
>  	while (next) {
>  		n = rb_entry(next, struct sp_node, nd);
>  		next = rb_next(&n->nd);
> -		rb_erase(&n->nd, &p->root);

Looks like we need to keep the above line? sp_delete does not remove the
tree entry.

> -		mpol_put(n->policy);
> -		kmem_cache_free(sn_cache, n);
> +		sp_delete(p, n);
>  	}
>  	mutex_unlock(&p->mutex);
>  }
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
  2012-08-20 16:36 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
@ 2012-08-20 19:51   ` Christoph Lameter
  2012-08-21  7:26     ` Mel Gorman
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Lameter @ 2012-08-20 19:51 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Mon, 20 Aug 2012, Mel Gorman wrote:

> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 45f9825..82e872f 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1545,15 +1545,28 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
>  		struct vm_area_struct *vma, unsigned long addr)
>  {
>  	struct mempolicy *pol = task->mempolicy;
> +	int got_ref;

New variable. Need to set it to zero?

>
>  	if (vma) {
>  		if (vma->vm_ops && vma->vm_ops->get_policy) {
>  			struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
>  									addr);
> -			if (vpol)
> +			if (vpol) {
>  				pol = vpol;
> -		} else if (vma->vm_policy)
> +				got_ref = 1;

Set the new variable. But it was not initialzed before. So now its 1 or
undefined?

> +			}
> +		} else if (vma->vm_policy) {
>  			pol = vma->vm_policy;
> +
> +			/*
> +			 * shmem_alloc_page() passes MPOL_F_SHARED policy with
> +			 * a pseudo vma whose vma->vm_ops=NULL. Take a reference
> +			 * count on these policies which will be dropped by
> +			 * mpol_cond_put() later
> +			 */
> +			if (mpol_needs_cond_ref(pol))
> +				mpol_get(pol);
> +		}
>  	}
>  	if (!pol)
>  		pol = &default_policy;
>

I do not see any use of got_ref. Can we get rid of the variable?



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/5] mempolicy: fix a race in shared_policy_replace()
  2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
@ 2012-08-20 19:52   ` Christoph Lameter
  2012-09-07 22:59   ` KOSAKI Motohiro
  1 sibling, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2012-08-20 19:52 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Mon, 20 Aug 2012, Mel Gorman wrote:

> Kosaki's original patch for this problem was to allocate an sp node and policy
> within shared_policy_replace and initialise it when the lock is reacquired. I
> was not keen on this approach because it partially duplicates sp_alloc(). As
> the paths were sp->lock is taken are not that performance critical this
> patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc().

Reviewed-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
  2012-08-20 19:46   ` Christoph Lameter
@ 2012-08-21  7:15     ` Mel Gorman
  2012-08-21 15:38       ` Christoph Lameter
  0 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2012-08-21  7:15 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Mon, Aug 20, 2012 at 07:46:09PM +0000, Christoph Lameter wrote:
> On Mon, 20 Aug 2012, Mel Gorman wrote:
> 
> > @@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
> >  	while (next) {
> >  		n = rb_entry(next, struct sp_node, nd);
> >  		next = rb_next(&n->nd);
> > -		rb_erase(&n->nd, &p->root);
> 
> Looks like we need to keep the above line? sp_delete does not remove the
> tree entry.
> 
> > -		mpol_put(n->policy);
> > -		kmem_cache_free(sn_cache, n);
> > +		sp_delete(p, n);

Yes it does, could you have accidentally mixed up sp_free (which does not
remove the tree entry) and sp_delete (which does)? The altered code ends
up looking like this;

static void sp_delete(struct shared_policy *sp, struct sp_node *n)
{
        pr_debug("deleting %lx-l%lx\n", n->start, n->end);
        rb_erase(&n->nd, &sp->root);				<----- frees node here
        sp_free(n);
}

void mpol_free_shared_policy(struct shared_policy *p)
{
        struct sp_node *n;
        struct rb_node *next;

        if (!p->root.rb_node)
                return;
        mutex_lock(&p->mutex);
        next = rb_first(&p->root);
        while (next) {
                n = rb_entry(next, struct sp_node, nd);
                next = rb_next(&n->nd);
                sp_delete(p, n);				<---- equivalent to rb_erase(&n->nd, &p->root); sp_free(n);
        }
        mutex_unlock(&p->mutex);
}

Thanks Christoph for looking at this.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
  2012-08-20 19:51   ` Christoph Lameter
@ 2012-08-21  7:26     ` Mel Gorman
  2012-08-21 15:37       ` Christoph Lameter
  2012-09-07 23:06       ` KOSAKI Motohiro
  0 siblings, 2 replies; 24+ messages in thread
From: Mel Gorman @ 2012-08-21  7:26 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Mon, Aug 20, 2012 at 07:51:10PM +0000, Christoph Lameter wrote:
> On Mon, 20 Aug 2012, Mel Gorman wrote:
> 
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index 45f9825..82e872f 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -1545,15 +1545,28 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
> >  		struct vm_area_struct *vma, unsigned long addr)
> >  {
> >  	struct mempolicy *pol = task->mempolicy;
> > +	int got_ref;
> 
> New variable. Need to set it to zero?
> 

Not needed at all, I was meant to get rid of it. Ben had pointed out this
exact problem with the initialisation.

> >
> >  	if (vma) {
> >  		if (vma->vm_ops && vma->vm_ops->get_policy) {
> >  			struct mempolicy *vpol = vma->vm_ops->get_policy(vma,
> >  									addr);
> > -			if (vpol)
> > +			if (vpol) {
> >  				pol = vpol;
> > -		} else if (vma->vm_policy)
> > +				got_ref = 1;
> 
> Set the new variable. But it was not initialzed before. So now its 1 or
> undefined?
> 

It's not even needed because the next block is the code that originally
cared about the value of got_ref.

> > +			}
> > +		} else if (vma->vm_policy) {
> >  			pol = vma->vm_policy;
> > +
> > +			/*
> > +			 * shmem_alloc_page() passes MPOL_F_SHARED policy with
> > +			 * a pseudo vma whose vma->vm_ops=NULL. Take a reference
> > +			 * count on these policies which will be dropped by
> > +			 * mpol_cond_put() later
> > +			 */
> > +			if (mpol_needs_cond_ref(pol))
> > +				mpol_get(pol);
> > +		}
> >  	}
> >  	if (!pol)
> >  		pol = &default_policy;
> >
> 
> I do not see any use of got_ref. Can we get rid of the variable?
> 

Yes, here is a correct version of the patch. Thanks Christoph.

---8<---
mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()

[cc9a6c87: cpuset: mm: reduce large amounts of memory barrier related damage
v3] introduced a potential memory corruption. shmem_alloc_page() uses a
pseudo vma and it has one significant unique combination, vma->vm_ops=NULL
and vma->policy->flags & MPOL_F_SHARED.

get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL and
mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED.
Therefore, when a cpuset update race occurs, alloc_pages_vma() falls in 'goto
retry_cpuset' path, decrements the reference count and frees the policy
prematurely.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/mempolicy.c |   12 +++++++++++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 45f9825..9842ef5 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1552,8 +1552,18 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
 									addr);
 			if (vpol)
 				pol = vpol;
-		} else if (vma->vm_policy)
+		} else if (vma->vm_policy) {
 			pol = vma->vm_policy;
+
+			/*
+			 * shmem_alloc_page() passes MPOL_F_SHARED policy with
+			 * a pseudo vma whose vma->vm_ops=NULL. Take a reference
+			 * count on these policies which will be dropped by
+			 * mpol_cond_put() later
+			 */
+			if (mpol_needs_cond_ref(pol))
+				mpol_get(pol);
+		}
 	}
 	if (!pol)
 		pol = &default_policy;

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] Memory policy corruption fixes V2
  2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
                   ` (4 preceding siblings ...)
  2012-08-20 16:36 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
@ 2012-08-21  7:29 ` Mel Gorman
  2012-09-06 12:40   ` Josh Boyer
  2012-08-21 21:46 ` Andi Kleen
  6 siblings, 1 reply; 24+ messages in thread
From: Mel Gorman @ 2012-08-21  7:29 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, Andrew Morton, KOSAKI Motohiro, LKML, Linux-MM

On Mon, Aug 20, 2012 at 05:36:29PM +0100, Mel Gorman wrote:
> This is a rebase with some small changes to Kosaki's "mempolicy memory
> corruption fixlet" series. I had expected that Kosaki would have revised
> the series by now but it's been waiting a long time.
> 
> Changelog since V1
> o Rebase to 3.6-rc2
> o Editted some of the changelogs
> o Converted sp->lock to sp->mutex to close a race in shared_policy_replace()
> o Reworked the refcount imbalance fix slightly
> o Do not call mpol_put in shmem_alloc_page.
> 
> I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
> did not test LTP such as Josh reported a problem with or with a database that
> used shared policies like Andi tested. The series is almost all Kosaki's
> work of course. If he has a revised series that simply got delayed in
> posting it should take precedence.

I meant to add Josh to the cc, adding him now.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
  2012-08-21  7:26     ` Mel Gorman
@ 2012-08-21 15:37       ` Christoph Lameter
  2012-09-07 23:06       ` KOSAKI Motohiro
  1 sibling, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2012-08-21 15:37 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Tue, 21 Aug 2012, Mel Gorman wrote:

> mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()

Reviewed-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
  2012-08-21  7:15     ` Mel Gorman
@ 2012-08-21 15:38       ` Christoph Lameter
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Lameter @ 2012-08-21 15:38 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Tue, 21 Aug 2012, Mel Gorman wrote:

> On Mon, Aug 20, 2012 at 07:46:09PM +0000, Christoph Lameter wrote:
> > On Mon, 20 Aug 2012, Mel Gorman wrote:
> >
> > > @@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
> > >  	while (next) {
> > >  		n = rb_entry(next, struct sp_node, nd);
> > >  		next = rb_next(&n->nd);
> > > -		rb_erase(&n->nd, &p->root);
> >
> > Looks like we need to keep the above line? sp_delete does not remove the
> > tree entry.
> >
> > > -		mpol_put(n->policy);
> > > -		kmem_cache_free(sn_cache, n);
> > > +		sp_delete(p, n);
>
> Yes it does, could you have accidentally mixed up sp_free (which does not
> remove the tree entry) and sp_delete (which does)? The altered code ends
> up looking like this;

Yup I got that mixed up.

Reviewed-by: Christoph Lameter <cl@linux.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] Memory policy corruption fixes V2
  2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
                   ` (5 preceding siblings ...)
  2012-08-21  7:29 ` [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
@ 2012-08-21 21:46 ` Andi Kleen
  6 siblings, 0 replies; 24+ messages in thread
From: Andi Kleen @ 2012-08-21 21:46 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Andrew Morton, KOSAKI Motohiro, Dave Jones, Christoph Lameter,
	Ben Hutchings, Hugh Dickins, LKML, Linux-MM

> I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
> did not test LTP such as Josh reported a problem with or with a database that
> used shared policies like Andi tested. The series is almost all Kosaki's
> work of course. If he has a revised series that simply got delayed in
> posting it should take precedence.

Initial tests of this patchkit look with a test programgood, full database tests 
are still pending.

-Andi
-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] mempolicy: Remove mempolicy sharing
  2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
  2012-08-20 19:35   ` Christoph Lameter
@ 2012-08-22 19:03   ` Andrew Morton
  2012-08-22 19:33     ` Mel Gorman
  2012-08-22 19:35     ` Andi Kleen
  1 sibling, 2 replies; 24+ messages in thread
From: Andrew Morton @ 2012-08-22 19:03 UTC (permalink / raw)
  To: Mel Gorman
  Cc: KOSAKI Motohiro, Dave Jones, Christoph Lameter, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Mon, 20 Aug 2012 17:36:31 +0100
Mel Gorman <mgorman@suse.de> wrote:

> From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> 
> Dave Jones' system call fuzz testing tool "trinity" triggered the following
> bug error with slab debugging enabled
> 
> ...
>
> Cc: <stable@vger.kernel.org>

The patch dosn't apply to 3.5 at all well.  I don't see much point in
retaining the stable tag so I think I'll remove it, and suggest that
you prepare a fresh patch for Greg and explain the situation?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] mempolicy: Remove mempolicy sharing
  2012-08-22 19:03   ` Andrew Morton
@ 2012-08-22 19:33     ` Mel Gorman
  2012-08-22 19:35     ` Andi Kleen
  1 sibling, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2012-08-22 19:33 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Dave Jones, Christoph Lameter, Ben Hutchings,
	Andi Kleen, Hugh Dickins, LKML, Linux-MM

On Wed, Aug 22, 2012 at 12:03:14PM -0700, Andrew Morton wrote:
> On Mon, 20 Aug 2012 17:36:31 +0100
> Mel Gorman <mgorman@suse.de> wrote:
> 
> > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > 
> > Dave Jones' system call fuzz testing tool "trinity" triggered the following
> > bug error with slab debugging enabled
> > 
> > ...
> >
> > Cc: <stable@vger.kernel.org>
> 
> The patch dosn't apply to 3.5 at all well.  I don't see much point in
> retaining the stable tag so I think I'll remove it, and suggest that
> you prepare a fresh patch for Greg and explain the situation?
> 

Sure. I'll do the backport at the time they get merged to mainline and
jump through the hoops. Thanks.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] mempolicy: Remove mempolicy sharing
  2012-08-22 19:03   ` Andrew Morton
  2012-08-22 19:33     ` Mel Gorman
@ 2012-08-22 19:35     ` Andi Kleen
  1 sibling, 0 replies; 24+ messages in thread
From: Andi Kleen @ 2012-08-22 19:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, KOSAKI Motohiro, Dave Jones, Christoph Lameter,
	Ben Hutchings, Hugh Dickins, LKML, Linux-MM

On Wed, Aug 22, 2012 at 12:03:14PM -0700, Andrew Morton wrote:
> On Mon, 20 Aug 2012 17:36:31 +0100
> Mel Gorman <mgorman@suse.de> wrote:
> 
> > From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > 
> > Dave Jones' system call fuzz testing tool "trinity" triggered the following
> > bug error with slab debugging enabled
> > 
> > ...
> >
> > Cc: <stable@vger.kernel.org>
> 
> The patch dosn't apply to 3.5 at all well.  I don't see much point in
> retaining the stable tag so I think I'll remove it, and suggest that
> you prepare a fresh patch for Greg and explain the situation?

Everything applies fine if you redo the revert manually.

BTW we tested it now and the new patchkit indeed fixes the database.
Please queue for 3.6 and 3.5 stable.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] Memory policy corruption fixes V2
  2012-08-21  7:29 ` [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
@ 2012-09-06 12:40   ` Josh Boyer
  2012-09-07  9:43     ` Mel Gorman
  0 siblings, 1 reply; 24+ messages in thread
From: Josh Boyer @ 2012-09-06 12:40 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, Andrew Morton, KOSAKI Motohiro, LKML, Linux-MM

On Tue, Aug 21, 2012 at 3:29 AM, Mel Gorman <mgorman@suse.de> wrote:
> On Mon, Aug 20, 2012 at 05:36:29PM +0100, Mel Gorman wrote:
>> This is a rebase with some small changes to Kosaki's "mempolicy memory
>> corruption fixlet" series. I had expected that Kosaki would have revised
>> the series by now but it's been waiting a long time.
>>
>> Changelog since V1
>> o Rebase to 3.6-rc2
>> o Editted some of the changelogs
>> o Converted sp->lock to sp->mutex to close a race in shared_policy_replace()
>> o Reworked the refcount imbalance fix slightly
>> o Do not call mpol_put in shmem_alloc_page.
>>
>> I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
>> did not test LTP such as Josh reported a problem with or with a database that
>> used shared policies like Andi tested. The series is almost all Kosaki's
>> work of course. If he has a revised series that simply got delayed in
>> posting it should take precedence.
>
> I meant to add Josh to the cc, adding him now.

Thank you.

I see Andi has done some testing and Acked this patchset.  Christoph
appears to have Acked it as well.  Is there anything else needed for
it to get in mainline?  Just want to make sure this doesn't get dropped
because we all forgot about it after KS/Plumbers.

josh

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 0/5] Memory policy corruption fixes V2
  2012-09-06 12:40   ` Josh Boyer
@ 2012-09-07  9:43     ` Mel Gorman
  0 siblings, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2012-09-07  9:43 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Dave Jones, Christoph Lameter, Ben Hutchings, Andi Kleen,
	Hugh Dickins, Andrew Morton, KOSAKI Motohiro, LKML, Linux-MM

On Thu, Sep 06, 2012 at 08:40:21AM -0400, Josh Boyer wrote:
> On Tue, Aug 21, 2012 at 3:29 AM, Mel Gorman <mgorman@suse.de> wrote:
> > On Mon, Aug 20, 2012 at 05:36:29PM +0100, Mel Gorman wrote:
> >> This is a rebase with some small changes to Kosaki's "mempolicy memory
> >> corruption fixlet" series. I had expected that Kosaki would have revised
> >> the series by now but it's been waiting a long time.
> >>
> >> Changelog since V1
> >> o Rebase to 3.6-rc2
> >> o Editted some of the changelogs
> >> o Converted sp->lock to sp->mutex to close a race in shared_policy_replace()
> >> o Reworked the refcount imbalance fix slightly
> >> o Do not call mpol_put in shmem_alloc_page.
> >>
> >> I tested this with trinity with CONFIG_DEBUG_SLAB enabled and it passed. I
> >> did not test LTP such as Josh reported a problem with or with a database that
> >> used shared policies like Andi tested. The series is almost all Kosaki's
> >> work of course. If he has a revised series that simply got delayed in
> >> posting it should take precedence.
> >
> > I meant to add Josh to the cc, adding him now.
> 
> Thank you.
> 
> I see Andi has done some testing and Acked this patchset.  Christoph
> appears to have Acked it as well.  Is there anything else needed for
> it to get in mainline? 

Not that I'm aware of.

> Just want to make sure this doesn't get dropped
> because we all forgot about it after KS/Plumbers.
> 

Right now it's in linux-next via Andrew. Bar major surprises I expect
it'll end up in mainline in due course. When it does, I'll do a -stable
backport.

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/5] mempolicy: fix a race in shared_policy_replace()
  2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
  2012-08-20 19:52   ` Christoph Lameter
@ 2012-09-07 22:59   ` KOSAKI Motohiro
  1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2012-09-07 22:59 UTC (permalink / raw)
  To: mgorman
  Cc: akpm, kosaki.motohiro, davej, cl, ben, ak, hughd, linux-kernel, linux-mm

First off, thank you very much for reworking this for me. I haven't got a chance
to get a test machine for this.

> shared_policy_replace() use of sp_alloc() is unsafe. 1) sp_node cannot
> be dereferenced if sp->lock is not held and 2) another thread can modify
> sp_node between spin_unlock for allocating a new sp node and next spin_lock.
> The bug was introduced before 2.6.12-rc2.
> 
> Kosaki's original patch for this problem was to allocate an sp node and policy
> within shared_policy_replace and initialise it when the lock is reacquired. I
> was not keen on this approach because it partially duplicates sp_alloc(). As
> the paths were sp->lock is taken are not that performance critical this
> patch converts sp->lock to sp->mutex so it can sleep when calling sp_alloc().

Looks make sense.

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
  2012-08-21  7:26     ` Mel Gorman
  2012-08-21 15:37       ` Christoph Lameter
@ 2012-09-07 23:06       ` KOSAKI Motohiro
  1 sibling, 0 replies; 24+ messages in thread
From: KOSAKI Motohiro @ 2012-09-07 23:06 UTC (permalink / raw)
  To: mgorman
  Cc: cl, akpm, kosaki.motohiro, davej, ben, ak, hughd, linux-kernel, linux-mm

> mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma()
> 
> [cc9a6c87: cpuset: mm: reduce large amounts of memory barrier related damage
> v3] introduced a potential memory corruption. shmem_alloc_page() uses a
> pseudo vma and it has one significant unique combination, vma->vm_ops=NULL
> and vma->policy->flags & MPOL_F_SHARED.
> 
> get_vma_policy() does NOT increase a policy ref when vma->vm_ops=NULL and
> mpol_cond_put() DOES decrease a policy ref when a policy has MPOL_F_SHARED.
> Therefore, when a cpuset update race occurs, alloc_pages_vma() falls in 'goto
> retry_cpuset' path, decrements the reference count and frees the policy
> prematurely.
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
> ---
>  mm/mempolicy.c |   12 +++++++++++-
>  1 files changed, 11 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 45f9825..9842ef5 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1552,8 +1552,18 @@ struct mempolicy *get_vma_policy(struct task_struct *task,
>  									addr);
>  			if (vpol)
>  				pol = vpol;
> -		} else if (vma->vm_policy)
> +		} else if (vma->vm_policy) {
>  			pol = vma->vm_policy;
> +
> +			/*
> +			 * shmem_alloc_page() passes MPOL_F_SHARED policy with
> +			 * a pseudo vma whose vma->vm_ops=NULL. Take a reference
> +			 * count on these policies which will be dropped by
> +			 * mpol_cond_put() later
> +			 */
> +			if (mpol_needs_cond_ref(pol))
> +				mpol_get(pol);
> +		}

Ok, looks sene change. thank you.


Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

>  	}
>  	if (!pol)
>  		pol = &default_policy;
> 


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy()
  2012-10-09 16:58 [PATCH 0/5] Memory policy corruption fixes -stable Mel Gorman
@ 2012-10-09 16:58 ` Mel Gorman
  0 siblings, 0 replies; 24+ messages in thread
From: Mel Gorman @ 2012-10-09 16:58 UTC (permalink / raw)
  To: Stable
  Cc: Andi Kleen, Andrew Morton, KOSAKI Motohiro, Dave Jones,
	Christoph Lameter, Hugh Dickins, LKML, Linux-MM, Mel Gorman

From: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

commit 63f74ca21f1fad36d075e063f06dcc6d39fe86b2 upstream.

When shared_policy_replace() fails to allocate new->policy is not freed
correctly by mpol_set_shared_policy().  The problem is that shared
mempolicy code directly call kmem_cache_free() in multiple places where
it is easy to make a mistake.

This patch creates an sp_free wrapper function and uses it. The bug was
introduced pre-git age (IOW, before 2.6.12-rc2).

[mgorman@suse.de: Editted changelog]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Mel Gorman <mgorman@suse.de>
Reviewed-by: Christoph Lameter <cl@linux.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Mel Gorman <mgorman@suse.de>
---
 mm/mempolicy.c |   15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b2f12ec..1763418 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -2157,12 +2157,17 @@ mpol_shared_policy_lookup(struct shared_policy *sp, unsigned long idx)
 	return pol;
 }
 
+static void sp_free(struct sp_node *n)
+{
+	mpol_put(n->policy);
+	kmem_cache_free(sn_cache, n);
+}
+
 static void sp_delete(struct shared_policy *sp, struct sp_node *n)
 {
 	pr_debug("deleting %lx-l%lx\n", n->start, n->end);
 	rb_erase(&n->nd, &sp->root);
-	mpol_put(n->policy);
-	kmem_cache_free(sn_cache, n);
+	sp_free(n);
 }
 
 static struct sp_node *sp_alloc(unsigned long start, unsigned long end,
@@ -2301,7 +2306,7 @@ int mpol_set_shared_policy(struct shared_policy *info,
 	}
 	err = shared_policy_replace(info, vma->vm_pgoff, vma->vm_pgoff+sz, new);
 	if (err && new)
-		kmem_cache_free(sn_cache, new);
+		sp_free(new);
 	return err;
 }
 
@@ -2318,9 +2323,7 @@ void mpol_free_shared_policy(struct shared_policy *p)
 	while (next) {
 		n = rb_entry(next, struct sp_node, nd);
 		next = rb_next(&n->nd);
-		rb_erase(&n->nd, &p->root);
-		mpol_put(n->policy);
-		kmem_cache_free(sn_cache, n);
+		sp_delete(p, n);
 	}
 	mutex_unlock(&p->mutex);
 }
-- 
1.7.9.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2012-10-09 16:59 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-20 16:36 [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
2012-08-20 16:36 ` [PATCH 1/5] Revert "mm: mempolicy: Let vma_merge and vma_split handle vma->vm_policy linkages" Mel Gorman
2012-08-20 16:36 ` [PATCH 2/5] mempolicy: Remove mempolicy sharing Mel Gorman
2012-08-20 19:35   ` Christoph Lameter
2012-08-22 19:03   ` Andrew Morton
2012-08-22 19:33     ` Mel Gorman
2012-08-22 19:35     ` Andi Kleen
2012-08-20 16:36 ` [PATCH 3/5] mempolicy: fix a race in shared_policy_replace() Mel Gorman
2012-08-20 19:52   ` Christoph Lameter
2012-09-07 22:59   ` KOSAKI Motohiro
2012-08-20 16:36 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman
2012-08-20 19:46   ` Christoph Lameter
2012-08-21  7:15     ` Mel Gorman
2012-08-21 15:38       ` Christoph Lameter
2012-08-20 16:36 ` [PATCH 5/5] mempolicy: fix a memory corruption by refcount imbalance in alloc_pages_vma() Mel Gorman
2012-08-20 19:51   ` Christoph Lameter
2012-08-21  7:26     ` Mel Gorman
2012-08-21 15:37       ` Christoph Lameter
2012-09-07 23:06       ` KOSAKI Motohiro
2012-08-21  7:29 ` [PATCH 0/5] Memory policy corruption fixes V2 Mel Gorman
2012-09-06 12:40   ` Josh Boyer
2012-09-07  9:43     ` Mel Gorman
2012-08-21 21:46 ` Andi Kleen
2012-10-09 16:58 [PATCH 0/5] Memory policy corruption fixes -stable Mel Gorman
2012-10-09 16:58 ` [PATCH 4/5] mempolicy: fix refcount leak in mpol_set_shared_policy() Mel Gorman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).