linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] revert changes to zcache_do_preload()
@ 2012-08-23 15:33 Seth Jennings
  2012-08-23 15:33 ` [PATCH 1/2] Revert "staging: zcache: cleanup zcache_do_preload and zcache_put_page" Seth Jennings
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Seth Jennings @ 2012-08-23 15:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Seth Jennings, Andrew Morton, Nitin Gupta, Minchan Kim,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel

This patchset fixes a regression in 3.6 by reverting two dependent
commits that made changes to zcache_do_preload().

The commits undermine an assumption made by tmem_put() in
the cleancache path that preemption is disabled.  This change
introduces a race condition that can result in the wrong page
being returned by tmem_get(), causing assorted errors (segfaults,
apparent file corruption, etc) in userspace.

The corruption was discussed in this thread:
https://lkml.org/lkml/2012/8/17/494

Please apply this patchset to 3.6.  This problem didn't exist
in previous releases so nothing need be done for the stable trees.

Seth Jennings (2):
  Revert "staging: zcache: cleanup zcache_do_preload and
    zcache_put_page"
  Revert "staging: zcache: optimize zcache_do_preload"

 drivers/staging/zcache/zcache-main.c |   54 +++++++++++++++++++---------------
 1 file changed, 31 insertions(+), 23 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/2] Revert "staging: zcache: cleanup zcache_do_preload and zcache_put_page"
  2012-08-23 15:33 [PATCH 0/2] revert changes to zcache_do_preload() Seth Jennings
@ 2012-08-23 15:33 ` Seth Jennings
  2012-08-23 15:33 ` [PATCH 2/2] Revert "staging: zcache: optimize zcache_do_preload" Seth Jennings
  2012-08-23 20:56 ` [PATCH 0/2] revert changes to zcache_do_preload() Minchan Kim
  2 siblings, 0 replies; 9+ messages in thread
From: Seth Jennings @ 2012-08-23 15:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Seth Jennings, Andrew Morton, Nitin Gupta, Minchan Kim,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel

This reverts commit b71f3bcc5ab5e76a22d7ad82b3795602fcf0e0af.

This commit is resulting  memory corruption in the cleancache case

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Reported-by: Dan Magenheimer <dan.magenheimer@oracle.com>
---
 drivers/staging/zcache/zcache-main.c |   37 +++++++++++++++++++---------------
 1 file changed, 21 insertions(+), 16 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index c214977..8a335b9 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1048,24 +1048,29 @@ static int zcache_do_preload(struct tmem_pool *pool)
 		kp->objnodes[kp->nr++] = objnode;
 	}
 
-	if (!kp->obj) {
-		obj = kmem_cache_alloc(zcache_obj_cache, ZCACHE_GFP_MASK);
-		if (unlikely(obj == NULL)) {
-			zcache_failed_alloc++;
-			goto out;
-		}
-		kp->obj = obj;
+	obj = kmem_cache_alloc(zcache_obj_cache, ZCACHE_GFP_MASK);
+	if (unlikely(obj == NULL)) {
+		zcache_failed_alloc++;
+		goto out;
 	}
 
-	if (!kp->page) {
-		page = (void *)__get_free_page(ZCACHE_GFP_MASK);
-		if (unlikely(page == NULL)) {
-			zcache_failed_get_free_pages++;
-			goto out;
-		}
-		kp->page =  page;
+	page = (void *)__get_free_page(ZCACHE_GFP_MASK);
+	if (unlikely(page == NULL)) {
+		zcache_failed_get_free_pages++;
+		kmem_cache_free(zcache_obj_cache, obj);
+		goto out;
 	}
 
+	if (kp->obj == NULL)
+		kp->obj = obj;
+	else
+		kmem_cache_free(zcache_obj_cache, obj);
+
+	if (kp->page == NULL)
+		kp->page = page;
+	else
+		free_page((unsigned long)page);
+
 	ret = 0;
 out:
 	return ret;
@@ -1575,14 +1580,14 @@ static int zcache_put_page(int cli_id, int pool_id, struct tmem_oid *oidp,
 			else
 				zcache_failed_pers_puts++;
 		}
+		zcache_put_pool(pool);
 	} else {
 		zcache_put_to_flush++;
 		if (atomic_read(&pool->obj_count) > 0)
 			/* the put fails whether the flush succeeds or not */
 			(void)tmem_flush_page(pool, oidp, index);
+		zcache_put_pool(pool);
 	}
-
-	zcache_put_pool(pool);
 out:
 	return ret;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] Revert "staging: zcache: optimize zcache_do_preload"
  2012-08-23 15:33 [PATCH 0/2] revert changes to zcache_do_preload() Seth Jennings
  2012-08-23 15:33 ` [PATCH 1/2] Revert "staging: zcache: cleanup zcache_do_preload and zcache_put_page" Seth Jennings
@ 2012-08-23 15:33 ` Seth Jennings
  2012-08-23 20:56 ` [PATCH 0/2] revert changes to zcache_do_preload() Minchan Kim
  2 siblings, 0 replies; 9+ messages in thread
From: Seth Jennings @ 2012-08-23 15:33 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Seth Jennings, Andrew Morton, Nitin Gupta, Minchan Kim,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel

This reverts commit 79c0d92c5b6175c1462fbe38bf44180f325aa478.

This commit is resulting  memory corruption in the cleancache case

Signed-off-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
Reported-by: Dan Magenheimer <dan.magenheimer@oracle.com>
---
 drivers/staging/zcache/zcache-main.c |   21 ++++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
index 8a335b9..4f92d87 100644
--- a/drivers/staging/zcache/zcache-main.c
+++ b/drivers/staging/zcache/zcache-main.c
@@ -1034,43 +1034,45 @@ static int zcache_do_preload(struct tmem_pool *pool)
 		goto out;
 	if (unlikely(zcache_obj_cache == NULL))
 		goto out;
-
-	/* IRQ has already been disabled. */
+	preempt_disable();
 	kp = &__get_cpu_var(zcache_preloads);
 	while (kp->nr < ARRAY_SIZE(kp->objnodes)) {
+		preempt_enable_no_resched();
 		objnode = kmem_cache_alloc(zcache_objnode_cache,
 				ZCACHE_GFP_MASK);
 		if (unlikely(objnode == NULL)) {
 			zcache_failed_alloc++;
 			goto out;
 		}
-
-		kp->objnodes[kp->nr++] = objnode;
+		preempt_disable();
+		kp = &__get_cpu_var(zcache_preloads);
+		if (kp->nr < ARRAY_SIZE(kp->objnodes))
+			kp->objnodes[kp->nr++] = objnode;
+		else
+			kmem_cache_free(zcache_objnode_cache, objnode);
 	}
-
+	preempt_enable_no_resched();
 	obj = kmem_cache_alloc(zcache_obj_cache, ZCACHE_GFP_MASK);
 	if (unlikely(obj == NULL)) {
 		zcache_failed_alloc++;
 		goto out;
 	}
-
 	page = (void *)__get_free_page(ZCACHE_GFP_MASK);
 	if (unlikely(page == NULL)) {
 		zcache_failed_get_free_pages++;
 		kmem_cache_free(zcache_obj_cache, obj);
 		goto out;
 	}
-
+	preempt_disable();
+	kp = &__get_cpu_var(zcache_preloads);
 	if (kp->obj == NULL)
 		kp->obj = obj;
 	else
 		kmem_cache_free(zcache_obj_cache, obj);
-
 	if (kp->page == NULL)
 		kp->page = page;
 	else
 		free_page((unsigned long)page);
-
 	ret = 0;
 out:
 	return ret;
@@ -1581,6 +1583,7 @@ static int zcache_put_page(int cli_id, int pool_id, struct tmem_oid *oidp,
 				zcache_failed_pers_puts++;
 		}
 		zcache_put_pool(pool);
+		preempt_enable_no_resched();
 	} else {
 		zcache_put_to_flush++;
 		if (atomic_read(&pool->obj_count) > 0)
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] revert changes to zcache_do_preload()
  2012-08-23 15:33 [PATCH 0/2] revert changes to zcache_do_preload() Seth Jennings
  2012-08-23 15:33 ` [PATCH 1/2] Revert "staging: zcache: cleanup zcache_do_preload and zcache_put_page" Seth Jennings
  2012-08-23 15:33 ` [PATCH 2/2] Revert "staging: zcache: optimize zcache_do_preload" Seth Jennings
@ 2012-08-23 20:56 ` Minchan Kim
  2012-08-23 22:10   ` Seth Jennings
  2 siblings, 1 reply; 9+ messages in thread
From: Minchan Kim @ 2012-08-23 20:56 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta, Minchan Kim,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel, xiaoguangrong

Hi Seth,

On Thu, Aug 23, 2012 at 10:33:09AM -0500, Seth Jennings wrote:
> This patchset fixes a regression in 3.6 by reverting two dependent
> commits that made changes to zcache_do_preload().
> 
> The commits undermine an assumption made by tmem_put() in
> the cleancache path that preemption is disabled.  This change
> introduces a race condition that can result in the wrong page
> being returned by tmem_get(), causing assorted errors (segfaults,
> apparent file corruption, etc) in userspace.
> 
> The corruption was discussed in this thread:
> https://lkml.org/lkml/2012/8/17/494

I think changelog isn't enough to explain what's the race.
Could you write it down in detail?

And you should Cc'ed Xiao who is author of reverted patch.

> 
> Please apply this patchset to 3.6.  This problem didn't exist
> in previous releases so nothing need be done for the stable trees.
> 
> Seth Jennings (2):
>   Revert "staging: zcache: cleanup zcache_do_preload and
>     zcache_put_page"
>   Revert "staging: zcache: optimize zcache_do_preload"
> 
>  drivers/staging/zcache/zcache-main.c |   54 +++++++++++++++++++---------------
>  1 file changed, 31 insertions(+), 23 deletions(-)
> 
> -- 
> 1.7.9.5
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] revert changes to zcache_do_preload()
  2012-08-23 20:56 ` [PATCH 0/2] revert changes to zcache_do_preload() Minchan Kim
@ 2012-08-23 22:10   ` Seth Jennings
  2012-08-23 23:28     ` Minchan Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Seth Jennings @ 2012-08-23 22:10 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel, xiaoguangrong

On 08/23/2012 03:56 PM, Minchan Kim wrote:
> Hi Seth,
> 
> On Thu, Aug 23, 2012 at 10:33:09AM -0500, Seth Jennings wrote:
>> This patchset fixes a regression in 3.6 by reverting two dependent
>> commits that made changes to zcache_do_preload().
>>
>> The commits undermine an assumption made by tmem_put() in
>> the cleancache path that preemption is disabled.  This change
>> introduces a race condition that can result in the wrong page
>> being returned by tmem_get(), causing assorted errors (segfaults,
>> apparent file corruption, etc) in userspace.
>>
>> The corruption was discussed in this thread:
>> https://lkml.org/lkml/2012/8/17/494
> 
> I think changelog isn't enough to explain what's the race.
> Could you write it down in detail?

I didn't come upon this solution via code inspection, but
rather through discovering that the issue didn't exist in
v3.5 and just looking at the changes since then.

> And you should Cc'ed Xiao who is author of reverted patch.

Thanks for adding Xiao.  I meant to do this. For some reason
I thought that you submitted that patchset :-/
My bad.

Seth


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] revert changes to zcache_do_preload()
  2012-08-23 22:10   ` Seth Jennings
@ 2012-08-23 23:28     ` Minchan Kim
  2012-08-24  2:21       ` Xiao Guangrong
  2012-08-24 20:57       ` Seth Jennings
  0 siblings, 2 replies; 9+ messages in thread
From: Minchan Kim @ 2012-08-23 23:28 UTC (permalink / raw)
  To: Seth Jennings
  Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel, xiaoguangrong

On Thu, Aug 23, 2012 at 05:10:00PM -0500, Seth Jennings wrote:
> On 08/23/2012 03:56 PM, Minchan Kim wrote:
> > Hi Seth,
> > 
> > On Thu, Aug 23, 2012 at 10:33:09AM -0500, Seth Jennings wrote:
> >> This patchset fixes a regression in 3.6 by reverting two dependent
> >> commits that made changes to zcache_do_preload().
> >>
> >> The commits undermine an assumption made by tmem_put() in
> >> the cleancache path that preemption is disabled.  This change
> >> introduces a race condition that can result in the wrong page
> >> being returned by tmem_get(), causing assorted errors (segfaults,
> >> apparent file corruption, etc) in userspace.
> >>
> >> The corruption was discussed in this thread:
> >> https://lkml.org/lkml/2012/8/17/494
> > 
> > I think changelog isn't enough to explain what's the race.
> > Could you write it down in detail?
> 
> I didn't come upon this solution via code inspection, but
> rather through discovering that the issue didn't exist in
> v3.5 and just looking at the changes since then.

Okay, then, why do you think the patchsets are culprit?
I didn't look the cleanup patch series of Xiao at that time
so I can be wrong but as I just look through patch of
"zcache: optimize zcache_do_preload", I can't find any fault
because zcache_put_page checks irq_disable so we don't need
to disable preemption so it seems that patch is correct to me.
If the race happens by preemption, BUG_ON in zcache_put_page
should catch it.

What do you mean? Do you have any clue in your mind?

        The commits undermine an assumption made by tmem_put() in
        the cleancache path that preemption is disabled.

> 
> > And you should Cc'ed Xiao who is author of reverted patch.
> 
> Thanks for adding Xiao.  I meant to do this. For some reason
> I thought that you submitted that patchset :-/

Even, I didn't notice that patchset at that time. :)

> My bad.
> 
> Seth
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] revert changes to zcache_do_preload()
  2012-08-23 23:28     ` Minchan Kim
@ 2012-08-24  2:21       ` Xiao Guangrong
  2012-08-24 20:57       ` Seth Jennings
  1 sibling, 0 replies; 9+ messages in thread
From: Xiao Guangrong @ 2012-08-24  2:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Seth Jennings, Greg Kroah-Hartman, Andrew Morton, Nitin Gupta,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel

On 08/24/2012 07:28 AM, Minchan Kim wrote:
> On Thu, Aug 23, 2012 at 05:10:00PM -0500, Seth Jennings wrote:
>> On 08/23/2012 03:56 PM, Minchan Kim wrote:
>>> Hi Seth,
>>>
>>> On Thu, Aug 23, 2012 at 10:33:09AM -0500, Seth Jennings wrote:
>>>> This patchset fixes a regression in 3.6 by reverting two dependent
>>>> commits that made changes to zcache_do_preload().
>>>>
>>>> The commits undermine an assumption made by tmem_put() in
>>>> the cleancache path that preemption is disabled.  This change
>>>> introduces a race condition that can result in the wrong page
>>>> being returned by tmem_get(), causing assorted errors (segfaults,
>>>> apparent file corruption, etc) in userspace.
>>>>
>>>> The corruption was discussed in this thread:
>>>> https://lkml.org/lkml/2012/8/17/494
>>>
>>> I think changelog isn't enough to explain what's the race.
>>> Could you write it down in detail?
>>
>> I didn't come upon this solution via code inspection, but
>> rather through discovering that the issue didn't exist in
>> v3.5 and just looking at the changes since then.
> 
> Okay, then, why do you think the patchsets are culprit?
> I didn't look the cleanup patch series of Xiao at that time
> so I can be wrong but as I just look through patch of
> "zcache: optimize zcache_do_preload", I can't find any fault
> because zcache_put_page checks irq_disable so we don't need
> to disable preemption so it seems that patch is correct to me.
> If the race happens by preemption, BUG_ON in zcache_put_page
> should catch it.

Confused me too!

And the first patch just do the cleanup, it is not different
before the patch and after the patch, what i missed?


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] revert changes to zcache_do_preload()
  2012-08-23 23:28     ` Minchan Kim
  2012-08-24  2:21       ` Xiao Guangrong
@ 2012-08-24 20:57       ` Seth Jennings
  2012-08-29 17:42         ` Seth Jennings
  1 sibling, 1 reply; 9+ messages in thread
From: Seth Jennings @ 2012-08-24 20:57 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel, xiaoguangrong

On 08/23/2012 06:28 PM, Minchan Kim wrote:
> Okay, then, why do you think the patchsets are culprit?
> I didn't look the cleanup patch series of Xiao at that time
> so I can be wrong but as I just look through patch of
> "zcache: optimize zcache_do_preload", I can't find any fault
> because zcache_put_page checks irq_disable so we don't need
> to disable preemption so it seems that patch is correct to me.
> If the race happens by preemption, BUG_ON in zcache_put_page
> should catch it.
> 
> What do you mean? Do you have any clue in your mind?
> 
>         The commits undermine an assumption made by tmem_put() in
>         the cleancache path that preemption is disabled.

I do not have an explanation right now for why these commits
expose this issue.  The patch looks like it should be fine
to me, hence my Ack at the time.

I understand and agree with you that the zcache shim
functions zcache_put_page(), zcache_get_page(),
zcache_flush_page(), and zcache_flush_object() all disable
interrupts (or make sure that interrupts are already
disabled) which implicitly disables preemption.

I'm still trying to find root cause here.

Seth


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/2] revert changes to zcache_do_preload()
  2012-08-24 20:57       ` Seth Jennings
@ 2012-08-29 17:42         ` Seth Jennings
  0 siblings, 0 replies; 9+ messages in thread
From: Seth Jennings @ 2012-08-29 17:42 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Greg Kroah-Hartman, Andrew Morton, Nitin Gupta,
	Konrad Rzeszutek Wilk, Dan Magenheimer, linux-mm, linux-kernel,
	devel, xiaoguangrong

Forget this whole thing, these reverts do _not_ fix the issue.

I wrote a test program to exercises cleancache and
determined that this problem has existed since the as far
back at v3.1 (basically the beginning).

No recent commit caused this.

Seth


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2012-08-29 17:42 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-23 15:33 [PATCH 0/2] revert changes to zcache_do_preload() Seth Jennings
2012-08-23 15:33 ` [PATCH 1/2] Revert "staging: zcache: cleanup zcache_do_preload and zcache_put_page" Seth Jennings
2012-08-23 15:33 ` [PATCH 2/2] Revert "staging: zcache: optimize zcache_do_preload" Seth Jennings
2012-08-23 20:56 ` [PATCH 0/2] revert changes to zcache_do_preload() Minchan Kim
2012-08-23 22:10   ` Seth Jennings
2012-08-23 23:28     ` Minchan Kim
2012-08-24  2:21       ` Xiao Guangrong
2012-08-24 20:57       ` Seth Jennings
2012-08-29 17:42         ` Seth Jennings

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).