All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH next/mmotm] slub: partly fix freeze in __slab_free
@ 2011-07-11 18:58 Hugh Dickins
  2011-07-11 19:39 ` Eric Dumazet
  0 siblings, 1 reply; 12+ messages in thread
From: Hugh Dickins @ 2011-07-11 18:58 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Andrew Morton, linux-kernel

My load tests on PowerPC freeze within minutes in __slab_free().
I happened to try PPC first, didn't try without this fix on x86.

It looks as if the author was interrupted while devising the new
cmpxchg_double_slab() version of __slab_free(): its decision to
spin_lock_irqsave() depends on several uninitialized fields,
and fixing that (by copying page to new) mostly fixes it.

But I didn't think about it very much, and this may well not be what
the author intends; and I have seen a couple of much rarer freezes
in __slab_free() on PPC (not yet on x86) even after applying this.

Signed-off-by: Hugh Dickins <hughd@google.com>
---
 mm/slub.c |    1 +
 1 file changed, 1 insertion(+)

--- mmotm/mm/slub.c	2011-07-08 18:59:44.135443127 -0700
+++ linux/mm/slub.c	2011-07-10 05:07:08.000000000 -0700
@@ -2217,6 +2217,7 @@ static void __slab_free(struct kmem_cach
 		return;
 
 	do {
+		new = *page;
 		prior = page->freelist;
 		counters = page->counters;
 		set_freepointer(s, object, prior);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-11 18:58 [PATCH next/mmotm] slub: partly fix freeze in __slab_free Hugh Dickins
@ 2011-07-11 19:39 ` Eric Dumazet
  2011-07-11 20:46   ` Hugh Dickins
  2011-07-12  6:13   ` Pekka Enberg
  0 siblings, 2 replies; 12+ messages in thread
From: Eric Dumazet @ 2011-07-11 19:39 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Christoph Lameter, Pekka Enberg, Andrew Morton, linux-kernel

Le lundi 11 juillet 2011 à 11:58 -0700, Hugh Dickins a écrit :
> My load tests on PowerPC freeze within minutes in __slab_free().
> I happened to try PPC first, didn't try without this fix on x86.
> 
> It looks as if the author was interrupted while devising the new
> cmpxchg_double_slab() version of __slab_free(): its decision to
> spin_lock_irqsave() depends on several uninitialized fields,
> and fixing that (by copying page to new) mostly fixes it.
> 
> But I didn't think about it very much, and this may well not be what
> the author intends; and I have seen a couple of much rarer freezes
> in __slab_free() on PPC (not yet on x86) even after applying this.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>
> ---
>  mm/slub.c |    1 +
>  1 file changed, 1 insertion(+)
> 
> --- mmotm/mm/slub.c	2011-07-08 18:59:44.135443127 -0700
> +++ linux/mm/slub.c	2011-07-10 05:07:08.000000000 -0700
> @@ -2217,6 +2217,7 @@ static void __slab_free(struct kmem_cach
>  		return;
>  
>  	do {
> +		new = *page;
>  		prior = page->freelist;
>  		counters = page->counters;
>  		set_freepointer(s, object, prior);
> --

I suspect you hit the bug on 32bit arch ?

What about following patch instead ?

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3d76a43..1351d28 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -62,7 +62,7 @@ struct page {
 		struct {			/* SLUB cmpxchg_double area */
 			void *freelist;
 			union {
-				unsigned long counters;
+				u64	counters;
 				struct {
 					unsigned inuse:16;
 					unsigned objects:15;



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-11 19:39 ` Eric Dumazet
@ 2011-07-11 20:46   ` Hugh Dickins
  2011-07-12  6:13   ` Pekka Enberg
  1 sibling, 0 replies; 12+ messages in thread
From: Hugh Dickins @ 2011-07-11 20:46 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Christoph Lameter, Pekka Enberg, Andrew Morton, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 836 bytes --]

On Mon, 11 Jul 2011, Eric Dumazet wrote:
> Le lundi 11 juillet 2011 à 11:58 -0700, Hugh Dickins a écrit :
> > My load tests on PowerPC freeze within minutes in __slab_free().
> > I happened to try PPC first, didn't try without this fix on x86.
> 
> I suspect you hit the bug on 32bit arch ?

No, it was ppc64.  I've not actually tried that load on mmotm 32bit
at all yet, better do so tonight!

Hugh

> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3d76a43..1351d28 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -62,7 +62,7 @@ struct page {
>  		struct {			/* SLUB cmpxchg_double area */
>  			void *freelist;
>  			union {
> -				unsigned long counters;
> +				u64	counters;
>  				struct {
>  					unsigned inuse:16;
>  					unsigned objects:15;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-11 19:39 ` Eric Dumazet
  2011-07-11 20:46   ` Hugh Dickins
@ 2011-07-12  6:13   ` Pekka Enberg
  2011-07-12 15:11     ` Christoph Lameter
  1 sibling, 1 reply; 12+ messages in thread
From: Pekka Enberg @ 2011-07-12  6:13 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Hugh Dickins, Christoph Lameter, Andrew Morton, linux-kernel

On 7/11/11 10:39 PM, Eric Dumazet wrote:
> Le lundi 11 juillet 2011 à 11:58 -0700, Hugh Dickins a écrit :
>> My load tests on PowerPC freeze within minutes in __slab_free().
>> I happened to try PPC first, didn't try without this fix on x86.
>>
>> It looks as if the author was interrupted while devising the new
>> cmpxchg_double_slab() version of __slab_free(): its decision to
>> spin_lock_irqsave() depends on several uninitialized fields,
>> and fixing that (by copying page to new) mostly fixes it.
>>
>> But I didn't think about it very much, and this may well not be what
>> the author intends; and I have seen a couple of much rarer freezes
>> in __slab_free() on PPC (not yet on x86) even after applying this.
>>
>> Signed-off-by: Hugh Dickins<hughd@google.com>
>> ---
>>   mm/slub.c |    1 +
>>   1 file changed, 1 insertion(+)
>>
>> --- mmotm/mm/slub.c	2011-07-08 18:59:44.135443127 -0700
>> +++ linux/mm/slub.c	2011-07-10 05:07:08.000000000 -0700
>> @@ -2217,6 +2217,7 @@ static void __slab_free(struct kmem_cach
>>   		return;
>>
>>   	do {
>> +		new = *page;
>>   		prior = page->freelist;
>>   		counters = page->counters;
>>   		set_freepointer(s, object, prior);
>> --
> I suspect you hit the bug on 32bit arch ?
>
> What about following patch instead ?
>
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 3d76a43..1351d28 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -62,7 +62,7 @@ struct page {
>   		struct {			/* SLUB cmpxchg_double area */
>   			void *freelist;
>   			union {
> -				unsigned long counters;
> +				u64	counters;
>   				struct {
>   					unsigned inuse:16;
>   					unsigned objects:15;
>

Christoph?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-12  6:13   ` Pekka Enberg
@ 2011-07-12 15:11     ` Christoph Lameter
  2011-07-12 15:55       ` Christoph Lameter
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Lameter @ 2011-07-12 15:11 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Eric Dumazet, Hugh Dickins, Andrew Morton, linux-kernel

On Tue, 12 Jul 2011, Pekka Enberg wrote:

> > diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> > index 3d76a43..1351d28 100644
> > --- a/include/linux/mm_types.h
> > +++ b/include/linux/mm_types.h
> > @@ -62,7 +62,7 @@ struct page {
> >   		struct {			/* SLUB cmpxchg_double area */
> >   			void *freelist;
> >   			union {
> > -				unsigned long counters;
> > +				u64	counters;
> >   				struct {
> >   					unsigned inuse:16;
> >   					unsigned objects:15;
> >
>
> Christoph?


counters needs to overlay the bitfields as well as _count. That is problem
with the earlier fix.

Sorry I have had a power outage for the last two days (strange Chicago
weather) and my server is only up for a few minutes and will go down
periodically due to a UPS malfunction. On a generator. Sigh.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-12 15:11     ` Christoph Lameter
@ 2011-07-12 15:55       ` Christoph Lameter
  2011-07-13 11:28         ` Hugh Dickins
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Lameter @ 2011-07-12 15:55 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Eric Dumazet, Hugh Dickins, Andrew Morton, linux-kernel

I wish there would be some easier way to overlay fields. The counters
field exists to allow an easier reference to the bitfield structure in
order to pass it to cmpxchg_double_slab.

In 64 bit the counters field needs to overlay the bitfields and _count
since we can only pass 64 bits integers to cmpxchg_double_slab.

On 32 bit counters field only covers the bitfields.

The "new" page struct in __slab_free is only partially populated and the
overlays are used for conversion between structs and words in order to be
able to pass the struct contents by value.

Signed-off-by: Christoph Lameter <cl@linux.com>


---
 include/linux/mm_types.h |   22 ++++++++++++++--------
 1 file changed, 14 insertions(+), 8 deletions(-)

Index: linux-2.6/include/linux/mm_types.h
===================================================================
--- linux-2.6.orig/include/linux/mm_types.h	2011-07-12 10:14:08.057706223 -0500
+++ linux-2.6/include/linux/mm_types.h	2011-07-12 10:14:14.117706184 -0500
@@ -56,20 +56,26 @@ struct page {
 		};

 		union {
-			atomic_t _mapcount;	/* Count of ptes mapped in mms,
+			/* Used for cmpxchg_double in slub */
+			unsigned long counters;
+
+			struct {
+
+				union {
+					atomic_t _mapcount;	/* Count of ptes mapped in mms,
 							 * to show when page is mapped
 							 * & limit reverse map searches.
 							 */

-			/* Used for cmpxchg_double in slub */
-			unsigned long counters;
-			struct {
-				unsigned inuse:16;
-				unsigned objects:15;
-				unsigned frozen:1;
+					struct {
+						unsigned inuse:16;
+						unsigned objects:15;
+						unsigned frozen:1;
+					};
+				};
+				atomic_t _count;		/* Usage count, see below. */
 			};
 		};
-		atomic_t _count;		/* Usage count, see below. */
 	};

 	/* Third double word block */

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-12 15:55       ` Christoph Lameter
@ 2011-07-13 11:28         ` Hugh Dickins
  2011-07-13 15:47           ` Christoph Lameter
  0 siblings, 1 reply; 12+ messages in thread
From: Hugh Dickins @ 2011-07-13 11:28 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: Pekka Enberg, Eric Dumazet, Andrew Morton, linux-kernel

On Tue, 12 Jul 2011, Christoph Lameter wrote:
> I wish there would be some easier way to overlay fields.

Yes, struct page has become ... a mess.

(I'd use stronger language, but I'm being a little careful at the
moment, not to say something that Andrew might then throw back at
me in our radix-tree discussion ;)

I'd be happy for the 32bit and 64bit cases to be separated entirely,
if that would help seeing their layouts clearly (but I don't think 
that's actually the problem).

It is the (standard e.g. x86_64) 64bit layout to which you are adding
8 bytes, isn't it?  Not in the patch below, but in your cmpxchg_double
slub series.  Which aligns it nicely for everyone, but does use
significantly more mem_map memory - I wonder how many people
have really taken that on board.

> The counters
> field exists to allow an easier reference to the bitfield structure in
> order to pass it to cmpxchg_double_slab.
> 
> In 64 bit the counters field needs to overlay the bitfields and _count
> since we can only pass 64 bits integers to cmpxchg_double_slab.
> 
> On 32 bit counters field only covers the bitfields.
> 
> The "new" page struct in __slab_free is only partially populated and the
> overlays are used for conversion between structs and words in order to be
> able to pass the struct contents by value.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

As I reported in Fengguang's more_io thread, I couldn't get this patch
to apply, unsurprisingly to mmotm, more surprisingly to Pekka's for-next
(which adds in the _count build fix).  Did you try to tidy up the
indentations a little in between?

At the bottom I've put the patch against mmotm which I started testing
instead on ppc64 a few hours ago.  It again seized up in __slab_free()
after two and a half hours.  So either I screwed up my transliteration
of your patch, or your patch is still wrong.  Please check mine out.

Is yours actually tested on PowerPC or ARM, say?  I wonder if you're
making endianness or bitfield ordering assumptions; but I've made no
attempt to work it out.  Anything I can send you?  Disassembly of
__slab_free()?

I didn't actually try the patch on x86 at all.

> 
> ---
>  include/linux/mm_types.h |   22 ++++++++++++++--------
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> Index: linux-2.6/include/linux/mm_types.h
> ===================================================================
> --- linux-2.6.orig/include/linux/mm_types.h	2011-07-12 10:14:08.057706223 -0500
> +++ linux-2.6/include/linux/mm_types.h	2011-07-12 10:14:14.117706184 -0500
> @@ -56,20 +56,26 @@ struct page {
>  		};
> 
>  		union {
> -			atomic_t _mapcount;	/* Count of ptes mapped in mms,
> +			/* Used for cmpxchg_double in slub */
> +			unsigned long counters;
> +
> +			struct {
> +
> +				union {
> +					atomic_t _mapcount;	/* Count of ptes mapped in mms,
>  							 * to show when page is mapped
>  							 * & limit reverse map searches.
>  							 */
> 
> -			/* Used for cmpxchg_double in slub */
> -			unsigned long counters;
> -			struct {
> -				unsigned inuse:16;
> -				unsigned objects:15;
> -				unsigned frozen:1;
> +					struct {
> +						unsigned inuse:16;
> +						unsigned objects:15;
> +						unsigned frozen:1;
> +					};
> +				};
> +				atomic_t _count;		/* Usage count, see below. */
>  			};
>  		};
> -		atomic_t _count;		/* Usage count, see below. */
>  	};
> 
>  	/* Third double word block */

This is what I tested with: is it what you intended against mmotm?

--- mmotm/include/linux/mm_types.h	2011-07-09 11:25:18.000000000 -0700
+++ linux/include/linux/mm_types.h	2011-07-12 22:04:01.000000000 -0700
@@ -49,30 +49,31 @@ struct page {
 					 * see PAGE_MAPPING_ANON below.
 					 */
 	/* Second double word */
-	union {
-		struct {
+	struct {
+		union {
 			pgoff_t index;		/* Our offset within mapping. */
-			atomic_t _mapcount;	/* Count of ptes mapped in mms,
+			void *freelist;		/* slub first free object */
+		};
+
+		union {
+			/* Used for cmpxchg_double in slub */
+			unsigned long counters;
+
+			struct {
+
+				union {
+					atomic_t _mapcount;	/* Count of ptes mapped in mms,
 							 * to show when page is mapped
 							 * & limit reverse map searches.
 							 */
-			atomic_t _count;		/* Usage count, see below. */
-		};
 
-		struct {			/* SLUB cmpxchg_double area */
-			void *freelist;
-			union {
-				unsigned long counters;
-				struct {
-					unsigned inuse:16;
-					unsigned objects:15;
-					unsigned frozen:1;
-					/*
-					 * Kernel may make use of this field even when slub
-					 * uses the rest of the double word!
-					 */
-					atomic_t _count;
+					struct {
+						unsigned inuse:16;
+						unsigned objects:15;
+						unsigned frozen:1;
+					};
 				};
+				atomic_t _count;		/* Usage count, see below. */
 			};
 		};
 	};

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-13 11:28         ` Hugh Dickins
@ 2011-07-13 15:47           ` Christoph Lameter
  2011-07-13 15:55             ` Christoph Lameter
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Lameter @ 2011-07-13 15:47 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Pekka Enberg, Eric Dumazet, Andrew Morton, linux-kernel

On Wed, 13 Jul 2011, Hugh Dickins wrote:

> As I reported in Fengguang's more_io thread, I couldn't get this patch
> to apply, unsurprisingly to mmotm, more surprisingly to Pekka's for-next
> (which adds in the _count build fix).  Did you try to tidy up the
> indentations a little in between?

Nope. There are multiple patches that fix up thing in mm_types now and
they need to be stacked. I will try to avoid that.

> At the bottom I've put the patch against mmotm which I started testing
> instead on ppc64 a few hours ago.  It again seized up in __slab_free()
> after two and a half hours.  So either I screwed up my transliteration
> of your patch, or your patch is still wrong.  Please check mine out.
>
> Is yours actually tested on PowerPC or ARM, say?  I wonder if you're
> making endianness or bitfield ordering assumptions; but I've made no
> attempt to work it out.  Anything I can send you?  Disassembly of
> __slab_free()?

No I dont have PowerPC or ARM here. The problem is that without the
cmpxchg_double we fall back to slab_lock/slab_unlock. That code is not
that well tested. It is also used for slub_debug and that is how I tested
it.

The patch looks fine. I suspect that the reason lies elsewhere.

__slab_free does not disable interrupts. The use of cmpxchg_double_slab
would have to disable them if it falls back to the pagelock. Sigh.




^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-13 15:47           ` Christoph Lameter
@ 2011-07-13 15:55             ` Christoph Lameter
  2011-07-14 16:14               ` Hugh Dickins
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Lameter @ 2011-07-13 15:55 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Pekka Enberg, Eric Dumazet, Andrew Morton, linux-kernel


Subject: slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock.

Disable interrupts when falling back to the use of the page lock in
cmpxchg_double_slab().

Becomes necessary since __slab_free will call cmpxchg_double_slab without disabling
interrupts.

A bit bad because the invocations from the allocation path of cmpxchg_double_slab
occur with interrupts already disabled (but there are plans to enable
interrupts there as well in the future).

Signed-off-by: Christoph Lameter <cl@linux.com>


---
 mm/slub.c |    5 +++++
 1 file changed, 5 insertions(+)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2011-07-13 10:51:07.867138834 -0500
+++ linux-2.6/mm/slub.c	2011-07-13 10:51:20.957138752 -0500
@@ -368,14 +368,19 @@ static inline bool cmpxchg_double_slab(s
 	} else
 #endif
 	{
+		unsigned long flags;
+
+		local_irq_save(flags);
 		slab_lock(page);
 		if (page->freelist == freelist_old && page->counters == counters_old) {
 			page->freelist = freelist_new;
 			page->counters = counters_new;
 			slab_unlock(page);
+			local_irq_restore(flags);
 			return 1;
 		}
 		slab_unlock(page);
+		local_irq_restore(flags);
 	}

 	cpu_relax();

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-13 15:55             ` Christoph Lameter
@ 2011-07-14 16:14               ` Hugh Dickins
  2011-07-14 17:26                 ` Christoph Lameter
  0 siblings, 1 reply; 12+ messages in thread
From: Hugh Dickins @ 2011-07-14 16:14 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Eric Dumazet, Andrew Morton, Wu Fengguang, linux-kernel

On Wed, 13 Jul 2011, Christoph Lameter wrote:
> 
> Subject: slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock.
> 
> Disable interrupts when falling back to the use of the page lock in
> cmpxchg_double_slab().
> 
> Becomes necessary since __slab_free will call cmpxchg_double_slab without disabling
> interrupts.
> 
> A bit bad because the invocations from the allocation path of cmpxchg_double_slab
> occur with interrupts already disabled (but there are plans to enable
> interrupts there as well in the future).
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

Success, thank you: I ran my load for 22 hours on the ppc64,
no problem this time, where before it froze in 2.5 hours.

I was running rc6-mm1 plus this patch, plus (my transcription of) your
mm_types.h struct page patch, plus Fengguang's redirty_tail patch;
leaving out my "*new = page" patch which kicked this thread off.

Feel free to add
Tested-by: Hugh Dickins <hughd@google.com>

> 
> 
> ---
>  mm/slub.c |    5 +++++
>  1 file changed, 5 insertions(+)
> 
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c	2011-07-13 10:51:07.867138834 -0500
> +++ linux-2.6/mm/slub.c	2011-07-13 10:51:20.957138752 -0500
> @@ -368,14 +368,19 @@ static inline bool cmpxchg_double_slab(s
>  	} else
>  #endif
>  	{
> +		unsigned long flags;
> +
> +		local_irq_save(flags);
>  		slab_lock(page);
>  		if (page->freelist == freelist_old && page->counters == counters_old) {
>  			page->freelist = freelist_new;
>  			page->counters = counters_new;
>  			slab_unlock(page);
> +			local_irq_restore(flags);
>  			return 1;
>  		}
>  		slab_unlock(page);
> +		local_irq_restore(flags);
>  	}
> 
>  	cpu_relax();
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-14 16:14               ` Hugh Dickins
@ 2011-07-14 17:26                 ` Christoph Lameter
  2011-07-15 19:22                   ` Hugh Dickins
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Lameter @ 2011-07-14 17:26 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Pekka Enberg, Eric Dumazet, Andrew Morton, Wu Fengguang, linux-kernel

On Thu, 14 Jul 2011, Hugh Dickins wrote:

> Success, thank you: I ran my load for 22 hours on the ppc64,
> no problem this time, where before it froze in 2.5 hours.
>
> I was running rc6-mm1 plus this patch, plus (my transcription of) your
> mm_types.h struct page patch, plus Fengguang's redirty_tail patch;
> leaving out my "*new = page" patch which kicked this thread off.
>
> Feel free to add
> Tested-by: Hugh Dickins <hughd@google.com>

Ok now we know the cause. The cure needs to be a bit different I think
since this is performance critical code and the interrupt operations are
high latency.

Subject: slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock (V2)

Split cmpxchg_double_slab into two functions. One for the case where we know that
interrupts are disabled (and therefore the fallback does not need to disable
interrupts) and one for the other cases where fallback will also disable interrupts.

This fixes the issue that __slab_free called cmpxchg_double_slab in some scenarios
without disabling interrupts.

Signed-off-by: Christoph Lameter <cl@linux.com>

---
 mm/slub.c |   49 +++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 45 insertions(+), 4 deletions(-)

Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c	2011-07-14 12:23:01.617611569 -0500
+++ linux-2.6/mm/slub.c	2011-07-14 12:23:04.117611553 -0500
@@ -354,6 +354,42 @@ static __always_inline void slab_unlock(
 	__bit_spin_unlock(PG_locked, &page->flags);
 }

+/* Interrupts must be disabled (for the fallback code to work right) */
+static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
+		void *freelist_old, unsigned long counters_old,
+		void *freelist_new, unsigned long counters_new,
+		const char *n)
+{
+	VM_BUG_ON(!irqs_disabled());
+#ifdef CONFIG_CMPXCHG_DOUBLE
+	if (s->flags & __CMPXCHG_DOUBLE) {
+		if (cmpxchg_double(&page->freelist,
+			freelist_old, counters_old,
+			freelist_new, counters_new))
+		return 1;
+	} else
+#endif
+	{
+		slab_lock(page);
+		if (page->freelist == freelist_old && page->counters == counters_old) {
+			page->freelist = freelist_new;
+			page->counters = counters_new;
+			slab_unlock(page);
+			return 1;
+		}
+		slab_unlock(page);
+	}
+
+	cpu_relax();
+	stat(s, CMPXCHG_DOUBLE_FAIL);
+
+#ifdef SLUB_DEBUG_CMPXCHG
+	printk(KERN_INFO "%s %s: cmpxchg double redo ", n, s->name);
+#endif
+
+	return 0;
+}
+
 static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
 		void *freelist_old, unsigned long counters_old,
 		void *freelist_new, unsigned long counters_new,
@@ -368,14 +404,19 @@ static inline bool cmpxchg_double_slab(s
 	} else
 #endif
 	{
+		unsigned long flags;
+
+		local_irq_save(flags);
 		slab_lock(page);
 		if (page->freelist == freelist_old && page->counters == counters_old) {
 			page->freelist = freelist_new;
 			page->counters = counters_new;
 			slab_unlock(page);
+			local_irq_restore(flags);
 			return 1;
 		}
 		slab_unlock(page);
+		local_irq_restore(flags);
 	}

 	cpu_relax();
@@ -1471,7 +1512,7 @@ static inline int acquire_slab(struct km
 		VM_BUG_ON(new.frozen);
 		new.frozen = 1;

-	} while (!cmpxchg_double_slab(s, page,
+	} while (!__cmpxchg_double_slab(s, page,
 			freelist, counters,
 			NULL, new.counters,
 			"lock and freeze"));
@@ -1709,7 +1750,7 @@ static void deactivate_slab(struct kmem_
 			new.inuse--;
 			VM_BUG_ON(!new.frozen);

-		} while (!cmpxchg_double_slab(s, page,
+		} while (!__cmpxchg_double_slab(s, page,
 			prior, counters,
 			freelist, new.counters,
 			"drain percpu freelist"));
@@ -1798,7 +1839,7 @@ redo:
 	}

 	l = m;
-	if (!cmpxchg_double_slab(s, page,
+	if (!__cmpxchg_double_slab(s, page,
 				old.freelist, old.counters,
 				new.freelist, new.counters,
 				"unfreezing slab"))
@@ -1992,7 +2033,7 @@ static void *__slab_alloc(struct kmem_ca
 		new.inuse = page->objects;
 		new.frozen = object != NULL;

-	} while (!cmpxchg_double_slab(s, page,
+	} while (!__cmpxchg_double_slab(s, page,
 			object, counters,
 			NULL, new.counters,
 			"__slab_alloc"));

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH next/mmotm] slub: partly fix freeze in __slab_free
  2011-07-14 17:26                 ` Christoph Lameter
@ 2011-07-15 19:22                   ` Hugh Dickins
  0 siblings, 0 replies; 12+ messages in thread
From: Hugh Dickins @ 2011-07-15 19:22 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, Eric Dumazet, Andrew Morton, Wu Fengguang, linux-kernel

On Thu, 14 Jul 2011, Christoph Lameter wrote:
> On Thu, 14 Jul 2011, Hugh Dickins wrote:
> 
> > Success, thank you: I ran my load for 22 hours on the ppc64,
> > no problem this time, where before it froze in 2.5 hours.
> >
> > I was running rc6-mm1 plus this patch, plus (my transcription of) your
> > mm_types.h struct page patch, plus Fengguang's redirty_tail patch;
> > leaving out my "*new = page" patch which kicked this thread off.
> >
> > Feel free to add
> > Tested-by: Hugh Dickins <hughd@google.com>
> 
> Ok now we know the cause. The cure needs to be a bit different I think
> since this is performance critical code and the interrupt operations are
> high latency.
> 
> Subject: slub: disable interrupts in cmpxchg_double_slab when falling back to pagelock (V2)
> 
> Split cmpxchg_double_slab into two functions. One for the case where we know that
> interrupts are disabled (and therefore the fallback does not need to disable
> interrupts) and one for the other cases where fallback will also disable interrupts.
> 
> This fixes the issue that __slab_free called cmpxchg_double_slab in some scenarios
> without disabling interrupts.
> 
> Signed-off-by: Christoph Lameter <cl@linux.com>

As you'd expect, this patch worked too.

> 
> ---
>  mm/slub.c |   49 +++++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 45 insertions(+), 4 deletions(-)
> 
> Index: linux-2.6/mm/slub.c
> ===================================================================
> --- linux-2.6.orig/mm/slub.c	2011-07-14 12:23:01.617611569 -0500
> +++ linux-2.6/mm/slub.c	2011-07-14 12:23:04.117611553 -0500
> @@ -354,6 +354,42 @@ static __always_inline void slab_unlock(
>  	__bit_spin_unlock(PG_locked, &page->flags);
>  }
> 
> +/* Interrupts must be disabled (for the fallback code to work right) */
> +static inline bool __cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
> +		void *freelist_old, unsigned long counters_old,
> +		void *freelist_new, unsigned long counters_new,
> +		const char *n)
> +{
> +	VM_BUG_ON(!irqs_disabled());
> +#ifdef CONFIG_CMPXCHG_DOUBLE
> +	if (s->flags & __CMPXCHG_DOUBLE) {
> +		if (cmpxchg_double(&page->freelist,
> +			freelist_old, counters_old,
> +			freelist_new, counters_new))
> +		return 1;
> +	} else
> +#endif
> +	{
> +		slab_lock(page);
> +		if (page->freelist == freelist_old && page->counters == counters_old) {
> +			page->freelist = freelist_new;
> +			page->counters = counters_new;
> +			slab_unlock(page);
> +			return 1;
> +		}
> +		slab_unlock(page);
> +	}
> +
> +	cpu_relax();
> +	stat(s, CMPXCHG_DOUBLE_FAIL);
> +
> +#ifdef SLUB_DEBUG_CMPXCHG
> +	printk(KERN_INFO "%s %s: cmpxchg double redo ", n, s->name);
> +#endif
> +
> +	return 0;
> +}
> +
>  static inline bool cmpxchg_double_slab(struct kmem_cache *s, struct page *page,
>  		void *freelist_old, unsigned long counters_old,
>  		void *freelist_new, unsigned long counters_new,
> @@ -368,14 +404,19 @@ static inline bool cmpxchg_double_slab(s
>  	} else
>  #endif
>  	{
> +		unsigned long flags;
> +
> +		local_irq_save(flags);
>  		slab_lock(page);
>  		if (page->freelist == freelist_old && page->counters == counters_old) {
>  			page->freelist = freelist_new;
>  			page->counters = counters_new;
>  			slab_unlock(page);
> +			local_irq_restore(flags);
>  			return 1;
>  		}
>  		slab_unlock(page);
> +		local_irq_restore(flags);
>  	}
> 
>  	cpu_relax();
> @@ -1471,7 +1512,7 @@ static inline int acquire_slab(struct km
>  		VM_BUG_ON(new.frozen);
>  		new.frozen = 1;
> 
> -	} while (!cmpxchg_double_slab(s, page,
> +	} while (!__cmpxchg_double_slab(s, page,
>  			freelist, counters,
>  			NULL, new.counters,
>  			"lock and freeze"));
> @@ -1709,7 +1750,7 @@ static void deactivate_slab(struct kmem_
>  			new.inuse--;
>  			VM_BUG_ON(!new.frozen);
> 
> -		} while (!cmpxchg_double_slab(s, page,
> +		} while (!__cmpxchg_double_slab(s, page,
>  			prior, counters,
>  			freelist, new.counters,
>  			"drain percpu freelist"));
> @@ -1798,7 +1839,7 @@ redo:
>  	}
> 
>  	l = m;
> -	if (!cmpxchg_double_slab(s, page,
> +	if (!__cmpxchg_double_slab(s, page,
>  				old.freelist, old.counters,
>  				new.freelist, new.counters,
>  				"unfreezing slab"))
> @@ -1992,7 +2033,7 @@ static void *__slab_alloc(struct kmem_ca
>  		new.inuse = page->objects;
>  		new.frozen = object != NULL;
> 
> -	} while (!cmpxchg_double_slab(s, page,
> +	} while (!__cmpxchg_double_slab(s, page,
>  			object, counters,
>  			NULL, new.counters,
>  			"__slab_alloc"));

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-07-15 19:22 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-11 18:58 [PATCH next/mmotm] slub: partly fix freeze in __slab_free Hugh Dickins
2011-07-11 19:39 ` Eric Dumazet
2011-07-11 20:46   ` Hugh Dickins
2011-07-12  6:13   ` Pekka Enberg
2011-07-12 15:11     ` Christoph Lameter
2011-07-12 15:55       ` Christoph Lameter
2011-07-13 11:28         ` Hugh Dickins
2011-07-13 15:47           ` Christoph Lameter
2011-07-13 15:55             ` Christoph Lameter
2011-07-14 16:14               ` Hugh Dickins
2011-07-14 17:26                 ` Christoph Lameter
2011-07-15 19:22                   ` Hugh Dickins

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.