linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch] Cache align futex hash buckets
@ 2006-02-20 23:32 Ravikiran G Thirumalai
  2006-02-20 23:33 ` Andrew Morton
  2006-02-21  3:30 ` Nick Piggin
  0 siblings, 2 replies; 22+ messages in thread
From: Ravikiran G Thirumalai @ 2006-02-20 23:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, Shai Fultheim (Shai@scalex86.org)

Following change places each element of the futex_queues hashtable on a 
different cacheline.  Spinlocks of adjacent hash buckets lie on the same 
cacheline otherwise.

Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Shai Fultheim <shai@scalex86.org>

Index: linux-2.6.16-rc2/kernel/futex.c
===================================================================
--- linux-2.6.16-rc2.orig/kernel/futex.c	2006-02-07 23:14:04.000000000 -0800
+++ linux-2.6.16-rc2/kernel/futex.c	2006-02-09 14:04:22.000000000 -0800
@@ -100,9 +100,10 @@ struct futex_q {
 struct futex_hash_bucket {
        spinlock_t              lock;
        struct list_head       chain;
-};
+} ____cacheline_internodealigned_in_smp;
 
-static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
+static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS] 
+				__cacheline_aligned_in_smp;
 
 /* Futex-fs vfsmount entry: */
 static struct vfsmount *futex_mnt;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-20 23:32 [patch] Cache align futex hash buckets Ravikiran G Thirumalai
@ 2006-02-20 23:33 ` Andrew Morton
  2006-02-20 23:34   ` Andrew Morton
  2006-02-21  3:30 ` Nick Piggin
  1 sibling, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2006-02-20 23:33 UTC (permalink / raw)
  To: Ravikiran G Thirumalai; +Cc: linux-kernel, shai

Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
>
> Following change places each element of the futex_queues hashtable on a 
> different cacheline.  Spinlocks of adjacent hash buckets lie on the same 
> cacheline otherwise.
> 
> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
> Signed-off-by: Shai Fultheim <shai@scalex86.org>
> 
> Index: linux-2.6.16-rc2/kernel/futex.c
> ===================================================================
> --- linux-2.6.16-rc2.orig/kernel/futex.c	2006-02-07 23:14:04.000000000 -0800
> +++ linux-2.6.16-rc2/kernel/futex.c	2006-02-09 14:04:22.000000000 -0800
> @@ -100,9 +100,10 @@ struct futex_q {
>  struct futex_hash_bucket {
>         spinlock_t              lock;
>         struct list_head       chain;
> -};
> +} ____cacheline_internodealigned_in_smp;
>  
> -static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> +static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS] 
> +				__cacheline_aligned_in_smp;
>  

How much memory does that thing end up consuming?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-20 23:33 ` Andrew Morton
@ 2006-02-20 23:34   ` Andrew Morton
  2006-02-21  0:09     ` Ravikiran G Thirumalai
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2006-02-20 23:34 UTC (permalink / raw)
  To: kiran, linux-kernel, shai

Andrew Morton <akpm@osdl.org> wrote:
>
> Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
> >
> > Following change places each element of the futex_queues hashtable on a 
> > different cacheline.  Spinlocks of adjacent hash buckets lie on the same 
> > cacheline otherwise.
> > 
> > Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
> > Signed-off-by: Shai Fultheim <shai@scalex86.org>
> > 
> > Index: linux-2.6.16-rc2/kernel/futex.c
> > ===================================================================
> > --- linux-2.6.16-rc2.orig/kernel/futex.c	2006-02-07 23:14:04.000000000 -0800
> > +++ linux-2.6.16-rc2/kernel/futex.c	2006-02-09 14:04:22.000000000 -0800
> > @@ -100,9 +100,10 @@ struct futex_q {
> >  struct futex_hash_bucket {
> >         spinlock_t              lock;
> >         struct list_head       chain;
> > -};
> > +} ____cacheline_internodealigned_in_smp;
> >  
> > -static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> > +static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS] 
> > +				__cacheline_aligned_in_smp;
> >  
> 
> How much memory does that thing end up consuming?

I think a megabyte?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-20 23:34   ` Andrew Morton
@ 2006-02-21  0:09     ` Ravikiran G Thirumalai
  2006-02-21  0:23       ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Ravikiran G Thirumalai @ 2006-02-21  0:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, shai

On Mon, Feb 20, 2006 at 03:34:19PM -0800, Andrew Morton wrote:
> Andrew Morton <akpm@osdl.org> wrote:
> >
> > > @@ -100,9 +100,10 @@ struct futex_q {
> > >  struct futex_hash_bucket {
> > >         spinlock_t              lock;
> > >         struct list_head       chain;
> > > -};
> > > +} ____cacheline_internodealigned_in_smp;
> > >  
> > > -static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> > > +static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS] 
> > > +				__cacheline_aligned_in_smp;
> > >  
> > 
> > How much memory does that thing end up consuming?
> 
> I think a megabyte?

On most machines it would be 256 * 128 = 32k. or 16k on arches with 64B 
cachelines.  This looked like a simpler solution for spinlocks falling on
the same cacheline.  So is 16/32k unreasonable?

Thanks,
Kiran

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21  0:09     ` Ravikiran G Thirumalai
@ 2006-02-21  0:23       ` Andrew Morton
  2006-02-21  1:04         ` Ravikiran G Thirumalai
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2006-02-21  0:23 UTC (permalink / raw)
  To: Ravikiran G Thirumalai; +Cc: linux-kernel, shai

Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
>
> On Mon, Feb 20, 2006 at 03:34:19PM -0800, Andrew Morton wrote:
> > Andrew Morton <akpm@osdl.org> wrote:
> > >
> > > > @@ -100,9 +100,10 @@ struct futex_q {
> > > >  struct futex_hash_bucket {
> > > >         spinlock_t              lock;
> > > >         struct list_head       chain;
> > > > -};
> > > > +} ____cacheline_internodealigned_in_smp;
> > > >  
> > > > -static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> > > > +static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS] 
> > > > +				__cacheline_aligned_in_smp;
> > > >  
> > > 
> > > How much memory does that thing end up consuming?
> > 
> > I think a megabyte?
> 
> On most machines it would be 256 * 128 = 32k. or 16k on arches with 64B 
> cachelines.  This looked like a simpler solution for spinlocks falling on
> the same cacheline.  So is 16/32k unreasonable?
> 

CONFIG_X86_VSMP enables 4096-byte padding for
____cacheline_internodealigned_in_smp.    It's a megabyte.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21  0:23       ` Andrew Morton
@ 2006-02-21  1:04         ` Ravikiran G Thirumalai
  2006-02-21  1:09           ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Ravikiran G Thirumalai @ 2006-02-21  1:04 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, shai

On Mon, Feb 20, 2006 at 04:23:17PM -0800, Andrew Morton wrote:
> Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
> >
> > On Mon, Feb 20, 2006 at 03:34:19PM -0800, Andrew Morton wrote:
> > > Andrew Morton <akpm@osdl.org> wrote:
> > > >
> > > > > @@ -100,9 +100,10 @@ struct futex_q {
> > > > >  struct futex_hash_bucket {
> > > > >         spinlock_t              lock;
> > > > >         struct list_head       chain;
> > > > > -};
> > > > > +} ____cacheline_internodealigned_in_smp;
> > > > >  
> > > > > -static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> > > > > +static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS] 
> > > > > +				__cacheline_aligned_in_smp;
> > > > >  
> > > > 
> > > > How much memory does that thing end up consuming?
> > > 
> > > I think a megabyte?
> > 
> > On most machines it would be 256 * 128 = 32k. or 16k on arches with 64B 
> > cachelines.  This looked like a simpler solution for spinlocks falling on
> > the same cacheline.  So is 16/32k unreasonable?
> > 
> 
> CONFIG_X86_VSMP enables 4096-byte padding for
> ____cacheline_internodealigned_in_smp.    It's a megabyte.

Yes, only on vSMPowered systems.  Well, we have a large 
internode cacheline, but these machines have lots of memory too.  I 
thought a  simpler padding solution might be acceptable as futex_queues 
would be large only on our boxes.
Maybe we can dynamically allocate spinlocks, but it would be difficult to
say which node the spinlock should come from as yet.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21  1:04         ` Ravikiran G Thirumalai
@ 2006-02-21  1:09           ` Andrew Morton
  2006-02-21  1:39             ` Ravikiran G Thirumalai
  2006-02-21 14:44             ` Andi Kleen
  0 siblings, 2 replies; 22+ messages in thread
From: Andrew Morton @ 2006-02-21  1:09 UTC (permalink / raw)
  To: Ravikiran G Thirumalai; +Cc: linux-kernel, shai

Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
>
> On Mon, Feb 20, 2006 at 04:23:17PM -0800, Andrew Morton wrote:
> > Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
> > >
> > > On Mon, Feb 20, 2006 at 03:34:19PM -0800, Andrew Morton wrote:
> > > > Andrew Morton <akpm@osdl.org> wrote:
> > > > >
> > > > > > @@ -100,9 +100,10 @@ struct futex_q {
> > > > > >  struct futex_hash_bucket {
> > > > > >         spinlock_t              lock;
> > > > > >         struct list_head       chain;
> > > > > > -};
> > > > > > +} ____cacheline_internodealigned_in_smp;
> > > > > >  
> > > > > > -static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> > > > > > +static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS] 
> > > > > > +				__cacheline_aligned_in_smp;
> > > > > >  
> > > > > 
> > > > > How much memory does that thing end up consuming?
> > > > 
> > > > I think a megabyte?
> > > 
> > > On most machines it would be 256 * 128 = 32k. or 16k on arches with 64B 
> > > cachelines.  This looked like a simpler solution for spinlocks falling on
> > > the same cacheline.  So is 16/32k unreasonable?
> > > 
> > 
> > CONFIG_X86_VSMP enables 4096-byte padding for
> > ____cacheline_internodealigned_in_smp.    It's a megabyte.
> 
> Yes, only on vSMPowered systems.  Well, we have a large 
> internode cacheline, but these machines have lots of memory too.  I 
> thought a  simpler padding solution might be acceptable as futex_queues 
> would be large only on our boxes.

Well it's your architecture...  As long as you're finding this to be a
sufficiently large problem in testing to justify consuming a meg of memory
then fine, let's do it.

But your initial changelog was rather benchmark-free?  It's always nice to
see numbers accompanying a purported optimisation patch.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21  1:09           ` Andrew Morton
@ 2006-02-21  1:39             ` Ravikiran G Thirumalai
  2006-02-21 14:44             ` Andi Kleen
  1 sibling, 0 replies; 22+ messages in thread
From: Ravikiran G Thirumalai @ 2006-02-21  1:39 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, shai

On Mon, Feb 20, 2006 at 05:09:40PM -0800, Andrew Morton wrote:
> Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
> >
> > 
> > Yes, only on vSMPowered systems.  Well, we have a large 
> > internode cacheline, but these machines have lots of memory too.  I 
> > thought a  simpler padding solution might be acceptable as futex_queues 
> > would be large only on our boxes.
> 
> Well it's your architecture...  As long as you're finding this to be a
> sufficiently large problem in testing to justify consuming a meg of memory
> then fine, let's do it.
> 
> But your initial changelog was rather benchmark-free?  It's always nice to
> see numbers accompanying a purported optimisation patch.

We saw 30% better elapsed time with a threaded benchmark on our systems, with 
this patch.  That said, we would like to avoid this bloat on our systems too, 
and some work needs to be done to improve futex hashing on NUMA.  But for now, 
this patch should be good enough.

Thanks,
Kiran

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-20 23:32 [patch] Cache align futex hash buckets Ravikiran G Thirumalai
  2006-02-20 23:33 ` Andrew Morton
@ 2006-02-21  3:30 ` Nick Piggin
  2006-02-21 18:33   ` Christoph Lameter
  2006-02-21 20:20   ` Ravikiran G Thirumalai
  1 sibling, 2 replies; 22+ messages in thread
From: Nick Piggin @ 2006-02-21  3:30 UTC (permalink / raw)
  To: Ravikiran G Thirumalai
  Cc: Andrew Morton, linux-kernel, Shai Fultheim (Shai@scalex86.org)

Ravikiran G Thirumalai wrote:
> Following change places each element of the futex_queues hashtable on a 
> different cacheline.  Spinlocks of adjacent hash buckets lie on the same 
> cacheline otherwise.
> 

It does not make sense to add swaths of unused memory into a hashtable for
this purpose, does it?

For a minimal, naive solution you just increase the size of the hash table.
This will (given a decent hash function) provide the same reduction in
cacheline contention, while also reducing collisions.

A smarter solution might have a single lock per cacheline, and multiple hash
slots. This could make the indexing code a bit more awkward though.

> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
> Signed-off-by: Shai Fultheim <shai@scalex86.org>
> 
> Index: linux-2.6.16-rc2/kernel/futex.c
> ===================================================================
> --- linux-2.6.16-rc2.orig/kernel/futex.c	2006-02-07 23:14:04.000000000 -0800
> +++ linux-2.6.16-rc2/kernel/futex.c	2006-02-09 14:04:22.000000000 -0800
> @@ -100,9 +100,10 @@ struct futex_q {
>  struct futex_hash_bucket {
>         spinlock_t              lock;
>         struct list_head       chain;
> -};
> +} ____cacheline_internodealigned_in_smp;
>  
> -static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> +static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS] 
> +				__cacheline_aligned_in_smp;
>  
>  /* Futex-fs vfsmount entry: */
>  static struct vfsmount *futex_mnt;
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21  1:09           ` Andrew Morton
  2006-02-21  1:39             ` Ravikiran G Thirumalai
@ 2006-02-21 14:44             ` Andi Kleen
  1 sibling, 0 replies; 22+ messages in thread
From: Andi Kleen @ 2006-02-21 14:44 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel, shai, kiran

Andrew Morton <akpm@osdl.org> writes:
> 
> Well it's your architecture...  As long as you're finding this to be a
> sufficiently large problem in testing to justify consuming a meg of memory
> then fine, let's do it.

I think it's a waste of memory even with 128 byte cache lines. 

If it's really a big problem. Some more clever solution would be
better. e.g. allocate a hash table per node and use a hierarchical
hash, trying to put stuff into the local node first.

-Andi

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21  3:30 ` Nick Piggin
@ 2006-02-21 18:33   ` Christoph Lameter
  2006-02-21 23:14     ` Christoph Lameter
  2006-02-22  0:40     ` Nick Piggin
  2006-02-21 20:20   ` Ravikiran G Thirumalai
  1 sibling, 2 replies; 22+ messages in thread
From: Christoph Lameter @ 2006-02-21 18:33 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ravikiran G Thirumalai, Andrew Morton, linux-kernel,
	Shai Fultheim (Shai@scalex86.org)

On Tue, 21 Feb 2006, Nick Piggin wrote:

> Ravikiran G Thirumalai wrote:
> > Following change places each element of the futex_queues hashtable on a
> > different cacheline.  Spinlocks of adjacent hash buckets lie on the same
> > cacheline otherwise.
> > 
> 
> It does not make sense to add swaths of unused memory into a hashtable for
> this purpose, does it?

It does if you essentially have a 4k cacheline (because you are doing NUMA 
in software with multiple PCs....) and transferring control of that 
cacheline is comparatively expensive.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21  3:30 ` Nick Piggin
  2006-02-21 18:33   ` Christoph Lameter
@ 2006-02-21 20:20   ` Ravikiran G Thirumalai
  2006-02-22  0:45     ` Nick Piggin
  1 sibling, 1 reply; 22+ messages in thread
From: Ravikiran G Thirumalai @ 2006-02-21 20:20 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Andrew Morton, linux-kernel, Shai Fultheim (Shai@scalex86.org)

On Tue, Feb 21, 2006 at 02:30:00PM +1100, Nick Piggin wrote:
> Ravikiran G Thirumalai wrote:
> >Following change places each element of the futex_queues hashtable on a 
> >different cacheline.  Spinlocks of adjacent hash buckets lie on the same 
> >cacheline otherwise.
> >
> 
> It does not make sense to add swaths of unused memory into a hashtable for
> this purpose, does it?

I don't know if having two (or more) spinlocks on the same cacheline is a good 
idea.  Right now, on a 128 B cacheline we have 10 spinlocks on the
same cacheline here!! Things get worse if two futexes from different nodes 
hash on to adjacent, or even nearly adjacent hash buckets.

> 
> For a minimal, naive solution you just increase the size of the hash table.
> This will (given a decent hash function) provide the same reduction in
> cacheline contention, while also reducing collisions.

Given a decent hash function.  I am not sure the hashing function is smart 
enough as of now.  Hashing is not a function of nodeid, and we have some 
instrumentation results which show hashing on NUMA is not good as yet, and
there are collisions from other nodes onto the same hashbucket; Nearby 
buckets have high hit rates too.

I think some sort of NUMA friendly hashing, where futexes from same nodes
hash onto a node local hash table, would be a decent solution here.
As I mentioned earlier, we are working on that, and we can probably allocate
the spinlock from nodelocal memory then and avoid this bloat.
We are hoping to have this as a stop gap fix until we get there.

Thanks,
Kiran


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21 18:33   ` Christoph Lameter
@ 2006-02-21 23:14     ` Christoph Lameter
  2006-02-22  0:40     ` Nick Piggin
  1 sibling, 0 replies; 22+ messages in thread
From: Christoph Lameter @ 2006-02-21 23:14 UTC (permalink / raw)
  To: Nick Piggin
  Cc: Ravikiran G Thirumalai, Andrew Morton, linux-kernel,
	Shai Fultheim (Shai@scalex86.org)

On Tue, 21 Feb 2006, Christoph Lameter wrote:

> It does if you essentially have a 4k cacheline (because you are doing NUMA 
> in software with multiple PCs....) and transferring control of that 
> cacheline is comparatively expensive.

Of course the above statement is a rather strong simplification of what is 
actually happening .... but I hope it helps.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21 18:33   ` Christoph Lameter
  2006-02-21 23:14     ` Christoph Lameter
@ 2006-02-22  0:40     ` Nick Piggin
  2006-02-22  2:08       ` Andrew Morton
  1 sibling, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2006-02-22  0:40 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Ravikiran G Thirumalai, Andrew Morton, linux-kernel,
	Shai Fultheim (Shai@scalex86.org)

Christoph Lameter wrote:
> On Tue, 21 Feb 2006, Nick Piggin wrote:
> 
> 
>>Ravikiran G Thirumalai wrote:
>>
>>>Following change places each element of the futex_queues hashtable on a
>>>different cacheline.  Spinlocks of adjacent hash buckets lie on the same
>>>cacheline otherwise.
>>>
>>
>>It does not make sense to add swaths of unused memory into a hashtable for
>>this purpose, does it?
> 
> 
> It does if you essentially have a 4k cacheline (because you are doing NUMA 
> in software with multiple PCs....) and transferring control of that 
> cacheline is comparatively expensive.
> 

Instead of 1MB hash with 256 entries in it covering 256 cachelines, you
have a 1MB hash with 65536(ish) entries covering 256 cachelines.

-
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-21 20:20   ` Ravikiran G Thirumalai
@ 2006-02-22  0:45     ` Nick Piggin
  2006-02-22  2:09       ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2006-02-22  0:45 UTC (permalink / raw)
  To: Ravikiran G Thirumalai
  Cc: Andrew Morton, linux-kernel, Shai Fultheim (Shai@scalex86.org)

Ravikiran G Thirumalai wrote:
> On Tue, Feb 21, 2006 at 02:30:00PM +1100, Nick Piggin wrote:
> 
>>Ravikiran G Thirumalai wrote:
>>
>>>Following change places each element of the futex_queues hashtable on a 
>>>different cacheline.  Spinlocks of adjacent hash buckets lie on the same 
>>>cacheline otherwise.
>>>
>>
>>It does not make sense to add swaths of unused memory into a hashtable for
>>this purpose, does it?
> 
> 
> I don't know if having two (or more) spinlocks on the same cacheline is a good 
> idea.  Right now, on a 128 B cacheline we have 10 spinlocks on the
> same cacheline here!! Things get worse if two futexes from different nodes 
> hash on to adjacent, or even nearly adjacent hash buckets.
> 

It is no worse than having a hash collision and having to take the same lock.

(as I said, a really good solution might just use a single lock per cacheline,
but that would make hash indexing a bit more difficult.)

> 
>>For a minimal, naive solution you just increase the size of the hash table.
>>This will (given a decent hash function) provide the same reduction in
>>cacheline contention, while also reducing collisions.
> 
> 
> Given a decent hash function.  I am not sure the hashing function is smart 
> enough as of now.  Hashing is not a function of nodeid, and we have some 
> instrumentation results which show hashing on NUMA is not good as yet, and
> there are collisions from other nodes onto the same hashbucket; Nearby 
> buckets have high hit rates too.
> 

But it is decent in that it is random. The probability of an address hashing
to the same cacheline as another should be more or less the same as with your
padded scheme.

> I think some sort of NUMA friendly hashing, where futexes from same nodes
> hash onto a node local hash table, would be a decent solution here.
> As I mentioned earlier, we are working on that, and we can probably allocate
> the spinlock from nodelocal memory then and avoid this bloat.
> We are hoping to have this as a stop gap fix until we get there.
> 

But my point still stands that it never makes sense to add empty space into
a hash table. Add hash slots instead and you (for free) get shorter hash
chains.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-22  0:40     ` Nick Piggin
@ 2006-02-22  2:08       ` Andrew Morton
  2006-02-22  2:35         ` Ravikiran G Thirumalai
  2006-02-22  2:37         ` Nick Piggin
  0 siblings, 2 replies; 22+ messages in thread
From: Andrew Morton @ 2006-02-22  2:08 UTC (permalink / raw)
  To: Nick Piggin; +Cc: clameter, kiran, linux-kernel, shai

Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> Christoph Lameter wrote:
> > On Tue, 21 Feb 2006, Nick Piggin wrote:
> > 
> > 
> >>Ravikiran G Thirumalai wrote:
> >>
> >>>Following change places each element of the futex_queues hashtable on a
> >>>different cacheline.  Spinlocks of adjacent hash buckets lie on the same
> >>>cacheline otherwise.
> >>>
> >>
> >>It does not make sense to add swaths of unused memory into a hashtable for
> >>this purpose, does it?
> > 
> > 
> > It does if you essentially have a 4k cacheline (because you are doing NUMA 
> > in software with multiple PCs....) and transferring control of that 
> > cacheline is comparatively expensive.
> > 
> 
> Instead of 1MB hash with 256 entries in it covering 256 cachelines, you
> have a 1MB hash with 65536(ish) entries covering 256 cachelines.
> 

Good (if accidental point).  Kiran, if you're going to gobble a megabyte,
you might as well use all of it and make the hashtable larger, rather than
just leaving 99% of that memory unused...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-22  0:45     ` Nick Piggin
@ 2006-02-22  2:09       ` Andrew Morton
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2006-02-22  2:09 UTC (permalink / raw)
  To: Nick Piggin; +Cc: kiran, linux-kernel, shai

Nick Piggin <nickpiggin@yahoo.com.au> wrote:
>
> But my point still stands that it never makes sense to add empty space into
>  a hash table. Add hash slots instead and you (for free) get shorter hash
>  chains.

OK, not an accident ;)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-22  2:08       ` Andrew Morton
@ 2006-02-22  2:35         ` Ravikiran G Thirumalai
  2006-02-22  2:37         ` Nick Piggin
  1 sibling, 0 replies; 22+ messages in thread
From: Ravikiran G Thirumalai @ 2006-02-22  2:35 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nick Piggin, clameter, linux-kernel, shai

On Tue, Feb 21, 2006 at 06:08:45PM -0800, Andrew Morton wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> >
> > Christoph Lameter wrote:
> > > On Tue, 21 Feb 2006, Nick Piggin wrote:
> > > 
> > > 
> > >>Ravikiran G Thirumalai wrote:
> > >>
> > >>>Following change places each element of the futex_queues hashtable on a
> > >>>different cacheline.  Spinlocks of adjacent hash buckets lie on the same
> > >>>cacheline otherwise.
> > >>>
> > >>
> > >>It does not make sense to add swaths of unused memory into a hashtable for
> > >>this purpose, does it?
> > > 
> > > 
> > > It does if you essentially have a 4k cacheline (because you are doing NUMA 
> > > in software with multiple PCs....) and transferring control of that 
> > > cacheline is comparatively expensive.
> > > 
> > 
> > Instead of 1MB hash with 256 entries in it covering 256 cachelines, you
> > have a 1MB hash with 65536(ish) entries covering 256 cachelines.
> > 
> 
> Good (if accidental point).  Kiran, if you're going to gobble a megabyte,
> you might as well use all of it and make the hashtable larger, rather than
> just leaving 99% of that memory unused...

Yes, good (intentional :) ) point. I am rerunning my tests with a larger hash slot.
(As large as the padding takes away).  If we get the same or better results, we 
can just increase the hash slots.

Thanks,
Kiran

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-22  2:08       ` Andrew Morton
  2006-02-22  2:35         ` Ravikiran G Thirumalai
@ 2006-02-22  2:37         ` Nick Piggin
  2006-02-22 20:17           ` Ravikiran G Thirumalai
  1 sibling, 1 reply; 22+ messages in thread
From: Nick Piggin @ 2006-02-22  2:37 UTC (permalink / raw)
  To: Andrew Morton; +Cc: clameter, kiran, linux-kernel, shai

Andrew Morton wrote:
> Nick Piggin <nickpiggin@yahoo.com.au> wrote:

>>Instead of 1MB hash with 256 entries in it covering 256 cachelines, you
>>have a 1MB hash with 65536(ish) entries covering 256 cachelines.
>>
> 
> 
> Good (if accidental point).  Kiran, if you're going to gobble a megabyte,
> you might as well use all of it and make the hashtable larger, rather than
> just leaving 99% of that memory unused...
> 

We chould probably also convert the list_head over to an hlist_head,
for a modest saving in size (although that's more important from a
cache footprint POV rather than improving cacheline bouncing).

Although speaking of cacheline footprint: making the hash table so
large will increase the "real" CPU cacheline footprint on your VSMP
systems, so perhaps it is not always such an easy decision.

Definitely for "normal" systems, we do not want to pad out to a
single entry per cacheline, so the current patch can not go upstream
as is.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-22  2:37         ` Nick Piggin
@ 2006-02-22 20:17           ` Ravikiran G Thirumalai
  2006-02-22 20:50             ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Ravikiran G Thirumalai @ 2006-02-22 20:17 UTC (permalink / raw)
  To: Nick Piggin; +Cc: Andrew Morton, clameter, linux-kernel, shai

On Wed, Feb 22, 2006 at 01:37:10PM +1100, Nick Piggin wrote:
> Andrew Morton wrote:
> >Nick Piggin <nickpiggin@yahoo.com.au> wrote:
> 
> >>Instead of 1MB hash with 256 entries in it covering 256 cachelines, you
> >>have a 1MB hash with 65536(ish) entries covering 256 cachelines.
> >>
> >
> >
> >Good (if accidental point).  Kiran, if you're going to gobble a megabyte,
> >you might as well use all of it and make the hashtable larger, rather than
> >just leaving 99% of that memory unused...
> >
> 
> We chould probably also convert the list_head over to an hlist_head,
> for a modest saving in size (although that's more important from a
> cache footprint POV rather than improving cacheline bouncing).
> 
> Although speaking of cacheline footprint: making the hash table so
> large will increase the "real" CPU cacheline footprint on your VSMP
> systems, so perhaps it is not always such an easy decision.

We re-ran the threaded benchmark with FUTEX_HASHBITS set to 15 (32k hash
slots) and got the same performance as the padding approach.  So increasing
the hash helps.  We also tried 256, 512, and 1024 hash slots, but we
did not get good results with those. 
We also collected hash collision statistics for 1024 slots.  We found that
50% of the slots did not take any hit!!  So maybe we should revisit the
hashing function before settling on the optimal number of hash slots. 

Thanks,
Kiran

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
  2006-02-22 20:17           ` Ravikiran G Thirumalai
@ 2006-02-22 20:50             ` Andrew Morton
       [not found]               ` <20060223015144.GC3663@localhost.localdomain>
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2006-02-22 20:50 UTC (permalink / raw)
  To: Ravikiran G Thirumalai; +Cc: nickpiggin, clameter, linux-kernel, shai

Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
>
> We also collected hash collision statistics for 1024 slots.  We found that
>  50% of the slots did not take any hit!!  So maybe we should revisit the
>  hashing function before settling on the optimal number of hash slots. 

You could try switching from jhash2() to hash_long().

Was there any particular pattern to the unused slots?  Not something silly
like every second one?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [patch] Cache align futex hash buckets
       [not found]               ` <20060223015144.GC3663@localhost.localdomain>
@ 2006-02-23  2:08                 ` Ravikiran G Thirumalai
  0 siblings, 0 replies; 22+ messages in thread
From: Ravikiran G Thirumalai @ 2006-02-23  2:08 UTC (permalink / raw)
  To: Andrew Morton; +Cc: nickpiggin, clameter, linux-kernel, shai, Nippun Goel

On Wed, Feb 22, 2006 at 05:51:44PM -0800, Ravikiran G Thirumalai wrote:
> On Wed, Feb 22, 2006 at 12:50:49PM -0800, Andrew Morton wrote:
> > Ravikiran G Thirumalai <kiran@scalex86.org> wrote:
> > >
> > > We also collected hash collision statistics for 1024 slots.  We found that
> > >  50% of the slots did not take any hit!!  So maybe we should revisit the
> > >  hashing function before settling on the optimal number of hash slots. 
> > 
> > You could try switching from jhash2() to hash_long().
> 
> OK, I will try that.
> 
> > 
> > Was there any particular pattern to the unused slots?  Not something silly
> > like every second one?
> 
> The distribution seems OK with 256 buckets, but with 1024 buckets, it goes
> bad.  We see high hits every 4-6 buckets.  Attaching the distribution

This pattern might not be bad as it avoids bouncing of spinlocks on the adjacent
hash buckets...but then there are high hits on adjacent buckets too.

(instrumentation done by Nippun)

Thanks,
Kiran

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2006-02-23  2:07 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-20 23:32 [patch] Cache align futex hash buckets Ravikiran G Thirumalai
2006-02-20 23:33 ` Andrew Morton
2006-02-20 23:34   ` Andrew Morton
2006-02-21  0:09     ` Ravikiran G Thirumalai
2006-02-21  0:23       ` Andrew Morton
2006-02-21  1:04         ` Ravikiran G Thirumalai
2006-02-21  1:09           ` Andrew Morton
2006-02-21  1:39             ` Ravikiran G Thirumalai
2006-02-21 14:44             ` Andi Kleen
2006-02-21  3:30 ` Nick Piggin
2006-02-21 18:33   ` Christoph Lameter
2006-02-21 23:14     ` Christoph Lameter
2006-02-22  0:40     ` Nick Piggin
2006-02-22  2:08       ` Andrew Morton
2006-02-22  2:35         ` Ravikiran G Thirumalai
2006-02-22  2:37         ` Nick Piggin
2006-02-22 20:17           ` Ravikiran G Thirumalai
2006-02-22 20:50             ` Andrew Morton
     [not found]               ` <20060223015144.GC3663@localhost.localdomain>
2006-02-23  2:08                 ` Ravikiran G Thirumalai
2006-02-21 20:20   ` Ravikiran G Thirumalai
2006-02-22  0:45     ` Nick Piggin
2006-02-22  2:09       ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).