linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Ravikiran G Thirumalai <kiran@scalex86.org>
Cc: linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
	shai@scalex86.org
Subject: Re: [rfc] [patch 1/2 ] Process private hash tables for private futexes
Date: Sat, 21 Mar 2009 04:35:14 -0700	[thread overview]
Message-ID: <20090321043514.69f8243d.akpm@linux-foundation.org> (raw)
In-Reply-To: <20090321044637.GA7278@localdomain>

On Fri, 20 Mar 2009 21:46:37 -0700 Ravikiran G Thirumalai <kiran@scalex86.org> wrote:

> Patch to have a process private hash table for 'PRIVATE' futexes.
> 
> On large core count systems running multiple threaded processes causes
> false sharing on the global futex hash table.  The global futex hash
> table is an array of struct futex_hash_bucket which is defined as:
> 
> struct futex_hash_bucket {
>         spinlock_t lock;
>         struct plist_head chain;
> };
> 
> static struct futex_hash_bucket futex_queues[1<<FUTEX_HASHBITS];
> 
> Needless to say this will cause multiple spinlocks to reside on the
> same cacheline which is very bad when multiple un-related process
> hash onto adjacent hash buckets.  The probability of unrelated futexes
> ending on adjacent hash buckets increase with the number of cores in the
> system (more cores available translates to more processes/more threads
> being run on a system).  The effects of false sharing are tangible on
> machines with more than 32 cores.  We have noticed this with  workload
> of a certain multiple threaded FEA (Finite Element Analysis) solvers.
> We reported this problem couple of years ago which eventually resulted in
> a new api for private futexes to avoid mmap_sem.  The false sharing on
> the global futex hash was put off pending glibc changes to accomodate
> the futex private apis.  Now that the glibc changes are in, and
> multicore is more prevalent, maybe it is time to fix this problem.
> 
> The root cause of the problem is a global futex hash table even for process
> private futexes.  Process private futexes can be hashed on process private
> hash tables, avoiding the global hash and a longer hash table walk when
> there are a lot more futexes in the workload.  However, this results in an
> addition of one extra pointer to the mm_struct.  Hence, this implementation
> of a process private hash table is based off a config option, which can be
> turned off for smaller core count systems.  Furthermore, a subsequent patch
> will introduce a sysctl to dynamically turn on private futex hash tables.
> 
> We found this patch to improve the runtime of a certain FEA solver by about
> 15% on a 32 core vSMP system.
> 
> Signed-off-by: Ravikiran Thirumalai <kiran@scalex86.org>
> Signed-off-by: Shai Fultheim <shai@scalex86.org>
> 
> Index: linux-2.6.28.6/include/linux/mm_types.h
> ===================================================================
> --- linux-2.6.28.6.orig/include/linux/mm_types.h	2009-03-11 16:52:06.000000000 -0800
> +++ linux-2.6.28.6/include/linux/mm_types.h	2009-03-11 16:52:23.000000000 -0800
> @@ -256,6 +256,10 @@ struct mm_struct {
>  #ifdef CONFIG_MMU_NOTIFIER
>  	struct mmu_notifier_mm *mmu_notifier_mm;
>  #endif
> +#ifdef CONFIG_PROCESS_PRIVATE_FUTEX
> +	/* Process private futex hash table */
> +	struct futex_hash_bucket *htb;
> +#endif

So we're effectively improving the hashing operation by splitting the
single hash table into multiple ones.

But was that the best way of speeding up the hashing operation?  I'd have
thought that for some workloads, there will still be tremendous amounts of
contention for the per-mm hashtable?  In which case it is but a partial fix
for certain workloads.

Whereas a more general hashing optimisation (if we can come up with it)
would benefit both types of workload?



  parent reply	other threads:[~2009-03-21 11:42 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-03-21  4:46 [rfc] [patch 1/2 ] Process private hash tables for private futexes Ravikiran G Thirumalai
2009-03-21  4:52 ` [rfc] [patch 2/2 ] Sysctl to turn on/off private futex " Ravikiran G Thirumalai
2009-03-21  9:07 ` [rfc] [patch 1/2 ] Process private " Eric Dumazet
2009-03-21 11:55   ` [PATCH] futex: Dynamically size futexes hash table Eric Dumazet
2009-03-21 16:28     ` Ingo Molnar
2009-03-22  4:54   ` [rfc] [patch 1/2 ] Process private hash tables for private futexes Ravikiran G Thirumalai
2009-03-22  8:17     ` Eric Dumazet
2009-03-23 20:28       ` Ravikiran G Thirumalai
2009-03-23 21:57         ` Eric Dumazet
2009-03-24  3:19           ` Ravikiran G Thirumalai
2009-03-24  3:33             ` Ravikiran G Thirumalai
2009-03-24  5:31             ` Eric Dumazet
2009-03-24  7:04           ` Eric Dumazet
2009-04-23 17:30             ` Darren Hart
2009-03-21 11:35 ` Andrew Morton [this message]
2009-03-22  4:15   ` Ravikiran G Thirumalai

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090321043514.69f8243d.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=kiran@scalex86.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=shai@scalex86.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).