linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] 6/6 cacheline align files_lock
@ 2003-03-07 23:36 Martin J. Bligh
       [not found] ` <20030308073535.B24272@infradead.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Martin J. Bligh @ 2003-03-07 23:36 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

I'm getting a lot of cacheline bounce from .text.lock.file_table due to 
false sharing of the cahceline. The following patch just aligns the lock
in it's own cacheline.

only changes in profile under 50 ticks are:

 -4832   -22.2% .text.lock.file_table
 -6357   -12.8% default_idle
-10374    -6.2% total

Difference in results below (note system times as well as elapsed).

Kernbench: (make -j N vmlinux, where N = 2 x num_cpus)
                              Elapsed      System        User         CPU
                 no-align       44.09       94.38      557.26     1477.00
                    align       44.38       94.18      558.00     1468.25

Kernbench: (make -j N vmlinux, where N = 16 x num_cpus)
                              Elapsed      System        User         CPU
                 no-align       45.53      118.06      560.48     1489.50
                    align       44.84      111.77      560.63     1502.50

Kernbench: (make -j vmlinux, maximal tasks)
                              Elapsed      System        User         CPU
                 no-align       45.17      117.80      560.62     1500.50
                    align       44.94      113.36      560.59     1500.00

diff -urpN -X /home/fletch/.diff.exclude 020-prof_docs/fs/file_table.c 030-align_files_lock/fs/file_table.c
--- 020-prof_docs/fs/file_table.c	Tue Feb 25 23:03:49 2003
+++ 030-align_files_lock/fs/file_table.c	Wed Mar  5 07:49:20 2003
@@ -27,7 +27,7 @@ static LIST_HEAD(anon_list);
 /* And here the free ones sit */
 static LIST_HEAD(free_list);
 /* public *and* exported. Not pretty! */
-spinlock_t files_lock = SPIN_LOCK_UNLOCKED;
+spinlock_t files_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED;
 
 /* Find an unused file structure and return a pointer to it.
  * Returns NULL, if there are no more free file structures or


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 6/6 cacheline align files_lock
       [not found] ` <20030308073535.B24272@infradead.org>
@ 2003-03-08 15:41   ` Martin J. Bligh
  2003-03-08 15:54     ` Christoph Hellwig
  2003-03-08 16:38     ` William Lee Irwin III
  0 siblings, 2 replies; 8+ messages in thread
From: Martin J. Bligh @ 2003-03-08 15:41 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Linus Torvalds, linux-kernel, Andrew Morton

>> I'm getting a lot of cacheline bounce from .text.lock.file_table due to 
>> false sharing of the cahceline. The following patch just aligns the lock
>> in it's own cacheline.
> 
> What about fixing it's use instead?  

Seriously, I'd love to do that ... files_lock is the highest item of 
the whole kernel compile profile overall, and accounts for about 10%
of all systime.

By I inlining the spinlocks, I can see the heavy users seem to be:

      6278   166.0% get_empty_filp
      5482   239.3% __fput
      4770   254.7% file_move
    -14464  -100.0% .text.lock.file_table

It seems to lock several different things though - free lists, tty lists,
etc, etc. I could break it up to one per list, but that means taking two
locks to shift between lists, which I'd prefer not to do if possible.

Andrew was talking about something more dramatic for it, but I'm not
sure exactly what ... any suggestions on how to drive a stake through
it's heart gladly accepted.

Ironically, the latest scheduler changes *seem* to get rid of about
57% of the cost of it .... I have *no* idea why. Diffprofile between
64 and 64-bk is just wierd:

      7222    45.1% page_remove_rmap
       521    18.4% __copy_from_user_ll
       422     8.9% __copy_to_user_ll
       334   105.4% __wake_up
       248    48.5% pte_alloc_one
       243   205.9% pgd_ctor
       211     4.6% vm_enough_memory
       198     6.3% zap_pte_range
       179    29.3% kmem_cache_free
       118    12.2% clear_page_tables
        99    13.2% __pte_chain_free
        84     6.1% do_no_page
        76     6.6% release_pages
        74     9.0% strnlen_user
        67     5.6% free_hot_cold_page
        65    14.4% copy_page_range
...
       -54   -26.1% generic_file_open
       -54    -7.8% dentry_open
       -55   -16.3% do_lookup
       -61    -1.7% find_get_page
       -74   -58.7% __wake_up_sync
       -74    -8.2% current_kernel_time
       -82   -16.1% file_ra_state_init
      -127    -8.0% schedule
      -132   -19.1% fd_install
      -148    -2.2% page_add_rmap
      -246    -3.0% d_lookup
      -263    -9.3% atomic_dec_and_lock
      -407   -25.7% file_move
      -455   -20.5% __fput
      -853   -21.1% get_empty_filp
      -870    -1.9% default_idle
     -1724    -1.0% total
     -8473   -59.8% .text.lock.file_table

page_remove_rmap I'm not too worried about, as I know objrmap will kill
that already (tried to combine them last night, but got an oops, which
I'm just about to try to recreate with just virgin -bk). But *why*
things are shifting about so much from (presumably) just the scheduler
changes is a mystery to me at least. 

Any ideas?

M.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 6/6 cacheline align files_lock
  2003-03-08 15:41   ` Martin J. Bligh
@ 2003-03-08 15:54     ` Christoph Hellwig
  2003-03-08 16:10       ` Linus Torvalds
  2003-03-08 16:38     ` William Lee Irwin III
  1 sibling, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2003-03-08 15:54 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Linus Torvalds, linux-kernel, Andrew Morton

On Sat, Mar 08, 2003 at 07:41:42AM -0800, Martin J. Bligh wrote:
> It seems to lock several different things though - free lists, tty lists,
> etc, etc. I could break it up to one per list, but that means taking two
> locks to shift between lists, which I'd prefer not to do if possible.

The free list should go away - we have slab for that.  The tty stuff should
get a per-tty lock.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 6/6 cacheline align files_lock
  2003-03-08 15:54     ` Christoph Hellwig
@ 2003-03-08 16:10       ` Linus Torvalds
  2003-03-08 18:10         ` Christoph Hellwig
  0 siblings, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2003-03-08 16:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Martin J. Bligh, linux-kernel, Andrew Morton


On Sat, 8 Mar 2003, Christoph Hellwig wrote:
> 
> The free list should go away - we have slab for that.  The tty stuff should
> get a per-tty lock.

I doubt either of these would actually fix the lock contention, though.

The tty stuff is not likely to be a real issue for any real load (just how
often do you kill off sessions etc?) And the free list isn't the reason
for the file lock - ues, the file lock protects it, but every time we
touch the free list we touch _real_ lists too (ie either we move a file
from the free list to another list, _or_ we move a unused entry from a
real list to the free list), so we'd need the lock anyway.

So to actually fix file_lock, you need to do something else. I _think_
that "something else" may be to make it be a per-super-block lock, since I
think that's the only thing the f_list thing is actually used for. Then
you should probably pass in the superblock pointer to "get_empty_filp()", 
and _then_ you can get rid of the free list and the current global lock.

Oh, and you need to make the "tty" stuff be a superblock too. Of course, 
it might actually be a perfectly fine thing to make that tty stuff use a 
totally separate pointer chain anyway, the current thing makes me worry 
that "umount()" actually might do the wrong thing if the only file open on 
a filesystem are tty files.

		Linus


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 6/6 cacheline align files_lock
  2003-03-08 15:41   ` Martin J. Bligh
  2003-03-08 15:54     ` Christoph Hellwig
@ 2003-03-08 16:38     ` William Lee Irwin III
  1 sibling, 0 replies; 8+ messages in thread
From: William Lee Irwin III @ 2003-03-08 16:38 UTC (permalink / raw)
  To: Martin J. Bligh
  Cc: Christoph Hellwig, Linus Torvalds, linux-kernel, Andrew Morton

On Sat, Mar 08, 2003 at 07:41:42AM -0800, Martin J. Bligh wrote:
> Ironically, the latest scheduler changes *seem* to get rid of about
> 57% of the cost of it .... I have *no* idea why. Diffprofile between
> 64 and 64-bk is just wierd:
>       7222    45.1% page_remove_rmap
>        521    18.4% __copy_from_user_ll
>        422     8.9% __copy_to_user_ll
>        334   105.4% __wake_up
>        248    48.5% pte_alloc_one
>        243   205.9% pgd_ctor
>        211     4.6% vm_enough_memory
>        198     6.3% zap_pte_range
>        179    29.3% kmem_cache_free
>        118    12.2% clear_page_tables
>         99    13.2% __pte_chain_free
>         84     6.1% do_no_page
>         76     6.6% release_pages
>         74     9.0% strnlen_user
>         67     5.6% free_hot_cold_page
>         65    14.4% copy_page_range

The high count for pgd_ctor() is bizarre. It fiddles with less data and
should be called less often than pmd_ctor() or pte_alloc_one(), both of
which take the cache hit for constructing full pages, and pgd_ctor()
can't really hope to do better. This otherwise appears to be in order
of cacheline bounciness. DCU_MISS_OUTSTANDING and IFU_MEM_STALL?

page_remove_rmap() as usual takes top position b/c of its long list
walks; I think you've got something brewing for that already.


-- wli

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 6/6 cacheline align files_lock
  2003-03-08 16:10       ` Linus Torvalds
@ 2003-03-08 18:10         ` Christoph Hellwig
  2003-03-08 18:20           ` Linus Torvalds
  0 siblings, 1 reply; 8+ messages in thread
From: Christoph Hellwig @ 2003-03-08 18:10 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin J. Bligh, linux-kernel, Andrew Morton

On Sat, Mar 08, 2003 at 08:10:29AM -0800, Linus Torvalds wrote:
> So to actually fix file_lock, you need to do something else. I _think_
> that "something else" may be to make it be a per-super-block lock, since I
> think that's the only thing the f_list thing is actually used for. Then
> you should probably pass in the superblock pointer to "get_empty_filp()", 
> and _then_ you can get rid of the free list and the current global lock.

Agreed, that's what I actually though when looking into that stuff in more
detail a while ago - I just couldn't remember everything now that Martin
brought it up again.  Killing the freelist seems like a good idea anyway
(or rather keep a small list for the reserved filp that is used only
_after_ kmem_cache_alloc() failed)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] 6/6 cacheline align files_lock
  2003-03-08 18:10         ` Christoph Hellwig
@ 2003-03-08 18:20           ` Linus Torvalds
  0 siblings, 0 replies; 8+ messages in thread
From: Linus Torvalds @ 2003-03-08 18:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Martin J. Bligh, linux-kernel, Andrew Morton


On Sat, 8 Mar 2003, Christoph Hellwig wrote:
> 
> Agreed, that's what I actually though when looking into that stuff in more
> detail a while ago - I just couldn't remember everything now that Martin
> brought it up again.  Killing the freelist seems like a good idea anyway
> (or rather keep a small list for the reserved filp that is used only
> _after_ kmem_cache_alloc() failed)

Well, the sad part, though, is that the file lists are almost never 
actually _used_, so even if this is made per-superblock that won't get 
over the fact that

 (a) much of the time you'll still get the lock overhead (not very many
     superblocks would be impacted: it would usually spread out the
     current load over two or three super-blocks: pipes, networking and
     "real" filesystem) and

 (b) it would _still_ be for something that almost never happens -
     relatively speaking (ie it's currently used for hanging up terminals
     and for unmounts, and nothing else, I think)

I think we might actually be able to make it really go away, if the file
list was made per-dentry or something like that (and then use the existing
dentry->lock instead of dcache_lock). We already keep track of dentries
for "umount", and if we made the tty layer use another list, we'd get rid 
of this whole lock entirely.

At this point, though, the 2.6.x thing is clearly to just avoid the false 
sharing. But somebody can work on this if they feel like a 2.7.x 
challenge.

		Linus


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] 6/6 cacheline align files_lock
  2003-03-05 17:26       ` [PATCH] 5/6 Provide basic documentation for profiling Martin J. Bligh
@ 2003-03-05 17:27         ` Martin J. Bligh
  0 siblings, 0 replies; 8+ messages in thread
From: Martin J. Bligh @ 2003-03-05 17:27 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-kernel

I'm getting a lot of cacheline bounce from .text.lock.file_table due to 
false sharing of the cahceline. The following patch just aligns the lock
in it's own cacheline.

only changes in profile under 50 ticks are:

 -4832   -22.2% .text.lock.file_table
 -6357   -12.8% default_idle
-10374    -6.2% total

Difference in results below (note system times as well as elapsed).

Kernbench: (make -j N vmlinux, where N = 2 x num_cpus)
                              Elapsed      System        User         CPU
                 no-align       44.09       94.38      557.26     1477.00
                    align       44.38       94.18      558.00     1468.25

Kernbench: (make -j N vmlinux, where N = 16 x num_cpus)
                              Elapsed      System        User         CPU
                 no-align       45.53      118.06      560.48     1489.50
                    align       44.84      111.77      560.63     1502.50

Kernbench: (make -j vmlinux, maximal tasks)
                              Elapsed      System        User         CPU
                 no-align       45.17      117.80      560.62     1500.50
                    align       44.94      113.36      560.59     1500.00

diff -urpN -X /home/fletch/.diff.exclude 020-prof_docs/fs/file_table.c 030-align_files_lock/fs/file_table.c
--- 020-prof_docs/fs/file_table.c	Tue Feb 25 23:03:49 2003
+++ 030-align_files_lock/fs/file_table.c	Wed Mar  5 07:49:20 2003
@@ -27,7 +27,7 @@ static LIST_HEAD(anon_list);
 /* And here the free ones sit */
 static LIST_HEAD(free_list);
 /* public *and* exported. Not pretty! */
-spinlock_t files_lock = SPIN_LOCK_UNLOCKED;
+spinlock_t files_lock __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED;
 
 /* Find an unused file structure and return a pointer to it.
  * Returns NULL, if there are no more free file structures or


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-03-08 18:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-03-07 23:36 [PATCH] 6/6 cacheline align files_lock Martin J. Bligh
     [not found] ` <20030308073535.B24272@infradead.org>
2003-03-08 15:41   ` Martin J. Bligh
2003-03-08 15:54     ` Christoph Hellwig
2003-03-08 16:10       ` Linus Torvalds
2003-03-08 18:10         ` Christoph Hellwig
2003-03-08 18:20           ` Linus Torvalds
2003-03-08 16:38     ` William Lee Irwin III
  -- strict thread matches above, loose matches on Subject: below --
2003-03-05 17:23 [PATCH] 1/6 Share common physnode_map code between NUMA-Q and Summit Martin J. Bligh
2003-03-05 17:24 ` [PATCH] 2/6 Make CONFIG_NUMA work on non-numa machines Martin J. Bligh
2003-03-05 17:24   ` [PATCH] 3/6 Convert physnode_map to u8 Martin J. Bligh
2003-03-05 17:25     ` [PATCH] 4/6 Fix the type of get_zholes_size for NUMA-Q Martin J. Bligh
2003-03-05 17:26       ` [PATCH] 5/6 Provide basic documentation for profiling Martin J. Bligh
2003-03-05 17:27         ` [PATCH] 6/6 cacheline align files_lock Martin J. Bligh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).