linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Waiman Long <waiman.long@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Jeff Layton <jlayton@redhat.com>,
	Miklos Szeredi <mszeredi@suse.cz>, Ingo Molnar <mingo@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andi Kleen <andi@firstfloor.org>,
	"Chandramouleeswaran, Aswin" <aswin@hp.com>,
	"Norton, Scott J" <scott.norton@hp.com>
Subject: Re: [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount
Date: Thu, 5 Sep 2013 15:31:23 +0200	[thread overview]
Message-ID: <20130905133123.GA24351@gmail.com> (raw)
In-Reply-To: <52274943.1040005@hp.com>


* Waiman Long <waiman.long@hp.com> wrote:

> On 09/03/2013 03:09 PM, Linus Torvalds wrote:
> >On Tue, Sep 3, 2013 at 8:34 AM, Linus Torvalds
> ><torvalds@linux-foundation.org>  wrote:
> >>I suspect the tty_ldisc_lock() could be made to go away if we care.
> >Heh. I just pulled the tty patches from Greg, and the locking has
> >changed completely.
> >
> >It may actually fix your AIM7 test-case, because while the global
> >spinlock remains (it got renamed to "tty_ldiscs_lock" - there's an
> >added "s"), the common operations now take the per-tty lock to get the
> >ldisc for that tty, rather than that global spinlock (which just
> >protects the actual ldisk array now).
> >
> >That said, I don't know what AIM7 really ends up doing, but your
> >profile seems to have every access through tty_ldisc_[de]ref() that
> >now uses only the per-tty lock. Of course, how much that helps ends up
> >depending on whether AIM7 uses lots of tty's or just one shared one.
> >
> >Anyway, it might be worth testing my current -git tree.
> >
> >                   Linus
> 
> The latest tty patches did work. The tty related spinlock contention
> is now completely gone. The short workload can now reach over 8M JPM
> which is the highest I have ever seen.
> 
> The perf profile was:
> 
> 5.85%     reaim  reaim                 [.] mul_short
> 4.87%     reaim  [kernel.kallsyms]     [k] ebitmap_get_bit
> 4.72%     reaim  reaim                 [.] mul_int
> 4.71%     reaim  reaim                 [.] mul_long
> 2.67%     reaim  libc-2.12.so          [.] __random_r
> 2.64%     reaim  [kernel.kallsyms]     [k] lockref_get_not_zero
> 1.58%     reaim  [kernel.kallsyms]     [k] copy_user_generic_string
> 1.48%     reaim  [kernel.kallsyms]     [k] mls_level_isvalid
> 1.35%     reaim  [kernel.kallsyms]     [k] find_next_bit

6%+ spent in ebitmap_get_bit() and mls_level_isvalid() looks like 
something worth optimizing.

Is that called very often, or is it perhaps cache-bouncing for some 
reason?

Btw., you ought to be able to see instructions where the CPU is in some 
sort of stall (either it ran out of work, or it is cache-missing, or it is 
executing something complex), via:

  perf top -e stalled-cycles-frontend -e stalled-cycles-backend

run it for a while and pick the one which has more entries and have a 
look. Both profiles will keep updating in the background.

(Note: on Haswell CPUs stalled-cycles events are not yet available.)

Another performance analysis trick is to run this while your workload is 
executing:

  perf stat -a -ddd sleep 60

and have a look at the output - it will color-code suspicious looking 
counts.

For example, this is on a 32-way box running a kernel build:

vega:~> perf stat -addd sleep 10

 Performance counter stats for 'sleep 10':

     320753.639566 task-clock                #   32.068 CPUs utilized           [100.00%]
           187,962 context-switches          #    0.586 K/sec                   [100.00%]
            22,989 cpu-migrations            #    0.072 K/sec                   [100.00%]
         6,622,424 page-faults               #    0.021 M/sec                  
   817,576,186,285 cycles                    #    2.549 GHz                     [27.82%]
   214,366,744,930 stalled-cycles-frontend   #   26.22% frontend cycles idle    [16.75%]
    45,454,323,703 stalled-cycles-backend    #    5.56% backend  cycles idle    [16.72%]
   474,770,833,376 instructions              #    0.58  insns per cycle        
                                             #    0.45  stalled cycles per insn [22.27%]
   105,860,676,229 branches                  #  330.037 M/sec                   [33.37%]
     5,964,088,457 branch-misses             #    5.63% of all branches         [33.36%]
   244,982,563,232 L1-dcache-loads           #  763.772 M/sec                   [33.35%]
     7,503,377,286 L1-dcache-load-misses     #    3.06% of all L1-dcache hits   [33.36%]
    19,606,194,180 LLC-loads                 #   61.125 M/sec                   [22.26%]
     1,232,340,603 LLC-load-misses           #    6.29% of all LL-cache hits    [16.69%]
   261,749,251,526 L1-icache-loads           #  816.045 M/sec                   [16.69%]
    11,821,747,974 L1-icache-load-misses     #    4.52% of all L1-icache hits   [16.67%]
   244,253,746,431 dTLB-loads                #  761.500 M/sec                   [27.78%]
       126,546,407 dTLB-load-misses          #    0.05% of all dTLB cache hits  [33.30%]
   260,909,042,891 iTLB-loads                #  813.425 M/sec                   [33.31%]
        73,911,000 iTLB-load-misses          #    0.03% of all iTLB cache hits  [33.31%]
     7,989,072,388 L1-dcache-prefetches      #   24.907 M/sec                   [27.76%]
                 0 L1-dcache-prefetch-misses #    0.000 K/sec                   [33.32%]

      10.002245831 seconds time elapsed

the system is nicely saturated, caches are more or less normally utilized, 
but about a quarter of all frontend cycles are idle.

So then I ran:

  perf top -e stalled-cycles-frontend

which gave me this profile:

     2.21%            cc1  cc1                                [.] ht_lookup_with_hash                                  
     1.86%            cc1  libc-2.15.so                       [.] _int_malloc                                          
     1.66%            cc1  [kernel.kallsyms]                  [k] page_fault                                           
     1.48%            cc1  cc1                                [.] _cpp_lex_direct                                      
     1.33%            cc1  cc1                                [.] grokdeclarator                                       
     1.26%            cc1  cc1                                [.] ggc_internal_alloc_stat                              
     1.19%            cc1  cc1                                [.] ggc_internal_cleared_alloc_stat                      
     1.12%            cc1  libc-2.15.so                       [.] _int_free                                            
     1.10%            cc1  libc-2.15.so                       [.] malloc                                               
     0.95%            cc1  cc1                                [.] c_lex_with_flags                                     
     0.95%            cc1  cc1                                [.] cpp_get_token_1                                      
     0.92%            cc1  cc1                                [.] c_parser_declspecs                                   

where gcc's ht_lookup_with_hash() is having trouble:

which function is visibly getting stalls from a hash walk:

    0.79 :        a0f303:       addl   $0x1,0x80(%rdi)
    0.01 :        a0f30a:       mov    %rdi,%rbx
    0.18 :        a0f30d:       mov    %rsi,%r15
    0.00 :        a0f310:       lea    -0x1(%r14),%r13d
    0.13 :        a0f314:       and    %r13d,%r10d
    0.05 :        a0f317:       mov    %r10d,%eax
    0.34 :        a0f31a:       mov    (%rcx,%rax,8),%r9
   24.87 :        a0f31e:       lea    0x0(,%rax,8),%rdx
    0.02 :        a0f326:       test   %r9,%r9
    0.00 :        a0f329:       je     a0f5bd <ht_lookup_with_hash+0x2ed>
    0.31 :        a0f32f:       cmp    $0xffffffffffffffff,%r9
    0.18 :        a0f333:       je     a0f6e1 <ht_lookup_with_hash+0x411>
    0.37 :        a0f339:       cmp    %r12d,0xc(%r9)
   24.41 :        a0f33d:       jne    a0f3a0 <ht_lookup_with_hash+0xd0>

So giving that function some attention would probably give the most bang 
for bucks on this particular workload.

Thanks,

	Ingo

  parent reply	other threads:[~2013-09-05 13:31 UTC|newest]

Thread overview: 151+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-08-06  3:12 [PATCH v7 0/4] Lockless update of reference count protected by spinlock Waiman Long
2013-08-06  3:12 ` [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount Waiman Long
2013-08-29  1:40   ` Linus Torvalds
2013-08-29  4:44     ` Benjamin Herrenschmidt
2013-08-29  7:00       ` Ingo Molnar
2013-08-29 16:43         ` Linus Torvalds
2013-08-29 19:25           ` Linus Torvalds
2013-08-29 23:42             ` Linus Torvalds
2013-08-30  0:26               ` Benjamin Herrenschmidt
2013-08-30  0:49                 ` Linus Torvalds
2013-08-30  2:06                   ` Michael Neuling
2013-08-30  2:30                     ` Benjamin Herrenschmidt
2013-08-30  2:35                       ` Linus Torvalds
2013-08-30  2:45                         ` Benjamin Herrenschmidt
2013-08-30  2:31                     ` Linus Torvalds
2013-08-30  2:43                       ` Benjamin Herrenschmidt
2013-08-30  7:16                   ` Ingo Molnar
2013-08-30 15:28                     ` Linus Torvalds
2013-08-30  3:12               ` Waiman Long
2013-08-30  3:54                 ` Linus Torvalds
2013-08-30  7:55                   ` Sedat Dilek
2013-08-30  8:10                     ` Sedat Dilek
2013-08-30  9:27                     ` Sedat Dilek
2013-08-30  9:48                       ` Ingo Molnar
2013-08-30  9:56                         ` Sedat Dilek
2013-08-30  9:58                           ` Sedat Dilek
2013-08-30 10:29                             ` Sedat Dilek
2013-08-30 10:36                               ` Peter Zijlstra
2013-08-30 10:44                                 ` Sedat Dilek
2013-08-30 10:46                                   ` Sedat Dilek
2013-08-30 10:52                                   ` Peter Zijlstra
2013-08-30 10:57                                     ` Sedat Dilek
2013-08-30 14:05                                       ` Sedat Dilek
2013-08-30 11:19                                 ` Sedat Dilek
2013-08-30 10:38                               ` Sedat Dilek
2013-08-30 15:34                       ` Linus Torvalds
2013-08-30 15:38                         ` Sedat Dilek
2013-08-30 16:12                           ` Steven Rostedt
2013-08-30 16:16                             ` Sedat Dilek
2013-08-30 18:42                             ` Linus Torvalds
2013-08-30 16:32                           ` Linus Torvalds
2013-08-30 16:37                             ` Sedat Dilek
2013-08-30 16:52                               ` Linus Torvalds
2013-08-30 17:11                                 ` Sedat Dilek
2013-08-30 17:26                                   ` Linus Torvalds
2013-09-01 10:01                                 ` Sedat Dilek
2013-09-01 10:33                                   ` Sedat Dilek
2013-09-01 15:32                                   ` Linus Torvalds
2013-09-01 15:45                                     ` Sedat Dilek
2013-09-01 15:55                                       ` Linus Torvalds
2013-09-02 10:30                                         ` Sedat Dilek
2013-09-02 16:09                                           ` David Ahern
2013-09-01 20:59                                     ` Linus Torvalds
2013-09-01 21:23                                       ` Al Viro
2013-09-01 22:16                                         ` Linus Torvalds
2013-09-01 22:35                                           ` Al Viro
2013-09-01 22:44                                             ` Al Viro
2013-09-01 22:58                                               ` Linus Torvalds
2013-09-01 22:48                                           ` Linus Torvalds
2013-09-01 23:30                                             ` Al Viro
2013-09-02  0:12                                               ` Linus Torvalds
2013-09-02  0:50                                                 ` Linus Torvalds
2013-09-02  7:05                                                   ` Ingo Molnar
2013-09-02 16:44                                                     ` Linus Torvalds
2013-09-03 10:15                                                       ` Ingo Molnar
2013-09-03 15:41                                                         ` Linus Torvalds
2013-09-03 18:34                                                           ` Linus Torvalds
2013-09-03 19:19                                                             ` Ingo Molnar
2013-09-03 21:05                                                               ` Linus Torvalds
2013-09-03 21:13                                                                 ` Linus Torvalds
2013-09-03 21:34                                                                   ` Linus Torvalds
2013-09-03 21:39                                                                     ` Linus Torvalds
2013-09-03 14:08                                                       ` Pavel Machek
2013-09-03 22:37                                     ` Sedat Dilek
2013-09-03 22:55                                       ` Dave Jones
2013-09-03 23:05                                         ` Sedat Dilek
2013-09-03 23:15                                           ` Dave Jones
2013-09-03 23:20                                             ` Sedat Dilek
2013-09-03 23:45                                       ` Sedat Dilek
2013-08-30 18:33                   ` Waiman Long
2013-08-30 18:53                     ` Linus Torvalds
2013-08-30 19:20                       ` Waiman Long
2013-08-30 19:33                         ` Linus Torvalds
2013-08-30 20:15                           ` Waiman Long
2013-08-30 20:43                             ` Linus Torvalds
2013-08-30 20:54                               ` Al Viro
2013-08-30 21:03                                 ` Linus Torvalds
2013-08-30 21:44                                   ` Al Viro
2013-08-30 22:30                                     ` Linus Torvalds
2013-08-31 21:23                                       ` Al Viro
2013-08-31 22:49                                         ` Linus Torvalds
2013-08-31 23:27                                           ` Al Viro
2013-09-01  0:13                                             ` Al Viro
2013-09-01 17:48                                               ` Al Viro
2013-09-09  8:30                                               ` Peter Zijlstra
2013-08-30 21:10                                 ` Waiman Long
2013-08-30 21:22                                   ` Linus Torvalds
2013-08-30 21:30                                   ` Al Viro
2013-08-30 21:42                                     ` Waiman Long
2013-08-30 19:40                         ` Al Viro
2013-08-30 19:52                           ` Waiman Long
2013-08-30 20:26                             ` Al Viro
2013-08-30 20:35                               ` Waiman Long
2013-08-30 20:48                                 ` Al Viro
2013-08-31  2:02                                   ` Waiman Long
2013-08-31  2:35                                     ` Al Viro
2013-08-31  2:42                                       ` Al Viro
2013-09-02 19:25                                         ` Waiman Long
2013-09-03  6:01                                           ` Ingo Molnar
2013-09-03  7:24                                             ` Sedat Dilek
2013-09-03 15:38                                               ` Linus Torvalds
2013-09-03 15:14                                             ` Waiman Long
2013-09-03 15:34                                               ` Linus Torvalds
2013-09-03 19:09                                                 ` Linus Torvalds
2013-09-03 21:01                                                   ` Waiman Long
2013-09-04 14:52                                                   ` Waiman Long
2013-09-04 15:14                                                     ` Linus Torvalds
2013-09-04 19:25                                                       ` Waiman Long
2013-09-04 21:34                                                         ` Linus Torvalds
2013-09-05  2:35                                                           ` Waiman Long
2013-09-05 13:31                                                     ` Ingo Molnar [this message]
2013-09-05 17:33                                                       ` Waiman Long
2013-09-05 17:40                                                         ` Ingo Molnar
2013-09-03 22:41                                               ` Sedat Dilek
2013-09-03 23:11                                                 ` Sedat Dilek
2013-09-08 21:45               ` Linus Torvalds
2013-09-09  0:03                 ` Al Viro
2013-09-09  0:25                   ` Linus Torvalds
2013-09-09  0:35                     ` Al Viro
2013-09-09  0:38                       ` Linus Torvalds
2013-09-09  0:57                         ` Al Viro
2013-09-09  2:09                     ` Ramkumar Ramachandra
2013-09-09  0:30                   ` Al Viro
2013-09-09  3:32                   ` Linus Torvalds
2013-09-09  4:06                     ` Ramkumar Ramachandra
2013-09-09  5:44                     ` Al Viro
2013-08-30 17:17           ` Peter Zijlstra
2013-08-30 17:28             ` Linus Torvalds
2013-08-30 17:33               ` Linus Torvalds
2013-08-29 15:20     ` Waiman Long
2013-08-06  3:12 ` [PATCH v7 2/4] spinlock: Enable x86 architecture to do lockless refcount update Waiman Long
2013-08-06  3:12 ` [PATCH v7 3/4] dcache: replace d_lock/d_count by d_lockcnt Waiman Long
2013-08-06  3:12 ` [PATCH v7 4/4] dcache: Enable lockless update of dentry's refcount Waiman Long
2013-08-13 18:03 ` [PATCH v7 0/4] Lockless update of reference count protected by spinlock Waiman Long
2013-08-31  3:06 [PATCH v7 1/4] spinlock: A new lockref structure for lockless update of refcount George Spelvin
2013-08-31 17:16 ` Linus Torvalds
2013-09-01  8:50   ` George Spelvin
2013-09-01 11:10     ` Theodore Ts'o
2013-09-01 15:49       ` Linus Torvalds
2013-09-01 18:11         ` Steven Rostedt
2013-09-01 20:03           ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130905133123.GA24351@gmail.com \
    --to=mingo@kernel.org \
    --cc=andi@firstfloor.org \
    --cc=aswin@hp.com \
    --cc=benh@kernel.crashing.org \
    --cc=jlayton@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=mszeredi@suse.cz \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=scott.norton@hp.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=waiman.long@hp.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).