All of lore.kernel.org
 help / color / mirror / Atom feed
* spin_lock cause ?
@ 2016-06-01 17:05 Kelly, Mark (RIS-BCT)
  2016-06-01 17:37 ` Milian Wolff
  0 siblings, 1 reply; 10+ messages in thread
From: Kelly, Mark (RIS-BCT) @ 2016-06-01 17:05 UTC (permalink / raw)
  To: linux-perf-users

Hi,

Perf report for our application shows:

  34.27%  [kernel]                    [k] _spin_lock
   8.19%  libjlib.so                  [.] CLZWExpander::expand(void*)
   4.62%  libjhtree.so                [.] CMRUCacheOf<CKeyIdAndPos, CJHTreeNode>
   1.97%  [kernel]                    [k] futex_wait_setup
   1.51%  [kernel]                    [k] native_write_msr_safe
   1.34%  [kernel]                    [k] futex_wake

Any suggestion for how to trace where/who is responsible for the spin_lock ?

thank you in advance,
mark



---------------------------------------- The information contained in this e-mail message is intended only for the personal and confidential use of the recipient(s) named above. This message may be an attorney-client communication and/or work product and as such is privileged and confidential. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately by e-mail, and delete the original message.  

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 17:05 spin_lock cause ? Kelly, Mark (RIS-BCT)
@ 2016-06-01 17:37 ` Milian Wolff
  2016-06-01 17:43   ` M Kelly
  0 siblings, 1 reply; 10+ messages in thread
From: Milian Wolff @ 2016-06-01 17:37 UTC (permalink / raw)
  To: Mark.Kelly; +Cc: linux-perf-users

[-- Attachment #1: Type: text/plain, Size: 911 bytes --]

On Wednesday, June 1, 2016 1:05:40 PM CEST Kelly, Mark (RIS-BCT) wrote:
> Hi,
> 
> Perf report for our application shows:
> 
>   34.27%  [kernel]                    [k] _spin_lock
>    8.19%  libjlib.so                  [.] CLZWExpander::expand(void*)
>    4.62%  libjhtree.so                [.] CMRUCacheOf<CKeyIdAndPos,
> CJHTreeNode> 1.97%  [kernel]                    [k] futex_wait_setup
>    1.51%  [kernel]                    [k] native_write_msr_safe
>    1.34%  [kernel]                    [k] futex_wake
> 
> Any suggestion for how to trace where/who is responsible for the spin_lock ?

For both cases, you should collect backtraces which should allow you to find 
culprits. Have you tried:

perf record --call-graph dwarf

or similar?

Cheers

-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 17:37 ` Milian Wolff
@ 2016-06-01 17:43   ` M Kelly
  2016-06-01 18:11     ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 10+ messages in thread
From: M Kelly @ 2016-06-01 17:43 UTC (permalink / raw)
  To: linux-perf-users

Milian Wolff <milian.wolff <at> kdab.com> writes:

> 
> On Wednesday, June 1, 2016 1:05:40 PM CEST Kelly, Mark (RIS-BCT) wrote:
> > Hi,
> > 
> > Perf report for our application shows:
> > 
> >   34.27%  [kernel]                    [k] _spin_lock
> >    8.19%  libjlib.so                  [.] CLZWExpander::expand(void*)
> >    4.62%  libjhtree.so                [.] CMRUCacheOf<CKeyIdAndPos,
> > CJHTreeNode> 1.97%  [kernel]                    [k] futex_wait_setup
> >    1.51%  [kernel]                    [k] native_write_msr_safe
> >    1.34%  [kernel]                    [k] futex_wake
> > 
> > Any suggestion for how to trace where/who is responsible for the 
spin_lock ?
> 
> For both cases, you should collect backtraces which should allow you to 
find 
> culprits. Have you tried:
> 
> perf record --call-graph dwarf
> 
> or similar?
> 
> Cheers
> 

ok, I will try it, thank you.  I will have to upgrade perf or move to a 
newer kernel ? because dwarf does not seem to be supported on my current 
system.  I will report back what it shows once I get it working.

thanks,
mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 17:43   ` M Kelly
@ 2016-06-01 18:11     ` Arnaldo Carvalho de Melo
  2016-06-01 18:49       ` M Kelly
  0 siblings, 1 reply; 10+ messages in thread
From: Arnaldo Carvalho de Melo @ 2016-06-01 18:11 UTC (permalink / raw)
  To: M Kelly; +Cc: linux-perf-users

Em Wed, Jun 01, 2016 at 05:43:12PM +0000, M Kelly escreveu:
> Milian Wolff <milian.wolff <at> kdab.com> writes:
> > On Wednesday, June 1, 2016 1:05:40 PM CEST Kelly, Mark (RIS-BCT) wrote:
> > > Perf report for our application shows:
> > > 
> > >   34.27%  [kernel]                    [k] _spin_lock
> > >    8.19%  libjlib.so                  [.] CLZWExpander::expand(void*)
> > >    4.62%  libjhtree.so                [.] CMRUCacheOf<CKeyIdAndPos,
> > >    1.97%  [kernel]                    [k] futex_wait_setup

> > > Any suggestion for how to trace where/who is responsible for the 
> spin_lock ?

> > For both cases, you should collect backtraces which should allow you
> > to find culprits. Have you tried:

> > perf record --call-graph dwarf

> > or similar?
 
> ok, I will try it, thank you.  I will have to upgrade perf or move to a 
> newer kernel ? because dwarf does not seem to be supported on my current 
> system.  I will report back what it shows once I get it working.

No need for DWARF if you want to know what leads to the spinlock just up
to the syscall layer or if your userland components have frame pointers.
So first try with:

	perf record -g ./workload

Or:

	perf record -g -p `pidof workload`

- Arnaldo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 18:11     ` Arnaldo Carvalho de Melo
@ 2016-06-01 18:49       ` M Kelly
  2016-06-01 18:53         ` Rick Jones
  2016-06-02  9:18         ` Milian Wolff
  0 siblings, 2 replies; 10+ messages in thread
From: M Kelly @ 2016-06-01 18:49 UTC (permalink / raw)
  To: linux-perf-users

Arnaldo Carvalho de Melo <acme <at> kernel.org> writes:

> No need for DWARF if you want to know what leads to the spinlock just up
> to the syscall layer or if your userland components have frame pointers.

Hi,

ok, I rebuilt our app using -fno-omit-frame-pointer 
and did the perf record -g and when looking at the 
report output I now see:

    40.90%  thorslave_mytho  [kernel.kallsyms]                   [k] _spin_lock                                                                       
            |
            --- _spin_lock
               |          
               |--49.11%-- futex_wake
               |          do_futex
               |          sys_futex
               |          system_call_fastpath
               |          __lll_unlock_wake
               |          |          
               |          |--99.81%-- CKeyCursor::gtEqual(char const*, char*, bool)
               |          |          _ZN16CKeyLevelManager7_lookupEbj.constprop.100
               |
               |--44.31%-- futex_wait_setup
               |          futex_wait
               |          do_futex
               |          sys_futex
               |          system_call_fastpath
               |          __lll_lock_wait
               |          |          
               |          |--99.83%-- CKeyCursor::gtEqual(char const*, char*, bool)
               |          |          _ZN16CKeyLevelManager7_lookupEbj.constprop.100
               |
               |--5.46%-- task_rq_lock
               |          try_to_wake_up
               |          |          
               |          |--99.98%-- wake_up_state
               |          |          wake_futex
               |          |          futex_wake
               |          |          do_futex
               |          |          sys_futex
               |          |          system_call_fastpath
               |          |          __lll_unlock_wake
               |          |          |          
               |          |          |--99.75%-- CKeyCursor::gtEqual(char const*, char*, bool)
               |          |          |          _ZN16CKeyLevelManager7_lookupEbj.constprop.100
               |          
               |--0.73%-- try_to_wake_up
               |          |          
               |          |--99.83%-- wake_up_state
               |          |          wake_futex
               |          |          futex_wake
               |          |          do_futex
               |          |          sys_futex
               |          |          system_call_fastpath
               |          |          __lll_unlock_wake
               |          |          |          
               |          |          |--99.88%-- CKeyCursor::gtEqual(char const*, char*, bool)
               |          |          |          _ZN16CKeyLevelManager7_lookupEbj.constprop.100

So that is helpful.  
I do not understand the system_call_fastpath->futex->_spin_lock part yet, 
but that is a separate issue!

thanks so much,
mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 18:49       ` M Kelly
@ 2016-06-01 18:53         ` Rick Jones
  2016-06-01 19:25           ` M Kelly
  2016-06-02  9:18         ` Milian Wolff
  1 sibling, 1 reply; 10+ messages in thread
From: Rick Jones @ 2016-06-01 18:53 UTC (permalink / raw)
  To: M Kelly, linux-perf-users

On 06/01/2016 11:49 AM, M Kelly wrote:
> So that is helpful.
> I do not understand the system_call_fastpath->futex->_spin_lock part yet,
> but that is a separate issue!

Just a guess, but if your application is using threads, contention on 
the synchonrization primitives you use to keep things straight can 
end-up punting to the kernel.

rick jones

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 18:53         ` Rick Jones
@ 2016-06-01 19:25           ` M Kelly
  2016-06-01 20:24             ` Rick Jones
  0 siblings, 1 reply; 10+ messages in thread
From: M Kelly @ 2016-06-01 19:25 UTC (permalink / raw)
  To: linux-perf-users

Rick Jones <rick.jones2 <at> hpe.com> writes:

> 
> On 06/01/2016 11:49 AM, M Kelly wrote:
> > So that is helpful.
> > I do not understand the system_call_fastpath->futex->_spin_lock part
> > yet,
> > but that is a separate issue!
> 
> Just a guess, but if your application is using threads, contention on 
> the synchonrization primitives you use to keep things straight can 
> end-up punting to the kernel.
> 
> rick jones
> 

Hi,

I think I recall you from hp, years ago when I was there :-)
Thanks for the info.  I suspect we are too heavy on 
pthread_mutex_lock()/unlock() 
which could probably produce these call graphs:

            --- _spin_lock
               |          
               |--49.11%-- futex_wake
               |          do_futex
               |          sys_futex
               |          system_call_fastpath
               |          __lll_unlock_wake
               |          
               |--44.31%-- futex_wait_setup
               |          futex_wait
               |          do_futex
               |          sys_futex
               |          system_call_fastpath
               |          __lll_lock_wait

thanks again,
mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 19:25           ` M Kelly
@ 2016-06-01 20:24             ` Rick Jones
  2016-06-01 21:10               ` David Ahern
  0 siblings, 1 reply; 10+ messages in thread
From: Rick Jones @ 2016-06-01 20:24 UTC (permalink / raw)
  To: M Kelly, linux-perf-users

On 06/01/2016 12:25 PM, M Kelly wrote:
> I think I recall you from hp, years ago when I was there :-)
> Thanks for the info.  I suspect we are too heavy on
> pthread_mutex_lock()/unlock()

HP-UX has some ways/tools to track mutex contention.  Some extensions to 
give names to mutexes and then get stats on how often one went to grab 
them and it was held etc etc.  Been ages since I used them - I'd hope 
there was something similar for Linux but I've not played with threads 
in a very long time.

happy benchmarking,

rick jones

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 20:24             ` Rick Jones
@ 2016-06-01 21:10               ` David Ahern
  0 siblings, 0 replies; 10+ messages in thread
From: David Ahern @ 2016-06-01 21:10 UTC (permalink / raw)
  To: Rick Jones, M Kelly, linux-perf-users

On 6/1/16 2:24 PM, Rick Jones wrote:
> On 06/01/2016 12:25 PM, M Kelly wrote:
>> I think I recall you from hp, years ago when I was there :-)
>> Thanks for the info.  I suspect we are too heavy on
>> pthread_mutex_lock()/unlock()
>
> HP-UX has some ways/tools to track mutex contention.  Some extensions to
> give names to mutexes and then get stats on how often one went to grab
> them and it was held etc etc.  Been ages since I used them - I'd hope
> there was something similar for Linux but I've not played with threads
> in a very long time.

If the kernel is new enough (3.8 or 3.10, I forget when userspace return 
probes were added) you can use probes + futex system call to analyze 
lock contention.

 From an old script for x86:

PL=/lib64/libpthread.so.0
perf probe -x ${PL} -a 'mutex_lock=pthread_mutex_lock addr=%di'
perf probe -x ${PL} -a 'mutex_trylock=pthread_mutex_trylock addr=%di'
perf probe -x ${PL} -a 'mutex_timedlock=pthread_mutex_timedlock addr=%di'
perf probe -x ${PL} -a 'mutex_unlock=pthread_mutex_unlock addr=%di'

perf probe -x ${PL} -a 'mutex_lock_ret=pthread_mutex_lock%return ret=%ax'
perf probe -x ${PL} -a 'mutex_trylock_ret=pthread_mutex_trylock%return 
ret=%ax'
perf probe -x ${PL} -a 
'mutex_timedlock_ret=pthread_mutex_timedlock%return ret=%ax'


perf record -a -e probe_libpthread:* -e 
syscalls:sys_enter_futex,syscalls:sys_exit_futex -- <workload spec>

perf script -g python

--> edit perf-script.py for analysis of interest.

perf script -s perf-script.py

e.g., you can track the thread id holding a lock and for a how long, how 
many contentions, who got the lock when it was released, etc.

Many years ago I used the above to understand a nasty apparent deadlock 
in a process.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: spin_lock cause ?
  2016-06-01 18:49       ` M Kelly
  2016-06-01 18:53         ` Rick Jones
@ 2016-06-02  9:18         ` Milian Wolff
  1 sibling, 0 replies; 10+ messages in thread
From: Milian Wolff @ 2016-06-02  9:18 UTC (permalink / raw)
  To: mark.kelly; +Cc: linux-perf-users

[-- Attachment #1: Type: text/plain, Size: 1870 bytes --]

On Mittwoch, 1. Juni 2016 18:49:01 CEST M Kelly wrote:
> Arnaldo Carvalho de Melo <acme <at> kernel.org> writes:
> > No need for DWARF if you want to know what leads to the spinlock just up
> > to the syscall layer or if your userland components have frame pointers.
> 
> Hi,
> 
> ok, I rebuilt our app using -fno-omit-frame-pointer
> and did the perf record -g and when looking at the
> report output I now see:
> 
>     40.90%  thorslave_mytho  [kernel.kallsyms]                   [k]
> _spin_lock
> 
>             --- _spin_lock
> 
>                |--49.11%-- futex_wake
>                |
>                |          do_futex
>                |          sys_futex
>                |          system_call_fastpath
>                |          __lll_unlock_wake
>                |          
>                |          |--99.81%-- CKeyCursor::gtEqual(char const*,
>                |          |char*, bool)
>                |          |
>                |          |          _ZN16CKeyLevelManager7_lookupEbj.constp
>                |          |          rop.100

<snip>

You are running an older perf, I recommend you rerun the report step and pass 
`-g graph`, the old default of `-g fractal` outputs relative-to-parent numbers 
which is often misleading.

The above already shows that you are suffering from lock contention issues, 
and where from (i.e. CKeyCursor::gtEqual). As the above only shows CPU time 
spent, I can only recommend to additionally look at sleep times, e.g. via 

https://perf.wiki.kernel.org/index.php/Tutorial#Profiling_sleep_times

You can easily put the above commands into a shell helper file, I did this:

https://github.com/milianw/shell-helpers/blob/master/perf-sleep-record

HTH
-- 
Milian Wolff | milian.wolff@kdab.com | Software Engineer
KDAB (Deutschland) GmbH&Co KG, a KDAB Group company
Tel: +49-30-521325470
KDAB - The Qt Experts

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5903 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-06-02  9:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-01 17:05 spin_lock cause ? Kelly, Mark (RIS-BCT)
2016-06-01 17:37 ` Milian Wolff
2016-06-01 17:43   ` M Kelly
2016-06-01 18:11     ` Arnaldo Carvalho de Melo
2016-06-01 18:49       ` M Kelly
2016-06-01 18:53         ` Rick Jones
2016-06-01 19:25           ` M Kelly
2016-06-01 20:24             ` Rick Jones
2016-06-01 21:10               ` David Ahern
2016-06-02  9:18         ` Milian Wolff

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.