linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* hackbench regression with 2.6.36-rc1
@ 2010-08-18  6:18 Zhang, Yanmin
  2010-08-18 10:56 ` Eric W. Biederman
  0 siblings, 1 reply; 4+ messages in thread
From: Zhang, Yanmin @ 2010-08-18  6:18 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: LKML, alex.shi, Pavel Emelyanov, David S. Miller

Comparing with 2.6.35's result, hackbench (thread mode) has about
80% regression on dual-socket Nehalem machine and about 90% regression
on 4-socket Tigerton machines.

Command to start hackbench:
#./hackbench 100 thread 2000

process mode has no such regression.

Profiling shows:
#perf top
             samples  pcnt function                 DSO
             _______ _____ ________________________ ________________________

            74415.00 29.9% put_pid                  [kernel.kallsyms]       
            38395.00 15.4% unix_stream_recvmsg      [kernel.kallsyms]       
            34877.00 14.0% unix_stream_sendmsg      [kernel.kallsyms]       
            25204.00 10.1% pid_vnr                  [kernel.kallsyms]       
            21864.00  8.8% unix_scm_to_skb          [kernel.kallsyms]       
            13637.00  5.5% cred_to_ucred            [kernel.kallsyms]       
             6520.00  2.6% unix_destruct_scm        [kernel.kallsyms]       
             4731.00  1.9% sock_alloc_send_pskb     [kernel.kallsyms]       


With 2.6.35, perf doesn't show put_pid/pid_vnr.

Alex Shi and I did a quick bisect and located below 2 patches.
1) commit 7361c36c5224519b258219fe3d0e8abc865d8134
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Sun Jun 13 03:34:33 2010 +0000

    af_unix: Allow credentials to work across user and pid namespaces.

    In unix_skb_parms store pointers to struct pid and struct cred instead
    of raw uid, gid, and pid values, then translate the credentials on
    reception into values that are meaningful in the receiving processes
    namespaces.


2) commit 257b5358b32f17e0603b6ff57b13610b0e02348f
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Sun Jun 13 03:32:34 2010 +0000

    scm: Capture the full credentials of the scm sender.

    Start capturing not only the userspace pid, uid and gid values of the
    sending process but also the struct pid and struct cred of the sending
    process as well.





^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hackbench regression with 2.6.36-rc1
  2010-08-18  6:18 hackbench regression with 2.6.36-rc1 Zhang, Yanmin
@ 2010-08-18 10:56 ` Eric W. Biederman
  2010-08-19  8:54   ` Zhang, Yanmin
  0 siblings, 1 reply; 4+ messages in thread
From: Eric W. Biederman @ 2010-08-18 10:56 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: LKML, alex.shi, Pavel Emelyanov, David S. Miller

"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> writes:

> Comparing with 2.6.35's result, hackbench (thread mode) has about
> 80% regression on dual-socket Nehalem machine and about 90% regression
> on 4-socket Tigerton machines.

That seems unfortunate.  Do you only show a regression in the pthread
hackbench test?  Do you show a regression when you use pipes?

Does the size of the regression very based on the number of loop
iterations?  I ask because it appears that on the last message the
sender will exit necessitating that the receiver put the senders pid.
Which should be atypical.

> Command to start hackbench:
> #./hackbench 100 thread 2000
>
> process mode has no such regression.
>
> Profiling shows:
> #perf top
>              samples  pcnt function                 DSO
>              _______ _____ ________________________ ________________________
>
>             74415.00 29.9% put_pid                  [kernel.kallsyms]       
>             38395.00 15.4% unix_stream_recvmsg      [kernel.kallsyms]       
>             34877.00 14.0% unix_stream_sendmsg      [kernel.kallsyms]       
>             25204.00 10.1% pid_vnr                  [kernel.kallsyms]       
>             21864.00  8.8% unix_scm_to_skb          [kernel.kallsyms]       
>             13637.00  5.5% cred_to_ucred            [kernel.kallsyms]       
>              6520.00  2.6% unix_destruct_scm        [kernel.kallsyms]       
>              4731.00  1.9% sock_alloc_send_pskb     [kernel.kallsyms]       
>
>
> With 2.6.35, perf doesn't show put_pid/pid_NR.

Yes.  2.6.35 is imperfect and can report the wrong pid in some
circumstances.  I am surprised nothing related to the reference count on
struct cred does not show up in your profiling traces.

You are performing statistical sampling so I don't believe the
percentage of hits per function is the same as the percentage of
time per function.

Given that we are talking about a scheduler benchmark that is
doing something rather artificial (inter thread communication via
sockets), I don't know that this case is worth worrying about.

> Alex Shi and I did a quick bisect and located below 2 patches.

That is a plausible result.  The atomic reference counts may
be causing you to ping pong cache lines between cpus.

Eric


> 1) commit 7361c36c5224519b258219fe3d0e8abc865d8134
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Sun Jun 13 03:34:33 2010 +0000
>
>     af_unix: Allow credentials to work across user and pid namespaces.
>
>     In unix_skb_parms store pointers to struct pid and struct cred instead
>     of raw uid, gid, and pid values, then translate the credentials on
>     reception into values that are meaningful in the receiving processes
>     namespaces.
>
>
> 2) commit 257b5358b32f17e0603b6ff57b13610b0e02348f
> Author: Eric W. Biederman <ebiederm@xmission.com>
> Date:   Sun Jun 13 03:32:34 2010 +0000
>
>     scm: Capture the full credentials of the scm sender.
>
>     Start capturing not only the userspace pid, uid and gid values of the
>     sending process but also the struct pid and struct cred of the sending
>     process as well.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hackbench regression with 2.6.36-rc1
  2010-08-18 10:56 ` Eric W. Biederman
@ 2010-08-19  8:54   ` Zhang, Yanmin
  2010-08-19 20:25     ` Eric W. Biederman
  0 siblings, 1 reply; 4+ messages in thread
From: Zhang, Yanmin @ 2010-08-19  8:54 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: LKML, alex.shi, Pavel Emelyanov, David S. Miller

On Wed, 2010-08-18 at 03:56 -0700, Eric W. Biederman wrote:
> "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> writes:
> 
> > Comparing with 2.6.35's result, hackbench (thread mode) has about
> > 80% regression on dual-socket Nehalem machine and about 90% regression
> > on 4-socket Tigerton machines.
> 
> That seems unfortunate.  

> Do you only show a regression in the pthread
> hackbench test?
Yes.

>   Do you show a regression when you use pipes?
No.

> 
> Does the size of the regression very based on the number of loop
> iterations?
No. I tried 1000 and get the similar regression ratio.
I choose a large 2000 loop number because I want to get a stable result.

It's easy to reproduce it. We found it almost on all our machines.

>   I ask because it appears that on the last message the
> sender will exit necessitating that the receiver put the senders pid.
> Which should be atypical.
I don't agree on that. With hackbench, sender would send loops*receiver_num_per_group
messages before exiting.
In addition, 'perf top' shows put_pid is the hottest function in the beginning
after I start hackbench. 

> 
> > Command to start hackbench:
> > #./hackbench 100 thread 2000
> >
> > process mode has no such regression.
> >
> > Profiling shows:
> > #perf top
> >              samples  pcnt function                 DSO
> >              _______ _____ ________________________ ________________________
> >
> >             74415.00 29.9% put_pid                  [kernel.kallsyms]       
> >             38395.00 15.4% unix_stream_recvmsg      [kernel.kallsyms]       
> >             34877.00 14.0% unix_stream_sendmsg      [kernel.kallsyms]       
> >             25204.00 10.1% pid_vnr                  [kernel.kallsyms]       
> >             21864.00  8.8% unix_scm_to_skb          [kernel.kallsyms]       
> >             13637.00  5.5% cred_to_ucred            [kernel.kallsyms]       
> >              6520.00  2.6% unix_destruct_scm        [kernel.kallsyms]       
> >              4731.00  1.9% sock_alloc_send_pskb     [kernel.kallsyms]       
> >
> >
> > With 2.6.35, perf doesn't show put_pid/pid_NR.
> 
> Yes.  2.6.35 is imperfect and can report the wrong pid in some
> circumstances.  I am surprised nothing related to the reference count on
> struct cred does not show up in your profiling traces.
> 

> You are performing statistical sampling so I don't believe the
> percentage of hits per function is the same as the percentage of
> time per function.
Agree. But from performance tuning point of view, percentage of hit is enough
for helping developers to investigate.

I provide 'perf top' data is to help you debug, not to prove your patches
cause the regression. We used bisect to locate them.

> 
> Given that we are talking about a scheduler benchmark that is
> doing something rather artificial (inter thread communication via
> sockets), I don't know that this case is worth worrying about.
Good question. I don't know how about below scenario:
Start 2 processes and every process creates many threads. threads of process 1
communicates with threads of process 2.

> 
> > Alex Shi and I did a quick bisect and located below 2 patches.
> 
> That is a plausible result.  

> The atomic reference counts may
> be causing you to ping pong cache lines between cpus.
Agree.



^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: hackbench regression with 2.6.36-rc1
  2010-08-19  8:54   ` Zhang, Yanmin
@ 2010-08-19 20:25     ` Eric W. Biederman
  0 siblings, 0 replies; 4+ messages in thread
From: Eric W. Biederman @ 2010-08-19 20:25 UTC (permalink / raw)
  To: Zhang, Yanmin; +Cc: LKML, alex.shi, Pavel Emelyanov, David S. Miller

"Zhang, Yanmin" <yanmin_zhang@linux.intel.com> writes:

> On Wed, 2010-08-18 at 03:56 -0700, Eric W. Biederman wrote:
>> "Zhang, Yanmin" <yanmin_zhang@linux.intel.com> writes:
>> 
>> > Comparing with 2.6.35's result, hackbench (thread mode) has about
>> > 80% regression on dual-socket Nehalem machine and about 90% regression
>> > on 4-socket Tigerton machines.
>> 
>> That seems unfortunate.  
>
>> Do you only show a regression in the pthread
>> hackbench test?
> Yes.
>
>>   Do you show a regression when you use pipes?
> No.
>
>> 
>> Does the size of the regression very based on the number of loop
>> iterations?
> No. I tried 1000 and get the similar regression ratio.
> I choose a large 2000 loop number because I want to get a stable
> result.
>
> It's easy to reproduce it. We found it almost on all our machines.
>
>>   I ask because it appears that on the last message the
>> sender will exit necessitating that the receiver put the senders pid.
>> Which should be atypical.
> I don't agree on that. With hackbench, sender would send loops*receiver_num_per_group
> messages before exiting.
> In addition, 'perf top' shows put_pid is the hottest function in the beginning
> after I start hackbench. 

If increasing the number of loops does not improve the performance the
hypothesis that it is only the last message that has the regression
is shot.


>> > Command to start hackbench:
>> > #./hackbench 100 thread 2000
>> >
>> > process mode has no such regression.
>> >
>> > Profiling shows:
>> > #perf top
>> >              samples  pcnt function                 DSO
>> >              _______ _____ ________________________ ________________________
>> >
>> >             74415.00 29.9% put_pid                  [kernel.kallsyms]       
>> >             38395.00 15.4% unix_stream_recvmsg      [kernel.kallsyms]       
>> >             34877.00 14.0% unix_stream_sendmsg      [kernel.kallsyms]       
>> >             25204.00 10.1% pid_vnr                  [kernel.kallsyms]       
>> >             21864.00  8.8% unix_scm_to_skb          [kernel.kallsyms]       
>> >             13637.00  5.5% cred_to_ucred            [kernel.kallsyms]       
>> >              6520.00  2.6% unix_destruct_scm        [kernel.kallsyms]       
>> >              4731.00  1.9% sock_alloc_send_pskb     [kernel.kallsyms]       
>> >
>> >
>> > With 2.6.35, perf doesn't show put_pid/pid_NR.
>> 
>> Yes.  2.6.35 is imperfect and can report the wrong pid in some
>> circumstances.  I am surprised nothing related to the reference count on
>> struct cred does not show up in your profiling traces.
>> 
>
>> You are performing statistical sampling so I don't believe the
>> percentage of hits per function is the same as the percentage of
>> time per function.
> Agree. But from performance tuning point of view, percentage of hit is enough
> for helping developers to investigate.
>
> I provide 'perf top' data is to help you debug, not to prove your patches
> cause the regression. We used bisect to locate them.

Sure I was just trying to figure out how to explain why the creds
don't show a similar hit.  I still don't have a complete explanation
for the profile but the cred put and get are inline functions so they
won't be present as distinct functions in the profile.

>> Given that we are talking about a scheduler benchmark that is
>> doing something rather artificial (inter thread communication via
>> sockets), I don't know that this case is worth worrying about.
> Good question. I don't know how about below scenario:
> Start 2 processes and every process creates many threads. threads of process 1
> communicates with threads of process 2.

Maybe.  A lot depends on the timing, and what it takes to trigger
the cross cpu cache line bounce.

And we still have pipes for ultimate performance.  Grrr.

I will give it some thought to see if I can find a less expensive way
but I don't have any good ideas at the moment.


Eric

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-08-19 20:25 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-18  6:18 hackbench regression with 2.6.36-rc1 Zhang, Yanmin
2010-08-18 10:56 ` Eric W. Biederman
2010-08-19  8:54   ` Zhang, Yanmin
2010-08-19 20:25     ` Eric W. Biederman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).