linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* hyperthreading performance with dbt-2 on 2.6.0-test11
@ 2003-12-08 23:54 markw
  2003-12-09  0:17 ` Nick Piggin
  0 siblings, 1 reply; 5+ messages in thread
From: markw @ 2003-12-08 23:54 UTC (permalink / raw)
  To: linux-kernel; +Cc: osdldbt-general

Hello, I have some data with hyperthreading I wanted to share.

I've seen about a 15% performance decrease in performance on a 4-way
Xeon system when I enable hyperthreading for my DBT-2 workload.  I also
gave Ingo's test11-C1 patch that someone pointed me to a try and only
saw a 12% decrease. Has anyone found this to be common with any specific
workloads?

I'm not really sure what to look for, but I do see some changes in the
readprofile data, which I've copied in part below.  It appears that the
count of schedule, __make_request, and try_to_wake_up are the only
functions at the top of the profile that are significantly different.
The links I have posted also have pointers to oprofile data as well as
annotated assembly source output, if that interests anyone.  If I can
provide any other details, let me know.

For the test with no hyperthreading
(http://developer.osdl.org/markw/dbt2-pgsql/258/):

8199701 poll_idle                                141374.1552
159442 schedule                                  93.5144
133488 __copy_from_user_ll                      1059.4286
128589 __copy_to_user_ll                        1071.5750
 73209 DAC960_LP_InterruptHandler               367.8844
 51885 __make_request                            36.0062
 35614 try_to_wake_up                            55.6469


For the test with hyperthreading
(http://developer.osdl.org/markw/dbt2-pgsql/253/):

20893773 poll_idle                                360237.4655
351988 schedule                                 206.4446
155826 __copy_from_user_ll                      1038.8400
152346 __copy_to_user_ll                        1269.5500
 90983 DAC960_LP_InterruptHandler               457.2010
 86936 try_to_wake_up                           135.8375
 70122 __make_request                            48.6620


For the test with hyperthreading and Ingo's patch
(http://developer.osdl.org/markw/dbt2-pgsql/260/):

20544823 poll_idle                                354221.0862
520575 schedule                                 231.6756
159609 __copy_from_user_ll                      1266.7381
153279 __copy_to_user_ll                        1277.3250
139321 try_to_wake_up                           221.4960
 92447 DAC960_LP_InterruptHandler               464.5578
 72162 __make_request                            50.0777


-- 
Mark Wong - - markw@osdl.org
Open Source Development Lab Inc - A non-profit corporation
12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005
(503) 626-2455 x 32 (office)
(503) 626-2436      (fax)
http://developer.osdl.org/markw/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: hyperthreading performance with dbt-2 on 2.6.0-test11
  2003-12-08 23:54 hyperthreading performance with dbt-2 on 2.6.0-test11 markw
@ 2003-12-09  0:17 ` Nick Piggin
  2003-12-09  3:12   ` Mark Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Nick Piggin @ 2003-12-09  0:17 UTC (permalink / raw)
  To: markw; +Cc: linux-kernel, osdldbt-general

[-- Attachment #1: Type: text/plain, Size: 1488 bytes --]



markw@osdl.org wrote:

>Hello, I have some data with hyperthreading I wanted to share.
>
>I've seen about a 15% performance decrease in performance on a 4-way
>Xeon system when I enable hyperthreading for my DBT-2 workload.  I also
>gave Ingo's test11-C1 patch that someone pointed me to a try and only
>saw a 12% decrease. Has anyone found this to be common with any specific
>workloads?
>
>I'm not really sure what to look for, but I do see some changes in the
>readprofile data, which I've copied in part below.  It appears that the
>count of schedule, __make_request, and try_to_wake_up are the only
>functions at the top of the profile that are significantly different.
>The links I have posted also have pointers to oprofile data as well as
>annotated assembly source output, if that interests anyone.  If I can
>provide any other details, let me know.
>

Hi Mark,
It could be cache contention which I think is typically the reason
hyperthreading can hurt performance. Its basically impossible for
the scheduler to correct this automatically (access to performance
counters might make it slightly less impossible).

Probably the CPU hotplug interface would enable a tool to effectively
turn HT on or off and it would be up to an administrator to tune
performance.

You could try my scheduler patchset if you like. I have recently got
HT support working (its against test11, you need to turn CONFIG_SMT
on), although if Ingo's patch doesn't help much, mine probably won't
either.


[-- Attachment #2: w26p21.gz --]
[-- Type: application/x-tar, Size: 13256 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: hyperthreading performance with dbt-2 on 2.6.0-test11
  2003-12-09  0:17 ` Nick Piggin
@ 2003-12-09  3:12   ` Mark Wong
  2003-12-09  3:27     ` Nick Piggin
  0 siblings, 1 reply; 5+ messages in thread
From: Mark Wong @ 2003-12-09  3:12 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel, osdldbt-general

On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote:
> 
> 
> markw@osdl.org wrote:
> 
> >Hello, I have some data with hyperthreading I wanted to share.
> >
> >I've seen about a 15% performance decrease in performance on a 4-way
> >Xeon system when I enable hyperthreading for my DBT-2 workload.  I also
> >gave Ingo's test11-C1 patch that someone pointed me to a try and only
> >saw a 12% decrease. Has anyone found this to be common with any specific
> >workloads?
> >
> >I'm not really sure what to look for, but I do see some changes in the
> >readprofile data, which I've copied in part below.  It appears that the
> >count of schedule, __make_request, and try_to_wake_up are the only
> >functions at the top of the profile that are significantly different.
> >The links I have posted also have pointers to oprofile data as well as
> >annotated assembly source output, if that interests anyone.  If I can
> >provide any other details, let me know.
> >
> 
> Hi Mark,
> It could be cache contention which I think is typically the reason
> hyperthreading can hurt performance. Its basically impossible for
> the scheduler to correct this automatically (access to performance
> counters might make it slightly less impossible).
> 
> Probably the CPU hotplug interface would enable a tool to effectively
> turn HT on or off and it would be up to an administrator to tune
> performance.
> 
> You could try my scheduler patchset if you like. I have recently got
> HT support working (its against test11, you need to turn CONFIG_SMT
> on), although if Ingo's patch doesn't help much, mine probably won't
> either.

Hi Nick,

Went ahead and tried your patch, but it looks like something's wrong.  If
this helps any:

Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0)
Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0 
       c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080 
       00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc 
Call Trace:                                                                      [<c011f488>] schedule+0x380/0x705
 [<c01ff412>] sys_semtimedop+0x460/0x530
 [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
 [<c02c69ea>] p4_check_ctrs+0xab/0x11b      
 [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
 [<c02c5a5d>] nmi_callback+0x25/0x29        
 [<c014218d>] buffered_rmqueue+0xea/0x199
 [<c010b8f5>] nmi_stack_correct+0x1e/0x2e
 [<c01422eb>] __alloc_pages+0xaf/0x334   
 [<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e
 [<c014d0c2>] do_anonymous_page+0x16a/0x28b            
 [<c014d80b>] handle_mm_fault+0x101/0x1ad  
 [<c011c4f5>] do_page_fault+0x2fa/0x4fc  
 [<c011eb6d>] rebalance_tick+0x8a/0x91 
 [<c02c4368>] oprofile_add_sample+0x9b/0x117
 [<c02c69ea>] p4_check_ctrs+0xab/0x11b      
 [<c0111ec4>] sys_ipc+0x61/0x2ae      
 [<c02c5a5d>] nmi_callback+0x25/0x29
 [<c010c839>] do_nmi+0x39/0x5a      
 [<c010ad4d>] sysenter_past_esp+0x52/0x71
                                         
Code:  Bad EIP value.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: hyperthreading performance with dbt-2 on 2.6.0-test11
  2003-12-09  3:12   ` Mark Wong
@ 2003-12-09  3:27     ` Nick Piggin
  2003-12-09  3:37       ` Mark Wong
  0 siblings, 1 reply; 5+ messages in thread
From: Nick Piggin @ 2003-12-09  3:27 UTC (permalink / raw)
  To: Mark Wong; +Cc: linux-kernel, osdldbt-general



Mark Wong wrote:

>On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote:
>
>>
>>markw@osdl.org wrote:
>>
>>
>>>Hello, I have some data with hyperthreading I wanted to share.
>>>
>>>I've seen about a 15% performance decrease in performance on a 4-way
>>>Xeon system when I enable hyperthreading for my DBT-2 workload.  I also
>>>gave Ingo's test11-C1 patch that someone pointed me to a try and only
>>>saw a 12% decrease. Has anyone found this to be common with any specific
>>>workloads?
>>>
>>>I'm not really sure what to look for, but I do see some changes in the
>>>readprofile data, which I've copied in part below.  It appears that the
>>>count of schedule, __make_request, and try_to_wake_up are the only
>>>functions at the top of the profile that are significantly different.
>>>The links I have posted also have pointers to oprofile data as well as
>>>annotated assembly source output, if that interests anyone.  If I can
>>>provide any other details, let me know.
>>>
>>>
>>Hi Mark,
>>It could be cache contention which I think is typically the reason
>>hyperthreading can hurt performance. Its basically impossible for
>>the scheduler to correct this automatically (access to performance
>>counters might make it slightly less impossible).
>>
>>Probably the CPU hotplug interface would enable a tool to effectively
>>turn HT on or off and it would be up to an administrator to tune
>>performance.
>>
>>You could try my scheduler patchset if you like. I have recently got
>>HT support working (its against test11, you need to turn CONFIG_SMT
>>on), although if Ingo's patch doesn't help much, mine probably won't
>>either.
>>
>
>Hi Nick,
>
>Went ahead and tried your patch, but it looks like something's wrong.  If
>this helps any:
>

Doh. OK, did you manage to capture anything above this?

>
>Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0)
>Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0 
>       c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080 
>       00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc 
>Call Trace:                                                                      [<c011f488>] schedule+0x380/0x705
> [<c01ff412>] sys_semtimedop+0x460/0x530
> [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
> [<c02c69ea>] p4_check_ctrs+0xab/0x11b      
> [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
> [<c02c5a5d>] nmi_callback+0x25/0x29        
> [<c014218d>] buffered_rmqueue+0xea/0x199
> [<c010b8f5>] nmi_stack_correct+0x1e/0x2e
> [<c01422eb>] __alloc_pages+0xaf/0x334   
> [<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e
> [<c014d0c2>] do_anonymous_page+0x16a/0x28b            
> [<c014d80b>] handle_mm_fault+0x101/0x1ad  
> [<c011c4f5>] do_page_fault+0x2fa/0x4fc  
> [<c011eb6d>] rebalance_tick+0x8a/0x91 
> [<c02c4368>] oprofile_add_sample+0x9b/0x117
> [<c02c69ea>] p4_check_ctrs+0xab/0x11b      
> [<c0111ec4>] sys_ipc+0x61/0x2ae      
> [<c02c5a5d>] nmi_callback+0x25/0x29
> [<c010c839>] do_nmi+0x39/0x5a      
> [<c010ad4d>] sysenter_past_esp+0x52/0x71
>                                         
>Code:  Bad EIP value.
>
>
>
>  
>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: hyperthreading performance with dbt-2 on 2.6.0-test11
  2003-12-09  3:27     ` Nick Piggin
@ 2003-12-09  3:37       ` Mark Wong
  0 siblings, 0 replies; 5+ messages in thread
From: Mark Wong @ 2003-12-09  3:37 UTC (permalink / raw)
  To: Nick Piggin; +Cc: linux-kernel, osdldbt-general

Oops, sorry, I must of overlooked it because it looked like garbage.  Um, I'll
copy it anyway just in case it helps, but it didn't echo cleanly for some
reason:

<4>>UU:b e 4                           
 aF AGSr u7f e0d0ress P00s03t80xe eainredc
  pcinci8 g eip::00000002 0 espo ince: cd11e1fbr  cdx<40 a00v60t
                              : f5c41d04
0d0:                                    
     000   es: 0000   ss: 0068


On Tue, Dec 09, 2003 at 02:27:57PM +1100, Nick Piggin wrote:
> 
> 
> Mark Wong wrote:
> 
> >On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote:
> >
> >>
> >>markw@osdl.org wrote:
> >>
> >>
> >>>Hello, I have some data with hyperthreading I wanted to share.
> >>>
> >>>I've seen about a 15% performance decrease in performance on a 4-way
> >>>Xeon system when I enable hyperthreading for my DBT-2 workload.  I also
> >>>gave Ingo's test11-C1 patch that someone pointed me to a try and only
> >>>saw a 12% decrease. Has anyone found this to be common with any specific
> >>>workloads?
> >>>
> >>>I'm not really sure what to look for, but I do see some changes in the
> >>>readprofile data, which I've copied in part below.  It appears that the
> >>>count of schedule, __make_request, and try_to_wake_up are the only
> >>>functions at the top of the profile that are significantly different.
> >>>The links I have posted also have pointers to oprofile data as well as
> >>>annotated assembly source output, if that interests anyone.  If I can
> >>>provide any other details, let me know.
> >>>
> >>>
> >>Hi Mark,
> >>It could be cache contention which I think is typically the reason
> >>hyperthreading can hurt performance. Its basically impossible for
> >>the scheduler to correct this automatically (access to performance
> >>counters might make it slightly less impossible).
> >>
> >>Probably the CPU hotplug interface would enable a tool to effectively
> >>turn HT on or off and it would be up to an administrator to tune
> >>performance.
> >>
> >>You could try my scheduler patchset if you like. I have recently got
> >>HT support working (its against test11, you need to turn CONFIG_SMT
> >>on), although if Ingo's patch doesn't help much, mine probably won't
> >>either.
> >>
> >
> >Hi Nick,
> >
> >Went ahead and tried your patch, but it looks like something's wrong.  If
> >this helps any:
> >
> 
> Doh. OK, did you manage to capture anything above this?
> 
> >
> >Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0)
> >Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0 
> >       c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080 
> >       00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc 
> >Call Trace:                                                                      [<c011f488>] schedule+0x380/0x705
> > [<c01ff412>] sys_semtimedop+0x460/0x530
> > [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
> > [<c02c69ea>] p4_check_ctrs+0xab/0x11b      
> > [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
> > [<c02c5a5d>] nmi_callback+0x25/0x29        
> > [<c014218d>] buffered_rmqueue+0xea/0x199
> > [<c010b8f5>] nmi_stack_correct+0x1e/0x2e
> > [<c01422eb>] __alloc_pages+0xaf/0x334   
> > [<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e
> > [<c014d0c2>] do_anonymous_page+0x16a/0x28b            
> > [<c014d80b>] handle_mm_fault+0x101/0x1ad  
> > [<c011c4f5>] do_page_fault+0x2fa/0x4fc  
> > [<c011eb6d>] rebalance_tick+0x8a/0x91 
> > [<c02c4368>] oprofile_add_sample+0x9b/0x117
> > [<c02c69ea>] p4_check_ctrs+0xab/0x11b      
> > [<c0111ec4>] sys_ipc+0x61/0x2ae      
> > [<c02c5a5d>] nmi_callback+0x25/0x29
> > [<c010c839>] do_nmi+0x39/0x5a      
> > [<c010ad4d>] sysenter_past_esp+0x52/0x71
> >                                         
> >Code:  Bad EIP value.
> >
> >
> >
> >  
> >

-- 
Mark Wong - - markw@osdl.org
Open Source Development Lab Inc - A non-profit corporation
12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005
(503) 626-2455 x 32 (office)
(503) 626-2436      (fax)
http://developer.osdl.org/markw/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-12-09  3:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-08 23:54 hyperthreading performance with dbt-2 on 2.6.0-test11 markw
2003-12-09  0:17 ` Nick Piggin
2003-12-09  3:12   ` Mark Wong
2003-12-09  3:27     ` Nick Piggin
2003-12-09  3:37       ` Mark Wong

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).