* hyperthreading performance with dbt-2 on 2.6.0-test11
@ 2003-12-08 23:54 markw
2003-12-09 0:17 ` Nick Piggin
0 siblings, 1 reply; 5+ messages in thread
From: markw @ 2003-12-08 23:54 UTC (permalink / raw)
To: linux-kernel; +Cc: osdldbt-general
Hello, I have some data with hyperthreading I wanted to share.
I've seen about a 15% performance decrease in performance on a 4-way
Xeon system when I enable hyperthreading for my DBT-2 workload. I also
gave Ingo's test11-C1 patch that someone pointed me to a try and only
saw a 12% decrease. Has anyone found this to be common with any specific
workloads?
I'm not really sure what to look for, but I do see some changes in the
readprofile data, which I've copied in part below. It appears that the
count of schedule, __make_request, and try_to_wake_up are the only
functions at the top of the profile that are significantly different.
The links I have posted also have pointers to oprofile data as well as
annotated assembly source output, if that interests anyone. If I can
provide any other details, let me know.
For the test with no hyperthreading
(http://developer.osdl.org/markw/dbt2-pgsql/258/):
8199701 poll_idle 141374.1552
159442 schedule 93.5144
133488 __copy_from_user_ll 1059.4286
128589 __copy_to_user_ll 1071.5750
73209 DAC960_LP_InterruptHandler 367.8844
51885 __make_request 36.0062
35614 try_to_wake_up 55.6469
For the test with hyperthreading
(http://developer.osdl.org/markw/dbt2-pgsql/253/):
20893773 poll_idle 360237.4655
351988 schedule 206.4446
155826 __copy_from_user_ll 1038.8400
152346 __copy_to_user_ll 1269.5500
90983 DAC960_LP_InterruptHandler 457.2010
86936 try_to_wake_up 135.8375
70122 __make_request 48.6620
For the test with hyperthreading and Ingo's patch
(http://developer.osdl.org/markw/dbt2-pgsql/260/):
20544823 poll_idle 354221.0862
520575 schedule 231.6756
159609 __copy_from_user_ll 1266.7381
153279 __copy_to_user_ll 1277.3250
139321 try_to_wake_up 221.4960
92447 DAC960_LP_InterruptHandler 464.5578
72162 __make_request 50.0777
--
Mark Wong - - markw@osdl.org
Open Source Development Lab Inc - A non-profit corporation
12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005
(503) 626-2455 x 32 (office)
(503) 626-2436 (fax)
http://developer.osdl.org/markw/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hyperthreading performance with dbt-2 on 2.6.0-test11
2003-12-08 23:54 hyperthreading performance with dbt-2 on 2.6.0-test11 markw
@ 2003-12-09 0:17 ` Nick Piggin
2003-12-09 3:12 ` Mark Wong
0 siblings, 1 reply; 5+ messages in thread
From: Nick Piggin @ 2003-12-09 0:17 UTC (permalink / raw)
To: markw; +Cc: linux-kernel, osdldbt-general
[-- Attachment #1: Type: text/plain, Size: 1488 bytes --]
markw@osdl.org wrote:
>Hello, I have some data with hyperthreading I wanted to share.
>
>I've seen about a 15% performance decrease in performance on a 4-way
>Xeon system when I enable hyperthreading for my DBT-2 workload. I also
>gave Ingo's test11-C1 patch that someone pointed me to a try and only
>saw a 12% decrease. Has anyone found this to be common with any specific
>workloads?
>
>I'm not really sure what to look for, but I do see some changes in the
>readprofile data, which I've copied in part below. It appears that the
>count of schedule, __make_request, and try_to_wake_up are the only
>functions at the top of the profile that are significantly different.
>The links I have posted also have pointers to oprofile data as well as
>annotated assembly source output, if that interests anyone. If I can
>provide any other details, let me know.
>
Hi Mark,
It could be cache contention which I think is typically the reason
hyperthreading can hurt performance. Its basically impossible for
the scheduler to correct this automatically (access to performance
counters might make it slightly less impossible).
Probably the CPU hotplug interface would enable a tool to effectively
turn HT on or off and it would be up to an administrator to tune
performance.
You could try my scheduler patchset if you like. I have recently got
HT support working (its against test11, you need to turn CONFIG_SMT
on), although if Ingo's patch doesn't help much, mine probably won't
either.
[-- Attachment #2: w26p21.gz --]
[-- Type: application/x-tar, Size: 13256 bytes --]
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hyperthreading performance with dbt-2 on 2.6.0-test11
2003-12-09 0:17 ` Nick Piggin
@ 2003-12-09 3:12 ` Mark Wong
2003-12-09 3:27 ` Nick Piggin
0 siblings, 1 reply; 5+ messages in thread
From: Mark Wong @ 2003-12-09 3:12 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel, osdldbt-general
On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote:
>
>
> markw@osdl.org wrote:
>
> >Hello, I have some data with hyperthreading I wanted to share.
> >
> >I've seen about a 15% performance decrease in performance on a 4-way
> >Xeon system when I enable hyperthreading for my DBT-2 workload. I also
> >gave Ingo's test11-C1 patch that someone pointed me to a try and only
> >saw a 12% decrease. Has anyone found this to be common with any specific
> >workloads?
> >
> >I'm not really sure what to look for, but I do see some changes in the
> >readprofile data, which I've copied in part below. It appears that the
> >count of schedule, __make_request, and try_to_wake_up are the only
> >functions at the top of the profile that are significantly different.
> >The links I have posted also have pointers to oprofile data as well as
> >annotated assembly source output, if that interests anyone. If I can
> >provide any other details, let me know.
> >
>
> Hi Mark,
> It could be cache contention which I think is typically the reason
> hyperthreading can hurt performance. Its basically impossible for
> the scheduler to correct this automatically (access to performance
> counters might make it slightly less impossible).
>
> Probably the CPU hotplug interface would enable a tool to effectively
> turn HT on or off and it would be up to an administrator to tune
> performance.
>
> You could try my scheduler patchset if you like. I have recently got
> HT support working (its against test11, you need to turn CONFIG_SMT
> on), although if Ingo's patch doesn't help much, mine probably won't
> either.
Hi Nick,
Went ahead and tried your patch, but it looks like something's wrong. If
this helps any:
Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0)
Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0
c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080
00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc
Call Trace: [<c011f488>] schedule+0x380/0x705
[<c01ff412>] sys_semtimedop+0x460/0x530
[<c011e5bb>] find_busiest_group+0x2bc/0x2e3
[<c02c69ea>] p4_check_ctrs+0xab/0x11b
[<c011e5bb>] find_busiest_group+0x2bc/0x2e3
[<c02c5a5d>] nmi_callback+0x25/0x29
[<c014218d>] buffered_rmqueue+0xea/0x199
[<c010b8f5>] nmi_stack_correct+0x1e/0x2e
[<c01422eb>] __alloc_pages+0xaf/0x334
[<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e
[<c014d0c2>] do_anonymous_page+0x16a/0x28b
[<c014d80b>] handle_mm_fault+0x101/0x1ad
[<c011c4f5>] do_page_fault+0x2fa/0x4fc
[<c011eb6d>] rebalance_tick+0x8a/0x91
[<c02c4368>] oprofile_add_sample+0x9b/0x117
[<c02c69ea>] p4_check_ctrs+0xab/0x11b
[<c0111ec4>] sys_ipc+0x61/0x2ae
[<c02c5a5d>] nmi_callback+0x25/0x29
[<c010c839>] do_nmi+0x39/0x5a
[<c010ad4d>] sysenter_past_esp+0x52/0x71
Code: Bad EIP value.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hyperthreading performance with dbt-2 on 2.6.0-test11
2003-12-09 3:12 ` Mark Wong
@ 2003-12-09 3:27 ` Nick Piggin
2003-12-09 3:37 ` Mark Wong
0 siblings, 1 reply; 5+ messages in thread
From: Nick Piggin @ 2003-12-09 3:27 UTC (permalink / raw)
To: Mark Wong; +Cc: linux-kernel, osdldbt-general
Mark Wong wrote:
>On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote:
>
>>
>>markw@osdl.org wrote:
>>
>>
>>>Hello, I have some data with hyperthreading I wanted to share.
>>>
>>>I've seen about a 15% performance decrease in performance on a 4-way
>>>Xeon system when I enable hyperthreading for my DBT-2 workload. I also
>>>gave Ingo's test11-C1 patch that someone pointed me to a try and only
>>>saw a 12% decrease. Has anyone found this to be common with any specific
>>>workloads?
>>>
>>>I'm not really sure what to look for, but I do see some changes in the
>>>readprofile data, which I've copied in part below. It appears that the
>>>count of schedule, __make_request, and try_to_wake_up are the only
>>>functions at the top of the profile that are significantly different.
>>>The links I have posted also have pointers to oprofile data as well as
>>>annotated assembly source output, if that interests anyone. If I can
>>>provide any other details, let me know.
>>>
>>>
>>Hi Mark,
>>It could be cache contention which I think is typically the reason
>>hyperthreading can hurt performance. Its basically impossible for
>>the scheduler to correct this automatically (access to performance
>>counters might make it slightly less impossible).
>>
>>Probably the CPU hotplug interface would enable a tool to effectively
>>turn HT on or off and it would be up to an administrator to tune
>>performance.
>>
>>You could try my scheduler patchset if you like. I have recently got
>>HT support working (its against test11, you need to turn CONFIG_SMT
>>on), although if Ingo's patch doesn't help much, mine probably won't
>>either.
>>
>
>Hi Nick,
>
>Went ahead and tried your patch, but it looks like something's wrong. If
>this helps any:
>
Doh. OK, did you manage to capture anything above this?
>
>Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0)
>Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0
> c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080
> 00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc
>Call Trace: [<c011f488>] schedule+0x380/0x705
> [<c01ff412>] sys_semtimedop+0x460/0x530
> [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
> [<c02c69ea>] p4_check_ctrs+0xab/0x11b
> [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
> [<c02c5a5d>] nmi_callback+0x25/0x29
> [<c014218d>] buffered_rmqueue+0xea/0x199
> [<c010b8f5>] nmi_stack_correct+0x1e/0x2e
> [<c01422eb>] __alloc_pages+0xaf/0x334
> [<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e
> [<c014d0c2>] do_anonymous_page+0x16a/0x28b
> [<c014d80b>] handle_mm_fault+0x101/0x1ad
> [<c011c4f5>] do_page_fault+0x2fa/0x4fc
> [<c011eb6d>] rebalance_tick+0x8a/0x91
> [<c02c4368>] oprofile_add_sample+0x9b/0x117
> [<c02c69ea>] p4_check_ctrs+0xab/0x11b
> [<c0111ec4>] sys_ipc+0x61/0x2ae
> [<c02c5a5d>] nmi_callback+0x25/0x29
> [<c010c839>] do_nmi+0x39/0x5a
> [<c010ad4d>] sysenter_past_esp+0x52/0x71
>
>Code: Bad EIP value.
>
>
>
>
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hyperthreading performance with dbt-2 on 2.6.0-test11
2003-12-09 3:27 ` Nick Piggin
@ 2003-12-09 3:37 ` Mark Wong
0 siblings, 0 replies; 5+ messages in thread
From: Mark Wong @ 2003-12-09 3:37 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-kernel, osdldbt-general
Oops, sorry, I must of overlooked it because it looked like garbage. Um, I'll
copy it anyway just in case it helps, but it didn't echo cleanly for some
reason:
<4>>UU:b e 4
aF AGSr u7f e0d0ress P00s03t80xe eainredc
pcinci8 g eip::00000002 0 espo ince: cd11e1fbr cdx<40 a00v60t
: f5c41d04
0d0:
000 es: 0000 ss: 0068
On Tue, Dec 09, 2003 at 02:27:57PM +1100, Nick Piggin wrote:
>
>
> Mark Wong wrote:
>
> >On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote:
> >
> >>
> >>markw@osdl.org wrote:
> >>
> >>
> >>>Hello, I have some data with hyperthreading I wanted to share.
> >>>
> >>>I've seen about a 15% performance decrease in performance on a 4-way
> >>>Xeon system when I enable hyperthreading for my DBT-2 workload. I also
> >>>gave Ingo's test11-C1 patch that someone pointed me to a try and only
> >>>saw a 12% decrease. Has anyone found this to be common with any specific
> >>>workloads?
> >>>
> >>>I'm not really sure what to look for, but I do see some changes in the
> >>>readprofile data, which I've copied in part below. It appears that the
> >>>count of schedule, __make_request, and try_to_wake_up are the only
> >>>functions at the top of the profile that are significantly different.
> >>>The links I have posted also have pointers to oprofile data as well as
> >>>annotated assembly source output, if that interests anyone. If I can
> >>>provide any other details, let me know.
> >>>
> >>>
> >>Hi Mark,
> >>It could be cache contention which I think is typically the reason
> >>hyperthreading can hurt performance. Its basically impossible for
> >>the scheduler to correct this automatically (access to performance
> >>counters might make it slightly less impossible).
> >>
> >>Probably the CPU hotplug interface would enable a tool to effectively
> >>turn HT on or off and it would be up to an administrator to tune
> >>performance.
> >>
> >>You could try my scheduler patchset if you like. I have recently got
> >>HT support working (its against test11, you need to turn CONFIG_SMT
> >>on), although if Ingo's patch doesn't help much, mine probably won't
> >>either.
> >>
> >
> >Hi Nick,
> >
> >Went ahead and tried your patch, but it looks like something's wrong. If
> >this helps any:
> >
>
> Doh. OK, did you manage to capture anything above this?
>
> >
> >Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0)
> >Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0
> > c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080
> > 00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc
> >Call Trace: [<c011f488>] schedule+0x380/0x705
> > [<c01ff412>] sys_semtimedop+0x460/0x530
> > [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
> > [<c02c69ea>] p4_check_ctrs+0xab/0x11b
> > [<c011e5bb>] find_busiest_group+0x2bc/0x2e3
> > [<c02c5a5d>] nmi_callback+0x25/0x29
> > [<c014218d>] buffered_rmqueue+0xea/0x199
> > [<c010b8f5>] nmi_stack_correct+0x1e/0x2e
> > [<c01422eb>] __alloc_pages+0xaf/0x334
> > [<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e
> > [<c014d0c2>] do_anonymous_page+0x16a/0x28b
> > [<c014d80b>] handle_mm_fault+0x101/0x1ad
> > [<c011c4f5>] do_page_fault+0x2fa/0x4fc
> > [<c011eb6d>] rebalance_tick+0x8a/0x91
> > [<c02c4368>] oprofile_add_sample+0x9b/0x117
> > [<c02c69ea>] p4_check_ctrs+0xab/0x11b
> > [<c0111ec4>] sys_ipc+0x61/0x2ae
> > [<c02c5a5d>] nmi_callback+0x25/0x29
> > [<c010c839>] do_nmi+0x39/0x5a
> > [<c010ad4d>] sysenter_past_esp+0x52/0x71
> >
> >Code: Bad EIP value.
> >
> >
> >
> >
> >
--
Mark Wong - - markw@osdl.org
Open Source Development Lab Inc - A non-profit corporation
12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005
(503) 626-2455 x 32 (office)
(503) 626-2436 (fax)
http://developer.osdl.org/markw/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-12-09 3:37 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-12-08 23:54 hyperthreading performance with dbt-2 on 2.6.0-test11 markw
2003-12-09 0:17 ` Nick Piggin
2003-12-09 3:12 ` Mark Wong
2003-12-09 3:27 ` Nick Piggin
2003-12-09 3:37 ` Mark Wong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).