* hyperthreading performance with dbt-2 on 2.6.0-test11 @ 2003-12-08 23:54 markw 2003-12-09 0:17 ` Nick Piggin 0 siblings, 1 reply; 5+ messages in thread From: markw @ 2003-12-08 23:54 UTC (permalink / raw) To: linux-kernel; +Cc: osdldbt-general Hello, I have some data with hyperthreading I wanted to share. I've seen about a 15% performance decrease in performance on a 4-way Xeon system when I enable hyperthreading for my DBT-2 workload. I also gave Ingo's test11-C1 patch that someone pointed me to a try and only saw a 12% decrease. Has anyone found this to be common with any specific workloads? I'm not really sure what to look for, but I do see some changes in the readprofile data, which I've copied in part below. It appears that the count of schedule, __make_request, and try_to_wake_up are the only functions at the top of the profile that are significantly different. The links I have posted also have pointers to oprofile data as well as annotated assembly source output, if that interests anyone. If I can provide any other details, let me know. For the test with no hyperthreading (http://developer.osdl.org/markw/dbt2-pgsql/258/): 8199701 poll_idle 141374.1552 159442 schedule 93.5144 133488 __copy_from_user_ll 1059.4286 128589 __copy_to_user_ll 1071.5750 73209 DAC960_LP_InterruptHandler 367.8844 51885 __make_request 36.0062 35614 try_to_wake_up 55.6469 For the test with hyperthreading (http://developer.osdl.org/markw/dbt2-pgsql/253/): 20893773 poll_idle 360237.4655 351988 schedule 206.4446 155826 __copy_from_user_ll 1038.8400 152346 __copy_to_user_ll 1269.5500 90983 DAC960_LP_InterruptHandler 457.2010 86936 try_to_wake_up 135.8375 70122 __make_request 48.6620 For the test with hyperthreading and Ingo's patch (http://developer.osdl.org/markw/dbt2-pgsql/260/): 20544823 poll_idle 354221.0862 520575 schedule 231.6756 159609 __copy_from_user_ll 1266.7381 153279 __copy_to_user_ll 1277.3250 139321 try_to_wake_up 221.4960 92447 DAC960_LP_InterruptHandler 464.5578 72162 __make_request 50.0777 -- Mark Wong - - markw@osdl.org Open Source Development Lab Inc - A non-profit corporation 12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005 (503) 626-2455 x 32 (office) (503) 626-2436 (fax) http://developer.osdl.org/markw/ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hyperthreading performance with dbt-2 on 2.6.0-test11 2003-12-08 23:54 hyperthreading performance with dbt-2 on 2.6.0-test11 markw @ 2003-12-09 0:17 ` Nick Piggin 2003-12-09 3:12 ` Mark Wong 0 siblings, 1 reply; 5+ messages in thread From: Nick Piggin @ 2003-12-09 0:17 UTC (permalink / raw) To: markw; +Cc: linux-kernel, osdldbt-general [-- Attachment #1: Type: text/plain, Size: 1488 bytes --] markw@osdl.org wrote: >Hello, I have some data with hyperthreading I wanted to share. > >I've seen about a 15% performance decrease in performance on a 4-way >Xeon system when I enable hyperthreading for my DBT-2 workload. I also >gave Ingo's test11-C1 patch that someone pointed me to a try and only >saw a 12% decrease. Has anyone found this to be common with any specific >workloads? > >I'm not really sure what to look for, but I do see some changes in the >readprofile data, which I've copied in part below. It appears that the >count of schedule, __make_request, and try_to_wake_up are the only >functions at the top of the profile that are significantly different. >The links I have posted also have pointers to oprofile data as well as >annotated assembly source output, if that interests anyone. If I can >provide any other details, let me know. > Hi Mark, It could be cache contention which I think is typically the reason hyperthreading can hurt performance. Its basically impossible for the scheduler to correct this automatically (access to performance counters might make it slightly less impossible). Probably the CPU hotplug interface would enable a tool to effectively turn HT on or off and it would be up to an administrator to tune performance. You could try my scheduler patchset if you like. I have recently got HT support working (its against test11, you need to turn CONFIG_SMT on), although if Ingo's patch doesn't help much, mine probably won't either. [-- Attachment #2: w26p21.gz --] [-- Type: application/x-tar, Size: 13256 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hyperthreading performance with dbt-2 on 2.6.0-test11 2003-12-09 0:17 ` Nick Piggin @ 2003-12-09 3:12 ` Mark Wong 2003-12-09 3:27 ` Nick Piggin 0 siblings, 1 reply; 5+ messages in thread From: Mark Wong @ 2003-12-09 3:12 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-kernel, osdldbt-general On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote: > > > markw@osdl.org wrote: > > >Hello, I have some data with hyperthreading I wanted to share. > > > >I've seen about a 15% performance decrease in performance on a 4-way > >Xeon system when I enable hyperthreading for my DBT-2 workload. I also > >gave Ingo's test11-C1 patch that someone pointed me to a try and only > >saw a 12% decrease. Has anyone found this to be common with any specific > >workloads? > > > >I'm not really sure what to look for, but I do see some changes in the > >readprofile data, which I've copied in part below. It appears that the > >count of schedule, __make_request, and try_to_wake_up are the only > >functions at the top of the profile that are significantly different. > >The links I have posted also have pointers to oprofile data as well as > >annotated assembly source output, if that interests anyone. If I can > >provide any other details, let me know. > > > > Hi Mark, > It could be cache contention which I think is typically the reason > hyperthreading can hurt performance. Its basically impossible for > the scheduler to correct this automatically (access to performance > counters might make it slightly less impossible). > > Probably the CPU hotplug interface would enable a tool to effectively > turn HT on or off and it would be up to an administrator to tune > performance. > > You could try my scheduler patchset if you like. I have recently got > HT support working (its against test11, you need to turn CONFIG_SMT > on), although if Ingo's patch doesn't help much, mine probably won't > either. Hi Nick, Went ahead and tried your patch, but it looks like something's wrong. If this helps any: Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0) Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0 c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080 00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc Call Trace: [<c011f488>] schedule+0x380/0x705 [<c01ff412>] sys_semtimedop+0x460/0x530 [<c011e5bb>] find_busiest_group+0x2bc/0x2e3 [<c02c69ea>] p4_check_ctrs+0xab/0x11b [<c011e5bb>] find_busiest_group+0x2bc/0x2e3 [<c02c5a5d>] nmi_callback+0x25/0x29 [<c014218d>] buffered_rmqueue+0xea/0x199 [<c010b8f5>] nmi_stack_correct+0x1e/0x2e [<c01422eb>] __alloc_pages+0xaf/0x334 [<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e [<c014d0c2>] do_anonymous_page+0x16a/0x28b [<c014d80b>] handle_mm_fault+0x101/0x1ad [<c011c4f5>] do_page_fault+0x2fa/0x4fc [<c011eb6d>] rebalance_tick+0x8a/0x91 [<c02c4368>] oprofile_add_sample+0x9b/0x117 [<c02c69ea>] p4_check_ctrs+0xab/0x11b [<c0111ec4>] sys_ipc+0x61/0x2ae [<c02c5a5d>] nmi_callback+0x25/0x29 [<c010c839>] do_nmi+0x39/0x5a [<c010ad4d>] sysenter_past_esp+0x52/0x71 Code: Bad EIP value. ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hyperthreading performance with dbt-2 on 2.6.0-test11 2003-12-09 3:12 ` Mark Wong @ 2003-12-09 3:27 ` Nick Piggin 2003-12-09 3:37 ` Mark Wong 0 siblings, 1 reply; 5+ messages in thread From: Nick Piggin @ 2003-12-09 3:27 UTC (permalink / raw) To: Mark Wong; +Cc: linux-kernel, osdldbt-general Mark Wong wrote: >On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote: > >> >>markw@osdl.org wrote: >> >> >>>Hello, I have some data with hyperthreading I wanted to share. >>> >>>I've seen about a 15% performance decrease in performance on a 4-way >>>Xeon system when I enable hyperthreading for my DBT-2 workload. I also >>>gave Ingo's test11-C1 patch that someone pointed me to a try and only >>>saw a 12% decrease. Has anyone found this to be common with any specific >>>workloads? >>> >>>I'm not really sure what to look for, but I do see some changes in the >>>readprofile data, which I've copied in part below. It appears that the >>>count of schedule, __make_request, and try_to_wake_up are the only >>>functions at the top of the profile that are significantly different. >>>The links I have posted also have pointers to oprofile data as well as >>>annotated assembly source output, if that interests anyone. If I can >>>provide any other details, let me know. >>> >>> >>Hi Mark, >>It could be cache contention which I think is typically the reason >>hyperthreading can hurt performance. Its basically impossible for >>the scheduler to correct this automatically (access to performance >>counters might make it slightly less impossible). >> >>Probably the CPU hotplug interface would enable a tool to effectively >>turn HT on or off and it would be up to an administrator to tune >>performance. >> >>You could try my scheduler patchset if you like. I have recently got >>HT support working (its against test11, you need to turn CONFIG_SMT >>on), although if Ingo's patch doesn't help much, mine probably won't >>either. >> > >Hi Nick, > >Went ahead and tried your patch, but it looks like something's wrong. If >this helps any: > Doh. OK, did you manage to capture anything above this? > >Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0) >Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0 > c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080 > 00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc >Call Trace: [<c011f488>] schedule+0x380/0x705 > [<c01ff412>] sys_semtimedop+0x460/0x530 > [<c011e5bb>] find_busiest_group+0x2bc/0x2e3 > [<c02c69ea>] p4_check_ctrs+0xab/0x11b > [<c011e5bb>] find_busiest_group+0x2bc/0x2e3 > [<c02c5a5d>] nmi_callback+0x25/0x29 > [<c014218d>] buffered_rmqueue+0xea/0x199 > [<c010b8f5>] nmi_stack_correct+0x1e/0x2e > [<c01422eb>] __alloc_pages+0xaf/0x334 > [<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e > [<c014d0c2>] do_anonymous_page+0x16a/0x28b > [<c014d80b>] handle_mm_fault+0x101/0x1ad > [<c011c4f5>] do_page_fault+0x2fa/0x4fc > [<c011eb6d>] rebalance_tick+0x8a/0x91 > [<c02c4368>] oprofile_add_sample+0x9b/0x117 > [<c02c69ea>] p4_check_ctrs+0xab/0x11b > [<c0111ec4>] sys_ipc+0x61/0x2ae > [<c02c5a5d>] nmi_callback+0x25/0x29 > [<c010c839>] do_nmi+0x39/0x5a > [<c010ad4d>] sysenter_past_esp+0x52/0x71 > >Code: Bad EIP value. > > > > > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: hyperthreading performance with dbt-2 on 2.6.0-test11 2003-12-09 3:27 ` Nick Piggin @ 2003-12-09 3:37 ` Mark Wong 0 siblings, 0 replies; 5+ messages in thread From: Mark Wong @ 2003-12-09 3:37 UTC (permalink / raw) To: Nick Piggin; +Cc: linux-kernel, osdldbt-general Oops, sorry, I must of overlooked it because it looked like garbage. Um, I'll copy it anyway just in case it helps, but it didn't echo cleanly for some reason: <4>>UU:b e 4 aF AGSr u7f e0d0ress P00s03t80xe eainredc pcinci8 g eip::00000002 0 espo ince: cd11e1fbr cdx<40 a00v60t : f5c41d04 0d0: 000 es: 0000 ss: 0068 On Tue, Dec 09, 2003 at 02:27:57PM +1100, Nick Piggin wrote: > > > Mark Wong wrote: > > >On Tue, Dec 09, 2003 at 11:17:26AM +1100, Nick Piggin wrote: > > > >> > >>markw@osdl.org wrote: > >> > >> > >>>Hello, I have some data with hyperthreading I wanted to share. > >>> > >>>I've seen about a 15% performance decrease in performance on a 4-way > >>>Xeon system when I enable hyperthreading for my DBT-2 workload. I also > >>>gave Ingo's test11-C1 patch that someone pointed me to a try and only > >>>saw a 12% decrease. Has anyone found this to be common with any specific > >>>workloads? > >>> > >>>I'm not really sure what to look for, but I do see some changes in the > >>>readprofile data, which I've copied in part below. It appears that the > >>>count of schedule, __make_request, and try_to_wake_up are the only > >>>functions at the top of the profile that are significantly different. > >>>The links I have posted also have pointers to oprofile data as well as > >>>annotated assembly source output, if that interests anyone. If I can > >>>provide any other details, let me know. > >>> > >>> > >>Hi Mark, > >>It could be cache contention which I think is typically the reason > >>hyperthreading can hurt performance. Its basically impossible for > >>the scheduler to correct this automatically (access to performance > >>counters might make it slightly less impossible). > >> > >>Probably the CPU hotplug interface would enable a tool to effectively > >>turn HT on or off and it would be up to an administrator to tune > >>performance. > >> > >>You could try my scheduler patchset if you like. I have recently got > >>HT support working (its against test11, you need to turn CONFIG_SMT > >>on), although if Ingo's patch doesn't help much, mine probably won't > >>either. > >> > > > >Hi Nick, > > > >Went ahead and tried your patch, but it looks like something's wrong. If > >this helps any: > > > > Doh. OK, did you manage to capture anything above this? > > > > >Process postmaster (pid: 1086, threadinfo=f5c40000 task=f5c586b0) > >Stack: f5c5007b 0000007b ffffffff c011f488 00000060 00010046 00000005 c322ccc0 > > c322c060 00000002 f5c5bdbc f686cb90 f686cb90 f5c41d4c f7f93940 f5ca6080 > > 00000007 00000000 c322ccc0 00006f12 03d99f34 0000033d f5c586b0 f5c41dbc > >Call Trace: [<c011f488>] schedule+0x380/0x705 > > [<c01ff412>] sys_semtimedop+0x460/0x530 > > [<c011e5bb>] find_busiest_group+0x2bc/0x2e3 > > [<c02c69ea>] p4_check_ctrs+0xab/0x11b > > [<c011e5bb>] find_busiest_group+0x2bc/0x2e3 > > [<c02c5a5d>] nmi_callback+0x25/0x29 > > [<c014218d>] buffered_rmqueue+0xea/0x199 > > [<c010b8f5>] nmi_stack_correct+0x1e/0x2e > > [<c01422eb>] __alloc_pages+0xaf/0x334 > > [<c014007b>] generic_file_aio_write_nolock+0x298/0xa9e > > [<c014d0c2>] do_anonymous_page+0x16a/0x28b > > [<c014d80b>] handle_mm_fault+0x101/0x1ad > > [<c011c4f5>] do_page_fault+0x2fa/0x4fc > > [<c011eb6d>] rebalance_tick+0x8a/0x91 > > [<c02c4368>] oprofile_add_sample+0x9b/0x117 > > [<c02c69ea>] p4_check_ctrs+0xab/0x11b > > [<c0111ec4>] sys_ipc+0x61/0x2ae > > [<c02c5a5d>] nmi_callback+0x25/0x29 > > [<c010c839>] do_nmi+0x39/0x5a > > [<c010ad4d>] sysenter_past_esp+0x52/0x71 > > > >Code: Bad EIP value. > > > > > > > > > > -- Mark Wong - - markw@osdl.org Open Source Development Lab Inc - A non-profit corporation 12725 SW Millikan Way - Suite 400 - Beaverton, OR 97005 (503) 626-2455 x 32 (office) (503) 626-2436 (fax) http://developer.osdl.org/markw/ ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2003-12-09 3:37 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-12-08 23:54 hyperthreading performance with dbt-2 on 2.6.0-test11 markw 2003-12-09 0:17 ` Nick Piggin 2003-12-09 3:12 ` Mark Wong 2003-12-09 3:27 ` Nick Piggin 2003-12-09 3:37 ` Mark Wong
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).