* [PATCH 1/1] Change IBS PMU to use perf_hw_context @ 2012-12-14 20:57 suravee.suthikulpanit 2012-12-16 9:04 ` Ingo Molnar 0 siblings, 1 reply; 6+ messages in thread From: suravee.suthikulpanit @ 2012-12-14 20:57 UTC (permalink / raw) To: a.p.zijlstra, paulus, mingo, acme, tglx, hpa, rric, x86, linux-kernel Cc: suravee.suthikulpanit, jacob.shin From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Currently, the AMD IBS PMU initialize pmu.task_ctx_nr to perf_invalid_context which only allows IBS to be running only in system-wide mode (e.g. perf record -a). IBS hardware is available in each core and should be per-context. This patch modifies the task_ctx_nr to use the perf_hw_context (default) instead. Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> diff --git a/arch/x86/kernel/cpu/perf_event_amd_ibs.c b/arch/x86/kernel/cpu/perf_event_amd_ibs.c index 6336bcb..08fa71a 100644 --- a/arch/x86/kernel/cpu/perf_event_amd_ibs.c +++ b/arch/x86/kernel/cpu/perf_event_amd_ibs.c @@ -466,8 +466,6 @@ static struct attribute *ibs_op_format_attrs[] = { static struct perf_ibs perf_ibs_fetch = { .pmu = { - .task_ctx_nr = perf_invalid_context, - .event_init = perf_ibs_init, .add = perf_ibs_add, .del = perf_ibs_del, @@ -490,8 +488,6 @@ static struct perf_ibs perf_ibs_fetch = { static struct perf_ibs perf_ibs_op = { .pmu = { - .task_ctx_nr = perf_invalid_context, - .event_init = perf_ibs_init, .add = perf_ibs_add, .del = perf_ibs_del, -- 1.7.10.4 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] Change IBS PMU to use perf_hw_context 2012-12-14 20:57 [PATCH 1/1] Change IBS PMU to use perf_hw_context suravee.suthikulpanit @ 2012-12-16 9:04 ` Ingo Molnar 2012-12-17 9:44 ` Robert Richter 0 siblings, 1 reply; 6+ messages in thread From: Ingo Molnar @ 2012-12-16 9:04 UTC (permalink / raw) To: suravee.suthikulpanit Cc: a.p.zijlstra, paulus, mingo, acme, tglx, hpa, rric, x86, linux-kernel, jacob.shin * suravee.suthikulpanit@amd.com <suravee.suthikulpanit@amd.com> wrote: > From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > > Currently, the AMD IBS PMU initialize pmu.task_ctx_nr to > perf_invalid_context which only allows IBS to be running only > in system-wide mode (e.g. perf record -a). IBS hardware is > available in each core and should be per-context. This patch > modifies the task_ctx_nr to use the perf_hw_context (default) > instead. I'm wondering how extensively was it tested/verified that it's safe to enable IBS in per context mode as well, and that the profiling results are precise and accurate? We never used the IBS hardware in this fashion before, so some extra care is prudent - and traces of that extra care should be visible in the changelog as well. Thanks, Ingo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] Change IBS PMU to use perf_hw_context 2012-12-16 9:04 ` Ingo Molnar @ 2012-12-17 9:44 ` Robert Richter 2012-12-18 22:54 ` Suravee Suthikulpanit 0 siblings, 1 reply; 6+ messages in thread From: Robert Richter @ 2012-12-17 9:44 UTC (permalink / raw) To: Ingo Molnar Cc: suravee.suthikulpanit, a.p.zijlstra, paulus, mingo, acme, tglx, hpa, x86, linux-kernel, jacob.shin On 16.12.12 10:04:10, Ingo Molnar wrote: > > * suravee.suthikulpanit@amd.com <suravee.suthikulpanit@amd.com> wrote: > > > From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > > > > Currently, the AMD IBS PMU initialize pmu.task_ctx_nr to > > perf_invalid_context which only allows IBS to be running only > > in system-wide mode (e.g. perf record -a). IBS hardware is > > available in each core and should be per-context. This patch > > modifies the task_ctx_nr to use the perf_hw_context (default) > > instead. > > I'm wondering how extensively was it tested/verified that it's > safe to enable IBS in per context mode as well, and that the > profiling results are precise and accurate? >From the implementation's point of view this is very similar to hw perf counters. I wouldn't expect any issues here. Since IBS can be immediatly started/stopped and there is no caching, there won't be any incomming sample that is not related to that context. The only potential problem I see could be a security risk in a way that an IBS sample might expose data related to other contexts such as cache information. This is similar to uncore/northbridge events so I don't think this is an issue, but we might want to evaluate this. > We never used the IBS hardware in this fashion before, so some > extra care is prudent - and traces of that extra care should be > visible in the changelog as well. Yeah, a comparison of numbers for IBS and hw counter (-e r076:p,r076 and -e r0C1:p,r0C1) in per-context mode would be useful here. -Robert ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] Change IBS PMU to use perf_hw_context 2012-12-17 9:44 ` Robert Richter @ 2012-12-18 22:54 ` Suravee Suthikulpanit 2013-01-16 22:19 ` Suravee Suthikulpanit 0 siblings, 1 reply; 6+ messages in thread From: Suravee Suthikulpanit @ 2012-12-18 22:54 UTC (permalink / raw) To: Robert Richter, Ingo Molnar Cc: a.p.zijlstra, paulus, mingo, acme, tglx, hpa, x86, linux-kernel, jacob.shin, suravee.suthikulpanit Ingo, Robert I am including a set of output from "perf report" to help validating IBS in per-process mode. In this experiment I ran a couple test cases: case 1. perf record -e cycles (baseline per-process mode w/ regular counter) case 2. perf record -a -e cycles:p (baseline system-wide mode w/ IBS) case 3. perf record -e cycles:p (the proposed per-process mode w/IBS) In all 3 test cases, the target application (classic) are showing about 27K samples. I am also including the IBS OP MSRs (0xc00110[33-3a]) snapshots on all 32 cores (using rdmsr tools) from case 2 and 3 above. ------------------------------------------------------------ CASE1: # ======== # captured on: Tue Dec 18 16:32:43 2012 # hostname : sos-dev02 # os release : 3.7.0-IBS+ # perf version : 3.7.rc8.g805f38 # arch : x86_64 # nrcpus online : 32 # nrcpus avail : 32 # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 # cpuid : AuthenticAMD,21,2,0 # total memory : 32863836 kB # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -e cycles taskset -c 31 src/classic # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, id = { 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229 } # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 # ======== # # Samples: 27K of event 'cycles' # Event count (approx.): 20938245323 # # Overhead Samples Command Shared Object Symbol # ........ ........... ....... ................. ....................................... # 99.16% 26927 classic classic [.] multiply_matrices() <--- TARGET APP 0.32% 78 classic libc-2.15.so [.] random 0.10% 23 classic libc-2.15.so [.] random_r 0.07% 16 classic classic [.] initialize_matrices() 0.04% 10 classic [kernel.kallsyms] [k] ttwu_do_wakeup 0.03% 9 classic [kernel.kallsyms] [k] clear_page_c 0.02% 11 classic [kernel.kallsyms] [k] native_write_msr_safe 0.02% 5 classic libc-2.15.so [.] rand 0.02% 2 classic ld-2.15.so [.] 0x000000000000a456 ------------------------------------------------------------ CASE 2: # ======== # captured on: Tue Dec 18 16:11:35 2012 # hostname : sos-dev02 # os release : 3.7.0-IBS+ # perf version : 3.7.rc8.g805f38 # arch : x86_64 # nrcpus online : 32 # nrcpus avail : 32 # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 # cpuid : AuthenticAMD,21,2,0 # total memory : 32863836 kB # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -a -e cycles:p taskset -c 31 src/classic # event : name = cycles:p, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 0, precise_ip = 1, id = { 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98 } # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 # ======== # # Samples: 189K of event 'cycles:p' # Event count (approx.): 40504131338 # # Overhead Samples Command Shared Object Symbol # ........ ........... ............... ................................ ........................................... # 51.07% 26959 classic classic [.] multiply_matrices() <------ TARGET APP 35.39% 131620 swapper [kernel.kallsyms] [k] acpi_idle_do_entry 2.10% 4673 swapper [kernel.kallsyms] [k] native_safe_halt 0.71% 1303 rdmsr ld-2.15.so [.] 0x0000000000002a44 0.33% 639 rdmsr [kernel.kallsyms] [k] irq_return 0.26% 499 rdmsr libc-2.15.so [.] 0x0000000000131d80 0.25% 440 rdmsr [kernel.kallsyms] [k] generic_exec_single 0.25% 470 rdmsr [kernel.kallsyms] [k] __do_fault 0.24% 478 rdmsr [kernel.kallsyms] [k] unmap_single_vma ------------------------------------------------------------ CASE 3: # ======== # captured on: Tue Dec 18 16:13:53 2012 # hostname : sos-dev02 # os release : 3.7.0-IBS+ # perf version : 3.7.rc8.g805f38 # arch : x86_64 # nrcpus online : 32 # nrcpus avail : 32 # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 # cpuid : AuthenticAMD,21,2,0 # total memory : 32863836 kB # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -e cycles:p taskset -c 31 src/classic # event : name = cycles:p, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 0, precise_ip = 1, id = { 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 } # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 # ======== # # Samples: 27K of event 'cycles:p' # Event count (approx.): 20851884446 # # Overhead Samples Command Shared Object Symbol # ........ ........... ....... ................. .............................. # 99.37% 27020 classic classic [.] multiply_matrices() <--- TARGET APP 0.22% 58 classic libc-2.15.so [.] random_r 0.13% 32 classic classic [.] initialize_matrices() 0.10% 26 classic libc-2.15.so [.] random 0.03% 8 classic libc-2.15.so [.] rand 0.03% 7 classic [kernel.kallsyms] [k] clear_page_c 0.01% 2 classic ld-2.15.so [.] 0x000000000000a423 0.01% 2 classic [kernel.kallsyms] [k] retint_swapgs 0.01% 2 classic [kernel.kallsyms] [k] ttwu_do_wakeup ------------------------------------------------------------ IBS MSR VALUES FROM CASE 2: core : 0xc0011033 0xc0011034 0xc0011035 0xc0011036 0xc0011037 0xc0011038 0xc0011039 0xc001103a 0 : 0000002300040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000000 000000fdfd300000 0000000000000100 1 : 0000006200040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 2 : 0000006000040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000000 000000fdfd300000 0000000000000100 3 : 0000005000040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000400 000000fdfd300400 0000000000000100 4 : 0000005700040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 5 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000000000 0000000000000514 000000fdfd300514 0000000000000100 6 : 0000004200040000 ffffffff813d8c74 00000000000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 7 : 0000000000000000 ffffffff81043ea8 00000000000a0000 0000000000000000 0000000000000000 0000000000000514 000000fdfd300514 0000000000000100 8 : 0000002300040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 9 : 0000004d00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000400 000000fdfd300400 0000000000000100 10 : 00001fe500000000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 11 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000040001 0000000000000514 000000fdfd300514 0000000000000100 12 : 0000008100040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 13 : 0000006900040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000400 000000fdfd300400 0000000000000100 14 : 0000004900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 15 : 0000002300040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000400 000000fdfd300400 0000000000000100 16 : 0000000f00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 17 : 0000004b00040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 18 : 0000003d00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 00000000000002b8 000000fdfd3002b8 0000000000000100 19 : 0000004400040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 20 : 0000001800040000 ffffffff813d8d27 0000000000060001 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 21 : 0000002900040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 22 : 0000005900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 23 : 0000001500040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 24 : 0000006100040000 ffffffff8133e844 00000028001e000f 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 25 : 0000005400040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 26 : 0000002900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 27 : 0000000e00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 28 : 0000007e00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040042 0000000000000000 000000fdfd300000 0000000000000100 29 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000040001 0000000000000514 000000fdfd300514 0000000000000100 30 : 00003f4a00000000 ffffffff813d8c6d 00000040000b0006 0000000000000000 000000000004000a 0000000000000000 000000fdfd300000 0000000000000100 31 : 0001147800000000 ffffffff810b9400 00000000003c0001 0000000000000000 0000000000040009 00000000000005dc 000000fdfd3005dc 0000000000000100 ------------------------------------------------------------ IBS MSR VALUES FROM CASE 3: core : 0xc0011033 0xc0011034 0xc0011035 0xc0011036 0xc0011037 0xc0011038 0xc0011039 0xc001103a 0 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 1 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 2 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 3 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 5 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 6 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 7 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 8 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 9 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 10 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 11 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 12 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 13 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 14 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 15 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 16 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 17 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 18 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 19 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 20 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 21 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 22 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 23 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 24 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 25 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 26 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 27 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 28 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 29 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 30 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 31 : 00034d8900000000 ffffffff811370cc 0000000000120000 0000000000000000 0000000000000008 ffff88082592bc58 000000082592bc58 0000000000000100 Suravee On Mon, 2012-12-17 at 10:44 +0100, Robert Richter wrote: > On 16.12.12 10:04:10, Ingo Molnar wrote: > > > > * suravee.suthikulpanit@amd.com <suravee.suthikulpanit@amd.com> wrote: > > > > > From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > > > > > > Currently, the AMD IBS PMU initialize pmu.task_ctx_nr to > > > perf_invalid_context which only allows IBS to be running only > > > in system-wide mode (e.g. perf record -a). IBS hardware is > > > available in each core and should be per-context. This patch > > > modifies the task_ctx_nr to use the perf_hw_context (default) > > > instead. > > > > I'm wondering how extensively was it tested/verified that it's > > safe to enable IBS in per context mode as well, and that the > > profiling results are precise and accurate? > > From the implementation's point of view this is very similar to hw > perf counters. I wouldn't expect any issues here. Since IBS can be > immediatly started/stopped and there is no caching, there won't be any > incomming sample that is not related to that context. > > The only potential problem I see could be a security risk in a way > that an IBS sample might expose data related to other contexts such as > cache information. This is similar to uncore/northbridge events so I > don't think this is an issue, but we might want to evaluate this. > > > We never used the IBS hardware in this fashion before, so some > > extra care is prudent - and traces of that extra care should be > > visible in the changelog as well. > > Yeah, a comparison of numbers for IBS and hw counter (-e r076:p,r076 > and -e r0C1:p,r0C1) in per-context mode would be useful here. > > -Robert > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] Change IBS PMU to use perf_hw_context 2012-12-18 22:54 ` Suravee Suthikulpanit @ 2013-01-16 22:19 ` Suravee Suthikulpanit 2013-06-18 16:03 ` Suravee Suthikulanit 0 siblings, 1 reply; 6+ messages in thread From: Suravee Suthikulpanit @ 2013-01-16 22:19 UTC (permalink / raw) To: Robert Richter; +Cc: Ingo Molnar, x86, linux-kernel, jacob.shin Hi, I am following up with this patch. Please let me know if you would like me to provide any more data or verifications. Thank you, Suravee On Tue, 2012-12-18 at 16:54 -0600, Suravee Suthikulpanit wrote: > Ingo, Robert > > I am including a set of output from "perf report" to help validating IBS in per-process mode. > In this experiment I ran a couple test cases: > > case 1. perf record -e cycles (baseline per-process mode w/ regular counter) > case 2. perf record -a -e cycles:p (baseline system-wide mode w/ IBS) > case 3. perf record -e cycles:p (the proposed per-process mode w/IBS) > > In all 3 test cases, the target application (classic) are showing about 27K samples. > I am also including the IBS OP MSRs (0xc00110[33-3a]) snapshots on all 32 cores > (using rdmsr tools) from case 2 and 3 above. > > ------------------------------------------------------------ > CASE1: > > # ======== > # captured on: Tue Dec 18 16:32:43 2012 > # hostname : sos-dev02 > # os release : 3.7.0-IBS+ > # perf version : 3.7.rc8.g805f38 > # arch : x86_64 > # nrcpus online : 32 > # nrcpus avail : 32 > # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 > # cpuid : AuthenticAMD,21,2,0 > # total memory : 32863836 kB > # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -e cycles taskset -c 31 src/classic > # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, id = { 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229 } > # HEADER_CPU_TOPOLOGY info available, use -I to display > # HEADER_NUMA_TOPOLOGY info available, use -I to display > # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 > # ======== > # > # Samples: 27K of event 'cycles' > # Event count (approx.): 20938245323 > # > # Overhead Samples Command Shared Object Symbol > # ........ ........... ....... ................. ....................................... > # > 99.16% 26927 classic classic [.] multiply_matrices() <--- TARGET APP > 0.32% 78 classic libc-2.15.so [.] random > 0.10% 23 classic libc-2.15.so [.] random_r > 0.07% 16 classic classic [.] initialize_matrices() > 0.04% 10 classic [kernel.kallsyms] [k] ttwu_do_wakeup > 0.03% 9 classic [kernel.kallsyms] [k] clear_page_c > 0.02% 11 classic [kernel.kallsyms] [k] native_write_msr_safe > 0.02% 5 classic libc-2.15.so [.] rand > 0.02% 2 classic ld-2.15.so [.] 0x000000000000a456 > > ------------------------------------------------------------ > CASE 2: > > # ======== > # captured on: Tue Dec 18 16:11:35 2012 > # hostname : sos-dev02 > # os release : 3.7.0-IBS+ > # perf version : 3.7.rc8.g805f38 > # arch : x86_64 > # nrcpus online : 32 > # nrcpus avail : 32 > # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 > # cpuid : AuthenticAMD,21,2,0 > # total memory : 32863836 kB > # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -a -e cycles:p taskset -c 31 src/classic > # event : name = cycles:p, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 0, precise_ip = 1, id = { 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98 } > # HEADER_CPU_TOPOLOGY info available, use -I to display > # HEADER_NUMA_TOPOLOGY info available, use -I to display > # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 > # ======== > # > # Samples: 189K of event 'cycles:p' > # Event count (approx.): 40504131338 > # > # Overhead Samples Command Shared Object Symbol > # ........ ........... ............... ................................ ........................................... > # > 51.07% 26959 classic classic [.] multiply_matrices() <------ TARGET APP > 35.39% 131620 swapper [kernel.kallsyms] [k] acpi_idle_do_entry > 2.10% 4673 swapper [kernel.kallsyms] [k] native_safe_halt > 0.71% 1303 rdmsr ld-2.15.so [.] 0x0000000000002a44 > 0.33% 639 rdmsr [kernel.kallsyms] [k] irq_return > 0.26% 499 rdmsr libc-2.15.so [.] 0x0000000000131d80 > 0.25% 440 rdmsr [kernel.kallsyms] [k] generic_exec_single > 0.25% 470 rdmsr [kernel.kallsyms] [k] __do_fault > 0.24% 478 rdmsr [kernel.kallsyms] [k] unmap_single_vma > > ------------------------------------------------------------ > CASE 3: > > # ======== > # captured on: Tue Dec 18 16:13:53 2012 > # hostname : sos-dev02 > # os release : 3.7.0-IBS+ > # perf version : 3.7.rc8.g805f38 > # arch : x86_64 > # nrcpus online : 32 > # nrcpus avail : 32 > # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 > # cpuid : AuthenticAMD,21,2,0 > # total memory : 32863836 kB > # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -e cycles:p taskset -c 31 src/classic > # event : name = cycles:p, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 0, precise_ip = 1, id = { 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 } > # HEADER_CPU_TOPOLOGY info available, use -I to display > # HEADER_NUMA_TOPOLOGY info available, use -I to display > # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 > # ======== > # > # Samples: 27K of event 'cycles:p' > # Event count (approx.): 20851884446 > # > # Overhead Samples Command Shared Object Symbol > # ........ ........... ....... ................. .............................. > # > 99.37% 27020 classic classic [.] multiply_matrices() <--- TARGET APP > 0.22% 58 classic libc-2.15.so [.] random_r > 0.13% 32 classic classic [.] initialize_matrices() > 0.10% 26 classic libc-2.15.so [.] random > 0.03% 8 classic libc-2.15.so [.] rand > 0.03% 7 classic [kernel.kallsyms] [k] clear_page_c > 0.01% 2 classic ld-2.15.so [.] 0x000000000000a423 > 0.01% 2 classic [kernel.kallsyms] [k] retint_swapgs > 0.01% 2 classic [kernel.kallsyms] [k] ttwu_do_wakeup > > ------------------------------------------------------------ > IBS MSR VALUES FROM CASE 2: > > core : 0xc0011033 0xc0011034 0xc0011035 0xc0011036 0xc0011037 0xc0011038 0xc0011039 0xc001103a > 0 : 0000002300040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000000 000000fdfd300000 0000000000000100 > 1 : 0000006200040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 > 2 : 0000006000040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000000 000000fdfd300000 0000000000000100 > 3 : 0000005000040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000400 000000fdfd300400 0000000000000100 > 4 : 0000005700040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 > 5 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000000000 0000000000000514 000000fdfd300514 0000000000000100 > 6 : 0000004200040000 ffffffff813d8c74 00000000000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 > 7 : 0000000000000000 ffffffff81043ea8 00000000000a0000 0000000000000000 0000000000000000 0000000000000514 000000fdfd300514 0000000000000100 > 8 : 0000002300040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 > 9 : 0000004d00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000400 000000fdfd300400 0000000000000100 > 10 : 00001fe500000000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 > 11 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000040001 0000000000000514 000000fdfd300514 0000000000000100 > 12 : 0000008100040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 > 13 : 0000006900040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000400 000000fdfd300400 0000000000000100 > 14 : 0000004900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 > 15 : 0000002300040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000400 000000fdfd300400 0000000000000100 > 16 : 0000000f00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 > 17 : 0000004b00040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 > 18 : 0000003d00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 00000000000002b8 000000fdfd3002b8 0000000000000100 > 19 : 0000004400040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 > 20 : 0000001800040000 ffffffff813d8d27 0000000000060001 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 > 21 : 0000002900040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 > 22 : 0000005900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 > 23 : 0000001500040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 > 24 : 0000006100040000 ffffffff8133e844 00000028001e000f 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 > 25 : 0000005400040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 > 26 : 0000002900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 > 27 : 0000000e00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 > 28 : 0000007e00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040042 0000000000000000 000000fdfd300000 0000000000000100 > 29 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000040001 0000000000000514 000000fdfd300514 0000000000000100 > 30 : 00003f4a00000000 ffffffff813d8c6d 00000040000b0006 0000000000000000 000000000004000a 0000000000000000 000000fdfd300000 0000000000000100 > 31 : 0001147800000000 ffffffff810b9400 00000000003c0001 0000000000000000 0000000000040009 00000000000005dc 000000fdfd3005dc 0000000000000100 > > ------------------------------------------------------------ > IBS MSR VALUES FROM CASE 3: > > core : 0xc0011033 0xc0011034 0xc0011035 0xc0011036 0xc0011037 0xc0011038 0xc0011039 0xc001103a > 0 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 1 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 2 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 3 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 5 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 6 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 7 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 8 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 9 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 10 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 11 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 12 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 13 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 14 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 15 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 16 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 17 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 18 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 19 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 20 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 21 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 22 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 23 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 24 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 25 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 26 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 27 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 28 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 29 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 30 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 > 31 : 00034d8900000000 ffffffff811370cc 0000000000120000 0000000000000000 0000000000000008 ffff88082592bc58 000000082592bc58 0000000000000100 > > > Suravee > > > On Mon, 2012-12-17 at 10:44 +0100, Robert Richter wrote: > > On 16.12.12 10:04:10, Ingo Molnar wrote: > > > > > > * suravee.suthikulpanit@amd.com <suravee.suthikulpanit@amd.com> wrote: > > > > > > > From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> > > > > > > > > Currently, the AMD IBS PMU initialize pmu.task_ctx_nr to > > > > perf_invalid_context which only allows IBS to be running only > > > > in system-wide mode (e.g. perf record -a). IBS hardware is > > > > available in each core and should be per-context. This patch > > > > modifies the task_ctx_nr to use the perf_hw_context (default) > > > > instead. > > > > > > I'm wondering how extensively was it tested/verified that it's > > > safe to enable IBS in per context mode as well, and that the > > > profiling results are precise and accurate? > > > > From the implementation's point of view this is very similar to hw > > perf counters. I wouldn't expect any issues here. Since IBS can be > > immediatly started/stopped and there is no caching, there won't be any > > incomming sample that is not related to that context. > > > > The only potential problem I see could be a security risk in a way > > that an IBS sample might expose data related to other contexts such as > > cache information. This is similar to uncore/northbridge events so I > > don't think this is an issue, but we might want to evaluate this. > > > > > We never used the IBS hardware in this fashion before, so some > > > extra care is prudent - and traces of that extra care should be > > > visible in the changelog as well. > > > > Yeah, a comparison of numbers for IBS and hw counter (-e r076:p,r076 > > and -e r0C1:p,r0C1) in per-context mode would be useful here. > > > > -Robert > > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH 1/1] Change IBS PMU to use perf_hw_context 2013-01-16 22:19 ` Suravee Suthikulpanit @ 2013-06-18 16:03 ` Suravee Suthikulanit 0 siblings, 0 replies; 6+ messages in thread From: Suravee Suthikulanit @ 2013-06-18 16:03 UTC (permalink / raw) To: suravee.suthikulpanit, Peter Zijlstra Cc: Robert Richter, Ingo Molnar, x86, linux-kernel Peter, I am trying to resurrect this patch. Basically, I have provided the information to show that IBS is supposed to support per-process usage. Would you mind taking a look at this? Thank you, Suravee On 1/16/2013 4:19 PM, Suravee Suthikulpanit wrote: > Hi, > > I am following up with this patch. Please let me know if you would like > me to provide any more data or verifications. > > Thank you, > > Suravee > > On Tue, 2012-12-18 at 16:54 -0600, Suravee Suthikulpanit wrote: >> Ingo, Robert >> >> I am including a set of output from "perf report" to help validating IBS in per-process mode. >> In this experiment I ran a couple test cases: >> >> case 1. perf record -e cycles (baseline per-process mode w/ regular counter) >> case 2. perf record -a -e cycles:p (baseline system-wide mode w/ IBS) >> case 3. perf record -e cycles:p (the proposed per-process mode w/IBS) >> >> In all 3 test cases, the target application (classic) are showing about 27K samples. >> I am also including the IBS OP MSRs (0xc00110[33-3a]) snapshots on all 32 cores >> (using rdmsr tools) from case 2 and 3 above. >> >> ------------------------------------------------------------ >> CASE1: >> >> # ======== >> # captured on: Tue Dec 18 16:32:43 2012 >> # hostname : sos-dev02 >> # os release : 3.7.0-IBS+ >> # perf version : 3.7.rc8.g805f38 >> # arch : x86_64 >> # nrcpus online : 32 >> # nrcpus avail : 32 >> # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 >> # cpuid : AuthenticAMD,21,2,0 >> # total memory : 32863836 kB >> # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -e cycles taskset -c 31 src/classic >> # event : name = cycles, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 1, precise_ip = 0, id = { 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229 } >> # HEADER_CPU_TOPOLOGY info available, use -I to display >> # HEADER_NUMA_TOPOLOGY info available, use -I to display >> # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 >> # ======== >> # >> # Samples: 27K of event 'cycles' >> # Event count (approx.): 20938245323 >> # >> # Overhead Samples Command Shared Object Symbol >> # ........ ........... ....... ................. ....................................... >> # >> 99.16% 26927 classic classic [.] multiply_matrices() <--- TARGET APP >> 0.32% 78 classic libc-2.15.so [.] random >> 0.10% 23 classic libc-2.15.so [.] random_r >> 0.07% 16 classic classic [.] initialize_matrices() >> 0.04% 10 classic [kernel.kallsyms] [k] ttwu_do_wakeup >> 0.03% 9 classic [kernel.kallsyms] [k] clear_page_c >> 0.02% 11 classic [kernel.kallsyms] [k] native_write_msr_safe >> 0.02% 5 classic libc-2.15.so [.] rand >> 0.02% 2 classic ld-2.15.so [.] 0x000000000000a456 >> >> ------------------------------------------------------------ >> CASE 2: >> >> # ======== >> # captured on: Tue Dec 18 16:11:35 2012 >> # hostname : sos-dev02 >> # os release : 3.7.0-IBS+ >> # perf version : 3.7.rc8.g805f38 >> # arch : x86_64 >> # nrcpus online : 32 >> # nrcpus avail : 32 >> # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 >> # cpuid : AuthenticAMD,21,2,0 >> # total memory : 32863836 kB >> # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -a -e cycles:p taskset -c 31 src/classic >> # event : name = cycles:p, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 0, precise_ip = 1, id = { 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98 } >> # HEADER_CPU_TOPOLOGY info available, use -I to display >> # HEADER_NUMA_TOPOLOGY info available, use -I to display >> # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 >> # ======== >> # >> # Samples: 189K of event 'cycles:p' >> # Event count (approx.): 40504131338 >> # >> # Overhead Samples Command Shared Object Symbol >> # ........ ........... ............... ................................ ........................................... >> # >> 51.07% 26959 classic classic [.] multiply_matrices() <------ TARGET APP >> 35.39% 131620 swapper [kernel.kallsyms] [k] acpi_idle_do_entry >> 2.10% 4673 swapper [kernel.kallsyms] [k] native_safe_halt >> 0.71% 1303 rdmsr ld-2.15.so [.] 0x0000000000002a44 >> 0.33% 639 rdmsr [kernel.kallsyms] [k] irq_return >> 0.26% 499 rdmsr libc-2.15.so [.] 0x0000000000131d80 >> 0.25% 440 rdmsr [kernel.kallsyms] [k] generic_exec_single >> 0.25% 470 rdmsr [kernel.kallsyms] [k] __do_fault >> 0.24% 478 rdmsr [kernel.kallsyms] [k] unmap_single_vma >> >> ------------------------------------------------------------ >> CASE 3: >> >> # ======== >> # captured on: Tue Dec 18 16:13:53 2012 >> # hostname : sos-dev02 >> # os release : 3.7.0-IBS+ >> # perf version : 3.7.rc8.g805f38 >> # arch : x86_64 >> # nrcpus online : 32 >> # nrcpus avail : 32 >> # cpudesc : AMD Eng Sample, 1S228145TGG54_31/22/20_2/16 >> # cpuid : AuthenticAMD,21,2,0 >> # total memory : 32863836 kB >> # cmdline : /sandbox/kernels/suravee/tools/perf/perf record -e cycles:p taskset -c 31 src/classic >> # event : name = cycles:p, type = 0, config = 0x0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern = 0, excl_host = 0, excl_guest = 0, precise_ip = 1, id = { 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131 } >> # HEADER_CPU_TOPOLOGY info available, use -I to display >> # HEADER_NUMA_TOPOLOGY info available, use -I to display >> # pmu mappings: cpu = 4, software = 1, tracepoint = 2, ibs_fetch = 6, ibs_op = 7, breakpoint = 5 >> # ======== >> # >> # Samples: 27K of event 'cycles:p' >> # Event count (approx.): 20851884446 >> # >> # Overhead Samples Command Shared Object Symbol >> # ........ ........... ....... ................. .............................. >> # >> 99.37% 27020 classic classic [.] multiply_matrices() <--- TARGET APP >> 0.22% 58 classic libc-2.15.so [.] random_r >> 0.13% 32 classic classic [.] initialize_matrices() >> 0.10% 26 classic libc-2.15.so [.] random >> 0.03% 8 classic libc-2.15.so [.] rand >> 0.03% 7 classic [kernel.kallsyms] [k] clear_page_c >> 0.01% 2 classic ld-2.15.so [.] 0x000000000000a423 >> 0.01% 2 classic [kernel.kallsyms] [k] retint_swapgs >> 0.01% 2 classic [kernel.kallsyms] [k] ttwu_do_wakeup >> >> ------------------------------------------------------------ >> IBS MSR VALUES FROM CASE 2: >> >> core : 0xc0011033 0xc0011034 0xc0011035 0xc0011036 0xc0011037 0xc0011038 0xc0011039 0xc001103a >> 0 : 0000002300040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000000 000000fdfd300000 0000000000000100 >> 1 : 0000006200040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 >> 2 : 0000006000040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000000 000000fdfd300000 0000000000000100 >> 3 : 0000005000040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000400 000000fdfd300400 0000000000000100 >> 4 : 0000005700040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 >> 5 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000000000 0000000000000514 000000fdfd300514 0000000000000100 >> 6 : 0000004200040000 ffffffff813d8c74 00000000000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 >> 7 : 0000000000000000 ffffffff81043ea8 00000000000a0000 0000000000000000 0000000000000000 0000000000000514 000000fdfd300514 0000000000000100 >> 8 : 0000002300040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 >> 9 : 0000004d00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000400 000000fdfd300400 0000000000000100 >> 10 : 00001fe500000000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 >> 11 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000040001 0000000000000514 000000fdfd300514 0000000000000100 >> 12 : 0000008100040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 >> 13 : 0000006900040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040002 0000000000000400 000000fdfd300400 0000000000000100 >> 14 : 0000004900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 >> 15 : 0000002300040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000400 000000fdfd300400 0000000000000100 >> 16 : 0000000f00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 >> 17 : 0000004b00040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 >> 18 : 0000003d00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 00000000000002b8 000000fdfd3002b8 0000000000000100 >> 19 : 0000004400040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 >> 20 : 0000001800040000 ffffffff813d8d27 0000000000060001 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 >> 21 : 0000002900040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 >> 22 : 0000005900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 >> 23 : 0000001500040000 ffffffff813d8c6d 00000058000b0002 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 >> 24 : 0000006100040000 ffffffff8133e844 00000028001e000f 0000000000000000 0000000000000000 0000000000000000 000000fdfd300000 0000000000000100 >> 25 : 0000005400040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 >> 26 : 0000002900040000 ffffffff813d8c6d 00000040000b0002 0000000000000000 0000000000040001 0000000000000000 000000fdfd300000 0000000000000100 >> 27 : 0000000e00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000000000 0000000000000400 000000fdfd300400 0000000000000100 >> 28 : 0000007e00040000 ffffffff813d8c6d 00000040000b0006 0000000000000000 0000000000040042 0000000000000000 000000fdfd300000 0000000000000100 >> 29 : 0000000000000000 ffffffff81043ea8 00000000000a0001 0000000000000000 0000000000040001 0000000000000514 000000fdfd300514 0000000000000100 >> 30 : 00003f4a00000000 ffffffff813d8c6d 00000040000b0006 0000000000000000 000000000004000a 0000000000000000 000000fdfd300000 0000000000000100 >> 31 : 0001147800000000 ffffffff810b9400 00000000003c0001 0000000000000000 0000000000040009 00000000000005dc 000000fdfd3005dc 0000000000000100 >> >> ------------------------------------------------------------ >> IBS MSR VALUES FROM CASE 3: >> >> core : 0xc0011033 0xc0011034 0xc0011035 0xc0011036 0xc0011037 0xc0011038 0xc0011039 0xc001103a >> 0 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 1 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 2 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 3 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 4 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 5 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 6 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 7 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 8 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 9 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 10 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 11 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 12 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 13 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 14 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 15 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 16 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 17 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 18 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 19 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 20 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 21 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 22 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 23 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 24 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 25 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 26 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 27 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 28 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 29 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 30 : 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000000 0000000000000100 >> 31 : 00034d8900000000 ffffffff811370cc 0000000000120000 0000000000000000 0000000000000008 ffff88082592bc58 000000082592bc58 0000000000000100 >> >> >> Suravee >> >> >> On Mon, 2012-12-17 at 10:44 +0100, Robert Richter wrote: >>> On 16.12.12 10:04:10, Ingo Molnar wrote: >>>> * suravee.suthikulpanit@amd.com <suravee.suthikulpanit@amd.com> wrote: >>>> >>>>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> >>>>> >>>>> Currently, the AMD IBS PMU initialize pmu.task_ctx_nr to >>>>> perf_invalid_context which only allows IBS to be running only >>>>> in system-wide mode (e.g. perf record -a). IBS hardware is >>>>> available in each core and should be per-context. This patch >>>>> modifies the task_ctx_nr to use the perf_hw_context (default) >>>>> instead. >>>> I'm wondering how extensively was it tested/verified that it's >>>> safe to enable IBS in per context mode as well, and that the >>>> profiling results are precise and accurate? >>> From the implementation's point of view this is very similar to hw >>> perf counters. I wouldn't expect any issues here. Since IBS can be >>> immediatly started/stopped and there is no caching, there won't be any >>> incomming sample that is not related to that context. >>> >>> The only potential problem I see could be a security risk in a way >>> that an IBS sample might expose data related to other contexts such as >>> cache information. This is similar to uncore/northbridge events so I >>> don't think this is an issue, but we might want to evaluate this. >>> >>>> We never used the IBS hardware in this fashion before, so some >>>> extra care is prudent - and traces of that extra care should be >>>> visible in the changelog as well. >>> Yeah, a comparison of numbers for IBS and hw counter (-e r076:p,r076 >>> and -e r0C1:p,r0C1) in per-context mode would be useful here. >>> >>> -Robert >>> >> > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > > > ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-06-18 16:04 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2012-12-14 20:57 [PATCH 1/1] Change IBS PMU to use perf_hw_context suravee.suthikulpanit 2012-12-16 9:04 ` Ingo Molnar 2012-12-17 9:44 ` Robert Richter 2012-12-18 22:54 ` Suravee Suthikulpanit 2013-01-16 22:19 ` Suravee Suthikulpanit 2013-06-18 16:03 ` Suravee Suthikulanit
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).