* unable to get load latency info @ 2014-02-28 6:24 Muthusamy 2014-02-28 10:43 ` Harald Servat ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Muthusamy @ 2014-02-28 6:24 UTC (permalink / raw) To: linux-perf-users Hi, I am trying to get the memory load latency info using perf, but I am always getting 0. Can you please help me in understanding what I am missing. Below are the details [root@rafa tmp]# ./perf --version perf version 3.12.11 [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with other data intensive programs too) /tmp/cms/tmp Performance counter stats for 'pwd': 0 r100b 0.000846849 seconds time elapsed [root@rafa tmp]# uname -a Linux rafa 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux [root@rafa tmp]# cat /proc/cpuinfo processor: 0 vendor_id: GenuineIntel cpu family: 6 model: 26 model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz stepping : 5 cpu MHz: 1600.000 cache size: 8192 KB physical id: 0 siblings: 1 core id: 0 cpu cores: 1 apicid: 0 initial apicid: 0 fpu: yes fpu_exception: yes cpuid level: 11 wp: yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid bogomips: 5333.09 clflush size: 64 cache_alignment: 64 address sizes: 40 bits physical, 48 bits virtual power management: processor: 1 vendor_id : GenuineIntel cpu family: 6 model: 26 model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz stepping: 5 cpu MHz: 1600.000 cache size: 8192 KB physical id : 1 siblings: 1 core id: 0 cpu cores: 1 apicid: 16 initial apicid: 16 fpu: yes fpu_exception: yes cpuid level: 11 wp: yes flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid bogomips: 5332.56 clflush size: 64 cache_alignment: 64 address sizes: 40 bits physical, 48 bits virtual power management: [root@rafa tmp]# Let me know if any other details are also required and steps on how to get them. Thanks, Muthusamy C ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info 2014-02-28 6:24 unable to get load latency info Muthusamy @ 2014-02-28 10:43 ` Harald Servat 2014-02-28 11:06 ` Manuel Selva 2014-02-28 13:42 ` Andi Kleen 2 siblings, 0 replies; 10+ messages in thread From: Harald Servat @ 2014-02-28 10:43 UTC (permalink / raw) To: Muthusamy, linux-perf-users On 28/02/14 07:24, Muthusamy wrote: > Hi, > > I am trying to get the memory load latency info using perf, but I am always getting 0. > Can you please help me in understanding what I am missing. Below are the details > > [root@rafa tmp]# ./perf --version > perf version 3.12.11 > [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with other data intensive programs too) > /tmp/cms/tmp > > Performance counter stats for 'pwd': > > 0 r100b > > 0.000846849 seconds time elapsed > > [root@rafa tmp]# uname -a > Linux rafa 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux > > [root@rafa tmp]# cat /proc/cpuinfo > processor: 0 > vendor_id: GenuineIntel > cpu family: 6 > model: 26 > model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz > stepping > : 5 > cpu MHz: 1600.000 > cache size: 8192 KB > physical id: 0 > siblings: 1 > core id: 0 > cpu cores: 1 > apicid: 0 > initial apicid: 0 > fpu: yes > fpu_exception: yes > cpuid level: 11 > wp: yes > flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid > bogomips: 5333.09 > clflush size: 64 > cache_alignment: 64 > address sizes: 40 bits physical, 48 bits virtual > power management: > > processor: 1 > vendor_id > : GenuineIntel > cpu family: 6 > model: 26 > model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz > stepping: 5 > cpu MHz: 1600.000 > cache size: 8192 KB > physical id > : 1 > siblings: 1 > core id: 0 > cpu cores: 1 > apicid: 16 > initial apicid: 16 > fpu: yes > fpu_exception: yes > cpuid level: 11 > wp: yes > flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid > bogomips: 5332.56 > clflush size: 64 > cache_alignment: 64 > address sizes: 40 bits physical, 48 bits virtual > power management: > > [root@rafa tmp]# > > Let me know if any other details are also required and steps on how to get them. > > Thanks, > Muthusamy C Hello Muthusamy, I have obtained load latency info through perf record + perf report, but never used perf stat to do that. I don't know if this is what you're looking for, but maybe you can give this a try. To capture the latency for an app (say /bin/ls) # perf mem record /bin/ls Then, to show the summary of the results (which part of the cache hierarchy provided the data for the references) you can use: # perf mem report Additionally, if you want to know the cost in cycles of every memory reference captured by PEBS, execute: # perf mem report -D in that output, each PERF_RECORD_SAMPLE reflects a sample, and its weight is the latency in cycles that the CPU was waiting for that reference. Hope that helps. Regards. WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received. http://www.bsc.es/disclaimer ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info 2014-02-28 6:24 unable to get load latency info Muthusamy 2014-02-28 10:43 ` Harald Servat @ 2014-02-28 11:06 ` Manuel Selva 2014-02-28 13:42 ` Andi Kleen 2 siblings, 0 replies; 10+ messages in thread From: Manuel Selva @ 2014-02-28 11:06 UTC (permalink / raw) To: Muthusamy, linux-perf-users You must also take care of the fact that PEBS load Latency Measurement has been introduced in kernel 3.10 http://web.eece.maine.edu/~vweaver/projects/perf_events/features.html http://comments.gmane.org/gmane.linux.kernel/1428651 Manu On 02/28/2014 07:24 AM, Muthusamy wrote: > Hi, > > I am trying to get the memory load latency info using perf, but I am always getting 0. > Can you please help me in understanding what I am missing. Below are the details > > [root@rafa tmp]# ./perf --version > perf version 3.12.11 > [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with other data intensive programs too) > /tmp/cms/tmp > > Performance counter stats for 'pwd': > > 0 r100b > > 0.000846849 seconds time elapsed > > [root@rafa tmp]# uname -a > Linux rafa 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux > > [root@rafa tmp]# cat /proc/cpuinfo > processor: 0 > vendor_id: GenuineIntel > cpu family: 6 > model: 26 > model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz > stepping > : 5 > cpu MHz: 1600.000 > cache size: 8192 KB > physical id: 0 > siblings: 1 > core id: 0 > cpu cores: 1 > apicid: 0 > initial apicid: 0 > fpu: yes > fpu_exception: yes > cpuid level: 11 > wp: yes > flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid > bogomips: 5333.09 > clflush size: 64 > cache_alignment: 64 > address sizes: 40 bits physical, 48 bits virtual > power management: > > processor: 1 > vendor_id > : GenuineIntel > cpu family: 6 > model: 26 > model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz > stepping: 5 > cpu MHz: 1600.000 > cache size: 8192 KB > physical id > : 1 > siblings: 1 > core id: 0 > cpu cores: 1 > apicid: 16 > initial apicid: 16 > fpu: yes > fpu_exception: yes > cpuid level: 11 > wp: yes > flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid > bogomips: 5332.56 > clflush size: 64 > cache_alignment: 64 > address sizes: 40 bits physical, 48 bits virtual > power management: > > [root@rafa tmp]# > > Let me know if any other details are also required and steps on how to get them. > > Thanks, > Muthusamy C > -- > To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info 2014-02-28 6:24 unable to get load latency info Muthusamy 2014-02-28 10:43 ` Harald Servat 2014-02-28 11:06 ` Manuel Selva @ 2014-02-28 13:42 ` Andi Kleen 2014-02-28 13:57 ` Harald Servat 2 siblings, 1 reply; 10+ messages in thread From: Andi Kleen @ 2014-02-28 13:42 UTC (permalink / raw) To: Muthusamy; +Cc: linux-perf-users Muthusamy <muthu9283@yahoo.com> writes: > perf version 3.12.11 > [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with > other data intensive programs too) I don't think r100b is a valid event on Westmere. Not sure where you got that from. As others pointed out you need to use perf mem record/report with sampling (but note that it reports use-latency, not load-latency) -Andi -- ak@linux.intel.com -- Speaking for myself only ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info 2014-02-28 13:42 ` Andi Kleen @ 2014-02-28 13:57 ` Harald Servat 2014-02-28 16:45 ` Andi Kleen 0 siblings, 1 reply; 10+ messages in thread From: Harald Servat @ 2014-02-28 13:57 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-perf-users On 28/02/14 14:42, Andi Kleen wrote: > Muthusamy <muthu9283@yahoo.com> writes: >> perf version 3.12.11 >> [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with >> other data intensive programs too) > > I don't think r100b is a valid event on Westmere. Not sure where you got > that from. As others pointed out you need to use perf mem record/report > with sampling (but note that it reports use-latency, not load-latency) > > -Andi > Andi, what do you mean by use-latency here? I understood that PEBS was able to report the number of core cycles that the load took to from some part of the memory hierarchy until it reached the CPU. Thank you. WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received. http://www.bsc.es/disclaimer ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info 2014-02-28 13:57 ` Harald Servat @ 2014-02-28 16:45 ` Andi Kleen 2014-03-03 5:12 ` Muthusamy 0 siblings, 1 reply; 10+ messages in thread From: Andi Kleen @ 2014-02-28 16:45 UTC (permalink / raw) To: Harald Servat; +Cc: Andi Kleen, linux-perf-users > what do you mean by use-latency here? I understood that PEBS was > able to report the number of core cycles that the load took to from > some part of the memory hierarchy until it reached the CPU. PEBS load/store latency reports the cycles from when the instruction started issuing in the pipeline to the return of the value. That's not quite the same, it can be much longer than the pure memory hierarchy cost. -Andi ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info 2014-02-28 16:45 ` Andi Kleen @ 2014-03-03 5:12 ` Muthusamy 2014-03-03 7:50 ` Harald Servat 0 siblings, 1 reply; 10+ messages in thread From: Muthusamy @ 2014-03-03 5:12 UTC (permalink / raw) To: Harald Servat, Andi Kleen, linux-perf-users > On 28/02/14 14:42, Andi Kleen wrote: > I don't think r100b is a valid event on Westmere. Thanks for the replies. As Harald Servat and others pointed I am now trying "perf mem record" [root@rafa cms]# ./perf mem record /bin/ls invalid or unsupported event: 'cpu/mem-loads/pp' Is this a issue with the (old) kernel version I am on or the CPU version or both. Regards, Muthusamy C > On Friday, February 28, 2014 10:15 PM, Andi Kleen <andi@firstfloor.org> wrote: > >> what do you mean by use-latency here? I understood that PEBS was >> able to report the number of core cycles that the load took to from >> some part of the memory hierarchy until it reached the CPU. > > PEBS load/store latency reports the cycles from when the instruction > started issuing in the pipeline to the return of the value. That's not > quite the same, it can be much longer than the pure memory hierarchy > cost. > > > -Andi > -- > To unsubscribe from this list: send the line "unsubscribe > linux-perf-users" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info 2014-03-03 5:12 ` Muthusamy @ 2014-03-03 7:50 ` Harald Servat [not found] ` <1393851732.26401.YahooMailNeo@web125404.mail.ne1.yahoo.com> 0 siblings, 1 reply; 10+ messages in thread From: Harald Servat @ 2014-03-03 7:50 UTC (permalink / raw) To: Muthusamy, linux-perf-users On 03/03/14 06:12, Muthusamy wrote: >> On 28/02/14 14:42, Andi Kleen wrote: > >> I don't think r100b is a valid event on Westmere. > > Thanks for the replies. > As Harald Servat and others pointed I am now trying "perf mem record" > > [root@rafa cms]# ./perf mem record /bin/ls > invalid or unsupported event: 'cpu/mem-loads/pp' > > Is this a issue with the (old) kernel version I am on or the CPU version or both. > > Regards, > Muthusamy C I'm using Linux laptop 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux so I imagine that you're ok with respect to OS. Now I remember I had to install some microcode module for my OS, though [2]. WRT hardware, I'm running on top of a vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz stepping : 7 Besides of that, the Intel Manual [1] says in section 18.7.1.1 that Nehalem (and I'd would guess Westmere, as it is a Nehalem successor) supports PEBS, so it should work. Best regards. [1] http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf [2] http://packages.ubuntu.com/lucid/intel-microcode WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received. http://www.bsc.es/disclaimer ^ permalink raw reply [flat|nested] 10+ messages in thread
[parent not found: <1393851732.26401.YahooMailNeo@web125404.mail.ne1.yahoo.com>]
* Re: unable to get load latency info [not found] ` <1393851732.26401.YahooMailNeo@web125404.mail.ne1.yahoo.com> @ 2014-03-03 14:24 ` Harald Servat 2014-03-04 10:43 ` Muthusamy 0 siblings, 1 reply; 10+ messages in thread From: Harald Servat @ 2014-03-03 14:24 UTC (permalink / raw) To: Muthusamy, linux-perf-users On 03/03/14 14:02, Muthusamy wrote: >> On Monday, March 3, 2014 1:20 PM, Harald Servat <harald.servat@bsc.es> wrote: >> so I imagine that you're ok with respect to OS. >> Besides of that, the Intel Manual [1] says in section 18.7.1.1 that >> Nehalem (and I'd would guess Westmere, as it is a Nehalem successor) >> supports PEBS, so it should work. > > > So, what am I missing. Is there something the kernel needs to be enabled for PEBS. If yes, how do I know if it is enabled. > > Regards, > Muthusamy C > I notice that you provide on an earlier output a uname -a showed 2.6.32-xxxx but perf --version showed version 3.12.11, and they don't match but should. Are you running on 2.6.x or on 3.12.x? Regards WARNING / LEGAL TEXT: This message is intended only for the use of the individual or entity to which it is addressed and may contain information which is privileged, confidential, proprietary, or exempt from disclosure under applicable law. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, you are strictly prohibited from disclosing, distributing, copying, or in any way using this message. If you have received this communication in error, please notify the sender and destroy and delete any copies you may have received. http://www.bsc.es/disclaimer ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info 2014-03-03 14:24 ` Harald Servat @ 2014-03-04 10:43 ` Muthusamy 0 siblings, 0 replies; 10+ messages in thread From: Muthusamy @ 2014-03-04 10:43 UTC (permalink / raw) To: Harald Servat, linux-perf-users >>> On Monday, March 3, 2014 1:20 PM, Harald Servat > <harald.servat@bsc.es> wrote: > I notice that you provide on an earlier output a uname -a showed > 2.6.32-xxxx but perf --version showed version 3.12.11, and they don't > match but should. Are you running on 2.6.x or on 3.12.x? After upgrading the kernel, it did work. Thanks a lot for the support. Regards, Muthusamy C > On Monday, March 3, 2014 7:54 PM, Harald Servat <harald.servat@bsc.es> wrote: > > On 03/03/14 14:02, Muthusamy wrote: >>> On Monday, March 3, 2014 1:20 PM, Harald Servat > <harald.servat@bsc.es> wrote: >>> so I imagine that you're ok with respect to OS. >>> Besides of that, the Intel Manual [1] says in section 18.7.1.1 > that >>> Nehalem (and I'd would guess Westmere, as it is a Nehalem > successor) >>> supports PEBS, so it should work. >> >> >> So, what am I missing. Is there something the kernel needs to be enabled > for PEBS. If yes, how do I know if it is enabled. >> >> Regards, >> Muthusamy C >> > > I notice that you provide on an earlier output a uname -a showed > 2.6.32-xxxx but perf --version showed version 3.12.11, and they don't > match but should. Are you running on 2.6.x or on 3.12.x? > > > Regards > > > WARNING / LEGAL TEXT: This message is intended only for the use of the > individual or entity to which it is addressed and may contain > information which is privileged, confidential, proprietary, or exempt > from disclosure under applicable law. If you are not the intended > recipient or the person responsible for delivering the message to the > intended recipient, you are strictly prohibited from disclosing, > distributing, copying, or in any way using this message. If you have > received this communication in error, please notify the sender and > destroy and delete any copies you may have received. > > http://www.bsc.es/disclaimer > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-03-04 10:43 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-02-28 6:24 unable to get load latency info Muthusamy 2014-02-28 10:43 ` Harald Servat 2014-02-28 11:06 ` Manuel Selva 2014-02-28 13:42 ` Andi Kleen 2014-02-28 13:57 ` Harald Servat 2014-02-28 16:45 ` Andi Kleen 2014-03-03 5:12 ` Muthusamy 2014-03-03 7:50 ` Harald Servat [not found] ` <1393851732.26401.YahooMailNeo@web125404.mail.ne1.yahoo.com> 2014-03-03 14:24 ` Harald Servat 2014-03-04 10:43 ` Muthusamy
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.