* unable to get load latency info
@ 2014-02-28 6:24 Muthusamy
2014-02-28 10:43 ` Harald Servat
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Muthusamy @ 2014-02-28 6:24 UTC (permalink / raw)
To: linux-perf-users
Hi,
I am trying to get the memory load latency info using perf, but I am always getting 0.
Can you please help me in understanding what I am missing. Below are the details
[root@rafa tmp]# ./perf --version
perf version 3.12.11
[root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with other data intensive programs too)
/tmp/cms/tmp
Performance counter stats for 'pwd':
0 r100b
0.000846849 seconds time elapsed
[root@rafa tmp]# uname -a
Linux rafa 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
[root@rafa tmp]# cat /proc/cpuinfo
processor: 0
vendor_id: GenuineIntel
cpu family: 6
model: 26
model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping
: 5
cpu MHz: 1600.000
cache size: 8192 KB
physical id: 0
siblings: 1
core id: 0
cpu cores: 1
apicid: 0
initial apicid: 0
fpu: yes
fpu_exception: yes
cpuid level: 11
wp: yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid
bogomips: 5333.09
clflush size: 64
cache_alignment: 64
address sizes: 40 bits physical, 48 bits virtual
power management:
processor: 1
vendor_id
: GenuineIntel
cpu family: 6
model: 26
model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
stepping: 5
cpu MHz: 1600.000
cache size: 8192 KB
physical id
: 1
siblings: 1
core id: 0
cpu cores: 1
apicid: 16
initial apicid: 16
fpu: yes
fpu_exception: yes
cpuid level: 11
wp: yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid
bogomips: 5332.56
clflush size: 64
cache_alignment: 64
address sizes: 40 bits physical, 48 bits virtual
power management:
[root@rafa tmp]#
Let me know if any other details are also required and steps on how to get them.
Thanks,
Muthusamy C
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
2014-02-28 6:24 unable to get load latency info Muthusamy
@ 2014-02-28 10:43 ` Harald Servat
2014-02-28 11:06 ` Manuel Selva
2014-02-28 13:42 ` Andi Kleen
2 siblings, 0 replies; 10+ messages in thread
From: Harald Servat @ 2014-02-28 10:43 UTC (permalink / raw)
To: Muthusamy, linux-perf-users
On 28/02/14 07:24, Muthusamy wrote:
> Hi,
>
> I am trying to get the memory load latency info using perf, but I am always getting 0.
> Can you please help me in understanding what I am missing. Below are the details
>
> [root@rafa tmp]# ./perf --version
> perf version 3.12.11
> [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with other data intensive programs too)
> /tmp/cms/tmp
>
> Performance counter stats for 'pwd':
>
> 0 r100b
>
> 0.000846849 seconds time elapsed
>
> [root@rafa tmp]# uname -a
> Linux rafa 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
>
> [root@rafa tmp]# cat /proc/cpuinfo
> processor: 0
> vendor_id: GenuineIntel
> cpu family: 6
> model: 26
> model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
> stepping
> : 5
> cpu MHz: 1600.000
> cache size: 8192 KB
> physical id: 0
> siblings: 1
> core id: 0
> cpu cores: 1
> apicid: 0
> initial apicid: 0
> fpu: yes
> fpu_exception: yes
> cpuid level: 11
> wp: yes
> flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid
> bogomips: 5333.09
> clflush size: 64
> cache_alignment: 64
> address sizes: 40 bits physical, 48 bits virtual
> power management:
>
> processor: 1
> vendor_id
> : GenuineIntel
> cpu family: 6
> model: 26
> model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
> stepping: 5
> cpu MHz: 1600.000
> cache size: 8192 KB
> physical id
> : 1
> siblings: 1
> core id: 0
> cpu cores: 1
> apicid: 16
> initial apicid: 16
> fpu: yes
> fpu_exception: yes
> cpuid level: 11
> wp: yes
> flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid
> bogomips: 5332.56
> clflush size: 64
> cache_alignment: 64
> address sizes: 40 bits physical, 48 bits virtual
> power management:
>
> [root@rafa tmp]#
>
> Let me know if any other details are also required and steps on how to get them.
>
> Thanks,
> Muthusamy C
Hello Muthusamy,
I have obtained load latency info through perf record + perf report,
but never used perf stat to do that. I don't know if this is what you're
looking for, but maybe you can give this a try.
To capture the latency for an app (say /bin/ls)
# perf mem record /bin/ls
Then, to show the summary of the results (which part of the cache
hierarchy provided the data for the references) you can use:
# perf mem report
Additionally, if you want to know the cost in cycles of every memory
reference captured by PEBS, execute:
# perf mem report -D
in that output, each PERF_RECORD_SAMPLE reflects a sample, and its
weight is the latency in cycles that the CPU was waiting for that reference.
Hope that helps.
Regards.
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
2014-02-28 6:24 unable to get load latency info Muthusamy
2014-02-28 10:43 ` Harald Servat
@ 2014-02-28 11:06 ` Manuel Selva
2014-02-28 13:42 ` Andi Kleen
2 siblings, 0 replies; 10+ messages in thread
From: Manuel Selva @ 2014-02-28 11:06 UTC (permalink / raw)
To: Muthusamy, linux-perf-users
You must also take care of the fact that PEBS load Latency Measurement
has been introduced in kernel 3.10
http://web.eece.maine.edu/~vweaver/projects/perf_events/features.html
http://comments.gmane.org/gmane.linux.kernel/1428651
Manu
On 02/28/2014 07:24 AM, Muthusamy wrote:
> Hi,
>
> I am trying to get the memory load latency info using perf, but I am always getting 0.
> Can you please help me in understanding what I am missing. Below are the details
>
> [root@rafa tmp]# ./perf --version
> perf version 3.12.11
> [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with other data intensive programs too)
> /tmp/cms/tmp
>
> Performance counter stats for 'pwd':
>
> 0 r100b
>
> 0.000846849 seconds time elapsed
>
> [root@rafa tmp]# uname -a
> Linux rafa 2.6.32-431.el6.x86_64 #1 SMP Sun Nov 10 22:19:54 EST 2013 x86_64 x86_64 x86_64 GNU/Linux
>
> [root@rafa tmp]# cat /proc/cpuinfo
> processor: 0
> vendor_id: GenuineIntel
> cpu family: 6
> model: 26
> model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
> stepping
> : 5
> cpu MHz: 1600.000
> cache size: 8192 KB
> physical id: 0
> siblings: 1
> core id: 0
> cpu cores: 1
> apicid: 0
> initial apicid: 0
> fpu: yes
> fpu_exception: yes
> cpuid level: 11
> wp: yes
> flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid
> bogomips: 5333.09
> clflush size: 64
> cache_alignment: 64
> address sizes: 40 bits physical, 48 bits virtual
> power management:
>
> processor: 1
> vendor_id
> : GenuineIntel
> cpu family: 6
> model: 26
> model name: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz
> stepping: 5
> cpu MHz: 1600.000
> cache size: 8192 KB
> physical id
> : 1
> siblings: 1
> core id: 0
> cpu cores: 1
> apicid: 16
> initial apicid: 16
> fpu: yes
> fpu_exception: yes
> cpuid level: 11
> wp: yes
> flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 sse4_2 popcnt lahf_lm ida dts tpr_shadow vnmi flexpriority ept vpid
> bogomips: 5332.56
> clflush size: 64
> cache_alignment: 64
> address sizes: 40 bits physical, 48 bits virtual
> power management:
>
> [root@rafa tmp]#
>
> Let me know if any other details are also required and steps on how to get them.
>
> Thanks,
> Muthusamy C
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
2014-02-28 6:24 unable to get load latency info Muthusamy
2014-02-28 10:43 ` Harald Servat
2014-02-28 11:06 ` Manuel Selva
@ 2014-02-28 13:42 ` Andi Kleen
2014-02-28 13:57 ` Harald Servat
2 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2014-02-28 13:42 UTC (permalink / raw)
To: Muthusamy; +Cc: linux-perf-users
Muthusamy <muthu9283@yahoo.com> writes:
> perf version 3.12.11
> [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with
> other data intensive programs too)
I don't think r100b is a valid event on Westmere. Not sure where you got
that from. As others pointed out you need to use perf mem record/report
with sampling (but note that it reports use-latency, not load-latency)
-Andi
--
ak@linux.intel.com -- Speaking for myself only
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
2014-02-28 13:42 ` Andi Kleen
@ 2014-02-28 13:57 ` Harald Servat
2014-02-28 16:45 ` Andi Kleen
0 siblings, 1 reply; 10+ messages in thread
From: Harald Servat @ 2014-02-28 13:57 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-perf-users
On 28/02/14 14:42, Andi Kleen wrote:
> Muthusamy <muthu9283@yahoo.com> writes:
>> perf version 3.12.11
>> [root@rafa tmp]# ./perf stat -e r100b pwd ### (I have tried with
>> other data intensive programs too)
>
> I don't think r100b is a valid event on Westmere. Not sure where you got
> that from. As others pointed out you need to use perf mem record/report
> with sampling (but note that it reports use-latency, not load-latency)
>
> -Andi
>
Andi,
what do you mean by use-latency here? I understood that PEBS was able
to report the number of core cycles that the load took to from some part
of the memory hierarchy until it reached the CPU.
Thank you.
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
2014-02-28 13:57 ` Harald Servat
@ 2014-02-28 16:45 ` Andi Kleen
2014-03-03 5:12 ` Muthusamy
0 siblings, 1 reply; 10+ messages in thread
From: Andi Kleen @ 2014-02-28 16:45 UTC (permalink / raw)
To: Harald Servat; +Cc: Andi Kleen, linux-perf-users
> what do you mean by use-latency here? I understood that PEBS was
> able to report the number of core cycles that the load took to from
> some part of the memory hierarchy until it reached the CPU.
PEBS load/store latency reports the cycles from when the instruction
started issuing in the pipeline to the return of the value. That's not
quite the same, it can be much longer than the pure memory hierarchy
cost.
-Andi
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
2014-02-28 16:45 ` Andi Kleen
@ 2014-03-03 5:12 ` Muthusamy
2014-03-03 7:50 ` Harald Servat
0 siblings, 1 reply; 10+ messages in thread
From: Muthusamy @ 2014-03-03 5:12 UTC (permalink / raw)
To: Harald Servat, Andi Kleen, linux-perf-users
> On 28/02/14 14:42, Andi Kleen wrote:
> I don't think r100b is a valid event on Westmere.
Thanks for the replies.
As Harald Servat and others pointed I am now trying "perf mem record"
[root@rafa cms]# ./perf mem record /bin/ls
invalid or unsupported event: 'cpu/mem-loads/pp'
Is this a issue with the (old) kernel version I am on or the CPU version or both.
Regards,
Muthusamy C
> On Friday, February 28, 2014 10:15 PM, Andi Kleen <andi@firstfloor.org> wrote:
> >> what do you mean by use-latency here? I understood that PEBS was
>> able to report the number of core cycles that the load took to from
>> some part of the memory hierarchy until it reached the CPU.
>
> PEBS load/store latency reports the cycles from when the instruction
> started issuing in the pipeline to the return of the value. That's not
> quite the same, it can be much longer than the pure memory hierarchy
> cost.
>
>
> -Andi
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-perf-users" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
2014-03-03 5:12 ` Muthusamy
@ 2014-03-03 7:50 ` Harald Servat
[not found] ` <1393851732.26401.YahooMailNeo@web125404.mail.ne1.yahoo.com>
0 siblings, 1 reply; 10+ messages in thread
From: Harald Servat @ 2014-03-03 7:50 UTC (permalink / raw)
To: Muthusamy, linux-perf-users
On 03/03/14 06:12, Muthusamy wrote:
>> On 28/02/14 14:42, Andi Kleen wrote:
>
>> I don't think r100b is a valid event on Westmere.
>
> Thanks for the replies.
> As Harald Servat and others pointed I am now trying "perf mem record"
>
> [root@rafa cms]# ./perf mem record /bin/ls
> invalid or unsupported event: 'cpu/mem-loads/pp'
>
> Is this a issue with the (old) kernel version I am on or the CPU version or both.
>
> Regards,
> Muthusamy C
I'm using
Linux laptop 3.11.0-12-generic #19-Ubuntu SMP Wed Oct 9 16:20:46 UTC
2013 x86_64 x86_64 x86_64 GNU/Linux
so I imagine that you're ok with respect to OS. Now I remember I had
to install some microcode module for my OS, though [2]. WRT hardware,
I'm running on top of a
vendor_id : GenuineIntel
cpu family : 6
model : 42
model name : Intel(R) Core(TM) i7-2760QM CPU @ 2.40GHz
stepping : 7
Besides of that, the Intel Manual [1] says in section 18.7.1.1 that
Nehalem (and I'd would guess Westmere, as it is a Nehalem successor)
supports PEBS, so it should work.
Best regards.
[1]
http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3b-part-2-manual.pdf
[2] http://packages.ubuntu.com/lucid/intel-microcode
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
[not found] ` <1393851732.26401.YahooMailNeo@web125404.mail.ne1.yahoo.com>
@ 2014-03-03 14:24 ` Harald Servat
2014-03-04 10:43 ` Muthusamy
0 siblings, 1 reply; 10+ messages in thread
From: Harald Servat @ 2014-03-03 14:24 UTC (permalink / raw)
To: Muthusamy, linux-perf-users
On 03/03/14 14:02, Muthusamy wrote:
>> On Monday, March 3, 2014 1:20 PM, Harald Servat <harald.servat@bsc.es> wrote:
>> so I imagine that you're ok with respect to OS.
>> Besides of that, the Intel Manual [1] says in section 18.7.1.1 that
>> Nehalem (and I'd would guess Westmere, as it is a Nehalem successor)
>> supports PEBS, so it should work.
>
>
> So, what am I missing. Is there something the kernel needs to be enabled for PEBS. If yes, how do I know if it is enabled.
>
> Regards,
> Muthusamy C
>
I notice that you provide on an earlier output a uname -a showed
2.6.32-xxxx but perf --version showed version 3.12.11, and they don't
match but should. Are you running on 2.6.x or on 3.12.x?
Regards
WARNING / LEGAL TEXT: This message is intended only for the use of the
individual or entity to which it is addressed and may contain
information which is privileged, confidential, proprietary, or exempt
from disclosure under applicable law. If you are not the intended
recipient or the person responsible for delivering the message to the
intended recipient, you are strictly prohibited from disclosing,
distributing, copying, or in any way using this message. If you have
received this communication in error, please notify the sender and
destroy and delete any copies you may have received.
http://www.bsc.es/disclaimer
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: unable to get load latency info
2014-03-03 14:24 ` Harald Servat
@ 2014-03-04 10:43 ` Muthusamy
0 siblings, 0 replies; 10+ messages in thread
From: Muthusamy @ 2014-03-04 10:43 UTC (permalink / raw)
To: Harald Servat, linux-perf-users
>>> On Monday, March 3, 2014 1:20 PM, Harald Servat
> <harald.servat@bsc.es> wrote:
> I notice that you provide on an earlier output a uname -a showed
> 2.6.32-xxxx but perf --version showed version 3.12.11, and they don't
> match but should. Are you running on 2.6.x or on 3.12.x?
After upgrading the kernel, it did work.
Thanks a lot for the support.
Regards,
Muthusamy C
> On Monday, March 3, 2014 7:54 PM, Harald Servat <harald.servat@bsc.es> wrote:
> > On 03/03/14 14:02, Muthusamy wrote:
>>> On Monday, March 3, 2014 1:20 PM, Harald Servat
> <harald.servat@bsc.es> wrote:
>>> so I imagine that you're ok with respect to OS.
>>> Besides of that, the Intel Manual [1] says in section 18.7.1.1
> that
>>> Nehalem (and I'd would guess Westmere, as it is a Nehalem
> successor)
>>> supports PEBS, so it should work.
>>
>>
>> So, what am I missing. Is there something the kernel needs to be enabled
> for PEBS. If yes, how do I know if it is enabled.
>>
>> Regards,
>> Muthusamy C
>>
>
> I notice that you provide on an earlier output a uname -a showed
> 2.6.32-xxxx but perf --version showed version 3.12.11, and they don't
> match but should. Are you running on 2.6.x or on 3.12.x?
>
>
> Regards
>
>
> WARNING / LEGAL TEXT: This message is intended only for the use of the
> individual or entity to which it is addressed and may contain
> information which is privileged, confidential, proprietary, or exempt
> from disclosure under applicable law. If you are not the intended
> recipient or the person responsible for delivering the message to the
> intended recipient, you are strictly prohibited from disclosing,
> distributing, copying, or in any way using this message. If you have
> received this communication in error, please notify the sender and
> destroy and delete any copies you may have received.
>
> http://www.bsc.es/disclaimer
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2014-03-04 10:43 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-02-28 6:24 unable to get load latency info Muthusamy
2014-02-28 10:43 ` Harald Servat
2014-02-28 11:06 ` Manuel Selva
2014-02-28 13:42 ` Andi Kleen
2014-02-28 13:57 ` Harald Servat
2014-02-28 16:45 ` Andi Kleen
2014-03-03 5:12 ` Muthusamy
2014-03-03 7:50 ` Harald Servat
[not found] ` <1393851732.26401.YahooMailNeo@web125404.mail.ne1.yahoo.com>
2014-03-03 14:24 ` Harald Servat
2014-03-04 10:43 ` Muthusamy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.