* question about stalls in perf
@ 2013-02-14 0:08 Yunqi Zhang
2013-02-14 1:44 ` William Cohen
0 siblings, 1 reply; 3+ messages in thread
From: Yunqi Zhang @ 2013-02-14 0:08 UTC (permalink / raw)
To: linux-perf-users
Hi all,
Recently, I'm using perf to do some profiling work on SandyBridge.
And I found two events stalled-cycles-frontend and stalled-cycles-backend
very interesting, while I'm not sure what are their accurate definitions.
So my question is which hardware counters on SandyBridge are used to
calculate these two events and how (an equation would be perfect).
Furthermore, I was wondering if it is possible for someone to tell
me in which file this calculation processes in the source code of perf.
Thanks a lot!
Regards,
Yunqi
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: question about stalls in perf
2013-02-14 0:08 question about stalls in perf Yunqi Zhang
@ 2013-02-14 1:44 ` William Cohen
2013-02-14 5:45 ` Yunqi Zhang
0 siblings, 1 reply; 3+ messages in thread
From: William Cohen @ 2013-02-14 1:44 UTC (permalink / raw)
To: Yunqi Zhang; +Cc: linux-perf-users
On 02/13/2013 07:08 PM, Yunqi Zhang wrote:
> Hi all,
>
> Recently, I'm using perf to do some profiling work on SandyBridge.
>
> And I found two events stalled-cycles-frontend and stalled-cycles-backend
> very interesting, while I'm not sure what are their accurate definitions.
> So my question is which hardware counters on SandyBridge are used to
> calculate these two events and how (an equation would be perfect).
> Furthermore, I was wondering if it is possible for someone to tell
> me in which file this calculation processes in the source code of perf.
>
> Thanks a lot!
>
> Regards,
> Yunqi
Hi Yunqi,
It is probably best to find out which specific code are being used to set up counters for those events. This can be found around the following line of code in the kernel for sandybridge:
http://lxr.linux.no/#linux+v3.7.7/arch/x86/kernel/cpu/perf_event_intel.c#L2069
2068 /* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
2069 intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
2070 X86_CONFIG(.event=0x0e, .umask=0x01, .inv=1, .cmask=1);
2071 /* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
2072 intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
2073 X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1, .cmask=1);
The first event counts the number of cycles no ops are issued to the queue. The
The events are described in the Intel® 64 and IA-32 Architectures Software Developer's Manual
Combined Volumes 3A, 3B, and 3C: System Programming Guide, Parts 1 and 2 available and the Architecture Optimization Reference Manual from:
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html
Table 19-6 of volume 3 (Non-Architectural Performance Events In the Processor Core Common to 2nd Generation Intel® CoreTM i7-2xxx, Intel® CoreTM i5-2xxx, Intel® CoreTM i3-2xxx Processor Series and Intel® Xeon® Processors E5 Family) describes the event for 0x0e and 0xb1.
Chapter 2.1.1 of the Architecture optimizaiton manual describes the sandybridge pipeline. And B.3.2 "Hierarchical Top-Down Performance Characterization Methodology and Locating Performance Bottlenecks" in the optimization manual describes front end and back end stalls.
-Will
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: question about stalls in perf
2013-02-14 1:44 ` William Cohen
@ 2013-02-14 5:45 ` Yunqi Zhang
0 siblings, 0 replies; 3+ messages in thread
From: Yunqi Zhang @ 2013-02-14 5:45 UTC (permalink / raw)
To: linux-perf-users
William Cohen <wcohen <at> redhat.com> writes:
>
> Hi Yunqi,
>
> It is probably best to find out which specific code are being used to set up
counters for those events. This
> can be found around the following line of code in the kernel for sandybridge:
>
> http://lxr.linux.no/#linux+v3.7.7/arch/x86/kernel/cpu/perf_event_intel.c#L2069
>
> 2068 /* UOPS_ISSUED.ANY,c=1,i=1 to count stall cycles */
> 2069
intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
> 2070 X86_CONFIG(.event=0x0e, .umask=0x01, .inv=1,
.cmask=1);
> 2071 /* UOPS_DISPATCHED.THREAD,c=1,i=1 to count stall cycles*/
> 2072
intel_perfmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
> 2073 X86_CONFIG(.event=0xb1, .umask=0x01, .inv=1,
.cmask=1);
>
> The first event counts the number of cycles no ops are issued to the queue.
The
>
> The events are described in the Intel® 64 and IA-32 Architectures Software
Developer's Manual
> Combined Volumes 3A, 3B, and 3C: System Programming Guide, Parts 1 and 2
available and the Architecture
> Optimization Reference Manual from:
>
> http://www.intel.com/content/www/us/en/processors/architectures-software-
developer-manuals.html
>
> Table 19-6 of volume 3 (Non-Architectural Performance Events In the Processor
Core Common to 2nd
> Generation Intel® CoreTM i7-2xxx, Intel® CoreTM i5-2xxx, Intel® CoreTM i3-2xxx
Processor Series
> and Intel® Xeon® Processors E5 Family) describes the event for 0x0e and 0xb1.
>
> Chapter 2.1.1 of the Architecture optimizaiton manual describes the
sandybridge pipeline. And B.3.2
> "Hierarchical Top-Down Performance Characterization Methodology and Locating
Performance
> Bottlenecks" in the optimization manual describes front end and back end
stalls.
>
> -Will
>
Thanks a lot Will! Happy Valentine's Day!
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2013-02-14 5:46 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-14 0:08 question about stalls in perf Yunqi Zhang
2013-02-14 1:44 ` William Cohen
2013-02-14 5:45 ` Yunqi Zhang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.