* [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data @ 2022-05-22 22:14 Tyson Thomas 2022-05-24 14:58 ` Liang, Kan 0 siblings, 1 reply; 5+ messages in thread From: Tyson Thomas @ 2022-05-22 22:14 UTC (permalink / raw) To: kan.liang, linux-kernel, linux-perf-users Hi Kan, Linux-Perf Team I have observed some odd behaviour within perf when using perf-mem. Specifically the reported latency under the weight column appears to be unreasonably high. Here is a given sample from a recent test, I find that some of the latencies are close to an unsigned short and I cannot seem to be understand why that would be outside of it being an issue with the perf events. This can be replicated using a NAS benchmark, specifically cg.D. I observe the following results in perf mem report (just getting the top 10 results) 0.02% ,62515 ,L1 or L1 hit 0.02% ,54048 ,L1 or L1 hit 0.02% ,52206 ,L1 or L1 hit 0.02% ,49831 ,L1 or L1 hit 0.02% ,49056 ,Local RAM or RAM hit 0.01% ,40666 ,LFB or LFB hit 0.01% ,38080 ,L1 or L1 hit 0.01% ,36772 ,L1 or L1 hit 0.01% ,36729 ,LFB or LFB hit 0.01% ,27101 ,LFB or LFB hit Is it possible for someone to shed some light on this or am I misunderstanding how the weight column is used here? This appears to have been an issue on 5.4, 5.10 and 5.15. I am looking into seeing if it is still present in 5.17 and 5.18. I've also tried this on different Intel CPUs such as Intel Xeon 6230, i5-1135G7, Intel Xeon 6330 Any insight or help would be appreciated, Tyson ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data 2022-05-22 22:14 [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data Tyson Thomas @ 2022-05-24 14:58 ` Liang, Kan 2022-06-01 5:03 ` Tyson Thomas 0 siblings, 1 reply; 5+ messages in thread From: Liang, Kan @ 2022-05-24 14:58 UTC (permalink / raw) To: Tyson Thomas, linux-kernel, linux-perf-users On 5/22/2022 6:14 PM, Tyson Thomas wrote: > Hi Kan, Linux-Perf Team > > I have observed some odd behaviour within perf when using perf-mem. Specifically the reported latency under the weight column appears to be unreasonably high. > > Here is a given sample from a recent test, I find that some of the latencies are close to an unsigned short and I cannot seem to be understand why that would be outside of it being an issue with the perf events. > > This can be replicated using a NAS benchmark, specifically cg.D. > > I observe the following results in perf mem report (just getting the top 10 results) > Could you please show me the exact perf command used? With my perf mem report, The first column is the Overhead. The second column is the number of samples. The third column is the weight. The fourth is the Memory access. Seems like the weight is missed? Could you please check the perf report -D? It will dump the weight for each Sample. Does it look correct? Thanks, Kan > 0.02% ,62515 ,L1 or L1 hit > 0.02% ,54048 ,L1 or L1 hit > 0.02% ,52206 ,L1 or L1 hit > 0.02% ,49831 ,L1 or L1 hit > 0.02% ,49056 ,Local RAM or RAM hit > 0.01% ,40666 ,LFB or LFB hit > 0.01% ,38080 ,L1 or L1 hit > 0.01% ,36772 ,L1 or L1 hit > 0.01% ,36729 ,LFB or LFB hit > 0.01% ,27101 ,LFB or LFB hit > > Is it possible for someone to shed some light on this or am I misunderstanding how the weight column is used here? > This appears to have been an issue on 5.4, 5.10 and 5.15. I am looking into seeing if it is still present in 5.17 and 5.18. > > I've also tried this on different Intel CPUs such as Intel Xeon 6230, i5-1135G7, Intel Xeon 6330 > > Any insight or help would be appreciated, > Tyson ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data 2022-05-24 14:58 ` Liang, Kan @ 2022-06-01 5:03 ` Tyson Thomas 2022-06-03 14:03 ` Liang, Kan 0 siblings, 1 reply; 5+ messages in thread From: Tyson Thomas @ 2022-06-01 5:03 UTC (permalink / raw) To: Liang, Kan, linux-kernel, linux-perf-users Hi Kan, Thanks for the reply. The exact command used was: perf mem report The columns are in order: Overhead, Samples, Local Weight, Memory Access. The results I sent before are: Overhead, Local Weight and Memory. So, to update the table from before, each tuple has a sample of 1, so it is: 0.02%,1 ,62515 ,L1 or L1 hit 0.02%,1 ,54048 ,L1 or L1 hit 0.02%,1 ,52206 ,L1 or L1 hit 0.02%,1 ,49831 ,L1 or L1 hit 0.02%,1 ,49056 ,Local RAM or RAM hit 0.01%,1 ,40666 ,LFB or LFB hit 0.01%,1 ,38080 ,L1 or L1 hit 0.01%,1 ,36772 ,L1 or L1 hit So, the weights are absurdly high here. I'll give a reply on perf report -D shortly. Regards, Tyson From: Liang, Kan <kan.liang@linux.intel.com> Sent: Wednesday, 25 May 2022 12:58 AM To: Tyson Thomas <tyson.thomas@sydney.edu.au>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-perf-users@vger.kernel.org <linux-perf-users@vger.kernel.org> Subject: Re: [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data On 5/22/2022 6:14 PM, Tyson Thomas wrote: > Hi Kan, Linux-Perf Team > > I have observed some odd behaviour within perf when using perf-mem. Specifically the reported latency under the weight column appears to be unreasonably high. > > Here is a given sample from a recent test, I find that some of the latencies are close to an unsigned short and I cannot seem to be understand why that would be outside of it being an issue with the perf events. > > This can be replicated using a NAS benchmark, specifically cg.D. > > I observe the following results in perf mem report (just getting the top 10 results) > Could you please show me the exact perf command used? With my perf mem report, The first column is the Overhead. The second column is the number of samples. The third column is the weight. The fourth is the Memory access. Seems like the weight is missed? Could you please check the perf report -D? It will dump the weight for each Sample. Does it look correct? Thanks, Kan > 0.02% ,62515 ,L1 or L1 hit > 0.02% ,54048 ,L1 or L1 hit > 0.02% ,52206 ,L1 or L1 hit > 0.02% ,49831 ,L1 or L1 hit > 0.02% ,49056 ,Local RAM or RAM hit > 0.01% ,40666 ,LFB or LFB hit > 0.01% ,38080 ,L1 or L1 hit > 0.01% ,36772 ,L1 or L1 hit > 0.01% ,36729 ,LFB or LFB hit > 0.01% ,27101 ,LFB or LFB hit > > Is it possible for someone to shed some light on this or am I misunderstanding how the weight column is used here? > This appears to have been an issue on 5.4, 5.10 and 5.15. I am looking into seeing if it is still present in 5.17 and 5.18. > > I've also tried this on different Intel CPUs such as Intel Xeon 6230, i5-1135G7, Intel Xeon 6330 > > Any insight or help would be appreciated, > Tyson ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data 2022-06-01 5:03 ` Tyson Thomas @ 2022-06-03 14:03 ` Liang, Kan 2022-06-07 20:39 ` Tyson Thomas 0 siblings, 1 reply; 5+ messages in thread From: Liang, Kan @ 2022-06-03 14:03 UTC (permalink / raw) To: Tyson Thomas, linux-kernel, linux-perf-users On 6/1/2022 1:03 AM, Tyson Thomas wrote: > Hi Kan, > > Thanks for the reply. > > The exact command used was: perf mem report > > The columns are in order: Overhead, Samples, Local Weight, Memory Access. > The results I sent before are: Overhead, Local Weight and Memory. > > So, to update the table from before, each tuple has a sample of 1, so it is: > > 0.02%,1 ,62515 ,L1 or L1 hit > 0.02%,1 ,54048 ,L1 or L1 hit > 0.02%,1 ,52206 ,L1 or L1 hit > 0.02%,1 ,49831 ,L1 or L1 hit > 0.02%,1 ,49056 ,Local RAM or RAM hit > 0.01%,1 ,40666 ,LFB or LFB hit > 0.01%,1 ,38080 ,L1 or L1 hit > 0.01%,1 ,36772 ,L1 or L1 hit > > So, the weights are absurdly high here. > > I'll give a reply on perf report -D shortly. > > Regards, > Tyson > > > > From: Liang, Kan <kan.liang@linux.intel.com> > Sent: Wednesday, 25 May 2022 12:58 AM > To: Tyson Thomas <tyson.thomas@sydney.edu.au>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-perf-users@vger.kernel.org <linux-perf-users@vger.kernel.org> > Subject: Re: [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data > > > > On 5/22/2022 6:14 PM, Tyson Thomas wrote: >> Hi Kan, Linux-Perf Team >> >> I have observed some odd behaviour within perf when using perf-mem. Specifically the reported latency under the weight column appears to be unreasonably high. >> >> Here is a given sample from a recent test, I find that some of the latencies are close to an unsigned short and I cannot seem to be understand why that would be outside of it being an issue with the perf events. >> >> This can be replicated using a NAS benchmark, specifically cg.D. I tried the cg.D benchmark on my machine, but I cannot reproduce the issue. The weight looks reasonable. $ perf mem record ./bin/cg.D.x NAS Parallel Benchmarks (NPB3.4-OMP) - CG Benchmark Size: 1500000 Iterations: 100 Number of available threads: 4 ^C[ perf record: Woken up 118 times to write data ] [ perf record: Captured and wrote 30.108 MB perf.data (438307 samples) ] $ perf mem report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 209K of event 'cpu/mem-loads,ldlat=30/Pu' # Total weight : 56835744 # Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,p_stage_cyc # # Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol > # ........ ............ ............ ........................ ................................ .................... ............> # 0.01% 66 53 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.01% 53 58 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.01% 59 52 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 53 53 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 43 65 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 50 55 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 51 50 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 46 55 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 49 51 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 1 2496 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 48 52 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 1 2441 Local RAM or RAM hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2420 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2415 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2403 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2399 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2390 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2385 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2385 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2378 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2358 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> Thanks, Kan >> >> I observe the following results in perf mem report (just getting the top 10 results) >> > > > Could you please show me the exact perf command used? > > With my perf mem report, > The first column is the Overhead. > The second column is the number of samples. > The third column is the weight. > The fourth is the Memory access. > > Seems like the weight is missed? > > Could you please check the perf report -D? > It will dump the weight for each Sample. > Does it look correct? > > Thanks, > Kan > >> 0.02% ,62515 ,L1 or L1 hit >> 0.02% ,54048 ,L1 or L1 hit >> 0.02% ,52206 ,L1 or L1 hit >> 0.02% ,49831 ,L1 or L1 hit >> 0.02% ,49056 ,Local RAM or RAM hit >> 0.01% ,40666 ,LFB or LFB hit >> 0.01% ,38080 ,L1 or L1 hit >> 0.01% ,36772 ,L1 or L1 hit >> 0.01% ,36729 ,LFB or LFB hit >> 0.01% ,27101 ,LFB or LFB hit >> >> Is it possible for someone to shed some light on this or am I misunderstanding how the weight column is used here? >> This appears to have been an issue on 5.4, 5.10 and 5.15. I am looking into seeing if it is still present in 5.17 and 5.18. >> >> I've also tried this on different Intel CPUs such as Intel Xeon 6230, i5-1135G7, Intel Xeon 6330 >> >> Any insight or help would be appreciated, >> Tyson ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data 2022-06-03 14:03 ` Liang, Kan @ 2022-06-07 20:39 ` Tyson Thomas 0 siblings, 0 replies; 5+ messages in thread From: Tyson Thomas @ 2022-06-07 20:39 UTC (permalink / raw) To: Liang, Kan, linux-kernel, linux-perf-users Hi Kan, Thanks for the reply and sharing your output. However, I have a couple of questions since the output is not what I have expected and appears odd. * Are you using the OpenMP version? I ask because I don't observe any openMP symbols in your output * Does your computer have enough ram for cg.D? (Your output does not include the iteration results and the perf file is smaller than expected) I observe a similar issue with cg.C.x (cg.C memory requirements are around 8GB) Regards, Tyson From: Liang, Kan <kan.liang@linux.intel.com> Sent: Saturday, 4 June 2022 12:03 AM To: Tyson Thomas <tyson.thomas@sydney.edu.au>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-perf-users@vger.kernel.org <linux-perf-users@vger.kernel.org> Subject: Re: [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data On 6/1/2022 1:03 AM, Tyson Thomas wrote: > Hi Kan, > > Thanks for the reply. > > The exact command used was: perf mem report > > The columns are in order: Overhead, Samples, Local Weight, Memory Access. > The results I sent before are: Overhead, Local Weight and Memory. > > So, to update the table from before, each tuple has a sample of 1, so it is: > > 0.02%,1 ,62515 ,L1 or L1 hit > 0.02%,1 ,54048 ,L1 or L1 hit > 0.02%,1 ,52206 ,L1 or L1 hit > 0.02%,1 ,49831 ,L1 or L1 hit > 0.02%,1 ,49056 ,Local RAM or RAM hit > 0.01%,1 ,40666 ,LFB or LFB hit > 0.01%,1 ,38080 ,L1 or L1 hit > 0.01%,1 ,36772 ,L1 or L1 hit > > So, the weights are absurdly high here. > > I'll give a reply on perf report -D shortly. > > Regards, > Tyson > > > > From: Liang, Kan <kan.liang@linux.intel.com> > Sent: Wednesday, 25 May 2022 12:58 AM > To: Tyson Thomas <tyson.thomas@sydney.edu.au>; linux-kernel@vger.kernel.org <linux-kernel@vger.kernel.org>; linux-perf-users@vger.kernel.org <linux-perf-users@vger.kernel.org> > Subject: Re: [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data > > > > On 5/22/2022 6:14 PM, Tyson Thomas wrote: >> Hi Kan, Linux-Perf Team >> >> I have observed some odd behaviour within perf when using perf-mem. Specifically the reported latency under the weight column appears to be unreasonably high. >> >> Here is a given sample from a recent test, I find that some of the latencies are close to an unsigned short and I cannot seem to be understand why that would be outside of it being an issue with the perf events. >> >> This can be replicated using a NAS benchmark, specifically cg.D. I tried the cg.D benchmark on my machine, but I cannot reproduce the issue. The weight looks reasonable. $ perf mem record ./bin/cg.D.x NAS Parallel Benchmarks (NPB3.4-OMP) - CG Benchmark Size: 1500000 Iterations: 100 Number of available threads: 4 ^C[ perf record: Woken up 118 times to write data ] [ perf record: Captured and wrote 30.108 MB perf.data (438307 samples) ] $ perf mem report --stdio # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 209K of event 'cpu/mem-loads,ldlat=30/Pu' # Total weight : 56835744 # Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,p_stage_cyc # # Overhead Samples Local Weight Memory access Symbol Shared Object Data Symbol > # ........ ............ ............ ........................ ................................ .................... ............> # 0.01% 66 53 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.01% 53 58 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.01% 59 52 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 53 53 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 43 65 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 50 55 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 51 50 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 46 55 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 49 51 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 1 2496 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 48 52 L1 or L1 hit [.] randlc_ cg.D.x [.] 0x00007f> 0.00% 1 2441 Local RAM or RAM hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2420 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2415 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2403 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2399 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2390 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2385 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2385 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2378 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> 0.00% 1 2358 LFB or LFB hit [.] sparse_ cg.D.x [.] 0x00007f> Thanks, Kan >> >> I observe the following results in perf mem report (just getting the top 10 results) >> > > > Could you please show me the exact perf command used? > > With my perf mem report, > The first column is the Overhead. > The second column is the number of samples. > The third column is the weight. > The fourth is the Memory access. > > Seems like the weight is missed? > > Could you please check the perf report -D? > It will dump the weight for each Sample. > Does it look correct? > > Thanks, > Kan > >> 0.02% ,62515 ,L1 or L1 hit >> 0.02% ,54048 ,L1 or L1 hit >> 0.02% ,52206 ,L1 or L1 hit >> 0.02% ,49831 ,L1 or L1 hit >> 0.02% ,49056 ,Local RAM or RAM hit >> 0.01% ,40666 ,LFB or LFB hit >> 0.01% ,38080 ,L1 or L1 hit >> 0.01% ,36772 ,L1 or L1 hit >> 0.01% ,36729 ,LFB or LFB hit >> 0.01% ,27101 ,LFB or LFB hit >> >> Is it possible for someone to shed some light on this or am I misunderstanding how the weight column is used here? >> This appears to have been an issue on 5.4, 5.10 and 5.15. I am looking into seeing if it is still present in 5.17 and 5.18. >> >> I've also tried this on different Intel CPUs such as Intel Xeon 6230, i5-1135G7, Intel Xeon 6330 >> >> Any insight or help would be appreciated, >> Tyson ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-06-08 1:09 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-05-22 22:14 [BUG REPORT] perf tools: x86_64: weight column displays odd memory latency data Tyson Thomas 2022-05-24 14:58 ` Liang, Kan 2022-06-01 5:03 ` Tyson Thomas 2022-06-03 14:03 ` Liang, Kan 2022-06-07 20:39 ` Tyson Thomas
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.