linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joe Mario <jmario@redhat.com>
To: Leo Yan <leo.yan@linaro.org>, Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ali Saidi <alisaidi@amazon.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org, german.gomez@arm.com,
	benh@kernel.crashing.org, Nick.Forrington@arm.com,
	alexander.shishkin@linux.intel.com, andrew.kilroy@arm.com,
	james.clark@arm.com, john.garry@huawei.com,
	Jiri Olsa <jolsa@kernel.org>,
	kjain@linux.ibm.com, lihuafei1@huawei.com, mark.rutland@arm.com,
	mathieu.poirier@linaro.org, mingo@redhat.com,
	namhyung@kernel.org, peterz@infradead.org, will@kernel.org
Subject: Re: [PATCH v8 0/4] perf: arm-spe: Decode SPE source and use for perf c2c
Date: Thu, 19 May 2022 11:16:53 -0400	[thread overview]
Message-ID: <32e5a3b7-9294-bbd5-0ae4-b5c04eb4e0e6@redhat.com> (raw)
In-Reply-To: <20220518041630.GD402837@leoy-ThinkPad-X240s>



On 5/18/22 12:16 AM, Leo Yan wrote:
> Hi Joe,
> 
> On Tue, May 17, 2022 at 06:20:03PM -0300, Arnaldo Carvalho de Melo wrote:
>> Em Tue, May 17, 2022 at 02:03:21AM +0000, Ali Saidi escreveu:
>>> When synthesizing data from SPE, augment the type with source information
>>> for Arm Neoverse cores so we can detect situtions like cache line
>>> contention and transfers on Arm platforms. 
>>>
>>> This changes enables future changes to c2c on a system with SPE where lines that
>>> are shared among multiple cores show up in perf c2c output. 
>>>
>>> Changes is v9:
>>>  * Change reporting of remote socket data which should make Leo's upcomping
>>>    patch set for c2c make sense on multi-socket platforms  
>>
>> Hey,
>>
>> 	Joe Mario, who is one of 'perf c2c' authors asked me about some
>> git tree he could clone from for both building the kernel and
>> tools/perf/ so that he could do tests, can you please provide that?
> 
> I have uploaded the latest patches for enabling 'perf c2c' on Arm SPE
> on the repo:
> 
> https://git.linaro.org/people/leo.yan/linux-spe.git branch: perf_c2c_arm_spe_peer_v3
> 
> Below are the quick notes for build the kernel with enabling Arm SPE:
> 
>   $ git clone -b perf_c2c_arm_spe_peer_v3 https://git.linaro.org/people/leo.yan/linux-spe.git
> 
>   Or
> 
>   $ git clone -b perf_c2c_arm_spe_peer_v3 ssh://git@git.linaro.org/people/leo.yan/linux-spe.git
> 
>   $ cd linux-spe
> 
>   # Build kernel
>   $ make defconfig
>   $ ./scripts/config -e CONFIG_PID_IN_CONTEXTIDR
>   $ ./scripts/config -e CONFIG_ARM_SPE_PMU
>   $ make Image
> 
>   # Build perf
>   $ cd tools/perf
>   $ make VF=1 DEBUG=1
> 
> When boot the kernel, please add option "kpti=off" in kernel command
> line, you might need to update grub menu for this.
> 
> Please feel free let us know if anything is not clear for you.
> 
> Thank you,
> Leo
> 

Hi Leo:
Thanks for getting this working on ARM.  I do have a few comments.

I built and ran this on a ARM Neoverse-N1 system with 2 numa nodes.  

Comment 1:
When I run "perf c2c report", the "Node" field is marked "N/A".  It's supposed to show the numa node where the data address for the cacheline resides.  That's important both to see what node hot data resides on and if that data is getting lots of cross-numa accesses. 

Comment 2:
I'm assuming you're identifying the contended cachelines using the "peer" load response, which indicates the load was resolved from a "peer" cpu's cacheline.  Please confirm.
If that's true, is it possible to identify if that "peer" response was on the local or remote numa node?  

I ask because being able to identify both local and remote HitM's on Intel X86_64 has been quite valuable.  That's because remote HitM's are costly and because it helps the viewer see if they need to optimize their cpu affinity or what node their hot data resides on.

Last Comment:
There's a row in the Pareto table that has incorrect column alignment.
Look at row 80 below in the truncated snipit of output.  It has an extra field inserted in it at the beginning.
I also show what the corrected output should look like.

Incorrect row 80:
    71	=================================================
    72	      Shared Cache Line Distribution Pareto      
    73	=================================================
    74	#
    75	# ----- HITM -----    Snoop  ------- Store Refs ------  ------- CL --------                      
    76	# RmtHitm  LclHitm     Peer   L1 Hit  L1 Miss      N/A    Off  Node  PA cnt        Code address
    77	# .......  .......  .......  .......  .......  .......  .....  ....  ......  ..................
    78	#
    79	  -------------------------------------------------------------------------------
    80	      0        0        0     4648        0        0    11572            0x422140
    81	  -------------------------------------------------------------------------------
    82	    0.00%    0.00%    0.00%    0.00%    0.00%   44.47%    0x0   N/A       0            0x400ce8
    83	    0.00%    0.00%   10.26%    0.00%    0.00%    0.00%    0x0   N/A       0            0x400e48
    84	    0.00%    0.00%    0.00%    0.00%    0.00%   55.53%    0x0   N/A       0            0x400e54
    85	    0.00%    0.00%   89.74%    0.00%    0.00%    0.00%    0x8   N/A       0            0x401038


Corrected row 80:
    71	=================================================
    72	      Shared Cache Line Distribution Pareto      
    73	=================================================
    74	#
    75	# ----- HITM -----    Snoop  ------- Store Refs -----   ------- CL --------                       
    76	# RmtHitm  LclHitm     Peer   L1 Hit  L1 Miss     N/A     Off  Node  PA cnt        Code address
    77	# .......  .......  .......  .......  .......  ......   .....  ....  ......  ..................
    78	#
    79	  -------------------------------------------------------------------------------
    80	       0        0     4648        0        0    11572            0x422140
    81	  -------------------------------------------------------------------------------
    82	    0.00%    0.00%    0.00%    0.00%    0.00%   44.47%    0x0   N/A       0            0x400ce8
    83	    0.00%    0.00%   10.26%    0.00%    0.00%    0.00%    0x0   N/A       0            0x400e48
    84	    0.00%    0.00%    0.00%    0.00%    0.00%   55.53%    0x0   N/A       0            0x400e54
    85	    0.00%    0.00%   89.74%    0.00%    0.00%    0.00%    0x8   N/A       0            0x401038
       
Thanks again for doing this.
Joe


  reply	other threads:[~2022-05-19 15:17 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-17  2:03 [PATCH v8 0/4] perf: arm-spe: Decode SPE source and use for perf c2c Ali Saidi
2022-05-17  2:03 ` [PATCH v9 1/5] perf: Add SNOOP_PEER flag to perf mem data struct Ali Saidi
2022-05-17  2:03 ` [PATCH v9 2/5] perf tools: sync addition of PERF_MEM_SNOOPX_PEER Ali Saidi
2022-05-17  2:03 ` [PATCH v9 3/5] perf mem: Print snoop peer flag Ali Saidi
2022-05-17  2:03 ` [PATCH v9 4/5] perf arm-spe: Don't set data source if it's not a memory operation Ali Saidi
2022-06-17 19:41   ` Arnaldo Carvalho de Melo
2022-05-17  2:03 ` [PATCH v9 5/5] perf arm-spe: Use SPE data source for neoverse cores Ali Saidi
2022-05-17 21:20 ` [PATCH v8 0/4] perf: arm-spe: Decode SPE source and use for perf c2c Arnaldo Carvalho de Melo
2022-05-18  1:06   ` Leo Yan
2022-05-18  4:16   ` Leo Yan
2022-05-19 15:16     ` Joe Mario [this message]
2022-05-22  6:15       ` Leo Yan
2022-05-23 17:24         ` Joe Mario
2022-05-26 14:44           ` Leo Yan
  -- strict thread matches above, loose matches on Subject: below --
2022-05-04 18:48 Ali Saidi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=32e5a3b7-9294-bbd5-0ae4-b5c04eb4e0e6@redhat.com \
    --to=jmario@redhat.com \
    --cc=Nick.Forrington@arm.com \
    --cc=acme@kernel.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alisaidi@amazon.com \
    --cc=andrew.kilroy@arm.com \
    --cc=benh@kernel.crashing.org \
    --cc=german.gomez@arm.com \
    --cc=james.clark@arm.com \
    --cc=john.garry@huawei.com \
    --cc=jolsa@kernel.org \
    --cc=kjain@linux.ibm.com \
    --cc=leo.yan@linaro.org \
    --cc=lihuafei1@huawei.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.poirier@linaro.org \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).