From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 46664C433EF for ; Sun, 13 Mar 2022 19:07:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235489AbiCMTIj (ORCPT ); Sun, 13 Mar 2022 15:08:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234705AbiCMTIb (ORCPT ); Sun, 13 Mar 2022 15:08:31 -0400 Received: from smtp-fw-9102.amazon.com (smtp-fw-9102.amazon.com [207.171.184.29]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CBF5652FF; Sun, 13 Mar 2022 12:07:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1647198423; x=1678734423; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3qnpGEDwzlBqk4oSVj6XhjSrpezFHas77xLMut5hfbw=; b=UfMlsu89OEVit3eW3OQB30iBfBn29f73RvExYnKjKB09S1DKfZU9SUgG YTiQUsM2VL9IGe7g75GePTcvzq85/VnobMQin4p0HZb+2C8/5Lfm/lhCg sPUqG7rSjtqfNTRCt6/ZI4Jwa2f251th4FIZy1G9NJFYdyIAtIOEIlqX0 k=; X-IronPort-AV: E=Sophos;i="5.90,179,1643673600"; d="scan'208";a="201917501" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2b-718d0906.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP; 13 Mar 2022 19:06:24 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2b-718d0906.us-west-2.amazon.com (Postfix) with ESMTPS id 968C63E1840; Sun, 13 Mar 2022 19:06:24 +0000 (UTC) Received: from EX13D02UWB001.ant.amazon.com (10.43.161.240) by EX13MTAUWB001.ant.amazon.com (10.43.161.207) with Microsoft SMTP Server (TLS) id 15.0.1497.28; Sun, 13 Mar 2022 19:06:24 +0000 Received: from EX13MTAUEA001.ant.amazon.com (10.43.61.82) by EX13D02UWB001.ant.amazon.com (10.43.161.240) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Sun, 13 Mar 2022 19:06:23 +0000 Received: from dev-dsk-alisaidi-1d-b9a0e636.us-east-1.amazon.com (172.19.181.128) by mail-relay.amazon.com (10.43.61.243) with Microsoft SMTP Server id 15.0.1497.28 via Frontend Transport; Sun, 13 Mar 2022 19:06:23 +0000 Received: by dev-dsk-alisaidi-1d-b9a0e636.us-east-1.amazon.com (Postfix, from userid 5131138) id 5A23C17DB; Sun, 13 Mar 2022 19:06:22 +0000 (UTC) From: Ali Saidi To: CC: , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v2 1/2] perf arm-spe: Use SPE data source for neoverse cores Date: Sun, 13 Mar 2022 19:06:19 +0000 Message-ID: <20220313190619.18914-1-alisaidi@amazon.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220313114615.GA143848@leoy-ThinkPad-X240s> References: <20220313114615.GA143848@leoy-ThinkPad-X240s> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, 13 Mar 2022 11:47:58 +0000 Leo Yan wrote: > Hi Ali, German, > > On Wed, Mar 02, 2022 at 11:59:05AM +0000, German Gomez wrote: > > Hi Ali, > > > > On 21/02/2022 22:47, Ali Saidi wrote: > > > When synthesizing data from SPE, augment the type with source information > > > for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use > > > the same encoding. I can't find encoding information for any other SPE > > > implementations to unify their choices with Arm's thus that is left for > > > future work. [snip] > > > > > > +static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *record, > > > + union perf_mem_data_src *data_src) > > > +{ > > > + switch (record->source) { > > > + case ARM_SPE_NV_L1D: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > > I understand mem_lvl is deprecated but shouldn't we add the level bits here as well for backwards compat? > > Thanks for pointing out this. Yeah, I think German's suggestion is > valid, the commit 6ae5fa61d27d ("perf/x86: Fix data source decoding > for Skylake") introduces new field 'mem_lvl_num', but it also keeps > backwards compatible for the field 'mem_lvl'. I thought about that, but then I'm making some assumption about how to fit this into the old LVL framework, which is perhaps OK (afaik there are no Neoverse systems with more than 3 cache levels). What stopped me was that perf_mem__lvl_scnprintf() does the wrong thing when both are set so I assumed that setting both was not the right course of action. > > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_L1; > > > + break; > > > + case ARM_SPE_NV_L2: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_L2; > > > + break; > > > + case ARM_SPE_NV_PEER_CORE: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > + data_src->mem_snoop = PERF_MEM_SNOOP_HITM; > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE; > > For PEER_CORE data source, we don't know if it's coming from peer > core's L1 cache or L2 cache, right? We don't. > > If so, do you think if it's possible to retrieve more accurate info > from the field "record->type"? No, we just don't know and it really doesn't matter. The main reason to understand the source is to understand the penalty of data coming from the source and that it's coming from a core should be sufficient. > > > > + break; > > > + /* > > > + * We don't know if this is L1, L2, or even L3 (for the cases the system > > > + * has an L3, but we do know it was a cache-2-cache transfer, so set > > > + * SNOOP_HITM > > > + */ > > > + case ARM_SPE_NV_LCL_CLSTR: > > > + case ARM_SPE_NV_PEER_CLSTR: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > + data_src->mem_snoop = PERF_MEM_SNOOP_HITM; > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE; > > Seems to me, we need to add attribution to indicate the difference > between ARM_SPE_NV_PEER_CORE and ARM_SPE_NV_LCL_CLSTR. I don't think we really do, see my reasoning above. > > For ARM_SPE_NV_PEER_CLSTR data source, should we set any "remote" > attribution as well? No, we should leave remote for data coming from another chip/socket which is really impactful. > > > > + break; > > > + /* > > > + * System cache is assumed to be L4, as cluster cache (if it exists) > > > + * would be L3 cache on Neoverse platforms > > > + */ > > > + case ARM_SPE_NV_SYS_CACHE: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_L4; > > > + break; > > > + /* > > > + * We don't know what level it hit in, except it came from the other > > > + * socket > > > + */ > > > + case ARM_SPE_NV_REMOTE: > > > + data_src->mem_snoop = PERF_MEM_SNOOP_HITM; > > > + data_src->mem_remote = PERF_MEM_REMOTE_REMOTE; > > > + break; > > Just curious, is it possible that 'record->source' combines multiple > bits? Like we can get a data source value with: > > ARM_SPE_NV_REMOTE | ARM_SPE_NV_REMOTE source encodes a single value (not bits that represent flags) on Neoverse cores. [snip] > > > @@ -796,6 +868,10 @@ static int arm_spe_process_event(struct perf_session *session, > > > u64 timestamp; > > > struct arm_spe *spe = container_of(session->auxtrace, > > > struct arm_spe, auxtrace); > > > + const char *cpuid = perf_env__cpuid(session->evlist->env); > > > + u64 midr = strtol(cpuid, NULL, 16); > > > + > > > + spe->midr = midr; > > > > I think this midr setup belongs in the arm_spe_process_auxtrace_info callback instead. > > Yeah, arm_spe_process_event() would be invoked for multiple times for > processing perf events. arm_spe_process_auxtrace_info() would be a > good place to initialize midr. Will do. Thanks, Ali From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 840F8C433F5 for ; Sun, 13 Mar 2022 19:07:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=bZdb9sQW9UOhciQ2Juwa4MdNBCMnab4tLMIXAarI3lQ=; b=JrtIHK2RTaQIzM 2ORv373kjLniC4EikbxvPMjjlTZhbjK2qhUbMo5hQmhsPtwbtexMMLwD9n67pWGUwxbaHKfdAnfnx xTPhI4lRDagOt05DZxABNZeGUgH9dB8vnnVstfSuHgeRh7f9T95pnFCcPOnIcHAd/T0E7UEZ3KFqZ tqAn67lyCH2TfZyJy0izMaI6q1WAM7TMU+6FM9Mo3rM8S2CDxZWiMtkPpHne5z45xDOkkzS/qf4uT ufTrzFII23deXFWKgQ/zZC57cLkZ3X2k2BM7w1a6Ktbbw7M4T31TZ4jJ2CTL6pB66K+SwFaR2QG6E mWPBqJ9TBMkLTHqxEEcA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nTTXz-003H3O-R9; Sun, 13 Mar 2022 19:06:32 +0000 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1nTTXw-003H2E-Me for linux-arm-kernel@lists.infradead.org; Sun, 13 Mar 2022 19:06:30 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1647198388; x=1678734388; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3qnpGEDwzlBqk4oSVj6XhjSrpezFHas77xLMut5hfbw=; b=MHceslZWwfrqukGkDazf25Y1yN/T3dWZPlkuLU8D/RQ6qs327ykBR/4h Mf/h6Hv4teQXG0BI6UzHmReWYoyz0TdLShH3zjO+LIWCiLm7M77giw/T8 Ljv60r/Zdee7J0tf1PuHfrEpQELF30o+U+istHP02zldini/yEjYoB16f Q=; X-IronPort-AV: E=Sophos;i="5.90,179,1643673600"; d="scan'208";a="201917501" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2b-718d0906.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP; 13 Mar 2022 19:06:24 +0000 Received: from EX13MTAUWB001.ant.amazon.com (pdx1-ws-svc-p6-lb9-vlan2.pdx.amazon.com [10.236.137.194]) by email-inbound-relay-pdx-2b-718d0906.us-west-2.amazon.com (Postfix) with ESMTPS id 968C63E1840; Sun, 13 Mar 2022 19:06:24 +0000 (UTC) Received: from EX13D02UWB001.ant.amazon.com (10.43.161.240) by EX13MTAUWB001.ant.amazon.com (10.43.161.207) with Microsoft SMTP Server (TLS) id 15.0.1497.28; Sun, 13 Mar 2022 19:06:24 +0000 Received: from EX13MTAUEA001.ant.amazon.com (10.43.61.82) by EX13D02UWB001.ant.amazon.com (10.43.161.240) with Microsoft SMTP Server (TLS) id 15.0.1497.32; Sun, 13 Mar 2022 19:06:23 +0000 Received: from dev-dsk-alisaidi-1d-b9a0e636.us-east-1.amazon.com (172.19.181.128) by mail-relay.amazon.com (10.43.61.243) with Microsoft SMTP Server id 15.0.1497.28 via Frontend Transport; Sun, 13 Mar 2022 19:06:23 +0000 Received: by dev-dsk-alisaidi-1d-b9a0e636.us-east-1.amazon.com (Postfix, from userid 5131138) id 5A23C17DB; Sun, 13 Mar 2022 19:06:22 +0000 (UTC) From: Ali Saidi To: CC: , , , , , , , , , , , , , , , , , , , , Subject: Re: [PATCH v2 1/2] perf arm-spe: Use SPE data source for neoverse cores Date: Sun, 13 Mar 2022 19:06:19 +0000 Message-ID: <20220313190619.18914-1-alisaidi@amazon.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20220313114615.GA143848@leoy-ThinkPad-X240s> References: <20220313114615.GA143848@leoy-ThinkPad-X240s> MIME-Version: 1.0 Precedence: Bulk X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220313_120628_831727_88981C9E X-CRM114-Status: GOOD ( 37.04 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Sun, 13 Mar 2022 11:47:58 +0000 Leo Yan wrote: > Hi Ali, German, > > On Wed, Mar 02, 2022 at 11:59:05AM +0000, German Gomez wrote: > > Hi Ali, > > > > On 21/02/2022 22:47, Ali Saidi wrote: > > > When synthesizing data from SPE, augment the type with source information > > > for Arm Neoverse cores. The field is IMPLDEF but the Neoverse cores all use > > > the same encoding. I can't find encoding information for any other SPE > > > implementations to unify their choices with Arm's thus that is left for > > > future work. [snip] > > > > > > +static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *record, > > > + union perf_mem_data_src *data_src) > > > +{ > > > + switch (record->source) { > > > + case ARM_SPE_NV_L1D: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > > I understand mem_lvl is deprecated but shouldn't we add the level bits here as well for backwards compat? > > Thanks for pointing out this. Yeah, I think German's suggestion is > valid, the commit 6ae5fa61d27d ("perf/x86: Fix data source decoding > for Skylake") introduces new field 'mem_lvl_num', but it also keeps > backwards compatible for the field 'mem_lvl'. I thought about that, but then I'm making some assumption about how to fit this into the old LVL framework, which is perhaps OK (afaik there are no Neoverse systems with more than 3 cache levels). What stopped me was that perf_mem__lvl_scnprintf() does the wrong thing when both are set so I assumed that setting both was not the right course of action. > > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_L1; > > > + break; > > > + case ARM_SPE_NV_L2: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_L2; > > > + break; > > > + case ARM_SPE_NV_PEER_CORE: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > + data_src->mem_snoop = PERF_MEM_SNOOP_HITM; > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE; > > For PEER_CORE data source, we don't know if it's coming from peer > core's L1 cache or L2 cache, right? We don't. > > If so, do you think if it's possible to retrieve more accurate info > from the field "record->type"? No, we just don't know and it really doesn't matter. The main reason to understand the source is to understand the penalty of data coming from the source and that it's coming from a core should be sufficient. > > > > + break; > > > + /* > > > + * We don't know if this is L1, L2, or even L3 (for the cases the system > > > + * has an L3, but we do know it was a cache-2-cache transfer, so set > > > + * SNOOP_HITM > > > + */ > > > + case ARM_SPE_NV_LCL_CLSTR: > > > + case ARM_SPE_NV_PEER_CLSTR: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > + data_src->mem_snoop = PERF_MEM_SNOOP_HITM; > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE; > > Seems to me, we need to add attribution to indicate the difference > between ARM_SPE_NV_PEER_CORE and ARM_SPE_NV_LCL_CLSTR. I don't think we really do, see my reasoning above. > > For ARM_SPE_NV_PEER_CLSTR data source, should we set any "remote" > attribution as well? No, we should leave remote for data coming from another chip/socket which is really impactful. > > > > + break; > > > + /* > > > + * System cache is assumed to be L4, as cluster cache (if it exists) > > > + * would be L3 cache on Neoverse platforms > > > + */ > > > + case ARM_SPE_NV_SYS_CACHE: > > > + data_src->mem_lvl = PERF_MEM_LVL_HIT; > > > + data_src->mem_lvl_num = PERF_MEM_LVLNUM_L4; > > > + break; > > > + /* > > > + * We don't know what level it hit in, except it came from the other > > > + * socket > > > + */ > > > + case ARM_SPE_NV_REMOTE: > > > + data_src->mem_snoop = PERF_MEM_SNOOP_HITM; > > > + data_src->mem_remote = PERF_MEM_REMOTE_REMOTE; > > > + break; > > Just curious, is it possible that 'record->source' combines multiple > bits? Like we can get a data source value with: > > ARM_SPE_NV_REMOTE | ARM_SPE_NV_REMOTE source encodes a single value (not bits that represent flags) on Neoverse cores. [snip] > > > @@ -796,6 +868,10 @@ static int arm_spe_process_event(struct perf_session *session, > > > u64 timestamp; > > > struct arm_spe *spe = container_of(session->auxtrace, > > > struct arm_spe, auxtrace); > > > + const char *cpuid = perf_env__cpuid(session->evlist->env); > > > + u64 midr = strtol(cpuid, NULL, 16); > > > + > > > + spe->midr = midr; > > > > I think this midr setup belongs in the arm_spe_process_auxtrace_info callback instead. > > Yeah, arm_spe_process_event() would be invoked for multiple times for > processing perf events. arm_spe_process_auxtrace_info() would be a > good place to initialize midr. Will do. Thanks, Ali _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel