From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25C4FC433E0 for ; Mon, 13 Jul 2020 02:44:55 +0000 (UTC) Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3EA122070B for ; Mon, 13 Jul 2020 02:44:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3EA122070B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.vnet.ibm.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 4B4p03621YzDqWy for ; Mon, 13 Jul 2020 12:44:51 +1000 (AEST) Authentication-Results: lists.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=atrajeev@linux.vnet.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=fail (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4B4nqv0TmPzDqQs for ; Mon, 13 Jul 2020 12:37:46 +1000 (AEST) Received: from pps.filterd (m0098404.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 06D2WNa9134328; Sun, 12 Jul 2020 22:37:40 -0400 Received: from ppma02fra.de.ibm.com (47.49.7a9f.ip4.static.sl-reverse.com [159.122.73.71]) by mx0a-001b2d01.pphosted.com with ESMTP id 3279gmm14w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 12 Jul 2020 22:37:39 -0400 Received: from pps.filterd (ppma02fra.de.ibm.com [127.0.0.1]) by ppma02fra.de.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 06D2aZJ8029216; Mon, 13 Jul 2020 02:37:37 GMT Received: from b06cxnps4076.portsmouth.uk.ibm.com (d06relay13.portsmouth.uk.ibm.com [9.149.109.198]) by ppma02fra.de.ibm.com with ESMTP id 327527spxb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 13 Jul 2020 02:37:36 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 06D2aJur46596156 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 13 Jul 2020 02:36:19 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C52B3A4053; Mon, 13 Jul 2020 02:36:19 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C9FD0A4040; Mon, 13 Jul 2020 02:36:16 +0000 (GMT) Received: from [9.79.215.199] (unknown [9.79.215.199]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Mon, 13 Jul 2020 02:36:16 +0000 (GMT) From: Athira Rajeev Message-Id: <20BF4FCD-A6CF-45FB-8242-AA190FB739C7@linux.vnet.ibm.com> Content-Type: multipart/alternative; boundary="Apple-Mail=_8D00A3FD-A5AD-46AB-917F-8C89296A1ADF" Mime-Version: 1.0 (Mac OS X Mail 13.4 \(3608.80.23.2.2\)) Subject: Re: [PATCH v2 09/10] tools/perf: Add perf tools support for extended register capability in powerpc Date: Mon, 13 Jul 2020 08:06:13 +0530 In-Reply-To: <87pn962owo.fsf@mpe.ellerman.id.au> To: Michael Ellerman References: <1593595262-1433-1-git-send-email-atrajeev@linux.vnet.ibm.com> <1593595262-1433-10-git-send-email-atrajeev@linux.vnet.ibm.com> <87pn962owo.fsf@mpe.ellerman.id.au> X-Mailer: Apple Mail (2.3608.80.23.2.2) X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-07-12_14:2020-07-10, 2020-07-12 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 clxscore=1015 mlxscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 phishscore=0 lowpriorityscore=0 spamscore=0 impostorscore=0 adultscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2007130014 X-Mailman-Approved-At: Mon, 13 Jul 2020 12:43:13 +1000 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Michael Neuling , maddy@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" --Apple-Mail=_8D00A3FD-A5AD-46AB-917F-8C89296A1ADF Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=us-ascii > On 08-Jul-2020, at 5:34 PM, Michael Ellerman = wrote: >=20 > Athira Rajeev > writes: >> From: Anju T Sudhakar >>=20 >> Add extended regs to sample_reg_mask in the tool side to use >> with `-I?` option. Perf tools side uses extended mask to display >> the platform supported register names (with -I? option) to the user >> and also send this mask to the kernel to capture the extended = registers >> in each sample. Hence decide the mask value based on the processor >> version. >>=20 >> Signed-off-by: Anju T Sudhakar >> [Decide extended mask at run time based on platform] >> Signed-off-by: Athira Rajeev >> Reviewed-by: Madhavan Srinivasan >=20 > Will need an ack from perf tools folks, who are not on Cc by the = looks. >=20 >> diff --git a/tools/arch/powerpc/include/uapi/asm/perf_regs.h = b/tools/arch/powerpc/include/uapi/asm/perf_regs.h >> index f599064..485b1d5 100644 >> --- a/tools/arch/powerpc/include/uapi/asm/perf_regs.h >> +++ b/tools/arch/powerpc/include/uapi/asm/perf_regs.h >> @@ -48,6 +48,18 @@ enum perf_event_powerpc_regs { >> PERF_REG_POWERPC_DSISR, >> PERF_REG_POWERPC_SIER, >> PERF_REG_POWERPC_MMCRA, >> - PERF_REG_POWERPC_MAX, >> + /* Extended registers */ >> + PERF_REG_POWERPC_MMCR0, >> + PERF_REG_POWERPC_MMCR1, >> + PERF_REG_POWERPC_MMCR2, >> + /* Max regs without the extended regs */ >> + PERF_REG_POWERPC_MAX =3D PERF_REG_POWERPC_MMCRA + 1, >=20 > I don't really understand this idea of a max that's not the max. Hi Michael This is the MAX without extended regs. This is mainly used in = `arch/powerpc/perf/perf_regs.c` to define pt_regs_offset ( to get index for other regs ) and also is used to determine whether requested = register is an extended reg while capturing data in sample ( in `perf_reg_value` ) Thanks Athira >=20 >> }; >> + >> +#define PERF_REG_PMU_MASK ((1ULL << PERF_REG_POWERPC_MAX) - 1) >> + >> +/* PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */ >> +#define PERF_REG_PMU_MASK_300 (((1ULL << (PERF_REG_POWERPC_MMCR2 + = 1)) - 1) \ >> + - PERF_REG_PMU_MASK) >> + >> #endif /* _UAPI_ASM_POWERPC_PERF_REGS_H */ >> diff --git a/tools/perf/arch/powerpc/include/perf_regs.h = b/tools/perf/arch/powerpc/include/perf_regs.h >> index e18a355..46ed00d 100644 >> --- a/tools/perf/arch/powerpc/include/perf_regs.h >> +++ b/tools/perf/arch/powerpc/include/perf_regs.h >> @@ -64,7 +64,10 @@ >> [PERF_REG_POWERPC_DAR] =3D "dar", >> [PERF_REG_POWERPC_DSISR] =3D "dsisr", >> [PERF_REG_POWERPC_SIER] =3D "sier", >> - [PERF_REG_POWERPC_MMCRA] =3D "mmcra" >> + [PERF_REG_POWERPC_MMCRA] =3D "mmcra", >> + [PERF_REG_POWERPC_MMCR0] =3D "mmcr0", >> + [PERF_REG_POWERPC_MMCR1] =3D "mmcr1", >> + [PERF_REG_POWERPC_MMCR2] =3D "mmcr2", >> }; >>=20 >> static inline const char *perf_reg_name(int id) >> diff --git a/tools/perf/arch/powerpc/util/perf_regs.c = b/tools/perf/arch/powerpc/util/perf_regs.c >> index 0a52429..9179230 100644 >> --- a/tools/perf/arch/powerpc/util/perf_regs.c >> +++ b/tools/perf/arch/powerpc/util/perf_regs.c >> @@ -6,9 +6,14 @@ >>=20 >> #include "../../../util/perf_regs.h" >> #include "../../../util/debug.h" >> +#include "../../../util/event.h" >> +#include "../../../util/header.h" >> +#include "../../../perf-sys.h" >>=20 >> #include >>=20 >> +#define PVR_POWER9 0x004E >> + >> const struct sample_reg sample_reg_masks[] =3D { >> SMPL_REG(r0, PERF_REG_POWERPC_R0), >> SMPL_REG(r1, PERF_REG_POWERPC_R1), >> @@ -55,6 +60,9 @@ >> SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR), >> SMPL_REG(sier, PERF_REG_POWERPC_SIER), >> SMPL_REG(mmcra, PERF_REG_POWERPC_MMCRA), >> + SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0), >> + SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1), >> + SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2), >> SMPL_REG_END >> }; >>=20 >> @@ -163,3 +171,50 @@ int arch_sdt_arg_parse_op(char *old_op, char = **new_op) >>=20 >> return SDT_ARG_VALID; >> } >> + >> +uint64_t arch__intr_reg_mask(void) >> +{ >> + struct perf_event_attr attr =3D { >> + .type =3D PERF_TYPE_HARDWARE, >> + .config =3D PERF_COUNT_HW_CPU_CYCLES, >> + .sample_type =3D PERF_SAMPLE_REGS_INTR, >> + .precise_ip =3D 1, >> + .disabled =3D 1, >> + .exclude_kernel =3D 1, >> + }; >> + int fd, ret; >> + char buffer[64]; >> + u32 version; >> + u64 extended_mask =3D 0; >> + >> + /* Get the PVR value to set the extended >> + * mask specific to platform >=20 > Comment format is wrong, and punctuation please. >=20 >> + */ >> + get_cpuid(buffer, sizeof(buffer)); >> + ret =3D sscanf(buffer, "%u,", &version); >=20 > This is powerpc specific code, why not just use mfspr(SPRN_PVR), = rather > than redirecting via printf/sscanf. >=20 >> + >> + if (ret !=3D 1) { >> + pr_debug("Failed to get the processor version, unable to = output extended registers\n"); >> + return PERF_REGS_MASK; >> + } >> + >> + if (version =3D=3D PVR_POWER9) >> + extended_mask =3D PERF_REG_PMU_MASK_300; >> + else >> + return PERF_REGS_MASK; >> + >> + attr.sample_regs_intr =3D extended_mask; >> + attr.sample_period =3D 1; >> + event_attr_init(&attr); >> + >> + /* >> + * check if the pmu supports perf extended regs, before >> + * returning the register mask to sample. >> + */ >> + fd =3D sys_perf_event_open(&attr, 0, -1, -1, 0); >> + if (fd !=3D -1) { >> + close(fd); >> + return (extended_mask | PERF_REGS_MASK); >> + } >> + return PERF_REGS_MASK; >=20 > I think this would read a bit better like: >=20 > mask =3D PERF_REGS_MASK; >=20 > if (version =3D=3D PVR_POWER9) > extended_mask =3D PERF_REG_PMU_MASK_300; > else > return mask; >=20 > attr.sample_regs_intr =3D extended_mask; > attr.sample_period =3D 1; > event_attr_init(&attr); >=20 > /* > * check if the pmu supports perf extended regs, before > * returning the register mask to sample. > */ > fd =3D sys_perf_event_open(&attr, 0, -1, -1, 0); > if (fd !=3D -1) { > close(fd); > mask |=3D extended_mask; > } >=20 > return mask; >=20 >=20 > cheers --Apple-Mail=_8D00A3FD-A5AD-46AB-917F-8C89296A1ADF Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=us-ascii

On 08-Jul-2020, at 5:34 PM, Michael Ellerman <mpe@ellerman.id.au> = wrote:

Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes:
From: Anju T Sudhakar <anju@linux.vnet.ibm.com>

Add extended regs to sample_reg_mask in the tool side to = use
with `-I?` option. Perf tools side uses extended mask = to display
the platform supported register names (with -I? = option) to the user
and also send this mask to the kernel = to capture the extended registers
in each sample. Hence = decide the mask value based on the processor
version.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
[Decide = extended mask at run time based on platform]
Signed-off-by: = Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Reviewed-by: = Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Will need an ack from perf tools = folks, who are not on Cc by the looks.

diff --git = a/tools/arch/powerpc/include/uapi/asm/perf_regs.h = b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
index = f599064..485b1d5 100644
--- = a/tools/arch/powerpc/include/uapi/asm/perf_regs.h
+++ = b/tools/arch/powerpc/include/uapi/asm/perf_regs.h
@@ -48,6 = +48,18 @@ enum perf_event_powerpc_regs {
= PERF_REG_POWERPC_DSISR,
= PERF_REG_POWERPC_SIER,
PERF_REG_POWERPC_MMCRA,
- = PERF_REG_POWERPC_MAX,
+ /* Extended registers */
+ = PERF_REG_POWERPC_MMCR0,
+ = PERF_REG_POWERPC_MMCR1,
+ = PERF_REG_POWERPC_MMCR2,
+ /* Max = regs without the extended regs */
+ = PERF_REG_POWERPC_MAX =3D PERF_REG_POWERPC_MMCRA + 1,

I don't really understand this idea of a max that's not the = max.

Hi = Michael

This is the MAX without = extended regs. This is mainly used in `arch/powerpc/perf/perf_regs.c` to = define pt_regs_offset ( to get index
for other regs ) and also = is used to determine whether requested register is an extended reg while = capturing data in sample
( in `perf_reg_value` )

Thanks
Athira


};
+
+#define PERF_REG_PMU_MASK ((1ULL = << PERF_REG_POWERPC_MAX) - 1)
+
+/* = PERF_REG_EXTENDED_MASK value for CPU_FTR_ARCH_300 */
+#define PERF_REG_PMU_MASK_300   (((1ULL << = (PERF_REG_POWERPC_MMCR2 + 1)) - 1) \
+ - = PERF_REG_PMU_MASK)
+
#endif /* = _UAPI_ASM_POWERPC_PERF_REGS_H */
diff --git = a/tools/perf/arch/powerpc/include/perf_regs.h = b/tools/perf/arch/powerpc/include/perf_regs.h
index = e18a355..46ed00d 100644
--- = a/tools/perf/arch/powerpc/include/perf_regs.h
+++ = b/tools/perf/arch/powerpc/include/perf_regs.h
@@ -64,7 = +64,10 @@
[PERF_REG_POWERPC_DAR] =3D = "dar",
[PERF_REG_POWERPC_DSISR] =3D "dsisr",
= [PERF_REG_POWERPC_SIER] =3D "sier",
- = [PERF_REG_POWERPC_MMCRA] =3D "mmcra"
+ = [PERF_REG_POWERPC_MMCRA] =3D "mmcra",
+ = [PERF_REG_POWERPC_MMCR0] =3D "mmcr0",
+ = [PERF_REG_POWERPC_MMCR1] =3D "mmcr1",
+ = [PERF_REG_POWERPC_MMCR2] =3D "mmcr2",
};

static inline const char *perf_reg_name(int = id)
diff --git a/tools/perf/arch/powerpc/util/perf_regs.c = b/tools/perf/arch/powerpc/util/perf_regs.c
index = 0a52429..9179230 100644
--- = a/tools/perf/arch/powerpc/util/perf_regs.c
+++ = b/tools/perf/arch/powerpc/util/perf_regs.c
@@ -6,9 +6,14 = @@

#include "../../../util/perf_regs.h"
#include "../../../util/debug.h"
+#include = "../../../util/event.h"
+#include = "../../../util/header.h"
+#include = "../../../perf-sys.h"

#include = <linux/kernel.h>

+#define = PVR_POWER9 = = 0x004E
+
const struct sample_reg = sample_reg_masks[] =3D {
SMPL_REG(r0, = PERF_REG_POWERPC_R0),
SMPL_REG(r1, = PERF_REG_POWERPC_R1),
@@ -55,6 +60,9 @@
= SMPL_REG(dsisr, PERF_REG_POWERPC_DSISR),
= SMPL_REG(sier, PERF_REG_POWERPC_SIER),
= SMPL_REG(mmcra, PERF_REG_POWERPC_MMCRA),
+ = SMPL_REG(mmcr0, PERF_REG_POWERPC_MMCR0),
+ = SMPL_REG(mmcr1, PERF_REG_POWERPC_MMCR1),
+ = SMPL_REG(mmcr2, PERF_REG_POWERPC_MMCR2),
= SMPL_REG_END
};

@@ = -163,3 +171,50 @@ int arch_sdt_arg_parse_op(char *old_op, char = **new_op)

return SDT_ARG_VALID;
}
+
+uint64_t = arch__intr_reg_mask(void)
+{
+ struct = perf_event_attr attr =3D {
+ .type =             &n= bsp;     =3D PERF_TYPE_HARDWARE,
+ = = .config =             &n= bsp;   =3D PERF_COUNT_HW_CPU_CYCLES,
+ = .sample_type =            =3D = PERF_SAMPLE_REGS_INTR,
+ .precise_ip =             =3D= 1,
+ .disabled =             &n= bsp; =3D 1,
+ .exclude_kernel =         =3D 1,
+ = };
+ int fd, ret;
+ char = buffer[64];
+ u32 version;
+ u64 = extended_mask =3D 0;
+
+ /* Get = the PVR value to set the extended
+  * mask specific to = platform

Comment format is wrong, and punctuation please.

+  */
+ = get_cpuid(buffer, sizeof(buffer));
+ ret =3D = sscanf(buffer, "%u,", &version);

This is powerpc specific code, = why not just use mfspr(SPRN_PVR), rather
than redirecting via printf/sscanf.

+
+ = if (ret !=3D 1) {
+ pr_debug("Failed to get the = processor version, unable to output extended registers\n");
+ = = return PERF_REGS_MASK;
+ }
+
+ if (version =3D=3D PVR_POWER9)
+ = = extended_mask =3D PERF_REG_PMU_MASK_300;
+ else
+ = = return PERF_REGS_MASK;
+
+ = attr.sample_regs_intr =3D extended_mask;
+ = attr.sample_period =3D 1;
+ = event_attr_init(&attr);
+
+ /*
+ =  * check if the = pmu supports perf extended regs, before
+  * returning the register = mask to sample.
+  */
+ fd =3D = sys_perf_event_open(&attr, 0, -1, -1, 0);
+ if (fd !=3D= -1) {
+ close(fd);
+ return (extended_mask | = PERF_REGS_MASK);
+ }
+ return = PERF_REGS_MASK;

I think this would read a bit better like:

mask =3D = PERF_REGS_MASK;

= if (version = =3D=3D PVR_POWER9)
= extended_mask = =3D PERF_REG_PMU_MASK_300;
       else
        return = mask;

       attr.sample_regs_intr= =3D extended_mask;
       attr.sample_period = =3D 1;
       event_attr_init(&= attr);

       /*
         * check = if the pmu supports perf extended regs, before
         * = returning the register mask to sample.
         */=
       fd =3D = sys_perf_event_open(&attr, 0, -1, -1, 0);
       if (fd !=3D -1) = {
          &nb= sp;    close(fd);
          &nb= sp;    mask |=3D extended_mask;
       }

return = mask;


cheers

= --Apple-Mail=_8D00A3FD-A5AD-46AB-917F-8C89296A1ADF--