From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF8B6C433EF for ; Wed, 11 May 2022 10:04:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239160AbiEKKEJ (ORCPT ); Wed, 11 May 2022 06:04:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239697AbiEKKEC (ORCPT ); Wed, 11 May 2022 06:04:02 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E45FA6162C; Wed, 11 May 2022 03:03:45 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BB7FE1042; Wed, 11 May 2022 03:03:45 -0700 (PDT) Received: from [10.57.80.111] (unknown [10.57.80.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8B30B3F73D; Wed, 11 May 2022 03:03:43 -0700 (PDT) Message-ID: <657d2bba-23af-fa74-3efe-cd7558b47ff7@arm.com> Date: Wed, 11 May 2022 11:03:38 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH 1/2] perf: coresight_pmu: Add support for ARM CoreSight PMU driver Content-Language: en-GB To: Besar Wicaksono Cc: "catalin.marinas@arm.com" , "will@kernel.org" , "mark.rutland@arm.com" , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "linux-tegra@vger.kernel.org" , "sudeep.holla@arm.com" , "thanu.rangarajan@arm.com" , "Michael.Williams@arm.com" , "suzuki.poulose@arm.com" , Thierry Reding , Jonathan Hunter , Vikram Sethi References: <20220509002810.12412-1-bwicaksono@nvidia.com> <20220509002810.12412-2-bwicaksono@nvidia.com> <756ac2c8-6530-03b0-53d3-ee7493509579@arm.com> From: Robin Murphy In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-tegra@vger.kernel.org On 2022-05-11 03:46, Besar Wicaksono wrote: [...] >>> +config ARM_CORESIGHT_PMU >>> + tristate "ARM Coresight PMU" >>> + depends on ARM64 && ACPI_APMT >> >> There shouldn't be any functional dependency on any CPU architecture here. > > The spec is targeted towards ARM based system, shouldn't we explicitly limit it to ARM? I wouldn't say so. The PMU spec does occasionally make reference to the Armv8-A and Armv8-M PMU architectures for comparison, but ultimately it's specifying an MMIO register interface for a system component. If 3rd-party system IP vendors adopt it, who knows what kind of systems these PMUs might end up in? (And of course a DT binding will inevitably come along once the rest of the market catches up with the ACPI-focused early adopters) In terms of functional dependency plus scope of practical usefulness, I think something like: depends on ACPI depends on ACPI_APMT || COMPILE_TEST would probably fit the bill until DT support comes along. [...] >>> +/* >>> + * Write to 64-bit register as a pair of 32-bit registers. >>> + * >>> + * @val : 64-bit value to write. >>> + * @base : base address of page-0 or page-1 if dual-page ext. is enabled. >>> + * @offset : register offset. >>> + * >>> + */ >>> +static void write_reg64_lohi(u64 val, void __iomem *base, u32 offset) >>> +{ >>> + u32 val_lo, val_hi; >>> + >>> + val_hi = upper_32_bits(val); >>> + val_lo = lower_32_bits(val); >>> + >>> + write_reg32(val_lo, base, offset); >>> + write_reg32(val_hi, base, offset + 4); >>> +} >> >> #include > > Thanks for pointing this out. We will replace it with lo_hi_writeq. The point is more that you can just use writeq() (and readq() where atomicity isn't important), and the header will make sure it works wherever. The significance of not having 64-bit single-copy atomicity should be that if the processor issues a 64-bit access, the system may *automatically* split it into a pair of 32-bit accesses, e.g. at an AXI-to-APB bridge. If making a 64-bit access to a 64-bit register would actually fail, that's just broken. [...] >>> +static inline bool is_cycle_cntr_idx(const struct perf_event *event) >>> +{ >>> + struct coresight_pmu *coresight_pmu = to_coresight_pmu(event- >>> pmu); >>> + int idx = event->hw.idx; >>> + >>> + return (support_cc(coresight_pmu) && idx == >> CORESIGHT_PMU_IDX_CCNTR); >> >> If we don't support cycle counting, cycles count events should have been >> rejected in event_init. If they're able to propagate further than that [apologies for an editing mishap here, this should have continued "then something is fundamentally broken."] > Not sure I understand, do you mean the check for cycle counter support is unnecessary ? > This function is actually called by coresight_pmu_start, which is after event_init had passed. > coresight_pmu_start is not aware if cycle counter is supported or not, so we need to keep checking it. I mean that the support_cc(coresight_pmu) check should only ever need to happen *once* in event_init, so if standard cycles events are not supported then they are correctly rejected there and then. After that, if we see one in event_add and later, then we can simply infer that we *do* have a standard cycle counter and go ahead and allocate it. >>> +} >>> + >>> +bool coresight_pmu_is_cc_event(const struct perf_event *event) >>> +{ >>> + struct coresight_pmu *coresight_pmu = to_coresight_pmu(event- >>> pmu); >>> + u32 evtype = coresight_pmu->impl.ops->event_type(event); >>> + >>> + return (support_cc(coresight_pmu) && >> >> Ditto. > > This function is called by event_init to validate the event and find available counters. Right, but it also ends up getting called from other places like event_add as well. Like I say, if we're still checking whether an event is supported or not by that point, we're doing something wrong. [...]>>> +/** >>> + * This is the default event number for cycle count, if supported, since the >>> + * ARM Coresight PMU specification does not define a standard event >> code >>> + * for cycle count. >>> + */ >>> +#define CORESIGHT_PMU_EVT_CYCLES_DEFAULT (0x1ULL << 31) >> >> And what do we do when an implementation defines 0x80000000 as one of >> its own event specifiers? The standard cycle count is independent of any >> other events, so it needs to be encoded in a manner which is distinct >> from *any* potentially-valid PMEVTYPER value. > > We were thinking that in such case, the implementor would provide coresight_pmu_impl_ops. > To avoid it, I guess we can use config[32] for the default cycle count event id. > The filter value will need to be moved to config1[31:0]. > Does it sound reasonable ? Sure, you can lay out the config fields however you fancy, but since the architecture leaves the standard cycles event independent from the 32-bit IMP-DEF PMEVTYPER specifier, logically we need at least 33 bits in some form or other to encode all possible event types in our perf_event config. Thanks, Robin. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 38260C433EF for ; Wed, 11 May 2022 10:04:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To:Subject: MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=BjK5Ae14QboNWK6M5ffom8ptZc423TbCQsd+8nPgjtY=; b=x7zzuqGWgbpcPK NbgwUhu03t6USA2v1v81cxT6bBC+8Uj2YQaqekbGCum+NRqK2BPl29AGOlJu2AzRrgG3FZxjUQgIs pM5IYEuiZG2iBrIfA5ZCDl4SHR2U7Vu1hpT4LcxJEZl8aRsQQSRWhYGfOMsdj3nIvkHg/rceZheNF Zkq2TywS8mx4JGi5IdoauLsYN1+u7fMw21Wc+5JX4QwLHoAjOAenl9hxGi/9/1/93/3UTrvuZg0HI 4f8/VLZI40rJjGsvN2G9vrgpVYCMTajwbIY3NRkyDo8Gl5HbkxnhWrJ+HrVhJRZUr5/kGfX+tB1pJ sBYqsUlSdWud/6jPOJlg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nojCC-006GXi-AT; Wed, 11 May 2022 10:03:52 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1nojC8-006GW8-Tp for linux-arm-kernel@lists.infradead.org; Wed, 11 May 2022 10:03:50 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id BB7FE1042; Wed, 11 May 2022 03:03:45 -0700 (PDT) Received: from [10.57.80.111] (unknown [10.57.80.111]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 8B30B3F73D; Wed, 11 May 2022 03:03:43 -0700 (PDT) Message-ID: <657d2bba-23af-fa74-3efe-cd7558b47ff7@arm.com> Date: Wed, 11 May 2022 11:03:38 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH 1/2] perf: coresight_pmu: Add support for ARM CoreSight PMU driver Content-Language: en-GB To: Besar Wicaksono Cc: "catalin.marinas@arm.com" , "will@kernel.org" , "mark.rutland@arm.com" , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , "linux-tegra@vger.kernel.org" , "sudeep.holla@arm.com" , "thanu.rangarajan@arm.com" , "Michael.Williams@arm.com" , "suzuki.poulose@arm.com" , Thierry Reding , Jonathan Hunter , Vikram Sethi References: <20220509002810.12412-1-bwicaksono@nvidia.com> <20220509002810.12412-2-bwicaksono@nvidia.com> <756ac2c8-6530-03b0-53d3-ee7493509579@arm.com> From: Robin Murphy In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220511_030349_104555_5DE6DFF3 X-CRM114-Status: GOOD ( 27.99 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 2022-05-11 03:46, Besar Wicaksono wrote: [...] >>> +config ARM_CORESIGHT_PMU >>> + tristate "ARM Coresight PMU" >>> + depends on ARM64 && ACPI_APMT >> >> There shouldn't be any functional dependency on any CPU architecture here. > > The spec is targeted towards ARM based system, shouldn't we explicitly limit it to ARM? I wouldn't say so. The PMU spec does occasionally make reference to the Armv8-A and Armv8-M PMU architectures for comparison, but ultimately it's specifying an MMIO register interface for a system component. If 3rd-party system IP vendors adopt it, who knows what kind of systems these PMUs might end up in? (And of course a DT binding will inevitably come along once the rest of the market catches up with the ACPI-focused early adopters) In terms of functional dependency plus scope of practical usefulness, I think something like: depends on ACPI depends on ACPI_APMT || COMPILE_TEST would probably fit the bill until DT support comes along. [...] >>> +/* >>> + * Write to 64-bit register as a pair of 32-bit registers. >>> + * >>> + * @val : 64-bit value to write. >>> + * @base : base address of page-0 or page-1 if dual-page ext. is enabled. >>> + * @offset : register offset. >>> + * >>> + */ >>> +static void write_reg64_lohi(u64 val, void __iomem *base, u32 offset) >>> +{ >>> + u32 val_lo, val_hi; >>> + >>> + val_hi = upper_32_bits(val); >>> + val_lo = lower_32_bits(val); >>> + >>> + write_reg32(val_lo, base, offset); >>> + write_reg32(val_hi, base, offset + 4); >>> +} >> >> #include > > Thanks for pointing this out. We will replace it with lo_hi_writeq. The point is more that you can just use writeq() (and readq() where atomicity isn't important), and the header will make sure it works wherever. The significance of not having 64-bit single-copy atomicity should be that if the processor issues a 64-bit access, the system may *automatically* split it into a pair of 32-bit accesses, e.g. at an AXI-to-APB bridge. If making a 64-bit access to a 64-bit register would actually fail, that's just broken. [...] >>> +static inline bool is_cycle_cntr_idx(const struct perf_event *event) >>> +{ >>> + struct coresight_pmu *coresight_pmu = to_coresight_pmu(event- >>> pmu); >>> + int idx = event->hw.idx; >>> + >>> + return (support_cc(coresight_pmu) && idx == >> CORESIGHT_PMU_IDX_CCNTR); >> >> If we don't support cycle counting, cycles count events should have been >> rejected in event_init. If they're able to propagate further than that [apologies for an editing mishap here, this should have continued "then something is fundamentally broken."] > Not sure I understand, do you mean the check for cycle counter support is unnecessary ? > This function is actually called by coresight_pmu_start, which is after event_init had passed. > coresight_pmu_start is not aware if cycle counter is supported or not, so we need to keep checking it. I mean that the support_cc(coresight_pmu) check should only ever need to happen *once* in event_init, so if standard cycles events are not supported then they are correctly rejected there and then. After that, if we see one in event_add and later, then we can simply infer that we *do* have a standard cycle counter and go ahead and allocate it. >>> +} >>> + >>> +bool coresight_pmu_is_cc_event(const struct perf_event *event) >>> +{ >>> + struct coresight_pmu *coresight_pmu = to_coresight_pmu(event- >>> pmu); >>> + u32 evtype = coresight_pmu->impl.ops->event_type(event); >>> + >>> + return (support_cc(coresight_pmu) && >> >> Ditto. > > This function is called by event_init to validate the event and find available counters. Right, but it also ends up getting called from other places like event_add as well. Like I say, if we're still checking whether an event is supported or not by that point, we're doing something wrong. [...]>>> +/** >>> + * This is the default event number for cycle count, if supported, since the >>> + * ARM Coresight PMU specification does not define a standard event >> code >>> + * for cycle count. >>> + */ >>> +#define CORESIGHT_PMU_EVT_CYCLES_DEFAULT (0x1ULL << 31) >> >> And what do we do when an implementation defines 0x80000000 as one of >> its own event specifiers? The standard cycle count is independent of any >> other events, so it needs to be encoded in a manner which is distinct >> from *any* potentially-valid PMEVTYPER value. > > We were thinking that in such case, the implementor would provide coresight_pmu_impl_ops. > To avoid it, I guess we can use config[32] for the default cycle count event id. > The filter value will need to be moved to config1[31:0]. > Does it sound reasonable ? Sure, you can lay out the config fields however you fancy, but since the architecture leaves the standard cycles event independent from the 32-bit IMP-DEF PMEVTYPER specifier, logically we need at least 33 bits in some form or other to encode all possible event types in our perf_event config. Thanks, Robin. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel