RE: [PATCH v4 4/4] perf/smmuv3_pmu: Enable HiSilicon Erratum 162001800 quirk

From: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
To: Robin Murphy <robin.murphy@arm.com>,
	"lorenzo.pieralisi@arm.com" <lorenzo.pieralisi@arm.com>,
	"jean-philippe.brucker@arm.com" <jean-philippe.brucker@arm.com>
Cc: "mark.rutland@arm.com" <mark.rutland@arm.com>,
	"vkilari@codeaurora.org" <vkilari@codeaurora.org>,
	"neil.m.leeder@gmail.com" <neil.m.leeder@gmail.com>,
	"pabba@codeaurora.org" <pabba@codeaurora.org>,
	"will.deacon@arm.com" <will.deacon@arm.com>,
	"rruigrok@codeaurora.org" <rruigrok@codeaurora.org>,
	Linuxarm <linuxarm@huawei.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-acpi@vger.kernel.org" <linux-acpi@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>
Subject: RE: [PATCH v4 4/4] perf/smmuv3_pmu: Enable HiSilicon Erratum 162001800 quirk
Date: Tue, 27 Nov 2018 13:23:16 +0000	[thread overview]
Message-ID: <5FC3163CFD30C246ABAA99954A238FA83885EDA7@FRAEML521-MBB.china.huawei.com> (raw)
In-Reply-To: <f8e469c3-7ec9-551e-d5bd-6f14b02a8f3f@arm.com>

> -----Original Message-----
> From: Robin Murphy [mailto:robin.murphy@arm.com]
> Sent: 26 November 2018 18:45
> To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>;
> lorenzo.pieralisi@arm.com; jean-philippe.brucker@arm.com
> Cc: mark.rutland@arm.com; vkilari@codeaurora.org;
> neil.m.leeder@gmail.com; pabba@codeaurora.org; will.deacon@arm.com;
> rruigrok@codeaurora.org; Linuxarm <linuxarm@huawei.com>; linux-
> kernel@vger.kernel.org; linux-acpi@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org
> Subject: Re: [PATCH v4 4/4] perf/smmuv3_pmu: Enable HiSilicon Erratum
> 162001800 quirk
> 
> Hi Shameer,
> 
> Sorry for the delay...
> 
> On 18/10/2018 16:27, Shameerali Kolothum Thodi wrote:
> >
> >
> >> -----Original Message-----
> >> From: Linuxarm [mailto:linuxarm-bounces@huawei.com] On Behalf Of
> >> Shameerali Kolothum Thodi
> >> Sent: 18 October 2018 14:34
> >> To: Robin Murphy <robin.murphy@arm.com>; lorenzo.pieralisi@arm.com;
> >> jean-philippe.brucker@arm.com
> >> Cc: mark.rutland@arm.com; vkilari@codeaurora.org;
> >> neil.m.leeder@gmail.com; pabba@codeaurora.org; will.deacon@arm.com;
> >> rruigrok@codeaurora.org; Linuxarm <linuxarm@huawei.com>; linux-
> >> kernel@vger.kernel.org; linux-acpi@vger.kernel.org; linux-arm-
> >> kernel@lists.infradead.org
> >> Subject: RE: [PATCH v4 4/4] perf/smmuv3_pmu: Enable HiSilicon Erratum
> >> 162001800 quirk
> >>
> >> Hi Robin,
> >>
> >>> -----Original Message-----
> >>> From: Robin Murphy [mailto:robin.murphy@arm.com]
> >>> Sent: 18 October 2018 12:44
> >>> To: Shameerali Kolothum Thodi
> <shameerali.kolothum.thodi@huawei.com>;
> >>> lorenzo.pieralisi@arm.com; jean-philippe.brucker@arm.com
> >>> Cc: will.deacon@arm.com; mark.rutland@arm.com; Guohanjun (Hanjun
> Guo)
> >>> <guohanjun@huawei.com>; John Garry <john.garry@huawei.com>;
> >>> pabba@codeaurora.org; vkilari@codeaurora.org;
> rruigrok@codeaurora.org;
> >>> linux-acpi@vger.kernel.org; linux-kernel@vger.kernel.org; linux-arm-
> >>> kernel@lists.infradead.org; Linuxarm <linuxarm@huawei.com>;
> >>> neil.m.leeder@gmail.com
> >>> Subject: Re: [PATCH v4 4/4] perf/smmuv3_pmu: Enable HiSilicon Erratum
> >>> 162001800 quirk
> >
> > [...]
> >
> >>>> +static const struct smmu_pmu_erratum_wa smmu_pmu_wa[] = {
> >>>> +	{
> >>>> +		.match_type = se_match_acpi_oem,
> >>>> +		.id = hisi_162001800_oem_info,
> >>>> +		.desc_str = "HiSilicon erratum 162001800",
> >>>> +		.enable = hisi_erratum_evcntr_rdonly,
> >>>> +	},
> >>>> +};
> >>>> +
> >>>
> >>> There's an awful lot of raw ACPI internals splashed about here -
> >>> couldn't at least some of it be abstracted behind the IORT code? In
> >>> fact, can't IORT just set all this stuff up in advance like it does for
> >>> SMMUs?
> >>
> >> Hmmm.. Sorry, not clear to me. You mean to say associate the IORT node
> >> with platform device and retrieve it in driver just like smmu does for
> >> "model" checks? Not sure that works here if that’s what the above meant.
> 
> I don't think there's much of interest in the actual IORT node itself,
> but I can't see that there would be any particular problem with passing
> either some implementation identifier or just a ready-made set of quirk
> flags to the PMCG driver via platdata.

Ok.

> >>>>    #define to_smmu_pmu(p) (container_of(p, struct smmu_pmu, pmu))
> >>>>
> >>>>    #define SMMU_PMU_EVENT_ATTR_EXTRACTOR(_name, _config,
> _start,
> >>> _end)        \
> >>>> @@ -224,15 +271,20 @@ static void smmu_pmu_set_period(struct
> >>> smmu_pmu *smmu_pmu,
> >>>>    	u32 idx = hwc->idx;
> >>>>    	u64 new;
> >>>>
> >>>> -	/*
> >>>> -	 * We limit the max period to half the max counter value of the
> >>> counter
> >>>> -	 * size, so that even in the case of extreme interrupt latency the
> >>>> -	 * counter will (hopefully) not wrap past its initial value.
> >>>> -	 */
> >>>> -	new = smmu_pmu->counter_mask >> 1;
> >>>> +	if (smmu_pmu->options & SMMU_PMU_OPT_EVCNTR_RDONLY) {
> >>>> +		new = smmu_pmu_counter_get_value(smmu_pmu, idx);
> >>>
> >>> Something's clearly missing, because if this happens to start at 0, the
> >>> current overflow handling code cannot possibly give the correct count.
> >>> Much as I hate the reset-to-half-period idiom for being impossible to
> >>> make sense of, it does make various aspects appear a lot simpler than
> >>> they really are. Wait, maybe that's yet another reason to hate it...
> >>
> >> Yes,  if the counter starts at 0 and overflow happens, it won't possibly give
> >> the correct count compared to the reset-to-half-period logic. Since this is a
> >> 64 bit counter, just hope that, it won't necessarily happen that often.
> 
> OK, if the full 64 counter bits are implemented, then I suppose we're
> probably OK to assume nobody's going to run a single profiling session
> over 4+ years or so. It might be worth a comment just to remind
> ourselves that we're (currently) relying on the counter size to mostly
> mitigate overflow-related issues in this case.

Sure, I will add a comment to make it clear.

> >
> > [...]
> >
> >>>> +static void smmu_pmu_enable_errata(struct smmu_pmu *smmu_pmu,
> >>>> +				enum smmu_pmu_erratum_match_type type,
> >>>> +				se_match_fn_t match_fn,
> >>>> +				void *arg)
> >>>> +{
> >>>> +	const struct smmu_pmu_erratum_wa *wa = smmu_pmu_wa;
> >>>> +
> >>>> +	for (; wa->desc_str; wa++) {
> >>>> +		if (wa->match_type != type)
> >>>> +			continue;
> >>>> +
> >>>> +		if (match_fn(wa, arg)) {
> >>>> +			if (wa->enable) {
> >>>> +				wa->enable(smmu_pmu);
> >>>> +				dev_info(smmu_pmu->dev,
> >>>> +					"Enabling workaround for %s\n",
> >>>> +					 wa->desc_str);
> >>>> +			}
> >>>
> >>> Just how many kinds of broken are we expecting here? Is this lifted from
> >>> the arm64 cpufeature framework, because it seems like absolute overkill
> >>> for a simple PMU driver which in all reality is only ever going to
> >>> wiggle a few flags in some data structure.
> >>
> >> Yes, this erratum framework is based on the arm_arch_timer code. Agree
> that
> >> this is an overkill if it is just to support this hardware. I am not sure this can
> be
> >> extended to add the IMPLEMENTATION DEFINED events in future(I haven't
> >> looked into that now). If this is not that useful in the near future, I will
> remove
> >> the
> >> framework part and use the OEM info directly to set the flag. Please let me
> >> know
> >> your thoughts..
> >
> > Below is another take on this patch. Please let me know if this makes any
> sense..
> >
> > Thanks,
> > Shameer
> >
> > ----8----
> > diff --git a/drivers/perf/arm_smmuv3_pmu.c
> b/drivers/perf/arm_smmuv3_pmu.c
> > index ef94b90..6f81b94 100644
> > --- a/drivers/perf/arm_smmuv3_pmu.c
> > +++ b/drivers/perf/arm_smmuv3_pmu.c
> > @@ -96,6 +96,8 @@
> >
> >   #define SMMU_PA_SHIFT                   12
> >
> > +#define SMMU_PMU_OPT_EVCNTR_RDONLY	(1 << 0)
> > +
> >   static int cpuhp_state_num;
> >
> >   struct smmu_pmu {
> > @@ -111,10 +113,38 @@ struct smmu_pmu {
> >   	struct device *dev;
> >   	void __iomem *reg_base;
> >   	void __iomem *reloc_base;
> > +	u32 options;
> >   	u64 counter_present_mask;
> >   	u64 counter_mask;
> >   };
> >
> > +struct erratum_acpi_oem_info {
> > +	char oem_id[ACPI_OEM_ID_SIZE + 1];
> > +	char oem_table_id[ACPI_OEM_TABLE_ID_SIZE + 1];
> > +	u32 oem_revision;
> > +	void (*enable)(struct smmu_pmu *smmu_pmu);
> > +};
> > +
> > +void hisi_erratum_evcntr_rdonly(struct smmu_pmu *smmu_pmu)
> > +{
> > +	smmu_pmu->options |= SMMU_PMU_OPT_EVCNTR_RDONLY;
> > +	dev_info(smmu_pmu->dev, "Enabling HiSilicon erratum
> 162001800\n");
> > +}
> > +
> > +static struct erratum_acpi_oem_info acpi_oem_info[] = {
> > +	/*
> > +	 * Note that trailing spaces are required to properly match
> > +	 * the OEM table information.
> > +	 */
> > +	{
> > +		.oem_id         = "HISI  ",
> > +		.oem_table_id   = "HIP08   ",
> > +		.oem_revision   = 0,
> > +		.enable = hisi_erratum_evcntr_rdonly,
> > +	},
> > +	{ /* Sentinel indicating the end of the OEM array */ },
> > +};
> > +
> >   #define to_smmu_pmu(p) (container_of(p, struct smmu_pmu, pmu))
> >
> >   #define SMMU_PMU_EVENT_ATTR_EXTRACTOR(_name, _config, _start,
> _end)        \
> > @@ -224,15 +254,20 @@ static void smmu_pmu_set_period(struct
> smmu_pmu *smmu_pmu,
> >   	u32 idx = hwc->idx;
> >   	u64 new;
> >
> > -	/*
> > -	 * We limit the max period to half the max counter value of the
> counter
> > -	 * size, so that even in the case of extreme interrupt latency the
> > -	 * counter will (hopefully) not wrap past its initial value.
> > -	 */
> > -	new = smmu_pmu->counter_mask >> 1;
> > +	if (smmu_pmu->options & SMMU_PMU_OPT_EVCNTR_RDONLY) {
> > +		new = smmu_pmu_counter_get_value(smmu_pmu, idx);
> > +	} else {
> > +		/*
> > +		 * We limit the max period to half the max counter value
> > +		 * of the counter size, so that even in the case of extreme
> > +		 * interrupt latency the counter will (hopefully) not wrap
> > +		 * past its initial value.
> > +		 */
> > +		new = smmu_pmu->counter_mask >> 1;
> > +		smmu_pmu_counter_set_value(smmu_pmu, idx, new);
> > +	}
> >
> >   	local64_set(&hwc->prev_count, new);
> > -	smmu_pmu_counter_set_value(smmu_pmu, idx, new);
> >   }
> >
> >   static void smmu_pmu_get_event_filter(struct perf_event *event, u32
> *span,
> > @@ -670,6 +705,28 @@ static void smmu_pmu_reset(struct smmu_pmu
> *smmu_pmu)
> >   		       smmu_pmu->reloc_base + SMMU_PMCG_OVSCLR0);
> >   }
> >
> > +static void smmu_pmu_check_acpi_workarounds(struct smmu_pmu
> *smmu_pmu)
> > +{
> > +	static const struct erratum_acpi_oem_info empty_oem_info = {};
> > +	const struct erratum_acpi_oem_info *info = acpi_oem_info;
> > +	struct acpi_table_header *hdr;
> > +
> > +	if (ACPI_FAILURE(acpi_get_table(ACPI_SIG_IORT, 0, &hdr))) {
> > +		dev_err(smmu_pmu->dev, "failed to get IORT\n");
> > +		return;
> > +	}
> > +
> > +	/* Iterate over the ACPI OEM info array, looking for a match */
> > +	while (memcmp(info, &empty_oem_info, sizeof(*info))) {
> > +		if (!memcmp(info->oem_id, hdr->oem_id, ACPI_OEM_ID_SIZE)
> &&
> > +		    !memcmp(info->oem_table_id, hdr->oem_table_id,
> ACPI_OEM_TABLE_ID_SIZE) &&
> > +			info->oem_revision == hdr->oem_revision)
> > +			info->enable(smmu_pmu);
> > +
> > +		info++;
> > +	}
> > +}
> 
> FWIW, this looks awfully like acpi_match_platform_list()...
> 
> However, I still think that any parsing of IORT fields belongs in
> iort.c, not in every driver which might ever need to detect a quirk. For
> starters, that code has iort_table to hand, full knowledge of all the
> other identifiable components, and a already contains a bunch of
> system-specific quirk detection which could potentially be shared.

Ok. I will take a look into moving this into IORT code and sharing
through platform data.

> [ side note: do you know if 1620 still has the same ITS quirk as 161x,
> or is it just the SMMU's MSI output that didn't get updated? ]

We don’t have the MSI reserve region issue for 1620. But I think it still
uses the upper 4 bytes of GITS_TRANSLATER and requires the patch from
Lei Zhen that takes care of sync_count overwrite memory corruption issue.

Thanks,
Shameer

> 
> > +
> >   static int smmu_pmu_probe(struct platform_device *pdev)
> >   {
> >   	struct smmu_pmu *smmu_pmu;
> > @@ -749,6 +806,8 @@ static int smmu_pmu_probe(struct platform_device
> *pdev)
> >   		return -EINVAL;
> >   	}
> >
> > +	smmu_pmu_check_acpi_workarounds(smmu_pmu);
> > +
> >   	/* Pick one CPU to be the preferred one to use */
> >   	smmu_pmu->on_cpu = get_cpu();
> >   	WARN_ON(irq_set_affinity(smmu_pmu->irq, cpumask_of(smmu_pmu-
> >on_cpu)));
> >