From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA2ACC48BE5 for ; Tue, 15 Jun 2021 16:46:05 +0000 (UTC) Received: from bombadil.infradead.org (unknown [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 910F76191A for ; Tue, 15 Jun 2021 16:46:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 910F76191A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:List-Subscribe:List-Help:List-Post:List-Archive: List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:Date:Message-ID:From: References:CC:To:Subject:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=lmaUKQFOKjcgE29hzRNitSh0zW4DjeUXoGD8CTdJvPA=; b=zD2qLvOj5M5rYJ3JMQao0ZUaKe PB1FIHPJ5BI7f+YbJATCUgp3rY6TsL683ibRla/uBKx4pueuoeQwHg5Xcnpc2frgIXxELbnRIErZX 1OiNqm3xXqi8kO/4uOufUEfEWu+YSvonso8sAJwpBvWSQP6NdftAcgrTKxgo1YpCLYVOSIvGY39xn DyYA3zwTAEAQR6urNN8XZQYM/MqR8m+jrg5q1cHXQss2si+u+n8EcyNW0vz56DzD29D//zXJjbpDe uvtMgq/5jcZbHYfTF1OUlaE5b0PyIuQwoe01NX4UF9mVy1Vr3t5XK0z39SBm+97rsyjjVROm/k9wy mEfriwhg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1ltC5I-001Q71-DO; Tue, 15 Jun 2021 16:38:42 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lt9Ic-000E8e-66 for linux-arm-kernel@bombadil.infradead.org; Tue, 15 Jun 2021 13:40:14 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:Content-Type :In-Reply-To:MIME-Version:Date:Message-ID:From:References:CC:To:Subject: Sender:Reply-To:Content-ID:Content-Description; bh=kLmJ89w7jejNHo8GxOuNGL9R3T38U4VV15U1a+g40G4=; b=jntCwqOJ4qYU9XsdnngNhQFDyp xdeSJopPEapCk1kqY3PbWhQE850aWzLt0ScnKbs1YcXm0jYNYKkkajsc2BZAx/WRvwExDMa5oY82S y23jAEPiopYvU9MjvdOaTMgFyinNfNBo38lxwPOT6h8kFk0ONAyoTL02PMf7I14Ou/5CFYt+nJC27 Sdan/sD3RI0+qVUsHG57dC3NfTuoX9tM0blxE3j0C4eaRNxURj864Y3ePDqdnUIwZeCsha7E18fwb 2xUqY51zvmD+s1JhD3I5uW2pgcac6tv5TwF2z4nyaco6aBub6HX9roSoxKMfizQ6reY7Aixc/+qri ACv/m/ag==; Received: from szxga01-in.huawei.com ([45.249.212.187]) by desiato.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lt4sq-007XKT-Qv for linux-arm-kernel@lists.infradead.org; Tue, 15 Jun 2021 08:57:30 +0000 Received: from dggemv703-chm.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4G42BM0sBNzXfwW; Tue, 15 Jun 2021 16:52:11 +0800 (CST) Received: from dggema757-chm.china.huawei.com (10.1.198.199) by dggemv703-chm.china.huawei.com (10.3.19.46) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256) id 15.1.2176.2; Tue, 15 Jun 2021 16:57:11 +0800 Received: from [127.0.0.1] (10.69.38.203) by dggema757-chm.china.huawei.com (10.1.198.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Tue, 15 Jun 2021 16:57:10 +0800 Subject: Re: [PATCH v6 2/2] drivers/perf: hisi: Add driver for HiSilicon PCIe PMU To: Will Deacon CC: , , , , , , References: <1622467951-32114-1-git-send-email-liuqi115@huawei.com> <1622467951-32114-3-git-send-email-liuqi115@huawei.com> <20210611162347.GA16284@willie-the-truck> From: "liuqi (BA)" Message-ID: Date: Tue, 15 Jun 2021 16:57:09 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.6.0 MIME-Version: 1.0 In-Reply-To: <20210611162347.GA16284@willie-the-truck> Content-Language: en-GB X-Originating-IP: [10.69.38.203] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggema757-chm.china.huawei.com (10.1.198.199) X-CFilter-Loop: Reflected X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210615_095728_364906_3B7DFC69 X-CRM114-Status: GOOD ( 36.20 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Hi Will, Thanks for your reviewing. On 2021/6/12 0:23, Will Deacon wrote: > On Mon, May 31, 2021 at 09:32:31PM +0800, Qi Liu wrote: >> PCIe PMU Root Complex Integrated End Point(RCiEP) device is supported >> to sample bandwidth, latency, buffer occupation etc. >> >> Each PMU RCiEP device monitors multiple Root Ports, and each RCiEP is >> registered as a PMU in /sys/bus/event_source/devices, so users can >> select target PMU, and use filter to do further sets. >> >> Filtering options contains: >> event - select the event. >> subevent - select the subevent. >> port - select target Root Ports. Information of Root Ports >> are shown under sysfs. >> bdf - select requester_id of target EP device. >> trig_len - set trigger condition for starting event statistics. >> trigger_mode - set trigger mode. 0 means starting to statistic when >> bigger than trigger condition, and 1 means smaller. >> thr_len - set threshold for statistics. >> thr_mode - set threshold mode. 0 means count when bigger than >> threshold, and 1 means smaller. >> >> Reviewed-by: John Garry >> Signed-off-by: Qi Liu >> --- >> MAINTAINERS | 6 + >> drivers/perf/Kconfig | 2 + >> drivers/perf/Makefile | 1 + >> drivers/perf/pci/Kconfig | 16 + >> drivers/perf/pci/Makefile | 2 + >> drivers/perf/pci/hisilicon/Makefile | 3 + >> drivers/perf/pci/hisilicon/hisi_pcie_pmu.c | 1019 ++++++++++++++++++++++++++++ > > Can we keep this under drivers/perf/hisilicon/ please? I don't see the > need to create a 'pci' directory here. > So how about drivers/perf/hisilicon/pci? as hisi_pcie_pmu.c do not use hisi_uncore_pmu framework. thanks >> include/linux/cpuhotplug.h | 1 + >> 8 files changed, 1050 insertions(+) >> create mode 100644 drivers/perf/pci/Kconfig >> create mode 100644 drivers/perf/pci/Makefile >> create mode 100644 drivers/perf/pci/hisilicon/Makefile >> create mode 100644 drivers/perf/pci/hisilicon/hisi_pcie_pmu.c >> [...] >> + >> +#define HISI_PCIE_PMU_ATTR(_name, _func, _config) \ >> + (&((struct dev_ext_attribute[]) { \ >> + { __ATTR(_name, 0444, _func, NULL), (void *)_config } \ >> + })[0].attr.attr) > > If you rebase onto my patch queue, then you can use PMU_EVENT_ATTR_ID to > define this. > ok, will fix this, thanks. >> +#define HISI_PCIE_PMU_FORMAT_ATTR(_name, _format) \ >> + HISI_PCIE_PMU_ATTR(_name, hisi_pcie_format_sysfs_show, (void *)_format) >> +#define HISI_PCIE_PMU_EVENT_ATTR(_name, _event) \ >> + HISI_PCIE_PMU_ATTR(_name, hisi_pcie_event_sysfs_show, (void *)_event) >> + >> +static ssize_t hisi_pcie_cpumask_show(struct device *dev, >> + struct device_attribute *attr, char *buf) >> +{ >> + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev)); >> + >> + return sysfs_emit(buf, "%d\n", pcie_pmu->on_cpu); >> +} > > This isn't a cpumask. > got it, I'll use cpumask_of(pcie_pmu->on_cpu) next time, thanks. >> + >> +static ssize_t hisi_pcie_identifier_show(struct device *dev, >> + struct device_attribute *attr, >> + char *buf) >> +{ >> + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev)); >> + >> + return sysfs_emit(buf, "0x%x\n", pcie_pmu->identifier); >> +} >> + >> +static ssize_t hisi_pcie_bus_show(struct device *dev, >> + struct device_attribute *attr, char *buf) >> +{ >> + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(dev_get_drvdata(dev)); >> + >> + return sysfs_emit(buf, "0x%02x\n", PCI_BUS_NUM(pcie_pmu->bdf_min)); >> +} >> + >> +static void hisi_pcie_parse_reg_value(struct hisi_pcie_pmu *pcie_pmu, >> + u32 reg_off, u16 *arg0, u16 *arg1) >> +{ >> + u32 val = readl(pcie_pmu->base + reg_off); >> + >> + *arg0 = val & 0xffff; >> + *arg1 = (val & 0xffff0000) >> 16; >> +} > > Define a new type for the pair of values and return that directly? > Sorry, I'm not sure about how to fix this, do you mean add a union like this? union reg_val { struct { u16 arg0; u16 arg1; } u32 val; } [...] >> +static void hisi_pcie_pmu_writeq(struct hisi_pcie_pmu *pcie_pmu, u32 reg_offset, >> + u32 idx, u64 val) >> +{ >> + u32 offset = hisi_pcie_pmu_get_offset(reg_offset, idx); >> + >> + writeq(val, pcie_pmu->base + offset); >> +} > > I'm guessing most (all?) of these IO access can be _relaxed() ? > ok, will change this. >> + >> +static void hisi_pcie_pmu_config_filter(struct perf_event *event) >> +{ >> + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); >> + struct hw_perf_event *hwc = &event->hw; >> + u64 reg = HISI_PCIE_DEFAULT_SET; >> + u64 port, trig_len, thr_len; >> + u32 idx = hwc->idx; >> + >> + /* Config HISI_PCIE_EVENT_CTRL according to event and subevent. */ >> + reg |= FIELD_PREP(HISI_PCIE_EVENT_M, hisi_pcie_get_event(event)) | >> + FIELD_PREP(HISI_PCIE_SUBEVENT_M, hisi_pcie_get_subevent(event)); >> + >> + /* Config HISI_PCIE_EVENT_CTRL according to ROOT PORT or EP device. */ >> + port = hisi_pcie_get_port(event); >> + if (port) >> + reg |= FIELD_PREP(HISI_PCIE_TARGET_M, port); >> + else >> + reg |= HISI_PCIE_TARGET_EN | >> + FIELD_PREP(HISI_PCIE_TARGET_M, hisi_pcie_get_bdf(event)); > > Please use braces for multi-line conditional expressions (same elsewhere). > It is single-line here, this line is more than 80 words so wrap here. >> + >> + /* Config HISI_PCIE_EVENT_CTRL according to trigger condition. */ >> + trig_len = hisi_pcie_get_trig_len(event); >> + if (trig_len) >> + reg |= FIELD_PREP(HISI_PCIE_TRIG_M, trig_len) | >> + FIELD_PREP(HISI_PCIE_TRIG_MODE_M, >> + hisi_pcie_get_trig_mode(event)) | HISI_PCIE_TRIG_EN; > > The formatting is very weird here. > will fix this. >> + >> + /* Config HISI_PCIE_EVENT_CTRL according to threshold condition. */ >> + thr_len = hisi_pcie_get_thr_len(event); >> + if (thr_len) >> + reg |= FIELD_PREP(HISI_PCIE_THR_M, thr_len) | >> + FIELD_PREP(HISI_PCIE_THR_MODE_M, >> + hisi_pcie_get_thr_mode(event)) | HISI_PCIE_THR_EN; > > and here. > will fix this, thanks. [...] >> + >> +/* >> + * The bandwidth, latency, bus utilization and buffer occupancy features are >> + * calculated from data in HISI_PCIE_CNT and extended data in HISI_PCIE_EXT_CNT. >> + * Other features are obtained only by HISI_PCIE_CNT. >> + * So data and data_ext are processed in this function to get performanace >> + * value like, bandwidth, latency, etc. >> + */ >> +static u64 hisi_pcie_pmu_get_performance(struct perf_event *event, u64 data, >> + u64 data_ext) >> +{ >> +#define CONVERT_DW_TO_BYTE(x) (sizeof(u32) * (x)) > > I don't know what a "DW" is, but this macro adds nothing... DW means double words, and 1DW = 4Bytes, value in hardware counter means DW so I wanna change it into Byte. So how about using 4*data here and adding code comment to explain it. > >> + struct hisi_pcie_pmu *pcie_pmu = to_pcie_pmu(event->pmu); >> + u64 us_per_cycle = readl(pcie_pmu->base + HISI_PCIE_REG_FREQ); >> + u32 idx = hisi_pcie_get_event(event); >> + >> + if (!data_ext) >> + return 0; >> + >> + /* Process data to set unit of bandwidth as "Byte/ms". */ >> + if (is_bw_event(idx)) { >> + >> + if (!div64_u64(data_ext, 1000)) >> + return 0; >> + >> + return div64_u64(CONVERT_DW_TO_BYTE(data), > > ... especially as this is the only use of it. > > >> + div64_u64(data_ext, 1000)); >> + } >> + >> + /* Process data to set unit of latency as "us". */ >> + if (is_latency_event(idx)) >> + return div64_u64(data * us_per_cycle, data_ext); >> + >> + if (is_bus_util_event(idx)) >> + return div64_u64(data * us_per_cycle, data_ext); >> + >> + if (is_buf_util_event(idx)) >> + return div64_u64(data, data_ext * us_per_cycle); > > Why do we need to do all this division in the kernel? Can't we just expose > the underlying values and let userspace figure out what it wants to do with > the numbers? > > Will > Our PMU hardware support 8 sets of counters to count bandwidth, latency and utilization events. For example, when users set latency event, common counter will count delay cycles, and extern counter count number of PCIe packets automaticly. And we do not have a event number for counting number of PCIe packets. So this division cannot move to userspace tool. Thanks, Qi > . > _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel