From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 13363C433F5 for ; Tue, 10 May 2022 12:54:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242078AbiEJM62 (ORCPT ); Tue, 10 May 2022 08:58:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57252 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242012AbiEJM6I (ORCPT ); Tue, 10 May 2022 08:58:08 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 695742A8075; Tue, 10 May 2022 05:54:10 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C65CE12FC; Tue, 10 May 2022 05:54:09 -0700 (PDT) Received: from [10.57.2.65] (unknown [10.57.2.65]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 53FEA3F73D; Tue, 10 May 2022 05:54:05 -0700 (PDT) Message-ID: <30ad6a53-1f4a-a88e-c239-fdd6d80ca4e1@arm.com> Date: Tue, 10 May 2022 13:54:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v7 2/7] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device Content-Language: en-US To: Yicong Yang , Leo Yan Cc: prime.zeng@huawei.com, liuqi115@huawei.com, zhangshaokun@hisilicon.com, linuxarm@huawei.com, gregkh@linuxfoundation.org, helgaas@kernel.org, alexander.shishkin@linux.intel.com, lorenzo.pieralisi@arm.com, will@kernel.org, mark.rutland@arm.com, mathieu.poirier@linaro.org, suzuki.poulose@arm.com, mike.leach@linaro.org, leo.yan@linaro.org, jonathan.cameron@huawei.com, daniel.thompson@linaro.org, joro@8bytes.org, john.garry@huawei.com, shameerali.kolothum.thodi@huawei.com, robin.murphy@arm.com, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, coresight@lists.linaro.org, linux-pci@vger.kernel.org, linux-perf-users@vger.kernel.org, iommu@lists.linux-foundation.org References: <20220407125841.3678-1-yangyicong@hisilicon.com> <20220407125841.3678-3-yangyicong@hisilicon.com> From: James Clark In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/05/2022 12:18, Yicong Yang wrote: > On 2022/5/10 17:46, James Clark wrote: >> >> >> On 07/04/2022 13:58, Yicong Yang wrote: >>> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated >>> Endpoint(RCiEP) device, providing the capability to dynamically monitor and >>> tune the PCIe traffic, and trace the TLP headers. >>> >>> Add the driver for the device to enable the trace function. Register PMU >>> device of PTT trace, then users can use trace through perf command. The >>> driver makes use of perf AUX trace and support following events to >>> configure the trace: >>> >>> - filter: select Root port or Endpoint to trace >>> - type: select the type of traced TLP headers >>> - direction: select the direction of traced TLP headers >>> - format: select the data format of the traced TLP headers >>> >>> This patch adds the driver part of PTT trace. The perf command support of >>> PTT trace is added in the following patch. >>> >>> Signed-off-by: Yicong Yang >>> Reviewed-by: Jonathan Cameron >>> --- >>> drivers/Makefile | 1 + >>> drivers/hwtracing/Kconfig | 2 + >>> drivers/hwtracing/ptt/Kconfig | 12 + >>> drivers/hwtracing/ptt/Makefile | 2 + >>> drivers/hwtracing/ptt/hisi_ptt.c | 874 +++++++++++++++++++++++++++++++ >>> drivers/hwtracing/ptt/hisi_ptt.h | 166 ++++++ >>> 6 files changed, 1057 insertions(+) >>> create mode 100644 drivers/hwtracing/ptt/Kconfig >>> create mode 100644 drivers/hwtracing/ptt/Makefile >>> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c >>> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h >>> >>> diff --git a/drivers/Makefile b/drivers/Makefile >>> index 020780b6b4d2..662d50599467 100644 >>> --- a/drivers/Makefile >>> +++ b/drivers/Makefile >>> @@ -175,6 +175,7 @@ obj-$(CONFIG_USB4) += thunderbolt/ >>> obj-$(CONFIG_CORESIGHT) += hwtracing/coresight/ >>> obj-y += hwtracing/intel_th/ >>> obj-$(CONFIG_STM) += hwtracing/stm/ >>> +obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/ >>> obj-$(CONFIG_ANDROID) += android/ >>> obj-$(CONFIG_NVMEM) += nvmem/ >>> obj-$(CONFIG_FPGA) += fpga/ >>> diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig >>> index 13085835a636..911ee977103c 100644 >>> --- a/drivers/hwtracing/Kconfig >>> +++ b/drivers/hwtracing/Kconfig >>> @@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig" >>> >>> source "drivers/hwtracing/intel_th/Kconfig" >>> >>> +source "drivers/hwtracing/ptt/Kconfig" >>> + >>> endmenu >>> diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig >>> new file mode 100644 >>> index 000000000000..8902a6f27563 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/Kconfig >>> @@ -0,0 +1,12 @@ >>> +# SPDX-License-Identifier: GPL-2.0-only >>> +config HISI_PTT >>> + tristate "HiSilicon PCIe Tune and Trace Device" >>> + depends on ARM64 || (COMPILE_TEST && 64BIT) >>> + depends on PCI && HAS_DMA && HAS_IOMEM && PERF_EVENTS >>> + help >>> + HiSilicon PCIe Tune and Trace Device exists as a PCIe RCiEP >>> + device, and it provides support for PCIe traffic tuning and >>> + tracing TLP headers to the memory. >>> + >>> + This driver can also be built as a module. If so, the module >>> + will be called hisi_ptt. >>> diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile >>> new file mode 100644 >>> index 000000000000..908c09a98161 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/Makefile >>> @@ -0,0 +1,2 @@ >>> +# SPDX-License-Identifier: GPL-2.0 >>> +obj-$(CONFIG_HISI_PTT) += hisi_ptt.o >>> diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c >>> new file mode 100644 >>> index 000000000000..242b41870380 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/hisi_ptt.c >>> @@ -0,0 +1,874 @@ >>> +// SPDX-License-Identifier: GPL-2.0 >>> +/* >>> + * Driver for HiSilicon PCIe tune and trace device >>> + * >>> + * Copyright (c) 2022 HiSilicon Technologies Co., Ltd. >>> + * Author: Yicong Yang >>> + */ >>> + >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> + >>> +#include "hisi_ptt.h" >>> + >>> +static u16 hisi_ptt_get_filter_val(struct pci_dev *pdev) >>> +{ >>> + if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) >>> + return BIT(HISI_PCIE_CORE_PORT_ID(PCI_SLOT(pdev->devfn))); >>> + >>> + return PCI_DEVID(pdev->bus->number, pdev->devfn); >>> +} >>> + >>> +static bool hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt) >>> +{ >>> + u32 val; >>> + >>> + return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_STS, >>> + val, val & HISI_PTT_TRACE_IDLE, >>> + HISI_PTT_WAIT_POLL_INTERVAL_US, >>> + HISI_PTT_WAIT_TRACE_TIMEOUT_US); >>> +} >>> + >>> +static bool hisi_ptt_wait_dma_reset_done(struct hisi_ptt *hisi_ptt) >>> +{ >>> + u32 val; >>> + >>> + return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_WR_STS, >>> + val, !val, HISI_PTT_RESET_POLL_INTERVAL_US, >>> + HISI_PTT_RESET_TIMEOUT_US); >>> +} >>> + >>> +static void hisi_ptt_free_trace_buf(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct device *dev = &hisi_ptt->pdev->dev; >>> + int i; >>> + >>> + if (!ctrl->trace_buf) >>> + return; >>> + >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) { >>> + if (ctrl->trace_buf[i].addr) >>> + dmam_free_coherent(dev, HISI_PTT_TRACE_BUF_SIZE, >>> + ctrl->trace_buf[i].addr, >>> + ctrl->trace_buf[i].dma); >>> + } >>> + >>> + devm_kfree(dev, ctrl->trace_buf); >>> + ctrl->trace_buf = NULL; >>> +} >>> + >>> +static int hisi_ptt_alloc_trace_buf(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct device *dev = &hisi_ptt->pdev->dev; >>> + int i; >>> + >>> + hisi_ptt->trace_ctrl.buf_index = 0; >>> + >>> + /* If the trace buffer has already been allocated, zero it. */ >>> + if (ctrl->trace_buf) { >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) >>> + memset(ctrl->trace_buf[i].addr, 0, HISI_PTT_TRACE_BUF_SIZE); >>> + return 0; >>> + } >>> + >>> + ctrl->trace_buf = devm_kcalloc(dev, HISI_PTT_TRACE_BUF_CNT, >>> + sizeof(struct hisi_ptt_dma_buffer), GFP_KERNEL); >>> + if (!ctrl->trace_buf) >>> + return -ENOMEM; >>> + >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; ++i) { >>> + ctrl->trace_buf[i].addr = dmam_alloc_coherent(dev, HISI_PTT_TRACE_BUF_SIZE, >>> + &ctrl->trace_buf[i].dma, >>> + GFP_KERNEL); >>> + if (!ctrl->trace_buf[i].addr) { >>> + hisi_ptt_free_trace_buf(hisi_ptt); >>> + return -ENOMEM; >>> + } >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt) >>> +{ >>> + writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + hisi_ptt->trace_ctrl.started = false; >>> +} >>> + >>> +static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + u32 val; >>> + int i; >>> + >>> + /* Check device idle before start trace */ >>> + if (!hisi_ptt_wait_trace_hw_idle(hisi_ptt)) { >>> + pci_err(hisi_ptt->pdev, "Failed to start trace, the device is still busy\n"); >>> + return -EBUSY; >>> + } >>> + >>> + ctrl->started = true; >>> + >>> + /* Reset the DMA before start tracing */ >>> + val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + val |= HISI_PTT_TRACE_CTRL_RST; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + hisi_ptt_wait_dma_reset_done(hisi_ptt); >>> + >>> + val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + val &= ~HISI_PTT_TRACE_CTRL_RST; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + /* Clear the interrupt status */ >>> + writel(HISI_PTT_TRACE_INT_STAT_MASK, hisi_ptt->iobase + HISI_PTT_TRACE_INT_STAT); >>> + writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_INT_MASK); >>> + >>> + /* Configure the trace DMA buffer */ >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) { >>> + writel(lower_32_bits(ctrl->trace_buf[i].dma), >>> + hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_LO_0 + >>> + i * HISI_PTT_TRACE_ADDR_STRIDE); >>> + writel(upper_32_bits(ctrl->trace_buf[i].dma), >>> + hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_HI_0 + >>> + i * HISI_PTT_TRACE_ADDR_STRIDE); >>> + } >>> + writel(HISI_PTT_TRACE_BUF_SIZE, hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_SIZE); >>> + >>> + /* Set the trace control register */ >>> + val = FIELD_PREP(HISI_PTT_TRACE_CTRL_TYPE_SEL, ctrl->type); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_RXTX_SEL, ctrl->direction); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_DATA_FORMAT, ctrl->format); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_TARGET_SEL, hisi_ptt->trace_ctrl.filter); >>> + if (!hisi_ptt->trace_ctrl.is_port) >>> + val |= HISI_PTT_TRACE_CTRL_FILTER_MODE; >>> + >>> + /* Start the Trace */ >>> + val |= HISI_PTT_TRACE_CTRL_EN; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + return 0; >>> +} >>> + >>> +static int hisi_ptt_update_aux(struct hisi_ptt *hisi_ptt, int index, bool stop) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct perf_output_handle *handle = &ctrl->handle; >>> + struct perf_event *event = handle->event; >>> + struct hisi_ptt_pmu_buf *buf; >>> + void *addr; >>> + >>> + buf = perf_get_aux(handle); >>> + if (!buf || !handle->size) >>> + return -EINVAL; >>> + >>> + addr = ctrl->trace_buf[ctrl->buf_index].addr; >>> + >>> + memcpy(buf->base + buf->pos, addr, HISI_PTT_TRACE_BUF_SIZE); >>> + memset(addr, 0, HISI_PTT_TRACE_BUF_SIZE); >> >> Hi Kicong, >> >> I also have the same comment as Leo here, I don't think the memset is >> required. >> > > It's necessary in the current approach as we always commit HISI_PTT_TRACE_BUF_SIZE > data but the buffer maybe partly filled (called when perf going to stopp, not by the > interrupt). The buffer is cleared so the unfilled part of the buffer will have > empty data (normal traced TLP headers won't be all 0), then the user can distinguish > the valid part of the data. > > I'm trying to only copy the traced data rather than the whole buffer then the > clear operation here will be unnecessary. The hardware provide a register indicating > which offset of which buffer it's currently writing to and it canbe used here. If only the traced data is copied rather than the full buffer, isn't that what perf_aux_output_end() is for? Perf will only read up to the point where you say the buffer is filled to, it won't go and read the zeros if you didn't tell it to by emitting perf_aux_output_end() for more data than was written. If you are having to write zeros to detect which bits of the buffer is filled or not it sounds like those zero parts are making it into the perf file and are wasting disk space and CPU cycles to copy them. > >>> + buf->pos += HISI_PTT_TRACE_BUF_SIZE; >>> + >>> + if (stop) { >>> + perf_aux_output_end(handle, buf->pos); >>> + } else if (buf->length - buf->pos < HISI_PTT_TRACE_BUF_SIZE) { >>> + perf_aux_output_skip(handle, buf->length - buf->pos); >> >> perf_aux_output_skip() can also return an error so should probably also >> be checked like perf_aux_output_begin() >> > > ok it should be checked. > >> I'm also wondering why there is a skip for every output_end()? Is that >> to avoid having two memcpy calls to handle the wrap around if the data >> to be copied goes past the end of the aux buffer? >> >> For example if your buffers are 4MB each and the aux buffer that the >> user picked isn't a multiple of 4 I can see you needing to write the >> first part of the 4MB to the end of the aux buffer and then the last >> part to the beginning which would be two memcpy() calls. And then a >> skip wouldn't be required. >> > > I intended to handle the case that AUX buffer is not a multiple of 4 MiB. > When the resident AUX buffer size is less than 4MiB, we're not going to > commit data to it and will apply a new AUX buffer instead. I think you're > right that the perf_aux_output_skip() is unnecessary here. Thanks for > catching this. > >> I looked at all the other uses of perf_output_end() and perf_output_skip() >> in the kernel and didn't see a pattern like yours so it seems suspicous to >> me. Maybe at least some comments around this section are needed. >> > > Will add some comments of the handling here. > > Regards, > Yicong From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DA2E6C433F5 for ; Tue, 10 May 2022 13:00:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:From:References:Cc:To: Subject:MIME-Version:Date:Message-ID:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=lpPUx62bjQewOE9eAc3lPWt8vlHRQfIIkQnr7J0V7HM=; b=p5760gSP2KL6IL MFceNkpYkUWneZlbqQa2DJm/jGE32pvN2Ppphg96tAYDeA3TdtzlKEA9tGFB1Y6Urf50RhkkNyhtq vTaqgZo/fA43Ut6VPu/FljkI4gnq0jpQaUY8TZwV0aqUrpJ0UrnjEykK2DotoWv0dbisnmcnis/V8 BjirxFyG3FoW+0lx6S0qNyIHvh7eTW9Py/TioB36gsMon+LXFkhyA/9rM7HPenFwEWyfAB/rHEi3n OZ+OYlYAXwxMQnIe7xp2ZjnaAWz6W1u4AB++OAbaYbc5KqjOf71fF8TEuJJP65ueePtkLVrrYzwb2 64ozhfS/7zxmkOXP6/lQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1noPSD-0024ML-Q6; Tue, 10 May 2022 12:59:06 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1noPNS-0022DK-S7 for linux-arm-kernel@lists.infradead.org; Tue, 10 May 2022 12:54:14 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C65CE12FC; Tue, 10 May 2022 05:54:09 -0700 (PDT) Received: from [10.57.2.65] (unknown [10.57.2.65]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 53FEA3F73D; Tue, 10 May 2022 05:54:05 -0700 (PDT) Message-ID: <30ad6a53-1f4a-a88e-c239-fdd6d80ca4e1@arm.com> Date: Tue, 10 May 2022 13:54:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v7 2/7] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device Content-Language: en-US To: Yicong Yang , Leo Yan Cc: prime.zeng@huawei.com, liuqi115@huawei.com, zhangshaokun@hisilicon.com, linuxarm@huawei.com, gregkh@linuxfoundation.org, helgaas@kernel.org, alexander.shishkin@linux.intel.com, lorenzo.pieralisi@arm.com, will@kernel.org, mark.rutland@arm.com, mathieu.poirier@linaro.org, suzuki.poulose@arm.com, mike.leach@linaro.org, leo.yan@linaro.org, jonathan.cameron@huawei.com, daniel.thompson@linaro.org, joro@8bytes.org, john.garry@huawei.com, shameerali.kolothum.thodi@huawei.com, robin.murphy@arm.com, peterz@infradead.org, mingo@redhat.com, acme@kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, coresight@lists.linaro.org, linux-pci@vger.kernel.org, linux-perf-users@vger.kernel.org, iommu@lists.linux-foundation.org References: <20220407125841.3678-1-yangyicong@hisilicon.com> <20220407125841.3678-3-yangyicong@hisilicon.com> From: James Clark In-Reply-To: X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220510_055411_091709_AF066DCA X-CRM114-Status: GOOD ( 48.93 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 10/05/2022 12:18, Yicong Yang wrote: > On 2022/5/10 17:46, James Clark wrote: >> >> >> On 07/04/2022 13:58, Yicong Yang wrote: >>> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated >>> Endpoint(RCiEP) device, providing the capability to dynamically monitor and >>> tune the PCIe traffic, and trace the TLP headers. >>> >>> Add the driver for the device to enable the trace function. Register PMU >>> device of PTT trace, then users can use trace through perf command. The >>> driver makes use of perf AUX trace and support following events to >>> configure the trace: >>> >>> - filter: select Root port or Endpoint to trace >>> - type: select the type of traced TLP headers >>> - direction: select the direction of traced TLP headers >>> - format: select the data format of the traced TLP headers >>> >>> This patch adds the driver part of PTT trace. The perf command support of >>> PTT trace is added in the following patch. >>> >>> Signed-off-by: Yicong Yang >>> Reviewed-by: Jonathan Cameron >>> --- >>> drivers/Makefile | 1 + >>> drivers/hwtracing/Kconfig | 2 + >>> drivers/hwtracing/ptt/Kconfig | 12 + >>> drivers/hwtracing/ptt/Makefile | 2 + >>> drivers/hwtracing/ptt/hisi_ptt.c | 874 +++++++++++++++++++++++++++++++ >>> drivers/hwtracing/ptt/hisi_ptt.h | 166 ++++++ >>> 6 files changed, 1057 insertions(+) >>> create mode 100644 drivers/hwtracing/ptt/Kconfig >>> create mode 100644 drivers/hwtracing/ptt/Makefile >>> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c >>> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h >>> >>> diff --git a/drivers/Makefile b/drivers/Makefile >>> index 020780b6b4d2..662d50599467 100644 >>> --- a/drivers/Makefile >>> +++ b/drivers/Makefile >>> @@ -175,6 +175,7 @@ obj-$(CONFIG_USB4) += thunderbolt/ >>> obj-$(CONFIG_CORESIGHT) += hwtracing/coresight/ >>> obj-y += hwtracing/intel_th/ >>> obj-$(CONFIG_STM) += hwtracing/stm/ >>> +obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/ >>> obj-$(CONFIG_ANDROID) += android/ >>> obj-$(CONFIG_NVMEM) += nvmem/ >>> obj-$(CONFIG_FPGA) += fpga/ >>> diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig >>> index 13085835a636..911ee977103c 100644 >>> --- a/drivers/hwtracing/Kconfig >>> +++ b/drivers/hwtracing/Kconfig >>> @@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig" >>> >>> source "drivers/hwtracing/intel_th/Kconfig" >>> >>> +source "drivers/hwtracing/ptt/Kconfig" >>> + >>> endmenu >>> diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig >>> new file mode 100644 >>> index 000000000000..8902a6f27563 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/Kconfig >>> @@ -0,0 +1,12 @@ >>> +# SPDX-License-Identifier: GPL-2.0-only >>> +config HISI_PTT >>> + tristate "HiSilicon PCIe Tune and Trace Device" >>> + depends on ARM64 || (COMPILE_TEST && 64BIT) >>> + depends on PCI && HAS_DMA && HAS_IOMEM && PERF_EVENTS >>> + help >>> + HiSilicon PCIe Tune and Trace Device exists as a PCIe RCiEP >>> + device, and it provides support for PCIe traffic tuning and >>> + tracing TLP headers to the memory. >>> + >>> + This driver can also be built as a module. If so, the module >>> + will be called hisi_ptt. >>> diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile >>> new file mode 100644 >>> index 000000000000..908c09a98161 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/Makefile >>> @@ -0,0 +1,2 @@ >>> +# SPDX-License-Identifier: GPL-2.0 >>> +obj-$(CONFIG_HISI_PTT) += hisi_ptt.o >>> diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c >>> new file mode 100644 >>> index 000000000000..242b41870380 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/hisi_ptt.c >>> @@ -0,0 +1,874 @@ >>> +// SPDX-License-Identifier: GPL-2.0 >>> +/* >>> + * Driver for HiSilicon PCIe tune and trace device >>> + * >>> + * Copyright (c) 2022 HiSilicon Technologies Co., Ltd. >>> + * Author: Yicong Yang >>> + */ >>> + >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> + >>> +#include "hisi_ptt.h" >>> + >>> +static u16 hisi_ptt_get_filter_val(struct pci_dev *pdev) >>> +{ >>> + if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) >>> + return BIT(HISI_PCIE_CORE_PORT_ID(PCI_SLOT(pdev->devfn))); >>> + >>> + return PCI_DEVID(pdev->bus->number, pdev->devfn); >>> +} >>> + >>> +static bool hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt) >>> +{ >>> + u32 val; >>> + >>> + return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_STS, >>> + val, val & HISI_PTT_TRACE_IDLE, >>> + HISI_PTT_WAIT_POLL_INTERVAL_US, >>> + HISI_PTT_WAIT_TRACE_TIMEOUT_US); >>> +} >>> + >>> +static bool hisi_ptt_wait_dma_reset_done(struct hisi_ptt *hisi_ptt) >>> +{ >>> + u32 val; >>> + >>> + return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_WR_STS, >>> + val, !val, HISI_PTT_RESET_POLL_INTERVAL_US, >>> + HISI_PTT_RESET_TIMEOUT_US); >>> +} >>> + >>> +static void hisi_ptt_free_trace_buf(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct device *dev = &hisi_ptt->pdev->dev; >>> + int i; >>> + >>> + if (!ctrl->trace_buf) >>> + return; >>> + >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) { >>> + if (ctrl->trace_buf[i].addr) >>> + dmam_free_coherent(dev, HISI_PTT_TRACE_BUF_SIZE, >>> + ctrl->trace_buf[i].addr, >>> + ctrl->trace_buf[i].dma); >>> + } >>> + >>> + devm_kfree(dev, ctrl->trace_buf); >>> + ctrl->trace_buf = NULL; >>> +} >>> + >>> +static int hisi_ptt_alloc_trace_buf(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct device *dev = &hisi_ptt->pdev->dev; >>> + int i; >>> + >>> + hisi_ptt->trace_ctrl.buf_index = 0; >>> + >>> + /* If the trace buffer has already been allocated, zero it. */ >>> + if (ctrl->trace_buf) { >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) >>> + memset(ctrl->trace_buf[i].addr, 0, HISI_PTT_TRACE_BUF_SIZE); >>> + return 0; >>> + } >>> + >>> + ctrl->trace_buf = devm_kcalloc(dev, HISI_PTT_TRACE_BUF_CNT, >>> + sizeof(struct hisi_ptt_dma_buffer), GFP_KERNEL); >>> + if (!ctrl->trace_buf) >>> + return -ENOMEM; >>> + >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; ++i) { >>> + ctrl->trace_buf[i].addr = dmam_alloc_coherent(dev, HISI_PTT_TRACE_BUF_SIZE, >>> + &ctrl->trace_buf[i].dma, >>> + GFP_KERNEL); >>> + if (!ctrl->trace_buf[i].addr) { >>> + hisi_ptt_free_trace_buf(hisi_ptt); >>> + return -ENOMEM; >>> + } >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt) >>> +{ >>> + writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + hisi_ptt->trace_ctrl.started = false; >>> +} >>> + >>> +static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + u32 val; >>> + int i; >>> + >>> + /* Check device idle before start trace */ >>> + if (!hisi_ptt_wait_trace_hw_idle(hisi_ptt)) { >>> + pci_err(hisi_ptt->pdev, "Failed to start trace, the device is still busy\n"); >>> + return -EBUSY; >>> + } >>> + >>> + ctrl->started = true; >>> + >>> + /* Reset the DMA before start tracing */ >>> + val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + val |= HISI_PTT_TRACE_CTRL_RST; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + hisi_ptt_wait_dma_reset_done(hisi_ptt); >>> + >>> + val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + val &= ~HISI_PTT_TRACE_CTRL_RST; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + /* Clear the interrupt status */ >>> + writel(HISI_PTT_TRACE_INT_STAT_MASK, hisi_ptt->iobase + HISI_PTT_TRACE_INT_STAT); >>> + writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_INT_MASK); >>> + >>> + /* Configure the trace DMA buffer */ >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) { >>> + writel(lower_32_bits(ctrl->trace_buf[i].dma), >>> + hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_LO_0 + >>> + i * HISI_PTT_TRACE_ADDR_STRIDE); >>> + writel(upper_32_bits(ctrl->trace_buf[i].dma), >>> + hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_HI_0 + >>> + i * HISI_PTT_TRACE_ADDR_STRIDE); >>> + } >>> + writel(HISI_PTT_TRACE_BUF_SIZE, hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_SIZE); >>> + >>> + /* Set the trace control register */ >>> + val = FIELD_PREP(HISI_PTT_TRACE_CTRL_TYPE_SEL, ctrl->type); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_RXTX_SEL, ctrl->direction); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_DATA_FORMAT, ctrl->format); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_TARGET_SEL, hisi_ptt->trace_ctrl.filter); >>> + if (!hisi_ptt->trace_ctrl.is_port) >>> + val |= HISI_PTT_TRACE_CTRL_FILTER_MODE; >>> + >>> + /* Start the Trace */ >>> + val |= HISI_PTT_TRACE_CTRL_EN; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + return 0; >>> +} >>> + >>> +static int hisi_ptt_update_aux(struct hisi_ptt *hisi_ptt, int index, bool stop) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct perf_output_handle *handle = &ctrl->handle; >>> + struct perf_event *event = handle->event; >>> + struct hisi_ptt_pmu_buf *buf; >>> + void *addr; >>> + >>> + buf = perf_get_aux(handle); >>> + if (!buf || !handle->size) >>> + return -EINVAL; >>> + >>> + addr = ctrl->trace_buf[ctrl->buf_index].addr; >>> + >>> + memcpy(buf->base + buf->pos, addr, HISI_PTT_TRACE_BUF_SIZE); >>> + memset(addr, 0, HISI_PTT_TRACE_BUF_SIZE); >> >> Hi Kicong, >> >> I also have the same comment as Leo here, I don't think the memset is >> required. >> > > It's necessary in the current approach as we always commit HISI_PTT_TRACE_BUF_SIZE > data but the buffer maybe partly filled (called when perf going to stopp, not by the > interrupt). The buffer is cleared so the unfilled part of the buffer will have > empty data (normal traced TLP headers won't be all 0), then the user can distinguish > the valid part of the data. > > I'm trying to only copy the traced data rather than the whole buffer then the > clear operation here will be unnecessary. The hardware provide a register indicating > which offset of which buffer it's currently writing to and it canbe used here. If only the traced data is copied rather than the full buffer, isn't that what perf_aux_output_end() is for? Perf will only read up to the point where you say the buffer is filled to, it won't go and read the zeros if you didn't tell it to by emitting perf_aux_output_end() for more data than was written. If you are having to write zeros to detect which bits of the buffer is filled or not it sounds like those zero parts are making it into the perf file and are wasting disk space and CPU cycles to copy them. > >>> + buf->pos += HISI_PTT_TRACE_BUF_SIZE; >>> + >>> + if (stop) { >>> + perf_aux_output_end(handle, buf->pos); >>> + } else if (buf->length - buf->pos < HISI_PTT_TRACE_BUF_SIZE) { >>> + perf_aux_output_skip(handle, buf->length - buf->pos); >> >> perf_aux_output_skip() can also return an error so should probably also >> be checked like perf_aux_output_begin() >> > > ok it should be checked. > >> I'm also wondering why there is a skip for every output_end()? Is that >> to avoid having two memcpy calls to handle the wrap around if the data >> to be copied goes past the end of the aux buffer? >> >> For example if your buffers are 4MB each and the aux buffer that the >> user picked isn't a multiple of 4 I can see you needing to write the >> first part of the 4MB to the end of the aux buffer and then the last >> part to the beginning which would be two memcpy() calls. And then a >> skip wouldn't be required. >> > > I intended to handle the case that AUX buffer is not a multiple of 4 MiB. > When the resident AUX buffer size is less than 4MiB, we're not going to > commit data to it and will apply a new AUX buffer instead. I think you're > right that the perf_aux_output_skip() is unnecessary here. Thanks for > catching this. > >> I looked at all the other uses of perf_output_end() and perf_output_skip() >> in the kernel and didn't see a pattern like yours so it seems suspicous to >> me. Maybe at least some comments around this section are needed. >> > > Will add some comments of the handling here. > > Regards, > Yicong _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AEE46C433EF for ; Tue, 10 May 2022 13:53:27 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 3A244416EF; Tue, 10 May 2022 13:53:27 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3q3uQdObWBNt; Tue, 10 May 2022 13:53:25 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp4.osuosl.org (Postfix) with ESMTPS id 424F4416EA; Tue, 10 May 2022 13:53:25 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 61A68C0086; Tue, 10 May 2022 13:53:24 +0000 (UTC) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2E9A8C002D for ; Tue, 10 May 2022 12:54:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 15F1540327 for ; Tue, 10 May 2022 12:54:12 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vPGJWHHVS11v for ; Tue, 10 May 2022 12:54:10 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.8.0 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp2.osuosl.org (Postfix) with ESMTP id B6E2C400FB for ; Tue, 10 May 2022 12:54:10 +0000 (UTC) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C65CE12FC; Tue, 10 May 2022 05:54:09 -0700 (PDT) Received: from [10.57.2.65] (unknown [10.57.2.65]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 53FEA3F73D; Tue, 10 May 2022 05:54:05 -0700 (PDT) Message-ID: <30ad6a53-1f4a-a88e-c239-fdd6d80ca4e1@arm.com> Date: Tue, 10 May 2022 13:54:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0 Subject: Re: [PATCH v7 2/7] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device Content-Language: en-US To: Yicong Yang , Leo Yan References: <20220407125841.3678-1-yangyicong@hisilicon.com> <20220407125841.3678-3-yangyicong@hisilicon.com> From: James Clark In-Reply-To: X-Mailman-Approved-At: Tue, 10 May 2022 13:53:23 +0000 Cc: mark.rutland@arm.com, prime.zeng@huawei.com, alexander.shishkin@linux.intel.com, linux-pci@vger.kernel.org, linuxarm@huawei.com, will@kernel.org, daniel.thompson@linaro.org, peterz@infradead.org, mingo@redhat.com, helgaas@kernel.org, liuqi115@huawei.com, mike.leach@linaro.org, suzuki.poulose@arm.com, coresight@lists.linaro.org, acme@kernel.org, zhangshaokun@hisilicon.com, linux-arm-kernel@lists.infradead.org, mathieu.poirier@linaro.org, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, iommu@lists.linux-foundation.org, leo.yan@linaro.org, robin.murphy@arm.com X-BeenThere: iommu@lists.linux-foundation.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: Development issues for Linux IOMMU support List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: iommu-bounces@lists.linux-foundation.org Sender: "iommu" On 10/05/2022 12:18, Yicong Yang wrote: > On 2022/5/10 17:46, James Clark wrote: >> >> >> On 07/04/2022 13:58, Yicong Yang wrote: >>> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex integrated >>> Endpoint(RCiEP) device, providing the capability to dynamically monitor and >>> tune the PCIe traffic, and trace the TLP headers. >>> >>> Add the driver for the device to enable the trace function. Register PMU >>> device of PTT trace, then users can use trace through perf command. The >>> driver makes use of perf AUX trace and support following events to >>> configure the trace: >>> >>> - filter: select Root port or Endpoint to trace >>> - type: select the type of traced TLP headers >>> - direction: select the direction of traced TLP headers >>> - format: select the data format of the traced TLP headers >>> >>> This patch adds the driver part of PTT trace. The perf command support of >>> PTT trace is added in the following patch. >>> >>> Signed-off-by: Yicong Yang >>> Reviewed-by: Jonathan Cameron >>> --- >>> drivers/Makefile | 1 + >>> drivers/hwtracing/Kconfig | 2 + >>> drivers/hwtracing/ptt/Kconfig | 12 + >>> drivers/hwtracing/ptt/Makefile | 2 + >>> drivers/hwtracing/ptt/hisi_ptt.c | 874 +++++++++++++++++++++++++++++++ >>> drivers/hwtracing/ptt/hisi_ptt.h | 166 ++++++ >>> 6 files changed, 1057 insertions(+) >>> create mode 100644 drivers/hwtracing/ptt/Kconfig >>> create mode 100644 drivers/hwtracing/ptt/Makefile >>> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.c >>> create mode 100644 drivers/hwtracing/ptt/hisi_ptt.h >>> >>> diff --git a/drivers/Makefile b/drivers/Makefile >>> index 020780b6b4d2..662d50599467 100644 >>> --- a/drivers/Makefile >>> +++ b/drivers/Makefile >>> @@ -175,6 +175,7 @@ obj-$(CONFIG_USB4) += thunderbolt/ >>> obj-$(CONFIG_CORESIGHT) += hwtracing/coresight/ >>> obj-y += hwtracing/intel_th/ >>> obj-$(CONFIG_STM) += hwtracing/stm/ >>> +obj-$(CONFIG_HISI_PTT) += hwtracing/ptt/ >>> obj-$(CONFIG_ANDROID) += android/ >>> obj-$(CONFIG_NVMEM) += nvmem/ >>> obj-$(CONFIG_FPGA) += fpga/ >>> diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig >>> index 13085835a636..911ee977103c 100644 >>> --- a/drivers/hwtracing/Kconfig >>> +++ b/drivers/hwtracing/Kconfig >>> @@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig" >>> >>> source "drivers/hwtracing/intel_th/Kconfig" >>> >>> +source "drivers/hwtracing/ptt/Kconfig" >>> + >>> endmenu >>> diff --git a/drivers/hwtracing/ptt/Kconfig b/drivers/hwtracing/ptt/Kconfig >>> new file mode 100644 >>> index 000000000000..8902a6f27563 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/Kconfig >>> @@ -0,0 +1,12 @@ >>> +# SPDX-License-Identifier: GPL-2.0-only >>> +config HISI_PTT >>> + tristate "HiSilicon PCIe Tune and Trace Device" >>> + depends on ARM64 || (COMPILE_TEST && 64BIT) >>> + depends on PCI && HAS_DMA && HAS_IOMEM && PERF_EVENTS >>> + help >>> + HiSilicon PCIe Tune and Trace Device exists as a PCIe RCiEP >>> + device, and it provides support for PCIe traffic tuning and >>> + tracing TLP headers to the memory. >>> + >>> + This driver can also be built as a module. If so, the module >>> + will be called hisi_ptt. >>> diff --git a/drivers/hwtracing/ptt/Makefile b/drivers/hwtracing/ptt/Makefile >>> new file mode 100644 >>> index 000000000000..908c09a98161 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/Makefile >>> @@ -0,0 +1,2 @@ >>> +# SPDX-License-Identifier: GPL-2.0 >>> +obj-$(CONFIG_HISI_PTT) += hisi_ptt.o >>> diff --git a/drivers/hwtracing/ptt/hisi_ptt.c b/drivers/hwtracing/ptt/hisi_ptt.c >>> new file mode 100644 >>> index 000000000000..242b41870380 >>> --- /dev/null >>> +++ b/drivers/hwtracing/ptt/hisi_ptt.c >>> @@ -0,0 +1,874 @@ >>> +// SPDX-License-Identifier: GPL-2.0 >>> +/* >>> + * Driver for HiSilicon PCIe tune and trace device >>> + * >>> + * Copyright (c) 2022 HiSilicon Technologies Co., Ltd. >>> + * Author: Yicong Yang >>> + */ >>> + >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> +#include >>> + >>> +#include "hisi_ptt.h" >>> + >>> +static u16 hisi_ptt_get_filter_val(struct pci_dev *pdev) >>> +{ >>> + if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT) >>> + return BIT(HISI_PCIE_CORE_PORT_ID(PCI_SLOT(pdev->devfn))); >>> + >>> + return PCI_DEVID(pdev->bus->number, pdev->devfn); >>> +} >>> + >>> +static bool hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt) >>> +{ >>> + u32 val; >>> + >>> + return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_STS, >>> + val, val & HISI_PTT_TRACE_IDLE, >>> + HISI_PTT_WAIT_POLL_INTERVAL_US, >>> + HISI_PTT_WAIT_TRACE_TIMEOUT_US); >>> +} >>> + >>> +static bool hisi_ptt_wait_dma_reset_done(struct hisi_ptt *hisi_ptt) >>> +{ >>> + u32 val; >>> + >>> + return !readl_poll_timeout_atomic(hisi_ptt->iobase + HISI_PTT_TRACE_WR_STS, >>> + val, !val, HISI_PTT_RESET_POLL_INTERVAL_US, >>> + HISI_PTT_RESET_TIMEOUT_US); >>> +} >>> + >>> +static void hisi_ptt_free_trace_buf(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct device *dev = &hisi_ptt->pdev->dev; >>> + int i; >>> + >>> + if (!ctrl->trace_buf) >>> + return; >>> + >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) { >>> + if (ctrl->trace_buf[i].addr) >>> + dmam_free_coherent(dev, HISI_PTT_TRACE_BUF_SIZE, >>> + ctrl->trace_buf[i].addr, >>> + ctrl->trace_buf[i].dma); >>> + } >>> + >>> + devm_kfree(dev, ctrl->trace_buf); >>> + ctrl->trace_buf = NULL; >>> +} >>> + >>> +static int hisi_ptt_alloc_trace_buf(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct device *dev = &hisi_ptt->pdev->dev; >>> + int i; >>> + >>> + hisi_ptt->trace_ctrl.buf_index = 0; >>> + >>> + /* If the trace buffer has already been allocated, zero it. */ >>> + if (ctrl->trace_buf) { >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) >>> + memset(ctrl->trace_buf[i].addr, 0, HISI_PTT_TRACE_BUF_SIZE); >>> + return 0; >>> + } >>> + >>> + ctrl->trace_buf = devm_kcalloc(dev, HISI_PTT_TRACE_BUF_CNT, >>> + sizeof(struct hisi_ptt_dma_buffer), GFP_KERNEL); >>> + if (!ctrl->trace_buf) >>> + return -ENOMEM; >>> + >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; ++i) { >>> + ctrl->trace_buf[i].addr = dmam_alloc_coherent(dev, HISI_PTT_TRACE_BUF_SIZE, >>> + &ctrl->trace_buf[i].dma, >>> + GFP_KERNEL); >>> + if (!ctrl->trace_buf[i].addr) { >>> + hisi_ptt_free_trace_buf(hisi_ptt); >>> + return -ENOMEM; >>> + } >>> + } >>> + >>> + return 0; >>> +} >>> + >>> +static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt) >>> +{ >>> + writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + hisi_ptt->trace_ctrl.started = false; >>> +} >>> + >>> +static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + u32 val; >>> + int i; >>> + >>> + /* Check device idle before start trace */ >>> + if (!hisi_ptt_wait_trace_hw_idle(hisi_ptt)) { >>> + pci_err(hisi_ptt->pdev, "Failed to start trace, the device is still busy\n"); >>> + return -EBUSY; >>> + } >>> + >>> + ctrl->started = true; >>> + >>> + /* Reset the DMA before start tracing */ >>> + val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + val |= HISI_PTT_TRACE_CTRL_RST; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + hisi_ptt_wait_dma_reset_done(hisi_ptt); >>> + >>> + val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + val &= ~HISI_PTT_TRACE_CTRL_RST; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + /* Clear the interrupt status */ >>> + writel(HISI_PTT_TRACE_INT_STAT_MASK, hisi_ptt->iobase + HISI_PTT_TRACE_INT_STAT); >>> + writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_INT_MASK); >>> + >>> + /* Configure the trace DMA buffer */ >>> + for (i = 0; i < HISI_PTT_TRACE_BUF_CNT; i++) { >>> + writel(lower_32_bits(ctrl->trace_buf[i].dma), >>> + hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_LO_0 + >>> + i * HISI_PTT_TRACE_ADDR_STRIDE); >>> + writel(upper_32_bits(ctrl->trace_buf[i].dma), >>> + hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_HI_0 + >>> + i * HISI_PTT_TRACE_ADDR_STRIDE); >>> + } >>> + writel(HISI_PTT_TRACE_BUF_SIZE, hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_SIZE); >>> + >>> + /* Set the trace control register */ >>> + val = FIELD_PREP(HISI_PTT_TRACE_CTRL_TYPE_SEL, ctrl->type); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_RXTX_SEL, ctrl->direction); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_DATA_FORMAT, ctrl->format); >>> + val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_TARGET_SEL, hisi_ptt->trace_ctrl.filter); >>> + if (!hisi_ptt->trace_ctrl.is_port) >>> + val |= HISI_PTT_TRACE_CTRL_FILTER_MODE; >>> + >>> + /* Start the Trace */ >>> + val |= HISI_PTT_TRACE_CTRL_EN; >>> + writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL); >>> + >>> + return 0; >>> +} >>> + >>> +static int hisi_ptt_update_aux(struct hisi_ptt *hisi_ptt, int index, bool stop) >>> +{ >>> + struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl; >>> + struct perf_output_handle *handle = &ctrl->handle; >>> + struct perf_event *event = handle->event; >>> + struct hisi_ptt_pmu_buf *buf; >>> + void *addr; >>> + >>> + buf = perf_get_aux(handle); >>> + if (!buf || !handle->size) >>> + return -EINVAL; >>> + >>> + addr = ctrl->trace_buf[ctrl->buf_index].addr; >>> + >>> + memcpy(buf->base + buf->pos, addr, HISI_PTT_TRACE_BUF_SIZE); >>> + memset(addr, 0, HISI_PTT_TRACE_BUF_SIZE); >> >> Hi Kicong, >> >> I also have the same comment as Leo here, I don't think the memset is >> required. >> > > It's necessary in the current approach as we always commit HISI_PTT_TRACE_BUF_SIZE > data but the buffer maybe partly filled (called when perf going to stopp, not by the > interrupt). The buffer is cleared so the unfilled part of the buffer will have > empty data (normal traced TLP headers won't be all 0), then the user can distinguish > the valid part of the data. > > I'm trying to only copy the traced data rather than the whole buffer then the > clear operation here will be unnecessary. The hardware provide a register indicating > which offset of which buffer it's currently writing to and it canbe used here. If only the traced data is copied rather than the full buffer, isn't that what perf_aux_output_end() is for? Perf will only read up to the point where you say the buffer is filled to, it won't go and read the zeros if you didn't tell it to by emitting perf_aux_output_end() for more data than was written. If you are having to write zeros to detect which bits of the buffer is filled or not it sounds like those zero parts are making it into the perf file and are wasting disk space and CPU cycles to copy them. > >>> + buf->pos += HISI_PTT_TRACE_BUF_SIZE; >>> + >>> + if (stop) { >>> + perf_aux_output_end(handle, buf->pos); >>> + } else if (buf->length - buf->pos < HISI_PTT_TRACE_BUF_SIZE) { >>> + perf_aux_output_skip(handle, buf->length - buf->pos); >> >> perf_aux_output_skip() can also return an error so should probably also >> be checked like perf_aux_output_begin() >> > > ok it should be checked. > >> I'm also wondering why there is a skip for every output_end()? Is that >> to avoid having two memcpy calls to handle the wrap around if the data >> to be copied goes past the end of the aux buffer? >> >> For example if your buffers are 4MB each and the aux buffer that the >> user picked isn't a multiple of 4 I can see you needing to write the >> first part of the 4MB to the end of the aux buffer and then the last >> part to the beginning which would be two memcpy() calls. And then a >> skip wouldn't be required. >> > > I intended to handle the case that AUX buffer is not a multiple of 4 MiB. > When the resident AUX buffer size is less than 4MiB, we're not going to > commit data to it and will apply a new AUX buffer instead. I think you're > right that the perf_aux_output_skip() is unnecessary here. Thanks for > catching this. > >> I looked at all the other uses of perf_output_end() and perf_output_skip() >> in the kernel and didn't see a pattern like yours so it seems suspicous to >> me. Maybe at least some comments around this section are needed. >> > > Will add some comments of the handling here. > > Regards, > Yicong _______________________________________________ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu