linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device
@ 2021-04-06 12:45 Yicong Yang
  2021-04-06 12:45 ` [PATCH 1/4] hwtracing: Add trace function " Yicong Yang
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Yicong Yang @ 2021-04-06 12:45 UTC (permalink / raw)
  To: alexander.shishkin, helgaas, linux-kernel, linux-pci
  Cc: lorenzo.pieralisi, gregkh, jonathan.cameron, song.bao.hua,
	prime.zeng, yangyicong, linux-doc, linuxarm

HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex
integrated Endpoint(RCiEP) device, providing the capability
to dynamically monitor and tune the PCIe traffic(tune),
and trace the TLP headers(trace). The driver exposes the user
interface through debugfs, so no need for extra user space tools.
The usage is described in the document.

Yicong Yang (4):
  hwtracing: Add trace function support for HiSilicon PCIe Tune and
    Trace device
  hwtracing: Add tune function support for HiSilicon PCIe Tune and Trace
    device
  docs: Add documentation for HiSilicon PTT device driver
  MAINTAINERS: Add maintainer for HiSilicon PTT driver

 Documentation/trace/hisi-ptt.rst       |  316 ++++++
 MAINTAINERS                            |    7 +
 drivers/Makefile                       |    1 +
 drivers/hwtracing/Kconfig              |    2 +
 drivers/hwtracing/hisilicon/Kconfig    |    8 +
 drivers/hwtracing/hisilicon/Makefile   |    2 +
 drivers/hwtracing/hisilicon/hisi_ptt.c | 1636 ++++++++++++++++++++++++++++++++
 7 files changed, 1972 insertions(+)
 create mode 100644 Documentation/trace/hisi-ptt.rst
 create mode 100644 drivers/hwtracing/hisilicon/Kconfig
 create mode 100644 drivers/hwtracing/hisilicon/Makefile
 create mode 100644 drivers/hwtracing/hisilicon/hisi_ptt.c

-- 
2.8.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device
  2021-04-06 12:45 [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Yicong Yang
@ 2021-04-06 12:45 ` Yicong Yang
  2021-04-06 13:51   ` Greg KH
                     ` (2 more replies)
  2021-04-06 12:45 ` [PATCH 2/4] hwtracing: Add tune " Yicong Yang
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 16+ messages in thread
From: Yicong Yang @ 2021-04-06 12:45 UTC (permalink / raw)
  To: alexander.shishkin, helgaas, linux-kernel, linux-pci
  Cc: lorenzo.pieralisi, gregkh, jonathan.cameron, song.bao.hua,
	prime.zeng, yangyicong, linux-doc, linuxarm

HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex
integrated Endpoint(RCiEP) device, providing the capability
to dynamically monitor and tune the PCIe traffic(tune),
and trace the TLP headers(trace).

Add the driver for the device to enable the trace function. The driver
will create debugfs directory for each PTT device, and users can
operate the device through the files under its directory.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/Makefile                       |    1 +
 drivers/hwtracing/Kconfig              |    2 +
 drivers/hwtracing/hisilicon/Kconfig    |    8 +
 drivers/hwtracing/hisilicon/Makefile   |    2 +
 drivers/hwtracing/hisilicon/hisi_ptt.c | 1449 ++++++++++++++++++++++++++++++++
 5 files changed, 1462 insertions(+)
 create mode 100644 drivers/hwtracing/hisilicon/Kconfig
 create mode 100644 drivers/hwtracing/hisilicon/Makefile
 create mode 100644 drivers/hwtracing/hisilicon/hisi_ptt.c

diff --git a/drivers/Makefile b/drivers/Makefile
index 6fba7da..9a0f76d 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -174,6 +174,7 @@ obj-$(CONFIG_PERF_EVENTS)	+= perf/
 obj-$(CONFIG_RAS)		+= ras/
 obj-$(CONFIG_USB4)		+= thunderbolt/
 obj-$(CONFIG_CORESIGHT)		+= hwtracing/coresight/
+obj-$(CONFIG_HISI_PTT)		+= hwtracing/hisilicon/
 obj-y				+= hwtracing/intel_th/
 obj-$(CONFIG_STM)		+= hwtracing/stm/
 obj-$(CONFIG_ANDROID)		+= android/
diff --git a/drivers/hwtracing/Kconfig b/drivers/hwtracing/Kconfig
index 1308583..e3796b1 100644
--- a/drivers/hwtracing/Kconfig
+++ b/drivers/hwtracing/Kconfig
@@ -5,4 +5,6 @@ source "drivers/hwtracing/stm/Kconfig"
 
 source "drivers/hwtracing/intel_th/Kconfig"
 
+source "drivers/hwtracing/hisilicon/Kconfig"
+
 endmenu
diff --git a/drivers/hwtracing/hisilicon/Kconfig b/drivers/hwtracing/hisilicon/Kconfig
new file mode 100644
index 0000000..e4c2379
--- /dev/null
+++ b/drivers/hwtracing/hisilicon/Kconfig
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0-only
+config HISI_PTT
+	tristate "HiSilicon PCIe Tune and Trace Device"
+	depends on (ARM64 && PCI && HAS_DMA && HAS_IOMEM && DEBUG_FS) || COMPILE_TEST
+	help
+	  HiSilicon PCIe Tune and Trace Device exist as a PCIe iEP
+	  device, provides support for PCIe traffic tuning and
+	  tracing TLP headers to the memory.
diff --git a/drivers/hwtracing/hisilicon/Makefile b/drivers/hwtracing/hisilicon/Makefile
new file mode 100644
index 0000000..908c09a
--- /dev/null
+++ b/drivers/hwtracing/hisilicon/Makefile
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_HISI_PTT) += hisi_ptt.o
diff --git a/drivers/hwtracing/hisilicon/hisi_ptt.c b/drivers/hwtracing/hisilicon/hisi_ptt.c
new file mode 100644
index 0000000..a1feece
--- /dev/null
+++ b/drivers/hwtracing/hisilicon/hisi_ptt.c
@@ -0,0 +1,1449 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for HiSilicon PCIe tune and trace device
+ *
+ * Copyright (c) 2021 HiSilicon Limited.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/debugfs.h>
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iopoll.h>
+#include <linux/kfifo.h>
+#include <linux/module.h>
+#include <linux/notifier.h>
+#include <linux/seq_file.h>
+#include <linux/pci.h>
+#include <linux/workqueue.h>
+
+#define HISI_PTT_CTRL_STR_LEN			40
+#define HISI_PTT_DEFAULT_TRACE_BUF_CNT		64
+#define HISI_PTT_FILTER_UPDATE_FIFO_SIZE	16
+
+#define HISI_PTT_RESET_WAIT_MS		1000UL
+#define HISI_PTT_WORK_DELAY_MS		100UL
+#define HISI_PTT_WAIT_TIMEOUT_US	1000000UL
+#define HISI_PTT_WAIT_POLL_INTERVAL_US	100UL
+
+#define HISI_PTT_IRQ_NUMS	1
+#define HISI_PTT_DMA_IRQ	0
+#define HISI_PTT_DMA_NUMS	4
+
+#define HISI_PTT_TUNING_CTRL		0x0000
+#define   HISI_PTT_TUNING_CTRL_CODE	GENMASK(15, 0)
+#define   HISI_PTT_TUNING_CTRL_SUB	GENMASK(23, 16)
+#define   HISI_PTT_TUNING_CTRL_CHN	GENMASK(31, 24)
+#define HISI_PTT_TUNING_DATA		0x0004
+#define   HISI_PTT_TUNING_DATA_VAL_MASK	GENMASK(15, 0)
+#define HISI_PTT_TRACE_ADDR_SIZE	0x0800
+#define HISI_PTT_TRACE_ADDR_BASE_LO_0	0x0810
+#define HISI_PTT_TRACE_ADDR_BASE_HI_0	0x0814
+#define HISI_PTT_TRACE_ADDR_STRIDE	0x8
+#define HISI_PTT_TRACE_CTRL		0x0850
+#define   HISI_PTT_TRACE_CTRL_EN	BIT(0)
+#define   HISI_PTT_TRACE_CTRL_RST	BIT(1)
+#define   HISI_PTT_TRACE_CTRL_RXTX_SEL	GENMASK(3, 2)
+#define   HISI_PTT_TRACE_CTRL_TYPE_SEL	GENMASK(7, 4)
+#define   HISI_PTT_TRACE_CTRL_DATA_FORMAT	BIT(14)
+#define   HISI_PTT_TRACE_CTRL_FILTER_MODE	BIT(15)
+#define   HISI_PTT_TRACE_CTRL_TARGET_SEL	GENMASK(31, 16)
+#define HISI_PTT_TRACE_INT_STAT		0x0890
+#define   HISI_PTT_TRACE_INT_STAT_MASK	GENMASK(3, 0)
+#define HISI_PTT_TRACE_INT_MASK_REG	0x0894
+#define HISI_PTT_TUNING_INT_STAT	0x0898
+#define   HISI_PTT_TUNING_INT_STAT_MASK	BIT(0)
+#define HISI_PTT_TUNING_INT_MASK	0x089c
+#define HISI_PTT_TRACE_WR_STS		0x08a0
+#define   HISI_PTT_TRACE_WR_STS_WRITE	GENMASK(27, 0)
+#define   HISI_PTT_TRACE_WR_STS_BUFFER	GENMASK(29, 28)
+#define HISI_PTT_TRACE_STS		0x08b0
+#define   HISI_PTT_TRACE_IDLE		BIT(0)
+#define HISI_PTT_DEVICE_RANGE		0x0fe0
+
+#define HISI_PCIE_CORE_PORT_ID(v)	(((v) & 7) << 1)
+
+static struct dentry *hisi_ptt_debugfs_root;
+
+struct hisi_ptt_debugfs_file_desc {
+	struct hisi_ptt *hisi_ptt;
+	const char *name;
+	const struct file_operations *fops;
+};
+
+#define PTT_FILE_INIT(__name, __fops)	\
+	{ .name = __name, .fops = __fops }
+
+struct hisi_ptt_trace_event_desc {
+	const char *name;
+	u32 event_code;
+};
+
+#define PTT_TRACE_EVENT_INIT(__name, __event_code) \
+	{ .name = __name, .event_code = __event_code }
+
+enum hisi_ptt_trace_rxtx {
+	HISI_PTT_TRACE_RX = 0,
+	HISI_PTT_TRACE_TX,
+	HISI_PTT_TRACE_RXTX,
+	HISI_PTT_TRACE_RXTX_NO_DMA_P2P,
+};
+
+static const struct hisi_ptt_trace_event_desc trace_rxtx[] = {
+	PTT_TRACE_EVENT_INIT("rx",
+			      HISI_PTT_TRACE_RX),
+	PTT_TRACE_EVENT_INIT("tx",
+			      HISI_PTT_TRACE_TX),
+	PTT_TRACE_EVENT_INIT("rxtx",
+			      HISI_PTT_TRACE_RXTX),
+	PTT_TRACE_EVENT_INIT("rxtx_no_dma_p2p",
+			      HISI_PTT_TRACE_RXTX_NO_DMA_P2P),
+};
+#define HISI_PTT_TRACE_DEFAULT_RXTX	trace_rxtx[2]
+
+
+enum hisi_ptt_trace_event {
+	HISI_PTT_TRACE_POSTED_REQUEST		= 1,
+	HISI_PTT_TRACE_NON_POSTED_REQUEST	= 2,
+	HISI_PTT_TRACE_COMPLETION		= 4,
+	HISI_PTT_TRACE_ALL			= 7,
+};
+
+static const struct hisi_ptt_trace_event_desc trace_events[] = {
+	PTT_TRACE_EVENT_INIT("all",
+			      HISI_PTT_TRACE_POSTED_REQUEST),
+	PTT_TRACE_EVENT_INIT("posted_request",
+			      HISI_PTT_TRACE_NON_POSTED_REQUEST),
+	PTT_TRACE_EVENT_INIT("non-posted_request",
+			      HISI_PTT_TRACE_COMPLETION),
+	PTT_TRACE_EVENT_INIT("completion",
+			      HISI_PTT_TRACE_ALL),
+};
+#define HISI_PTT_TRACE_DEFAULT_EVENT	trace_events[0]
+
+static const unsigned int available_buflet_size[] = {
+	0x00200000,	/* 2  MiB */
+	0x00400000,	/* 4  MiB */
+};
+#define HISI_PTT_TRACE_DEFAULT_BUFLET_SIZE	available_buflet_size[0]
+
+struct hisi_ptt_dma_buflet {
+	struct list_head list;
+	unsigned int size;
+	dma_addr_t dma;
+
+	/*
+	 * The address of the buflet holding the trace data.
+	 * See Documentation/trace/hisi-ptt.rst for the details
+	 * of the data format.
+	 */
+	u32 *addr;
+	int index;
+};
+
+enum hisi_ptt_trace_data_format {
+	HISI_PTT_TRACE_DATA_4DW = 0,
+	HISI_PTT_TRACE_DATA_8DW,
+};
+
+struct hisi_ptt_trace_ctrl {
+	struct list_head trace_buf;
+	struct hisi_ptt_dma_buflet *cur;
+	u64 buflet_nums;
+	u32 buflet_size;
+	unsigned int status; /* 0:idle, 1:tracing */
+#define	HISI_PTT_TRACE_STATUS_OFF	0
+#define	HISI_PTT_TRACE_STATUS_ON	1
+
+	enum hisi_ptt_trace_data_format tr_format;
+	enum hisi_ptt_trace_event tr_event;
+	enum hisi_ptt_trace_rxtx rxtx;
+};
+
+struct hisi_ptt_filter_desc {
+	struct list_head list;
+	struct pci_dev *pdev;
+	u16 val;
+};
+
+/* Structure containing the information for filter updating */
+struct hisi_ptt_filter_update_info {
+	struct pci_dev *pdev;
+	bool is_port;
+	bool is_add;
+	u32 port_devid;
+	u16 val;
+};
+
+struct hisi_ptt {
+	struct notifier_block hisi_ptt_nb;
+	struct dentry *debugfs_dir;
+	struct pci_dev *pdev;
+	void __iomem *iobase;
+	/* protects the hisi_ptt structure */
+	struct mutex mutex;
+	const char *name;
+
+	/*
+	 * We use a delayed work here to avoid indefinitely waiting for
+	 * the hisi_ptt->mutex which protecting the filter list. The
+	 * work will be delayed only if the mutex can not be held,
+	 * otherwise no delay will be applied.
+	 */
+	struct delayed_work work;
+	spinlock_t filter_update_lock;
+	DECLARE_KFIFO(filter_update_kfifo, struct hisi_ptt_filter_update_info,
+		      HISI_PTT_FILTER_UPDATE_FIFO_SIZE);
+
+	/*
+	 * The upper and lower BDF numbers of the root port,
+	 * which PTT device can control and trace.
+	 */
+	u16 upper;
+	u16 lower;
+
+	/*
+	 * The trace TLP headers can either be filtered by certain
+	 * root port, or by the requester ID. Organize the filters
+	 * by @port_filters and @req_filters here.
+	 */
+	struct list_head port_filters;
+	struct list_head req_filters;
+	struct hisi_ptt_trace_ctrl trace_ctrl;
+	struct {
+		bool is_port;
+		u16 val;
+	} cur_filter;
+};
+
+static int hisi_ptt_write_to_buffer(void *buf, size_t len, loff_t *pos,
+				    const void __user *ubuf, size_t count)
+{
+	char *cp;
+
+	if (simple_write_to_buffer(buf, len, pos, ubuf, count) < 0)
+		return -EINVAL;
+
+	cp = strchr(buf, '\n');
+	if (cp)
+		*cp = '\0';
+
+	return 0;
+}
+
+static u16 hisi_ptt_get_filter_val(struct pci_dev *pdev)
+{
+	if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT)
+		return BIT(HISI_PCIE_CORE_PORT_ID(PCI_SLOT(pdev->devfn)));
+	else
+		return PCI_DEVID(pdev->bus->number, pdev->devfn);
+}
+
+static int hisi_ptt_wait_trace_hw_idle(struct hisi_ptt *hisi_ptt)
+{
+	u32 val;
+
+	return readl_poll_timeout(hisi_ptt->iobase + HISI_PTT_TRACE_STS, val,
+				  val & HISI_PTT_TRACE_IDLE,
+				  HISI_PTT_WAIT_POLL_INTERVAL_US,
+				  HISI_PTT_WAIT_TIMEOUT_US);
+}
+
+static void hisi_ptt_free_trace_buf(struct hisi_ptt *hisi_ptt)
+{
+	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
+	struct device *dev = &hisi_ptt->pdev->dev;
+	struct hisi_ptt_dma_buflet *buflet, *tbuflet;
+
+	if (list_empty(&ctrl->trace_buf))
+		return;
+
+	list_for_each_entry_safe(buflet, tbuflet, &ctrl->trace_buf, list) {
+		dma_free_coherent(dev, buflet->size, buflet->addr, buflet->dma);
+		list_del(&buflet->list);
+		kfree(buflet);
+	}
+}
+
+static int hisi_ptt_alloc_trace_buf(struct hisi_ptt *hisi_ptt)
+{
+	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
+	struct device *dev = &hisi_ptt->pdev->dev;
+	struct hisi_ptt_dma_buflet *buflet;
+	int i, ret;
+
+	/* Make sure the trace buffer is empty before allocating */
+	if (!list_empty(&ctrl->trace_buf))
+		hisi_ptt_free_trace_buf(hisi_ptt);
+
+	for (i = 0; i < ctrl->buflet_nums; ++i) {
+		buflet = kzalloc(sizeof(*buflet), GFP_KERNEL);
+		if (!buflet) {
+			ret = -ENOMEM;
+			goto err;
+		}
+		buflet->addr = dma_alloc_coherent(dev, ctrl->buflet_size,
+						  &buflet->dma, GFP_KERNEL);
+		if (!buflet->addr) {
+			kfree(buflet);
+			ret = -ENOMEM;
+			goto err;
+		}
+
+		buflet->index = i;
+		buflet->size = ctrl->buflet_size;
+		list_add_tail(&buflet->list, &ctrl->trace_buf);
+	}
+
+	return 0;
+
+err:
+	hisi_ptt_free_trace_buf(hisi_ptt);
+	return ret;
+}
+
+static void hisi_ptt_trace_end(struct hisi_ptt *hisi_ptt)
+{
+	writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
+	hisi_ptt->trace_ctrl.status = HISI_PTT_TRACE_STATUS_OFF;
+	hisi_ptt->trace_ctrl.cur = NULL;
+}
+
+static int hisi_ptt_trace_start(struct hisi_ptt *hisi_ptt)
+{
+	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
+	u32 val;
+	int i;
+
+	if (!hisi_ptt->cur_filter.val) {
+		pci_err(hisi_ptt->pdev, "no filter specified\n");
+		return -ENODEV;
+	}
+
+	if (!ctrl->tr_event) {
+		pci_err(hisi_ptt->pdev, "no trace type specified\n");
+		return -EINVAL;
+	}
+
+	/* Check device idle before start trace */
+	if (hisi_ptt_wait_trace_hw_idle(hisi_ptt)) {
+		pci_err(hisi_ptt->pdev, "the device is still busy\n");
+		return -EBUSY;
+	}
+
+	/* Reset the DMA before start tracing */
+	val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
+	val |= HISI_PTT_TRACE_CTRL_RST;
+	writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
+
+	/* Wait for the reset finished */
+	msleep(HISI_PTT_RESET_WAIT_MS);
+
+	val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
+	val &= ~HISI_PTT_TRACE_CTRL_RST;
+	writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
+
+	/* clear the interrupt status */
+	writel(HISI_PTT_TRACE_INT_STAT_MASK,
+	       hisi_ptt->iobase + HISI_PTT_TRACE_INT_STAT);
+	writel(0, hisi_ptt->iobase + HISI_PTT_TRACE_INT_MASK_REG);
+
+	for (i = 0; i < HISI_PTT_DMA_NUMS; ++i) {
+		if (!ctrl->cur)
+			ctrl->cur = list_first_entry(&ctrl->trace_buf,
+						     struct hisi_ptt_dma_buflet,
+						     list);
+		else
+			ctrl->cur = list_next_entry(ctrl->cur, list);
+
+		writel(lower_32_bits(ctrl->cur->dma),
+		       hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_LO_0 +
+		       i * HISI_PTT_TRACE_ADDR_STRIDE);
+		writel(upper_32_bits(ctrl->cur->dma),
+		       hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_BASE_HI_0 +
+		       i * HISI_PTT_TRACE_ADDR_STRIDE);
+	}
+	writel(ctrl->buflet_size, hisi_ptt->iobase + HISI_PTT_TRACE_ADDR_SIZE);
+
+	/* set the trace control register */
+	val = 0;
+	val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_TYPE_SEL, ctrl->tr_event);
+	val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_RXTX_SEL, ctrl->rxtx);
+	val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_DATA_FORMAT, ctrl->tr_format);
+	val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_FILTER_MODE,
+			  hisi_ptt->cur_filter.is_port ? 0 : 1);
+	val |= FIELD_PREP(HISI_PTT_TRACE_CTRL_TARGET_SEL,
+			  hisi_ptt->cur_filter.val);
+
+	val |= HISI_PTT_TRACE_CTRL_EN;
+	writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_CTRL);
+
+	ctrl->status = HISI_PTT_TRACE_STATUS_ON;
+
+	return 0;
+}
+
+#define HISI_PTT_TRACE_ATTR(__name)					\
+static int __name ## _open(struct inode *inode, struct file *filp)	\
+{									\
+	struct hisi_ptt_debugfs_file_desc *desc = inode->i_private;	\
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;			\
+	if (!mutex_trylock(&hisi_ptt->mutex))				\
+		return -EBUSY;						\
+	return single_open(filp, __name ## _show, hisi_ptt);		\
+}									\
+static int __name ## _release(struct inode *inode, struct file *filp)	\
+{									\
+	struct hisi_ptt_debugfs_file_desc *desc = inode->i_private;	\
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;			\
+	mutex_unlock(&hisi_ptt->mutex);					\
+	return seq_release(inode, filp);				\
+}									\
+static const struct file_operations __name ## _fops = {			\
+	.owner		= THIS_MODULE,					\
+	.open		= __name ## _open,				\
+	.read		= seq_read,					\
+	.write		= __name ## _write,				\
+	.llseek		= seq_lseek,					\
+	.release	= __name ## _release,				\
+}
+
+static void hisi_ptt_print_filter_list(struct seq_file *m,
+				       struct hisi_ptt *hisi_ptt,
+				       struct list_head *filter_list,
+				       bool port_filter)
+{
+	struct hisi_ptt_filter_desc *filter;
+
+	list_for_each_entry(filter, filter_list, list) {
+		struct pci_dev *pdev = filter->pdev;
+		bool selected = false;
+
+		if (port_filter && hisi_ptt->cur_filter.is_port)
+			selected = filter->val & hisi_ptt->cur_filter.val;
+		else
+			selected = filter->val == hisi_ptt->cur_filter.val;
+
+		seq_printf(m, "%s%04x:%02x:%02x.%u%s\n",
+			   selected ? "[" : "",
+			   pci_domain_nr(pdev->bus),
+			   pdev->bus->number, PCI_SLOT(pdev->devfn),
+			   PCI_FUNC(pdev->devfn),
+			   selected ? "]" : "");
+	}
+}
+
+static int hisi_ptt_set_filter_show(struct seq_file *m, void *v)
+{
+	struct hisi_ptt *hisi_ptt = m->private;
+
+	if (list_empty(&hisi_ptt->port_filters)) {
+		seq_puts(m, "#### No available filter ####\n");
+		return 0;
+	}
+
+	seq_puts(m, "#### Root Ports ####\n");
+	hisi_ptt_print_filter_list(m, hisi_ptt, &hisi_ptt->port_filters, true);
+
+	if (list_empty(&hisi_ptt->req_filters))
+		return 0;
+
+	seq_puts(m, "#### Functions ####\n");
+	hisi_ptt_print_filter_list(m, hisi_ptt, &hisi_ptt->req_filters, false);
+
+	return 0;
+}
+
+static int hisi_ptt_parse_set_filter(struct hisi_ptt *hisi_ptt, const char *buf)
+{
+	struct hisi_ptt_filter_desc *filter;
+	u32 domain, bus, dev, fn, devfn;
+	struct list_head *filter_list;
+	struct pci_dev *pdev;
+	u16 filter_val;
+	int num;
+
+	/*
+	 * the input should be like 0000:80:01.1, etc. Parse it
+	 * and check whether it's in the available func list.
+	 */
+	num = sscanf(buf, "%04x:%02x:%02x.%u", &domain, &bus, &dev, &fn);
+	if (num != 4)
+		return -EINVAL;
+
+	devfn = PCI_DEVFN(dev, fn);
+	pdev = pci_get_domain_bus_and_slot(domain, bus, devfn);
+	if (!pdev)
+		return -EINVAL;
+
+	filter_val = hisi_ptt_get_filter_val(pdev);
+	if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT)
+		filter_list = &hisi_ptt->port_filters;
+	else
+		filter_list = &hisi_ptt->req_filters;
+
+	list_for_each_entry(filter, filter_list, list) {
+		if (filter_val != filter->val)
+			continue;
+
+		if (hisi_ptt->cur_filter.is_port &&
+		    pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT)
+			hisi_ptt->cur_filter.val |= filter_val;
+		else
+			hisi_ptt->cur_filter.val = filter_val;
+
+		hisi_ptt->cur_filter.is_port = pci_pcie_type(pdev) ==
+					       PCI_EXP_TYPE_ROOT_PORT;
+
+		pci_dev_put(pdev);
+		return 0;
+	}
+
+	pci_dev_put(pdev);
+	return -EINVAL;
+}
+
+static ssize_t hisi_ptt_set_filter_write(struct file *filp,
+					 const char __user *buf,
+					 size_t count, loff_t *pos)
+{
+	struct seq_file *m = filp->private_data;
+	struct hisi_ptt *hisi_ptt = m->private;
+	char tbuf[HISI_PTT_CTRL_STR_LEN];
+
+	if (list_empty(&hisi_ptt->port_filters))
+		return -EINVAL;
+
+	if (hisi_ptt_write_to_buffer(tbuf, HISI_PTT_CTRL_STR_LEN - 1,
+				     pos, buf, count))
+		return -EINVAL;
+
+	if (!strlen(tbuf)) {
+		hisi_ptt->cur_filter.val = 0;
+		hisi_ptt->cur_filter.is_port = false;
+		return count;
+	}
+
+	if (hisi_ptt_parse_set_filter(hisi_ptt, tbuf) < 0)
+		return -EINVAL;
+
+	return count;
+}
+HISI_PTT_TRACE_ATTR(hisi_ptt_set_filter);
+
+static int hisi_ptt_set_direction_show(struct seq_file *m, void *v)
+{
+	struct hisi_ptt *hisi_ptt = m->private;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(trace_rxtx); ++i) {
+		bool selected = false;
+
+		if (hisi_ptt->trace_ctrl.rxtx == trace_rxtx[i].event_code)
+			selected = true;
+
+		seq_printf(m, "%s%s%s  ",
+			   selected ? "[" : "", trace_rxtx[i].name,
+			   selected ? "]" : "");
+	}
+	seq_putc(m, '\n');
+
+	return 0;
+}
+
+static ssize_t hisi_ptt_set_direction_write(struct file *filp,
+					    const char __user *buf,
+					    size_t count, loff_t *pos)
+{
+	struct seq_file *m = filp->private_data;
+	struct hisi_ptt *hisi_ptt = m->private;
+	char tbuf[HISI_PTT_CTRL_STR_LEN];
+	int i;
+
+	if (hisi_ptt_write_to_buffer(tbuf, HISI_PTT_CTRL_STR_LEN - 1,
+				     pos, buf, count))
+		return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(trace_rxtx); ++i) {
+		if (!strcmp(tbuf, trace_rxtx[i].name)) {
+			hisi_ptt->trace_ctrl.rxtx = trace_rxtx[i].event_code;
+			return count;
+		}
+	}
+
+	return -EINVAL;
+}
+HISI_PTT_TRACE_ATTR(hisi_ptt_set_direction);
+
+static int hisi_ptt_set_trace_type_show(struct seq_file *m, void *v)
+{
+	struct hisi_ptt *hisi_ptt = m->private;
+	bool select_all = false;
+	int i;
+
+	if (hisi_ptt->trace_ctrl.tr_event == trace_events[0].event_code)
+		select_all = true;
+
+	for (i = 0; i < ARRAY_SIZE(trace_events); ++i) {
+		bool selected = false;
+
+		if (((hisi_ptt->trace_ctrl.tr_event &
+		      trace_events[i].event_code) &&
+		     !select_all && i) || (i == 0 && select_all))
+			selected = true;
+
+		seq_printf(m, "%s%s%s  ",
+			   selected ? "[" : "", trace_events[i].name,
+			   selected ? "]" : "");
+	}
+	seq_putc(m, '\n');
+
+	return 0;
+}
+
+static ssize_t hisi_ptt_set_trace_type_write(struct file *filp,
+					     const char __user *buf,
+					     size_t count, loff_t *pos)
+{
+	struct seq_file *m = filp->private_data;
+	struct hisi_ptt *hisi_ptt = m->private;
+	char tbuf[HISI_PTT_CTRL_STR_LEN];
+	int i;
+
+	if (hisi_ptt_write_to_buffer(tbuf, HISI_PTT_CTRL_STR_LEN - 1,
+				     pos, buf, count))
+		return -EINVAL;
+
+	if (!strlen(tbuf)) {
+		hisi_ptt->trace_ctrl.tr_event = 0;
+		return count;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(trace_events); ++i) {
+		if (!strcmp(tbuf, trace_events[i].name)) {
+			hisi_ptt->trace_ctrl.tr_event |= trace_events[i].event_code;
+			return count;
+		}
+	}
+
+	return -EINVAL;
+}
+HISI_PTT_TRACE_ATTR(hisi_ptt_set_trace_type);
+
+static int hisi_ptt_trace_on_get(void *data, u64 *val)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = data;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	*val = hisi_ptt->trace_ctrl.status;
+
+	mutex_unlock(&hisi_ptt->mutex);
+	return 0;
+}
+
+static int hisi_ptt_trace_on_set(void *data, u64 val)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = data;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+	int ret = 0;
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	if (val) {
+		if (hisi_ptt->trace_ctrl.status == HISI_PTT_TRACE_STATUS_ON)
+			goto out;
+
+		ret = hisi_ptt_alloc_trace_buf(hisi_ptt);
+		if (ret)
+			goto out;
+
+		ret = hisi_ptt_trace_start(hisi_ptt);
+		if (ret) {
+			hisi_ptt_free_trace_buf(hisi_ptt);
+			goto out;
+		}
+	} else {
+		if (hisi_ptt->trace_ctrl.status == HISI_PTT_TRACE_STATUS_OFF)
+			goto out;
+
+		hisi_ptt_trace_end(hisi_ptt);
+	}
+
+out:
+	mutex_unlock(&hisi_ptt->mutex);
+	return ret;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(hisi_ptt_trace_on_fops, hisi_ptt_trace_on_get,
+			 hisi_ptt_trace_on_set, "%llu\n");
+
+static int hisi_ptt_set_trace_buf_nums_get(void *data, u64 *val)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = data;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	*val = hisi_ptt->trace_ctrl.buflet_nums;
+
+	mutex_unlock(&hisi_ptt->mutex);
+	return 0;
+}
+
+static int hisi_ptt_set_trace_buf_nums_set(void *data, u64 val)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = data;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+
+	if (val < HISI_PTT_DMA_NUMS) {
+		pci_warn(hisi_ptt->pdev, "buflet numbers should be more than %d\n",
+			 HISI_PTT_DMA_NUMS - 1);
+		return -EINVAL;
+	}
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	hisi_ptt->trace_ctrl.buflet_nums = val;
+
+	mutex_unlock(&hisi_ptt->mutex);
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(hisi_ptt_set_trace_buf_nums_fops,
+			 hisi_ptt_set_trace_buf_nums_get,
+			 hisi_ptt_set_trace_buf_nums_set, "%llu\n");
+
+static int hisi_ptt_set_trace_buflet_size_show(struct seq_file *m, void *v)
+{
+	struct hisi_ptt *hisi_ptt = m->private;
+	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(available_buflet_size); ++i) {
+		bool selected = false;
+
+		if (ctrl->buflet_size == available_buflet_size[i])
+			selected = true;
+
+		seq_printf(m, "%s%uMiB%s  ",
+			   selected ? "[" : "", available_buflet_size[i] >> 20,
+			   selected ? "]" : "");
+	}
+	seq_putc(m, '\n');
+
+	return 0;
+}
+
+static ssize_t hisi_ptt_set_trace_buflet_size_write(struct file *filp,
+						    const char __user *buf,
+						    size_t count, loff_t *pos)
+{
+	struct seq_file *m = filp->private_data;
+	struct hisi_ptt *hisi_ptt = m->private;
+	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
+	char tbuf[HISI_PTT_CTRL_STR_LEN];
+	unsigned int size;
+	int i;
+
+	if (hisi_ptt_write_to_buffer(tbuf, HISI_PTT_CTRL_STR_LEN - 1,
+				     pos, buf, count))
+		return -EINVAL;
+
+	if (kstrtouint(tbuf, 0, &size))
+		return -EINVAL;
+
+	size <<= 20;
+	for (i = 0; i < ARRAY_SIZE(available_buflet_size); ++i) {
+		if (available_buflet_size[i] == size) {
+			ctrl->buflet_size = size;
+			return count;
+		}
+	}
+
+	return -EINVAL;
+}
+HISI_PTT_TRACE_ATTR(hisi_ptt_set_trace_buflet_size);
+
+static int hisi_ptt_free_trace_buf_get(void *data, u64 *val)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = data;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	*val = list_empty(&hisi_ptt->trace_ctrl.trace_buf);
+
+	mutex_unlock(&hisi_ptt->mutex);
+	return 0;
+}
+
+static int hisi_ptt_free_trace_buf_set(void *data, u64 val)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = data;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+	int ret = 0;
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	if (!list_empty(&hisi_ptt->trace_ctrl.trace_buf)) {
+		if (hisi_ptt->trace_ctrl.status == HISI_PTT_TRACE_STATUS_ON)
+			ret = -EBUSY;
+		else
+			hisi_ptt_free_trace_buf(hisi_ptt);
+	}
+
+	mutex_unlock(&hisi_ptt->mutex);
+	return ret;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(hisi_ptt_free_trace_buf_fops,
+			 hisi_ptt_free_trace_buf_get,
+			 hisi_ptt_free_trace_buf_set, "%llu\n");
+
+static int hisi_ptt_trace_data_format_show(struct seq_file *m, void *v)
+{
+	struct hisi_ptt *hisi_ptt = m->private;
+	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
+
+	seq_printf(m, "%s\n", ctrl->tr_format ? "4DW [8DW]" : "[4DW] 8DW");
+
+	return 0;
+}
+
+static ssize_t hisi_ptt_trace_data_format_write(struct file *filp,
+						const char __user *buf,
+						size_t count, loff_t *pos)
+{
+	struct seq_file *m = filp->private_data;
+	struct hisi_ptt *hisi_ptt = m->private;
+	struct hisi_ptt_trace_ctrl *ctrl = &hisi_ptt->trace_ctrl;
+	char tbuf[HISI_PTT_CTRL_STR_LEN];
+	u32 format;
+
+	if (hisi_ptt_write_to_buffer(tbuf, HISI_PTT_CTRL_STR_LEN - 1,
+				     pos, buf, count))
+		return -EINVAL;
+
+	if (kstrtouint(tbuf, 0, &format))
+		return -EINVAL;
+
+	/* The data format can only be 4DW or 8DW */
+	if (format == 4)
+		ctrl->tr_format = HISI_PTT_TRACE_DATA_4DW;
+	else if (format == 8)
+		ctrl->tr_format = HISI_PTT_TRACE_DATA_8DW;
+	else
+		return -EINVAL;
+
+	return count;
+}
+HISI_PTT_TRACE_ATTR(hisi_ptt_trace_data_format);
+
+static void *hisi_ptt_trace_data_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	struct hisi_ptt *hisi_ptt = m->private;
+	struct hisi_ptt_dma_buflet *buflet = v;
+
+	(*pos)++;
+
+	if (!list_is_last(&buflet->list, &hisi_ptt->trace_ctrl.trace_buf))
+		return list_next_entry(buflet, list);
+
+	return NULL;
+}
+
+static void *hisi_ptt_trace_data_start(struct seq_file *m, loff_t *pos)
+{
+	struct hisi_ptt *hisi_ptt = m->private;
+	struct hisi_ptt_dma_buflet *buflet;
+	loff_t off = 0;
+
+	buflet = list_first_entry(&hisi_ptt->trace_ctrl.trace_buf,
+				  struct hisi_ptt_dma_buflet, list);
+	while (off < *pos)
+		buflet = hisi_ptt_trace_data_next(m, buflet, &off);
+
+	return buflet;
+}
+
+static void hisi_ptt_trace_data_stop(struct seq_file *m, void *p)
+{
+	/* Nothing to do, only a stub */
+}
+
+static int hisi_ptt_trace_data_show(struct seq_file *m, void *v)
+{
+	struct hisi_ptt_dma_buflet *buflet = v;
+
+	if (buflet)
+		seq_write(m, buflet->addr, buflet->size);
+
+	return 0;
+}
+
+/*
+ * The operations of the seq_file will exist when the mutex of
+ * hisi_ptt being held in the open(), and the mutex will be
+ * released in the release().
+ */
+static const struct seq_operations hisi_ptt_trace_data_seq_ops = {
+	.start	= hisi_ptt_trace_data_start,
+	.next	= hisi_ptt_trace_data_next,
+	.stop	= hisi_ptt_trace_data_stop,
+	.show	= hisi_ptt_trace_data_show,
+};
+
+static int hisi_ptt_trace_data_open(struct inode *inode, struct file *filep)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = inode->i_private;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+	struct seq_file *m;
+	int ret;
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	/*
+	 * Check the trace status, we cannot read the
+	 * data if the trace is still on. Then hold the
+	 * lock when reading the traced data.
+	 */
+	if (hisi_ptt->trace_ctrl.status == HISI_PTT_TRACE_STATUS_ON ||
+	    hisi_ptt_wait_trace_hw_idle(hisi_ptt)) {
+		ret = -EBUSY;
+		goto err;
+	}
+
+	if (list_empty(&hisi_ptt->trace_ctrl.trace_buf)) {
+		pci_info(hisi_ptt->pdev, "the trace buffer is empty\n");
+		ret = -ENOTTY;
+		goto err;
+	}
+
+	ret = seq_open(filep, &hisi_ptt_trace_data_seq_ops);
+	if (ret)
+		goto err;
+
+	m = filep->private_data;
+	m->private = hisi_ptt;
+
+	return 0;
+
+err:
+	mutex_unlock(&hisi_ptt->mutex);
+	return ret;
+}
+
+static ssize_t hisi_ptt_trace_data_write(struct file *filp,
+					 const char __user *buf,
+					 size_t count, loff_t *off)
+{
+	struct seq_file *m = filp->private_data;
+	struct hisi_ptt *hisi_ptt = m->private;
+	char tbuf[HISI_PTT_CTRL_STR_LEN];
+
+	if (hisi_ptt_write_to_buffer(tbuf, HISI_PTT_CTRL_STR_LEN - 1,
+				     off, buf, count))
+		return -EINVAL;
+
+	/* Free the trace buffer when `echo "" > trace_data` */
+	if (!strlen(tbuf) && !list_empty(&hisi_ptt->trace_ctrl.trace_buf)) {
+		if (hisi_ptt->trace_ctrl.status == HISI_PTT_TRACE_STATUS_ON)
+			return -EBUSY;
+		hisi_ptt_free_trace_buf(hisi_ptt);
+	}
+
+	return count;
+}
+
+static int hisi_ptt_trace_data_release(struct inode *inode, struct file *filp)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = inode->i_private;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+
+	mutex_unlock(&hisi_ptt->mutex);
+
+	return seq_release(inode, filp);
+}
+
+static const struct file_operations hisi_ptt_trace_data_fops = {
+	.owner		= THIS_MODULE,
+	.open		= hisi_ptt_trace_data_open,
+	.read		= seq_read,
+	.write		= hisi_ptt_trace_data_write,
+	.llseek		= no_llseek,
+	.release	= hisi_ptt_trace_data_release,
+};
+
+static struct hisi_ptt_debugfs_file_desc trace_entries[] = {
+	PTT_FILE_INIT("filter",		&hisi_ptt_set_filter_fops),
+	PTT_FILE_INIT("direction",	&hisi_ptt_set_direction_fops),
+	PTT_FILE_INIT("type",		&hisi_ptt_set_trace_type_fops),
+	PTT_FILE_INIT("data_format",	&hisi_ptt_trace_data_format_fops),
+	PTT_FILE_INIT("trace_on",	&hisi_ptt_trace_on_fops),
+	PTT_FILE_INIT("buf_nums",	&hisi_ptt_set_trace_buf_nums_fops),
+	PTT_FILE_INIT("buflet_size",	&hisi_ptt_set_trace_buflet_size_fops),
+	PTT_FILE_INIT("free_buffer",	&hisi_ptt_free_trace_buf_fops),
+	PTT_FILE_INIT("data",		&hisi_ptt_trace_data_fops),
+};
+
+static irqreturn_t hisi_ptt_isr(int irq, void *context)
+{
+	struct hisi_ptt *hisi_ptt = context;
+	struct hisi_ptt_dma_buflet *next, *cur;
+	u32 status, val, buf_idx;
+
+	status = readl(hisi_ptt->iobase + HISI_PTT_TRACE_INT_STAT);
+	buf_idx = fls(status) - 1;
+
+	/*
+	 * Check whether the trace buffer is full. Stop tracing
+	 * when the last DMA buffer is finished. Otherwise, assign
+	 * the address of next buflet to the DMA register.
+	 */
+	cur = hisi_ptt->trace_ctrl.cur;
+	if (list_is_last(&cur->list, &hisi_ptt->trace_ctrl.trace_buf)) {
+		if ((status & HISI_PTT_TRACE_INT_STAT_MASK) ==
+		    HISI_PTT_TRACE_INT_STAT_MASK)
+			hisi_ptt_trace_end(hisi_ptt);
+
+		val = readl(hisi_ptt->iobase + HISI_PTT_TRACE_INT_MASK_REG);
+		val |= BIT(buf_idx);
+		writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_INT_MASK_REG);
+	} else {
+		next = list_next_entry(cur, list);
+		writel(lower_32_bits(next->dma), hisi_ptt->iobase +
+				  HISI_PTT_TRACE_ADDR_BASE_LO_0 +
+				  buf_idx * HISI_PTT_TRACE_ADDR_STRIDE);
+		writel(upper_32_bits(next->dma), hisi_ptt->iobase +
+					HISI_PTT_TRACE_ADDR_BASE_HI_0 +
+					buf_idx * HISI_PTT_TRACE_ADDR_STRIDE);
+		hisi_ptt->trace_ctrl.cur = next;
+		val = BIT(buf_idx);
+		writel(val, hisi_ptt->iobase + HISI_PTT_TRACE_INT_STAT);
+	}
+
+	return IRQ_HANDLED;
+}
+
+static irqreturn_t hisi_ptt_irq(int irq, void *context)
+{
+	struct hisi_ptt *hisi_ptt = context;
+	u32 status;
+
+	status = readl(hisi_ptt->iobase + HISI_PTT_TRACE_INT_STAT);
+	if (status & HISI_PTT_TRACE_INT_STAT_MASK)
+		return IRQ_WAKE_THREAD;
+
+	return IRQ_NONE;
+}
+
+static int hisi_ptt_irq_register(struct hisi_ptt *hisi_ptt)
+{
+	struct pci_dev *pdev = hisi_ptt->pdev;
+	int ret;
+
+	ret = pci_alloc_irq_vectors(pdev, HISI_PTT_IRQ_NUMS, HISI_PTT_IRQ_NUMS,
+				    PCI_IRQ_MSI);
+	if (ret < 0) {
+		pci_err(pdev, "failed to allocate irq vector, ret = %d\n", ret);
+		return ret;
+	}
+
+	ret = request_threaded_irq(pci_irq_vector(pdev, HISI_PTT_DMA_IRQ),
+				   hisi_ptt_irq, hisi_ptt_isr, IRQF_SHARED,
+				   "hisi-ptt", hisi_ptt);
+	if (ret) {
+		pci_err(pdev, "failed to request irq %d, ret = %d\n",
+			pci_irq_vector(pdev, HISI_PTT_DMA_IRQ), ret);
+		pci_free_irq_vectors(pdev);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void hisi_ptt_irq_unregister(struct hisi_ptt *hisi_ptt)
+{
+	struct pci_dev *pdev = hisi_ptt->pdev;
+
+	free_irq(pci_irq_vector(pdev, HISI_PTT_DMA_IRQ), hisi_ptt);
+	pci_free_irq_vectors(pdev);
+}
+
+static void hisi_ptt_update_filters(struct work_struct *work)
+{
+	struct delayed_work *delayed_work = to_delayed_work(work);
+	struct hisi_ptt_filter_update_info info;
+	struct hisi_ptt_filter_desc *filter;
+	struct list_head *target_list;
+	struct hisi_ptt *hisi_ptt;
+
+	hisi_ptt = container_of(delayed_work, struct hisi_ptt, work);
+
+	if (!mutex_trylock(&hisi_ptt->mutex)) {
+		schedule_delayed_work(&hisi_ptt->work, HISI_PTT_WORK_DELAY_MS);
+		return;
+	}
+
+	while (kfifo_get(&hisi_ptt->filter_update_kfifo, &info)) {
+		target_list = info.is_port ? &hisi_ptt->port_filters :
+			      &hisi_ptt->req_filters;
+
+		if (info.is_add) {
+			filter = kzalloc(sizeof(*filter), GFP_KERNEL);
+			if (!filter) {
+				pci_err(hisi_ptt->pdev,
+					"failed to update the filters\n");
+				continue;
+			}
+
+			filter->pdev = info.pdev;
+			filter->val = info.val;
+
+			list_add_tail(&filter->list, target_list);
+		} else {
+			list_for_each_entry(filter, target_list, list)
+				if (filter->val == info.val) {
+					list_del(&filter->list);
+					kfree(filter);
+					break;
+				}
+		}
+	}
+
+	mutex_unlock(&hisi_ptt->mutex);
+}
+
+static void hisi_ptt_update_fifo_in(struct hisi_ptt *hisi_ptt,
+				    struct hisi_ptt_filter_update_info *info)
+{
+	struct pci_dev *root_port = pcie_find_root_port(info->pdev);
+
+	if (!root_port)
+		return;
+
+	info->port_devid = PCI_DEVID(root_port->bus->number, root_port->devfn);
+	if (info->port_devid < hisi_ptt->lower ||
+	    info->port_devid > hisi_ptt->upper)
+			return;
+
+	info->is_port = pci_pcie_type(info->pdev) == PCI_EXP_TYPE_ROOT_PORT;
+	info->val = hisi_ptt_get_filter_val(info->pdev);
+
+	if (kfifo_in_spinlocked(&hisi_ptt->filter_update_kfifo, info, 1,
+				&hisi_ptt->filter_update_lock))
+		schedule_delayed_work(&hisi_ptt->work, 0);
+	else
+		pci_warn(hisi_ptt->pdev,
+			 "filter update fifo overflow for target %s\n",
+			 pci_name(info->pdev));
+}
+
+/*
+ * A PCI bus notifier is used here for dynamically updating the filter
+ * list.
+ */
+static int hisi_ptt_notifier_call(struct notifier_block *nb,
+				  unsigned long action,
+				  void *data)
+{
+	struct hisi_ptt *hisi_ptt = container_of(nb, struct hisi_ptt, hisi_ptt_nb);
+	struct hisi_ptt_filter_update_info info;
+	struct device *dev = data;
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	info.pdev = pdev;
+
+	switch (action) {
+	case BUS_NOTIFY_ADD_DEVICE:
+		info.is_add = true;
+		break;
+	case BUS_NOTIFY_DEL_DEVICE:
+		info.is_add = false;
+		break;
+	default:
+		return 0;
+	}
+
+	hisi_ptt_update_fifo_in(hisi_ptt, &info);
+
+	return 0;
+}
+
+static int hisi_ptt_init_filters(struct pci_dev *pdev, void *data)
+{
+	struct hisi_ptt_filter_update_info info = {
+		.pdev = pdev,
+		.is_add = true,
+	};
+	struct hisi_ptt *hisi_ptt = data;
+
+	hisi_ptt_update_fifo_in(hisi_ptt, &info);
+
+	return 0;
+}
+
+static void hisi_ptt_release_filters(struct hisi_ptt *hisi_ptt)
+{
+	struct hisi_ptt_filter_desc *filter, *tfilter;
+
+	list_for_each_entry_safe(filter, tfilter, &hisi_ptt->req_filters, list) {
+		list_del(&filter->list);
+		kfree(filter);
+	}
+
+	list_for_each_entry_safe(filter, tfilter, &hisi_ptt->port_filters, list) {
+		list_del(&filter->list);
+		kfree(filter);
+	}
+}
+
+static void hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt)
+{
+	struct pci_bus *bus;
+	u32 reg;
+
+	INIT_LIST_HEAD(&hisi_ptt->port_filters);
+	INIT_LIST_HEAD(&hisi_ptt->req_filters);
+
+	/*
+	 * The device range register provides the information about the
+	 * root ports which the RCiEP can control and trace. The RCiEP
+	 * and the root ports it support are on the same PCIe core, with
+	 * same domain number but maybe different bus number. The device
+	 * range register will tell us which root ports we can support,
+	 * Bit[31:16] indicates the upper BDF numbers of the root port,
+	 * while Bit[15:0] indicates the lower.
+	 */
+	reg = readl(hisi_ptt->iobase + HISI_PTT_DEVICE_RANGE);
+	hisi_ptt->upper = reg >> 16;
+	hisi_ptt->lower = reg & 0xffff;
+
+	bus = pci_find_bus(pci_domain_nr(hisi_ptt->pdev->bus),
+			   PCI_BUS_NUM(hisi_ptt->upper));
+	if (bus)
+		pci_walk_bus(bus, hisi_ptt_init_filters, hisi_ptt);
+
+	/* Initialize trace controls */
+	INIT_LIST_HEAD(&hisi_ptt->trace_ctrl.trace_buf);
+	hisi_ptt->trace_ctrl.buflet_nums = HISI_PTT_DEFAULT_TRACE_BUF_CNT;
+	hisi_ptt->trace_ctrl.buflet_size = HISI_PTT_TRACE_DEFAULT_BUFLET_SIZE;
+	hisi_ptt->trace_ctrl.rxtx = HISI_PTT_TRACE_DEFAULT_RXTX.event_code;
+	hisi_ptt->trace_ctrl.tr_event = HISI_PTT_TRACE_DEFAULT_EVENT.event_code;
+}
+
+static int hisi_ptt_create_trace_entries(struct hisi_ptt *hisi_ptt)
+{
+	struct hisi_ptt_debugfs_file_desc *trace_files;
+	struct dentry *dir;
+	int i, ret = 0;
+
+	dir = debugfs_create_dir("trace", hisi_ptt->debugfs_dir);
+	if (IS_ERR(dir))
+		return PTR_ERR(dir);
+
+	trace_files = devm_kmemdup(&hisi_ptt->pdev->dev, trace_entries,
+				   sizeof(trace_entries), GFP_KERNEL);
+	if (IS_ERR(trace_files)) {
+		ret = PTR_ERR(trace_files);
+		goto err;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(trace_entries); ++i) {
+		struct dentry *file;
+
+		trace_files[i].hisi_ptt = hisi_ptt;
+		file = debugfs_create_file(trace_files[i].name, 0600,
+					   dir, &trace_files[i],
+					   trace_files[i].fops);
+		if (IS_ERR(file)) {
+			ret = PTR_ERR(file);
+			goto err;
+		}
+	}
+
+	return 0;
+
+err:
+	debugfs_remove_recursive(dir);
+	return ret;
+}
+
+static int hisi_ptt_create_debugfs_entries(struct hisi_ptt *hisi_ptt)
+{
+	int ret;
+
+	hisi_ptt->debugfs_dir = debugfs_create_dir(hisi_ptt->name,
+						   hisi_ptt_debugfs_root);
+	if (IS_ERR(hisi_ptt->debugfs_dir))
+		return PTR_ERR(hisi_ptt->debugfs_dir);
+
+	ret = hisi_ptt_create_trace_entries(hisi_ptt);
+	if (ret) {
+		pci_err(hisi_ptt->pdev, "failed to create debugfs entries\n");
+		debugfs_remove_recursive(hisi_ptt->debugfs_dir);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int hisi_ptt_probe(struct pci_dev *pdev,
+			  const struct pci_device_id *id)
+{
+	struct hisi_ptt *hisi_ptt;
+	int ret;
+
+	hisi_ptt = devm_kzalloc(&pdev->dev, sizeof(*hisi_ptt), GFP_KERNEL);
+	if (!hisi_ptt)
+		return -ENOMEM;
+
+	mutex_init(&hisi_ptt->mutex);
+	hisi_ptt->pdev = pdev;
+
+	/*
+	 * Lifetime of pci_dev is longer than hisi_ptt,
+	 * so directly reference to the pci name string.
+	 */
+	hisi_ptt->name = pci_name(pdev);
+	pci_set_drvdata(pdev, hisi_ptt);
+
+	ret = pcim_enable_device(pdev);
+	if (ret) {
+		pci_err(pdev, "failed to enable device, ret = %d\n", ret);
+		return ret;
+	}
+
+	ret = pcim_iomap_regions(pdev, BIT(2), hisi_ptt->name);
+	if (ret) {
+		pci_err(pdev, "failed to remap io memory, ret = %d\n", ret);
+		return ret;
+	}
+
+	hisi_ptt->iobase = pcim_iomap_table(pdev)[2];
+
+	ret = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(64));
+	if (ret) {
+		pci_err(pdev, "failed to set 64 bit dma mask, ret = %d\n", ret);
+		return ret;
+	}
+
+	pci_set_master(pdev);
+	ret = hisi_ptt_irq_register(hisi_ptt);
+	if (ret)
+		return ret;
+
+	spin_lock_init(&hisi_ptt->filter_update_lock);
+	INIT_DELAYED_WORK(&hisi_ptt->work, hisi_ptt_update_filters);
+	INIT_KFIFO(hisi_ptt->filter_update_kfifo);
+
+	hisi_ptt_init_ctrls(hisi_ptt);
+
+	ret = hisi_ptt_create_debugfs_entries(hisi_ptt);
+	if (ret) {
+		hisi_ptt_irq_unregister(hisi_ptt);
+		return ret;
+	}
+
+	/* Register the bus notifier for dynamically updating the filter list */
+	hisi_ptt->hisi_ptt_nb.notifier_call = hisi_ptt_notifier_call;
+	ret = bus_register_notifier(&pci_bus_type, &hisi_ptt->hisi_ptt_nb);
+	if (ret)
+		pci_warn(pdev, "failed to register notifier block, filter list cannot be updated dynamically\n");
+
+	return 0;
+}
+
+void hisi_ptt_remove(struct pci_dev *pdev)
+{
+	struct hisi_ptt *hisi_ptt = pci_get_drvdata(pdev);
+
+	bus_unregister_notifier(&pci_bus_type, &hisi_ptt->hisi_ptt_nb);
+	debugfs_remove_recursive(hisi_ptt->debugfs_dir);
+
+	/* Cancel any work that has been queued */
+	cancel_delayed_work_sync(&hisi_ptt->work);
+
+	/* Make sure trace is ended before free trace buffer */
+	if (hisi_ptt->trace_ctrl.status == HISI_PTT_TRACE_STATUS_ON)
+		hisi_ptt_trace_end(hisi_ptt);
+
+	hisi_ptt_release_filters(hisi_ptt);
+	hisi_ptt_free_trace_buf(hisi_ptt);
+	hisi_ptt_irq_unregister(hisi_ptt);
+}
+
+static const struct pci_device_id hisi_ptt_id_tbl[] = {
+	{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, 0xa12e) },
+	{ }
+};
+MODULE_DEVICE_TABLE(pci, hisi_ptt_id_tbl);
+
+static struct pci_driver hisi_ptt_driver = {
+	.name = "hisi_ptt",
+	.id_table = hisi_ptt_id_tbl,
+	.probe = hisi_ptt_probe,
+	.remove = hisi_ptt_remove,
+};
+
+static int hisi_ptt_register_debugfs(void)
+{
+	if (!debugfs_initialized()) {
+		pr_err("failed to create debugfs directory: debugfs uninitialized\n");
+		return -ENOENT;
+	}
+
+	hisi_ptt_debugfs_root = debugfs_create_dir("hisi_ptt", NULL);
+	if (IS_ERR(hisi_ptt_debugfs_root)) {
+		pr_err("failed to create debugfs directory.\n");
+		return PTR_ERR(hisi_ptt_debugfs_root);
+	}
+
+	return 0;
+}
+
+static void hisi_ptt_unregister_debugfs(void)
+{
+	debugfs_remove_recursive(hisi_ptt_debugfs_root);
+}
+
+static int __init hisi_ptt_module_init(void)
+{
+	int ret;
+
+	/* The driver cannot work without debugfs entry */
+	ret = hisi_ptt_register_debugfs();
+	if (ret)
+		return ret;
+
+	ret = pci_register_driver(&hisi_ptt_driver);
+	if (ret) {
+		pr_err("failed to register hisi ptt driver, ret = %d.\n", ret);
+		hisi_ptt_unregister_debugfs();
+		return ret;
+	}
+
+	return 0;
+}
+
+static void __exit hisi_ptt_module_exit(void)
+{
+	pci_unregister_driver(&hisi_ptt_driver);
+	hisi_ptt_unregister_debugfs();
+}
+
+module_init(hisi_ptt_module_init);
+module_exit(hisi_ptt_module_exit);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Yicong Yang <yangyicong@hisilicon.com>");
+MODULE_DESCRIPTION("Driver for HiSilicon PCIe tune and trace device");
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] hwtracing: Add tune function support for HiSilicon PCIe Tune and Trace device
  2021-04-06 12:45 [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Yicong Yang
  2021-04-06 12:45 ` [PATCH 1/4] hwtracing: Add trace function " Yicong Yang
@ 2021-04-06 12:45 ` Yicong Yang
  2021-04-06 12:45 ` [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver Yicong Yang
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2021-04-06 12:45 UTC (permalink / raw)
  To: alexander.shishkin, helgaas, linux-kernel, linux-pci
  Cc: lorenzo.pieralisi, gregkh, jonathan.cameron, song.bao.hua,
	prime.zeng, yangyicong, linux-doc, linuxarm

Add tune function for the HiSilicon Tune and Trace device. The interface
of tune is also exposed through debugfs.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/hwtracing/hisilicon/hisi_ptt.c | 187 +++++++++++++++++++++++++++++++++
 1 file changed, 187 insertions(+)

diff --git a/drivers/hwtracing/hisilicon/hisi_ptt.c b/drivers/hwtracing/hisilicon/hisi_ptt.c
index a1feece..2e3631b 100644
--- a/drivers/hwtracing/hisilicon/hisi_ptt.c
+++ b/drivers/hwtracing/hisilicon/hisi_ptt.c
@@ -72,6 +72,7 @@ struct hisi_ptt_debugfs_file_desc {
 	struct hisi_ptt *hisi_ptt;
 	const char *name;
 	const struct file_operations *fops;
+	struct hisi_ptt_tune_event_desc *private;
 };
 
 #define PTT_FILE_INIT(__name, __fops)	\
@@ -85,6 +86,15 @@ struct hisi_ptt_trace_event_desc {
 #define PTT_TRACE_EVENT_INIT(__name, __event_code) \
 	{ .name = __name, .event_code = __event_code }
 
+struct hisi_ptt_tune_event_desc {
+	const char *name;
+	u32 event_code;
+	int (*set)(struct hisi_ptt *hisi_ptt, void *data);
+	int (*get)(struct hisi_ptt *hisi_ptt, void *data);
+};
+#define PTT_TUNE_EVENT_INIT(__name, __event_code, __set, __get)	\
+	{ .name = __name, .event_code = __event_code, .set = __set, .get = __get }
+
 enum hisi_ptt_trace_rxtx {
 	HISI_PTT_TRACE_RX = 0,
 	HISI_PTT_TRACE_TX,
@@ -217,6 +227,8 @@ struct hisi_ptt {
 		bool is_port;
 		u16 val;
 	} cur_filter;
+
+	u32 tune_event;
 };
 
 static int hisi_ptt_write_to_buffer(void *buf, size_t len, loff_t *pos,
@@ -234,6 +246,131 @@ static int hisi_ptt_write_to_buffer(void *buf, size_t len, loff_t *pos,
 	return 0;
 }
 
+static int hisi_ptt_wait_tuning_finish(struct hisi_ptt *hisi_ptt)
+{
+	u32 val;
+
+	return readl_poll_timeout(hisi_ptt->iobase + HISI_PTT_TUNING_INT_STAT,
+				  val, !(val & HISI_PTT_TUNING_INT_STAT_MASK),
+				  HISI_PTT_WAIT_POLL_INTERVAL_US,
+				  HISI_PTT_WAIT_TIMEOUT_US);
+}
+
+static int hisi_ptt_tune_data_get(struct hisi_ptt *hisi_ptt, void *data)
+{
+	u32 *val = data, reg;
+
+	reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+	reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB);
+	reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB,
+			  hisi_ptt->tune_event);
+	writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+	/* Write all 1 to indicates it's the read process */
+	writel(HISI_PTT_TUNING_DATA_VAL_MASK,
+	       hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+
+	if (hisi_ptt_wait_tuning_finish(hisi_ptt)) {
+		pci_err(hisi_ptt->pdev, "failed to read tune data, device timeout.\n");
+		return -ETIMEDOUT;
+	}
+
+	*val = readl(hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+	*val &= HISI_PTT_TUNING_DATA_VAL_MASK;
+
+	return 0;
+}
+
+static int hisi_ptt_tune_data_set(struct hisi_ptt *hisi_ptt, void *data)
+{
+	u32 *val = data, reg;
+
+	reg = readl(hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+	reg &= ~(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB);
+	reg |= FIELD_PREP(HISI_PTT_TUNING_CTRL_CODE | HISI_PTT_TUNING_CTRL_SUB,
+			  hisi_ptt->tune_event);
+	writel(reg, hisi_ptt->iobase + HISI_PTT_TUNING_CTRL);
+	*val &= HISI_PTT_TUNING_DATA_VAL_MASK;
+	writel(*val, hisi_ptt->iobase + HISI_PTT_TUNING_DATA);
+
+	if (hisi_ptt_wait_tuning_finish(hisi_ptt)) {
+		pci_err(hisi_ptt->pdev, "failed to write tune data, device timeout.\n");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static ssize_t hisi_ptt_tune_common_read(struct file *filp, char __user *buf,
+					 size_t count, loff_t *pos)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = filp->private_data;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+	struct hisi_ptt_tune_event_desc *e_desc = desc->private;
+	char tbuf[HISI_PTT_CTRL_STR_LEN];
+	int len, ret;
+	u32 val;
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	hisi_ptt->tune_event = e_desc->event_code;
+	ret = e_desc->get(hisi_ptt, &val);
+	mutex_unlock(&hisi_ptt->mutex);
+	if (ret)
+		return ret;
+
+	len = snprintf(tbuf, HISI_PTT_CTRL_STR_LEN, "%d\n", val);
+
+	return simple_read_from_buffer(buf, count, pos, tbuf, len);
+}
+
+static ssize_t hisi_ptt_tune_common_write(struct file *filp,
+					  const char __user *buf,
+					  size_t count, loff_t *pos)
+{
+	struct hisi_ptt_debugfs_file_desc *desc = filp->private_data;
+	struct hisi_ptt *hisi_ptt = desc->hisi_ptt;
+	struct hisi_ptt_tune_event_desc *e_desc = desc->private;
+	char tbuf[HISI_PTT_CTRL_STR_LEN];
+	int ret;
+	u16 val;
+
+	if (hisi_ptt_write_to_buffer(tbuf, HISI_PTT_CTRL_STR_LEN - 1,
+				     pos, buf, count))
+		return -EINVAL;
+
+	if (kstrtou16(tbuf, 0, &val))
+		return -EINVAL;
+
+	if (!mutex_trylock(&hisi_ptt->mutex))
+		return -EBUSY;
+
+	hisi_ptt->tune_event = e_desc->event_code;
+	ret = e_desc->set(hisi_ptt, &val);
+
+	mutex_unlock(&hisi_ptt->mutex);
+	return ret ? ret : count;
+}
+
+static const struct file_operations tune_common_fops = {
+	.owner		= THIS_MODULE,
+	.open		= simple_open,
+	.read		= hisi_ptt_tune_common_read,
+	.write		= hisi_ptt_tune_common_write,
+	.llseek		= no_llseek,
+};
+
+#define PTT_SIMPLE_TUNE_EVENTS(__name, __event_code)	\
+	PTT_TUNE_EVENT_INIT(__name, __event_code, hisi_ptt_tune_data_set, hisi_ptt_tune_data_get)
+
+static struct hisi_ptt_tune_event_desc tune_events[] = {
+	PTT_SIMPLE_TUNE_EVENTS("qos_tx_cpl",			(0x4 | (3 << 16))),
+	PTT_SIMPLE_TUNE_EVENTS("qos_tx_np",			(0x4 | (4 << 16))),
+	PTT_SIMPLE_TUNE_EVENTS("qos_tx_p",			(0x4 | (5 << 16))),
+	PTT_SIMPLE_TUNE_EVENTS("tx_path_rx_req_alloc_buf_level",	(0x5 | (6 << 16))),
+	PTT_SIMPLE_TUNE_EVENTS("tx_path_tx_req_alloc_buf_level",	(0x5 | (7 << 16))),
+};
+
 static u16 hisi_ptt_get_filter_val(struct pci_dev *pdev)
 {
 	if (pci_pcie_type(pdev) == PCI_EXP_TYPE_ROOT_PORT)
@@ -1241,6 +1378,49 @@ static void hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt)
 	hisi_ptt->trace_ctrl.tr_event = HISI_PTT_TRACE_DEFAULT_EVENT.event_code;
 }
 
+static int hisi_ptt_create_tune_entries(struct hisi_ptt *hisi_ptt)
+{
+	struct hisi_ptt_debugfs_file_desc *tune_files;
+	struct dentry *dir;
+	int i, ret = 0;
+
+	dir = debugfs_create_dir("tune", hisi_ptt->debugfs_dir);
+	if (IS_ERR(dir))
+		return PTR_ERR(dir);
+
+	tune_files = devm_kcalloc(&hisi_ptt->pdev->dev, ARRAY_SIZE(tune_events),
+				  sizeof(struct hisi_ptt_debugfs_file_desc),
+				  GFP_KERNEL);
+	if (IS_ERR(tune_files)) {
+		ret = PTR_ERR(tune_files);
+		goto err;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(tune_events); ++i) {
+		struct dentry *file;
+
+		tune_files[i].hisi_ptt = hisi_ptt;
+
+		/* We use tune event string as control file name. */
+		tune_files[i].name = tune_events[i].name;
+		tune_files[i].fops = &tune_common_fops;
+		tune_files[i].private = &tune_events[i];
+		file = debugfs_create_file(tune_files[i].name, 0600,
+					   dir, &tune_files[i],
+					   &tune_common_fops);
+		if (IS_ERR(file)) {
+			ret = PTR_ERR(file);
+			goto err;
+		}
+	}
+
+	return 0;
+
+err:
+	debugfs_remove_recursive(dir);
+	return ret;
+}
+
 static int hisi_ptt_create_trace_entries(struct hisi_ptt *hisi_ptt)
 {
 	struct hisi_ptt_debugfs_file_desc *trace_files;
@@ -1294,6 +1474,13 @@ static int hisi_ptt_create_debugfs_entries(struct hisi_ptt *hisi_ptt)
 		return ret;
 	}
 
+	ret = hisi_ptt_create_tune_entries(hisi_ptt);
+	if (ret) {
+		pci_err(hisi_ptt->pdev, "failed to create tune entries.\n");
+		debugfs_remove_recursive(hisi_ptt->debugfs_dir);
+		return ret;
+	}
+
 	return 0;
 }
 
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver
  2021-04-06 12:45 [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Yicong Yang
  2021-04-06 12:45 ` [PATCH 1/4] hwtracing: Add trace function " Yicong Yang
  2021-04-06 12:45 ` [PATCH 2/4] hwtracing: Add tune " Yicong Yang
@ 2021-04-06 12:45 ` Yicong Yang
  2021-04-07 18:55   ` Bjorn Helgaas
  2021-04-06 12:45 ` [PATCH 4/4] MAINTAINERS: Add maintainer for HiSilicon PTT driver Yicong Yang
  2021-04-06 13:49 ` [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Greg KH
  4 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2021-04-06 12:45 UTC (permalink / raw)
  To: alexander.shishkin, helgaas, linux-kernel, linux-pci
  Cc: lorenzo.pieralisi, gregkh, jonathan.cameron, song.bao.hua,
	prime.zeng, yangyicong, linux-doc, linuxarm

Document the introduction and usage of HiSilicon PTT device driver.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 Documentation/trace/hisi-ptt.rst | 316 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 316 insertions(+)
 create mode 100644 Documentation/trace/hisi-ptt.rst

diff --git a/Documentation/trace/hisi-ptt.rst b/Documentation/trace/hisi-ptt.rst
new file mode 100644
index 0000000..215676f
--- /dev/null
+++ b/Documentation/trace/hisi-ptt.rst
@@ -0,0 +1,316 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================================
+HiSilicon PCIe Tune and Trace device
+======================================
+
+Introduction
+============
+
+HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
+integrated Endpoint (RCiEP) device, providing the capability
+to dynamically monitor and tune the PCIe link's events (tune),
+and trace the TLP headers (trace). The two functions are independent,
+but is recommended to use them together to analyze and enhance the
+PCIe link's performance.
+
+On Kunpeng 930 SoC, the PCIe root complex is composed of several
+PCIe cores.
+Each core is composed of several root ports, RCiEPs, and one
+PTT device, like below. The PTT device is capable of tuning and
+tracing the link of the PCIe core.
+::
+          +--------------Core 0-------+
+          |       |       [   PTT   ] |
+          |       |       [Root Port]---[Endpoint]
+          |       |       [Root Port]---[Endpoint]
+          |       |       [Root Port]---[Endpoint]
+    Root Complex  |------Core 1-------+
+          |       |       [   PTT   ] |
+          |       |       [Root Port]---[ Switch ]---[Endpoint]
+          |       |       [Root Port]---[Endpoint] `-[Endpoint]
+          |       |       [Root Port]---[Endpoint]
+          +---------------------------+
+
+The PTT device driver cannot be loaded if debugfs is not mounted.
+Each PTT device will be presented under /sys/kernel/debugfs/hisi_ptt
+as its root directory, with name of its BDF number.
+::
+
+    /sys/kernel/debug/hisi_ptt/<domain>:<bus>:<device>.<function>
+
+Tune
+====
+
+PTT tune is designed for monitoring and adjusting PCIe link parameters(events).
+Currently we support events in 4 classes. The scope of the events
+covers the PCIe core with which the PTT device belongs to.
+
+Each event is presented as a file under $(PTT root dir)/$(BDF)/tune, and
+mostly a simple open/read/write/close cycle will be used to tune
+the event.
+::
+    $ cd /sys/kernel/debug/hisi_ptt/$(BDF)/tune
+    $ ls
+    qos_tx_cpl    qos_tx_np    qos_tx_p
+    tx_path_rx_req_alloc_buf_level
+    tx_path_tx_req_alloc_buf_level
+    $ cat qos_tx_dp
+    1
+    $ echo 2 > qos_tx_dp
+    $ cat qos_tx_dp
+    2
+
+Current value(numerical value) of the event can be simply read
+from the file, and the desired value written to the file to tune.
+Tuning multiple events at the same time is not permitted, which means
+you cannot read or write more than one tune file at one time.
+
+1. Tx path QoS control
+------------------------
+
+Following files are provided to tune the QoS of the tx path of the PCIe core.
+
+- qos_tx_cpl: weight of tx completion TLPs
+- qos_tx_np: weight of tx non-posted TLPs
+- qos_tx_p: weight of tx posted TLPs
+
+The weight influences the proportion of certain packets on the PCIe link.
+For example, for the storage scenario, increase the proportion
+of the completion packets on the link to enhance the performance as
+more completions are consumed.
+
+The available tune data of these events is [0, 1, 2].
+Writing a negative value will return an error, and out of range
+values will be converted to 2. Note that the event value just
+indicates a probable level, but is not precise.
+
+2. Tx path buffer control
+-------------------------
+
+Following files are provided to tune the buffer of tx path of the PCIe core.
+
+- tx_path_rx_req_alloc_buf_level: watermark of RX requested
+- tx_path_tx_req_alloc_buf_level: watermark of TX requested
+
+These events influence the watermark of the buffer allocated for each
+type. RX means the inbound while Tx means outbound. For a busy
+direction, you should increase the related buffer watermark to enhance
+the performance.
+
+The available tune data of above events is [0, 1, 2].
+Writing a negative value will return an error, and out of range
+values will be converted to 2. Note that the event value just
+indicates a probable level, but is not precise.
+
+Trace
+=====
+
+PTT trace is designed for dumping the TLP headers to the memory, which
+can be used to analyze the transactions and usage condition of the PCIe
+Link. You can chose to filter the traced headers by either requester ID,
+or those downstream of a set of root ports on the same core of the PTT
+device. It's also support to trace the headers of certain type and of
+certain direction.
+
+In order to start trace, you need to configure the parameters first.
+The parameters files is provided under $(PTT root dir)/$(BDF)/trace.
+::
+    $ cd /sys/kernel/debug/hisi_ptt/$(BDF)/trace
+    $ ls
+    free_buffer     filter      buflet_nums     buflet_size
+    direction       type        data            trace_on
+    data_format
+
+1. filter
+---------
+
+You can configure the filter of TLP headers through the file. The filter
+is provided as BDF numbers of either root port or subordinates, which
+belong to the same PCIe core. You can get the filters available and
+currently configured by read the file, and write the desired BDF to the
+file to set the filters. There is no default filter, which means you
+must specifiy at least one filter before start tracing.
+Write invalid BDF(not in the available list) will return
+a failure.
+::
+    $ echo 0000:80:04.0 > filter
+    $ cat filter
+    #### Root Ports ####
+    0000:80:00.0
+    [0000:80:04.0]
+    #### Functions ####
+    0000:81:00.0
+    0000:81:00.1
+    0000:82:00.0
+
+Note that multiple root ports can be specified at one time, but only
+one Endpoint function can be specified in one trace.
+Specifying both root port and function at the same time is not supported.
+
+If no filter is available, read the filter will get the hint.
+::
+    $ cat filter
+    #### No available filter ####
+
+The filter can be dynamically updated, which means you can always
+get correct filter information when hotplug events happens, or
+manually remove/rescan the devices.
+
+2. type
+-------
+
+You can trace the TLP headers of certain types by configure the file.
+Read the file will get available types and current setting, and write
+the desired type to the file to configure. The default type is
+`posted_request` and write types not in the available list will return
+a failure.
+::
+    $ echo completion > type
+    $ cat type
+    all  posted_request  non-posted_request  [completion]
+
+3. direction
+------------
+
+You can trace the TLP headers from certain direction, which is relative
+to the root port or the PCIe core. Read the file to get available
+directions and current configurition, and write the desired direction
+to configure. The default value is `rx` and any invalid direction will
+return a failure. Note `rxtx_no_dma_p2p` means the headers of both
+directions, but not include P2P DMA access.
+::
+    $ echo rxtx > direction
+    $ cat direction
+    rx  tx  [rxtx]  rxtx_no_dma_p2p
+
+4. buflet_size
+--------------
+
+The traced TLP headers will be written to the memory allocated
+by the driver. The hardware accept 4 DMA address with same size,
+and write the buflet sequentially like below. If DMA addr 3 is
+finished and the trace is still on, it will return to addr 0.
+Driver will allocated each DMA buffer (we call it buflet).
+The finished buflet will be replaced with a new one, so
+a long time trace can be achieved.
+::
+    +->[DMA addr 0]->[DMA addr 1]->[DMA addr 2]->[DMA addr 3]-+
+    +---------------------------------------------------------+
+
+You should both configure the buflet_size and buflet_nums to
+configure the `trace buffer` to receive the TLP headers. The
+total trace buffer size is buflet_size * buflet_nums. Note
+that the trace buffer will not be allocated immediately after you
+configure the parameters, but will be allocated right before
+the trace starts.
+
+This file configures the buflet size. Read the file will get
+available buflet size and size set currently, write the desired
+size to the file to configure. The default size is 2 MiB and any
+invalid size written will return a failure.
+::
+    $ cat buflet_size
+    [2 MiB]     4 MiB
+    $ echo 4 > buflet_size
+    $ cat buflet_size
+    2 MiB     [4 MiB]
+
+5. buflet_nums
+--------------
+
+You can write the desired buflet count to the file to configure,
+and read the file to get current buflet count. The default
+value is 64. And any positive value is valid. Note that big value
+may lead to DMA memory allocation failure, and you will not be
+able to start tracing. If it happens, you should consider adjusting
+buflet_nums or buflet_size.
+::
+    $ cat buflet_nums
+    64
+    $ echo 128 > buflet_nums
+    $ cat buflet_nums
+    128
+
+6. data
+-------
+
+The file to access the traced data. You can read the file to get the
+binary blob of traced TLP headers. The format of the headers is
+4 Dword length and is just as defined by the PCIe Spec r4.0,
+Sec 2.2.4.1, or 8 Dword length with additional 4 Dword extra
+information.
+
+echo "" > data will free all the trace buffers allocated as well as
+the traced datas.
+
+7. trace_on
+-----------
+
+Start or end the trace by simple writing to the file, and monitor the
+trace status by reading the file.
+::
+    $ echo 1 > trace_on     # start trace
+    $ cat trace_on          # get the trace status
+    1
+    $ echo 0 > trace_on     # manually end trace
+
+The read value of the trace_on will be auto cleared if the buffer
+allocated is full. 1 indicates the trace is running and 0 for
+stopped. Write any non-zero value to the file can start trace.
+
+8. free_buffer
+--------------
+
+File to indicate the trace buffer status and to manually free the
+trace buffer. The read value of 1 indicates the trace buffer has
+been allocated and exists in the memory, while 0 indicates there
+is no buffer allocated. Write 1 to the file to free the trace
+buffer as well as the traced datas.
+::
+    $ cat free_buffer
+    1                       # indicate the buffer exists
+    $ echo 1 > free_buffer  # free the trace buffer
+    $ cat free_buffer
+    0
+
+9. data_format
+--------------
+
+File to indicate the format of the traced TLP headers. User can also
+specify the desired format of traced TLP headers. Available formats
+are 4DW, 8DW which indicates the length of each TLP headers traced.
+::
+    $ cat data_format
+    [4DW]    8DW
+    $ echo 8 > data_format
+    $ cat data_format
+    4DW     [8DW]
+
+The traced TLP header format is different from the PCIe standard.
+4DW format is like
+::
+    bits [31:30] [ 29:25 ][24][23][22][21][    20:11   ][    10:0    ]
+         |-----|---------|---|---|---|---|-------------|-------------|
+     DW0 [ Fmt ][  Type  ][T9][T8][TH][SO][   Length   ][    Time    ]
+     DW1 [                     Header DW1                            ]
+     DW2 [                     Header DW2                            ]
+     DW3 [                     Header DW3                            ]
+
+For 8DW format, the bit[31:11] of DW0 is always 0x1fffff, which can be
+used to distinguish the data format. 8DW format is like
+::
+    bits [                 31:11                 ][       10:0       ]
+         |---------------------------------------|-------------------|
+     DW0 [                0x1fffff               ][ Reserved (0x7ff) ]
+     DW1 [                       Prefix                              ]
+     DW2 [                     Header DW0                            ]
+     DW3 [                     Header DW1                            ]
+     DW4 [                     Header DW2                            ]
+     DW5 [                     Header DW3                            ]
+     DW6 [                   Reserved (0x0)                          ]
+     DW7 [                        Time                               ]
+
+All the fields of the traced TLP header is defined by the PCIe Specification.
+While 'Header DWx' means standard TLP header DWord x, and 'Time' is the
+timestamp of the traced header.
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] MAINTAINERS: Add maintainer for HiSilicon PTT driver
  2021-04-06 12:45 [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Yicong Yang
                   ` (2 preceding siblings ...)
  2021-04-06 12:45 ` [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver Yicong Yang
@ 2021-04-06 12:45 ` Yicong Yang
  2021-04-06 13:49 ` [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Greg KH
  4 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2021-04-06 12:45 UTC (permalink / raw)
  To: alexander.shishkin, helgaas, linux-kernel, linux-pci
  Cc: lorenzo.pieralisi, gregkh, jonathan.cameron, song.bao.hua,
	prime.zeng, yangyicong, linux-doc, linuxarm

Add maintainer for driver and documentation of HiSilicon PTT device.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 MAINTAINERS | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 8d23b0e..61b793b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8078,6 +8078,13 @@ W:	http://www.hisilicon.com
 F:	Documentation/admin-guide/perf/hisi-pmu.rst
 F:	drivers/perf/hisilicon
 
+HISILICON PTT DRIVER
+M:	Yicong Yang <yangyicong@hisilicon.com>
+L:	linux-kernel@vger.kernel.org
+S:	Maintained
+F:	Documentation/trace/hisi-ptt.rst
+F:	drivers/hwtracing/hisilicon/
+
 HISILICON QM AND ZIP Controller DRIVER
 M:	Zhou Wang <wangzhou1@hisilicon.com>
 L:	linux-crypto@vger.kernel.org
-- 
2.8.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device
  2021-04-06 12:45 [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Yicong Yang
                   ` (3 preceding siblings ...)
  2021-04-06 12:45 ` [PATCH 4/4] MAINTAINERS: Add maintainer for HiSilicon PTT driver Yicong Yang
@ 2021-04-06 13:49 ` Greg KH
  2021-04-07 10:03   ` Yicong Yang
  4 siblings, 1 reply; 16+ messages in thread
From: Greg KH @ 2021-04-06 13:49 UTC (permalink / raw)
  To: Yicong Yang
  Cc: alexander.shishkin, helgaas, linux-kernel, linux-pci,
	lorenzo.pieralisi, jonathan.cameron, song.bao.hua, prime.zeng,
	linux-doc, linuxarm

On Tue, Apr 06, 2021 at 08:45:50PM +0800, Yicong Yang wrote:
> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex
> integrated Endpoint(RCiEP) device, providing the capability
> to dynamically monitor and tune the PCIe traffic(tune),
> and trace the TLP headers(trace). The driver exposes the user
> interface through debugfs, so no need for extra user space tools.
> The usage is described in the document.

Why use debugfs and not the existing perf tools for debugging?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device
  2021-04-06 12:45 ` [PATCH 1/4] hwtracing: Add trace function " Yicong Yang
@ 2021-04-06 13:51   ` Greg KH
  2021-04-06 16:14   ` kernel test robot
  2021-04-06 22:48   ` kernel test robot
  2 siblings, 0 replies; 16+ messages in thread
From: Greg KH @ 2021-04-06 13:51 UTC (permalink / raw)
  To: Yicong Yang
  Cc: alexander.shishkin, helgaas, linux-kernel, linux-pci,
	lorenzo.pieralisi, jonathan.cameron, song.bao.hua, prime.zeng,
	linux-doc, linuxarm

On Tue, Apr 06, 2021 at 08:45:51PM +0800, Yicong Yang wrote:
> +static int hisi_ptt_create_trace_entries(struct hisi_ptt *hisi_ptt)
> +{
> +	struct hisi_ptt_debugfs_file_desc *trace_files;
> +	struct dentry *dir;
> +	int i, ret = 0;
> +
> +	dir = debugfs_create_dir("trace", hisi_ptt->debugfs_dir);
> +	if (IS_ERR(dir))
> +		return PTR_ERR(dir);

No need to care about this, please do not check, code should not do
different things based on if debugfs is working or not.

> +
> +	trace_files = devm_kmemdup(&hisi_ptt->pdev->dev, trace_entries,
> +				   sizeof(trace_entries), GFP_KERNEL);
> +	if (IS_ERR(trace_files)) {
> +		ret = PTR_ERR(trace_files);
> +		goto err;
> +	}
> +
> +	for (i = 0; i < ARRAY_SIZE(trace_entries); ++i) {
> +		struct dentry *file;
> +
> +		trace_files[i].hisi_ptt = hisi_ptt;
> +		file = debugfs_create_file(trace_files[i].name, 0600,
> +					   dir, &trace_files[i],
> +					   trace_files[i].fops);
> +		if (IS_ERR(file)) {
> +			ret = PTR_ERR(file);

Same here, why check?

> +static int hisi_ptt_register_debugfs(void)
> +{
> +	if (!debugfs_initialized()) {
> +		pr_err("failed to create debugfs directory: debugfs uninitialized\n");

Why do you care?  How can this happen?

> +		return -ENOENT;
> +	}
> +
> +	hisi_ptt_debugfs_root = debugfs_create_dir("hisi_ptt", NULL);
> +	if (IS_ERR(hisi_ptt_debugfs_root)) {

Again, no need to check.

If you are building the whole functionality of your code on if debugfs
is working or not, that feels really wrong.  Debugfs is for random
kernel debug type things, not a whole driver infrastructure that somehow
relies on it working or not.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device
  2021-04-06 12:45 ` [PATCH 1/4] hwtracing: Add trace function " Yicong Yang
  2021-04-06 13:51   ` Greg KH
@ 2021-04-06 16:14   ` kernel test robot
  2021-04-06 22:48   ` kernel test robot
  2 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2021-04-06 16:14 UTC (permalink / raw)
  To: Yicong Yang, alexander.shishkin, helgaas, linux-kernel, linux-pci
  Cc: kbuild-all, lorenzo.pieralisi, gregkh, jonathan.cameron,
	song.bao.hua, prime.zeng, yangyicong

[-- Attachment #1: Type: text/plain, Size: 5682 bytes --]

Hi Yicong,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.12-rc6 next-20210406]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yicong-Yang/Add-support-for-HiSilicon-PCIe-Tune-and-Trace-device/20210406-204959
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 0a50438c84363bd37fe18fe432888ae9a074dcab
config: nios2-allyesconfig (attached as .config)
compiler: nios2-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/8d755179573b25c8c165509321a32c3c04b10ab5
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yicong-Yang/Add-support-for-HiSilicon-PCIe-Tune-and-Trace-device/20210406-204959
        git checkout 8d755179573b25c8c165509321a32c3c04b10ab5
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nios2 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/hwtracing/hisilicon/hisi_ptt.c: In function 'hisi_ptt_irq_register':
   drivers/hwtracing/hisilicon/hisi_ptt.c:1067:3: error: implicit declaration of function 'pci_free_irq_vectors'; did you mean 'pci_alloc_irq_vectors'? [-Werror=implicit-function-declaration]
    1067 |   pci_free_irq_vectors(pdev);
         |   ^~~~~~~~~~~~~~~~~~~~
         |   pci_alloc_irq_vectors
   drivers/hwtracing/hisilicon/hisi_ptt.c: In function 'hisi_ptt_init_ctrls':
   drivers/hwtracing/hisilicon/hisi_ptt.c:1231:8: error: implicit declaration of function 'pci_find_bus'; did you mean 'pci_find_next_bus'? [-Werror=implicit-function-declaration]
    1231 |  bus = pci_find_bus(pci_domain_nr(hisi_ptt->pdev->bus),
         |        ^~~~~~~~~~~~
         |        pci_find_next_bus
>> drivers/hwtracing/hisilicon/hisi_ptt.c:1231:6: warning: assignment to 'struct pci_bus *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
    1231 |  bus = pci_find_bus(pci_domain_nr(hisi_ptt->pdev->bus),
         |      ^
   drivers/hwtracing/hisilicon/hisi_ptt.c:1234:3: error: implicit declaration of function 'pci_walk_bus' [-Werror=implicit-function-declaration]
    1234 |   pci_walk_bus(bus, hisi_ptt_init_filters, hisi_ptt);
         |   ^~~~~~~~~~~~
   drivers/hwtracing/hisilicon/hisi_ptt.c: In function 'hisi_ptt_probe':
   drivers/hwtracing/hisilicon/hisi_ptt.c:1359:31: error: 'pci_bus_type' undeclared (first use in this function); did you mean 'pci_pcie_type'?
    1359 |  ret = bus_register_notifier(&pci_bus_type, &hisi_ptt->hisi_ptt_nb);
         |                               ^~~~~~~~~~~~
         |                               pci_pcie_type
   drivers/hwtracing/hisilicon/hisi_ptt.c:1359:31: note: each undeclared identifier is reported only once for each function it appears in
   drivers/hwtracing/hisilicon/hisi_ptt.c: At top level:
   drivers/hwtracing/hisilicon/hisi_ptt.c:1366:6: warning: no previous prototype for 'hisi_ptt_remove' [-Wmissing-prototypes]
    1366 | void hisi_ptt_remove(struct pci_dev *pdev)
         |      ^~~~~~~~~~~~~~~
   drivers/hwtracing/hisilicon/hisi_ptt.c: In function 'hisi_ptt_remove':
   drivers/hwtracing/hisilicon/hisi_ptt.c:1370:27: error: 'pci_bus_type' undeclared (first use in this function); did you mean 'pci_pcie_type'?
    1370 |  bus_unregister_notifier(&pci_bus_type, &hisi_ptt->hisi_ptt_nb);
         |                           ^~~~~~~~~~~~
         |                           pci_pcie_type
   cc1: some warnings being treated as errors


vim +1231 drivers/hwtracing/hisilicon/hisi_ptt.c

  1209	
  1210	static void hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt)
  1211	{
  1212		struct pci_bus *bus;
  1213		u32 reg;
  1214	
  1215		INIT_LIST_HEAD(&hisi_ptt->port_filters);
  1216		INIT_LIST_HEAD(&hisi_ptt->req_filters);
  1217	
  1218		/*
  1219		 * The device range register provides the information about the
  1220		 * root ports which the RCiEP can control and trace. The RCiEP
  1221		 * and the root ports it support are on the same PCIe core, with
  1222		 * same domain number but maybe different bus number. The device
  1223		 * range register will tell us which root ports we can support,
  1224		 * Bit[31:16] indicates the upper BDF numbers of the root port,
  1225		 * while Bit[15:0] indicates the lower.
  1226		 */
  1227		reg = readl(hisi_ptt->iobase + HISI_PTT_DEVICE_RANGE);
  1228		hisi_ptt->upper = reg >> 16;
  1229		hisi_ptt->lower = reg & 0xffff;
  1230	
> 1231		bus = pci_find_bus(pci_domain_nr(hisi_ptt->pdev->bus),
  1232				   PCI_BUS_NUM(hisi_ptt->upper));
  1233		if (bus)
  1234			pci_walk_bus(bus, hisi_ptt_init_filters, hisi_ptt);
  1235	
  1236		/* Initialize trace controls */
  1237		INIT_LIST_HEAD(&hisi_ptt->trace_ctrl.trace_buf);
  1238		hisi_ptt->trace_ctrl.buflet_nums = HISI_PTT_DEFAULT_TRACE_BUF_CNT;
  1239		hisi_ptt->trace_ctrl.buflet_size = HISI_PTT_TRACE_DEFAULT_BUFLET_SIZE;
  1240		hisi_ptt->trace_ctrl.rxtx = HISI_PTT_TRACE_DEFAULT_RXTX.event_code;
  1241		hisi_ptt->trace_ctrl.tr_event = HISI_PTT_TRACE_DEFAULT_EVENT.event_code;
  1242	}
  1243	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 58991 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] hwtracing: Add trace function support for HiSilicon PCIe Tune and Trace device
  2021-04-06 12:45 ` [PATCH 1/4] hwtracing: Add trace function " Yicong Yang
  2021-04-06 13:51   ` Greg KH
  2021-04-06 16:14   ` kernel test robot
@ 2021-04-06 22:48   ` kernel test robot
  2 siblings, 0 replies; 16+ messages in thread
From: kernel test robot @ 2021-04-06 22:48 UTC (permalink / raw)
  To: Yicong Yang, alexander.shishkin, helgaas, linux-kernel, linux-pci
  Cc: kbuild-all, lorenzo.pieralisi, gregkh, jonathan.cameron,
	song.bao.hua, prime.zeng, yangyicong

[-- Attachment #1: Type: text/plain, Size: 10717 bytes --]

Hi Yicong,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v5.12-rc6 next-20210406]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yicong-Yang/Add-support-for-HiSilicon-PCIe-Tune-and-Trace-device/20210406-204959
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 0a50438c84363bd37fe18fe432888ae9a074dcab
config: nios2-allyesconfig (attached as .config)
compiler: nios2-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/8d755179573b25c8c165509321a32c3c04b10ab5
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yicong-Yang/Add-support-for-HiSilicon-PCIe-Tune-and-Trace-device/20210406-204959
        git checkout 8d755179573b25c8c165509321a32c3c04b10ab5
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nios2 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   drivers/hwtracing/hisilicon/hisi_ptt.c: In function 'hisi_ptt_irq_register':
>> drivers/hwtracing/hisilicon/hisi_ptt.c:1067:3: error: implicit declaration of function 'pci_free_irq_vectors'; did you mean 'pci_alloc_irq_vectors'? [-Werror=implicit-function-declaration]
    1067 |   pci_free_irq_vectors(pdev);
         |   ^~~~~~~~~~~~~~~~~~~~
         |   pci_alloc_irq_vectors
   drivers/hwtracing/hisilicon/hisi_ptt.c: In function 'hisi_ptt_init_ctrls':
>> drivers/hwtracing/hisilicon/hisi_ptt.c:1231:8: error: implicit declaration of function 'pci_find_bus'; did you mean 'pci_find_next_bus'? [-Werror=implicit-function-declaration]
    1231 |  bus = pci_find_bus(pci_domain_nr(hisi_ptt->pdev->bus),
         |        ^~~~~~~~~~~~
         |        pci_find_next_bus
   drivers/hwtracing/hisilicon/hisi_ptt.c:1231:6: warning: assignment to 'struct pci_bus *' from 'int' makes pointer from integer without a cast [-Wint-conversion]
    1231 |  bus = pci_find_bus(pci_domain_nr(hisi_ptt->pdev->bus),
         |      ^
>> drivers/hwtracing/hisilicon/hisi_ptt.c:1234:3: error: implicit declaration of function 'pci_walk_bus' [-Werror=implicit-function-declaration]
    1234 |   pci_walk_bus(bus, hisi_ptt_init_filters, hisi_ptt);
         |   ^~~~~~~~~~~~
   drivers/hwtracing/hisilicon/hisi_ptt.c: In function 'hisi_ptt_probe':
>> drivers/hwtracing/hisilicon/hisi_ptt.c:1359:31: error: 'pci_bus_type' undeclared (first use in this function); did you mean 'pci_pcie_type'?
    1359 |  ret = bus_register_notifier(&pci_bus_type, &hisi_ptt->hisi_ptt_nb);
         |                               ^~~~~~~~~~~~
         |                               pci_pcie_type
   drivers/hwtracing/hisilicon/hisi_ptt.c:1359:31: note: each undeclared identifier is reported only once for each function it appears in
   drivers/hwtracing/hisilicon/hisi_ptt.c: At top level:
   drivers/hwtracing/hisilicon/hisi_ptt.c:1366:6: warning: no previous prototype for 'hisi_ptt_remove' [-Wmissing-prototypes]
    1366 | void hisi_ptt_remove(struct pci_dev *pdev)
         |      ^~~~~~~~~~~~~~~
   drivers/hwtracing/hisilicon/hisi_ptt.c: In function 'hisi_ptt_remove':
   drivers/hwtracing/hisilicon/hisi_ptt.c:1370:27: error: 'pci_bus_type' undeclared (first use in this function); did you mean 'pci_pcie_type'?
    1370 |  bus_unregister_notifier(&pci_bus_type, &hisi_ptt->hisi_ptt_nb);
         |                           ^~~~~~~~~~~~
         |                           pci_pcie_type
   cc1: some warnings being treated as errors


vim +1067 drivers/hwtracing/hisilicon/hisi_ptt.c

  1048	
  1049	static int hisi_ptt_irq_register(struct hisi_ptt *hisi_ptt)
  1050	{
  1051		struct pci_dev *pdev = hisi_ptt->pdev;
  1052		int ret;
  1053	
  1054		ret = pci_alloc_irq_vectors(pdev, HISI_PTT_IRQ_NUMS, HISI_PTT_IRQ_NUMS,
  1055					    PCI_IRQ_MSI);
  1056		if (ret < 0) {
  1057			pci_err(pdev, "failed to allocate irq vector, ret = %d\n", ret);
  1058			return ret;
  1059		}
  1060	
  1061		ret = request_threaded_irq(pci_irq_vector(pdev, HISI_PTT_DMA_IRQ),
  1062					   hisi_ptt_irq, hisi_ptt_isr, IRQF_SHARED,
  1063					   "hisi-ptt", hisi_ptt);
  1064		if (ret) {
  1065			pci_err(pdev, "failed to request irq %d, ret = %d\n",
  1066				pci_irq_vector(pdev, HISI_PTT_DMA_IRQ), ret);
> 1067			pci_free_irq_vectors(pdev);
  1068			return ret;
  1069		}
  1070	
  1071		return 0;
  1072	}
  1073	
  1074	static void hisi_ptt_irq_unregister(struct hisi_ptt *hisi_ptt)
  1075	{
  1076		struct pci_dev *pdev = hisi_ptt->pdev;
  1077	
  1078		free_irq(pci_irq_vector(pdev, HISI_PTT_DMA_IRQ), hisi_ptt);
  1079		pci_free_irq_vectors(pdev);
  1080	}
  1081	
  1082	static void hisi_ptt_update_filters(struct work_struct *work)
  1083	{
  1084		struct delayed_work *delayed_work = to_delayed_work(work);
  1085		struct hisi_ptt_filter_update_info info;
  1086		struct hisi_ptt_filter_desc *filter;
  1087		struct list_head *target_list;
  1088		struct hisi_ptt *hisi_ptt;
  1089	
  1090		hisi_ptt = container_of(delayed_work, struct hisi_ptt, work);
  1091	
  1092		if (!mutex_trylock(&hisi_ptt->mutex)) {
  1093			schedule_delayed_work(&hisi_ptt->work, HISI_PTT_WORK_DELAY_MS);
  1094			return;
  1095		}
  1096	
  1097		while (kfifo_get(&hisi_ptt->filter_update_kfifo, &info)) {
  1098			target_list = info.is_port ? &hisi_ptt->port_filters :
  1099				      &hisi_ptt->req_filters;
  1100	
  1101			if (info.is_add) {
  1102				filter = kzalloc(sizeof(*filter), GFP_KERNEL);
  1103				if (!filter) {
  1104					pci_err(hisi_ptt->pdev,
  1105						"failed to update the filters\n");
  1106					continue;
  1107				}
  1108	
  1109				filter->pdev = info.pdev;
  1110				filter->val = info.val;
  1111	
  1112				list_add_tail(&filter->list, target_list);
  1113			} else {
  1114				list_for_each_entry(filter, target_list, list)
  1115					if (filter->val == info.val) {
  1116						list_del(&filter->list);
  1117						kfree(filter);
  1118						break;
  1119					}
  1120			}
  1121		}
  1122	
  1123		mutex_unlock(&hisi_ptt->mutex);
  1124	}
  1125	
  1126	static void hisi_ptt_update_fifo_in(struct hisi_ptt *hisi_ptt,
  1127					    struct hisi_ptt_filter_update_info *info)
  1128	{
  1129		struct pci_dev *root_port = pcie_find_root_port(info->pdev);
  1130	
  1131		if (!root_port)
  1132			return;
  1133	
  1134		info->port_devid = PCI_DEVID(root_port->bus->number, root_port->devfn);
  1135		if (info->port_devid < hisi_ptt->lower ||
  1136		    info->port_devid > hisi_ptt->upper)
  1137				return;
  1138	
  1139		info->is_port = pci_pcie_type(info->pdev) == PCI_EXP_TYPE_ROOT_PORT;
  1140		info->val = hisi_ptt_get_filter_val(info->pdev);
  1141	
  1142		if (kfifo_in_spinlocked(&hisi_ptt->filter_update_kfifo, info, 1,
  1143					&hisi_ptt->filter_update_lock))
  1144			schedule_delayed_work(&hisi_ptt->work, 0);
  1145		else
  1146			pci_warn(hisi_ptt->pdev,
  1147				 "filter update fifo overflow for target %s\n",
  1148				 pci_name(info->pdev));
  1149	}
  1150	
  1151	/*
  1152	 * A PCI bus notifier is used here for dynamically updating the filter
  1153	 * list.
  1154	 */
  1155	static int hisi_ptt_notifier_call(struct notifier_block *nb,
  1156					  unsigned long action,
  1157					  void *data)
  1158	{
  1159		struct hisi_ptt *hisi_ptt = container_of(nb, struct hisi_ptt, hisi_ptt_nb);
  1160		struct hisi_ptt_filter_update_info info;
  1161		struct device *dev = data;
  1162		struct pci_dev *pdev = to_pci_dev(dev);
  1163	
  1164		info.pdev = pdev;
  1165	
  1166		switch (action) {
  1167		case BUS_NOTIFY_ADD_DEVICE:
  1168			info.is_add = true;
  1169			break;
  1170		case BUS_NOTIFY_DEL_DEVICE:
  1171			info.is_add = false;
  1172			break;
  1173		default:
  1174			return 0;
  1175		}
  1176	
  1177		hisi_ptt_update_fifo_in(hisi_ptt, &info);
  1178	
  1179		return 0;
  1180	}
  1181	
  1182	static int hisi_ptt_init_filters(struct pci_dev *pdev, void *data)
  1183	{
  1184		struct hisi_ptt_filter_update_info info = {
  1185			.pdev = pdev,
  1186			.is_add = true,
  1187		};
  1188		struct hisi_ptt *hisi_ptt = data;
  1189	
  1190		hisi_ptt_update_fifo_in(hisi_ptt, &info);
  1191	
  1192		return 0;
  1193	}
  1194	
  1195	static void hisi_ptt_release_filters(struct hisi_ptt *hisi_ptt)
  1196	{
  1197		struct hisi_ptt_filter_desc *filter, *tfilter;
  1198	
  1199		list_for_each_entry_safe(filter, tfilter, &hisi_ptt->req_filters, list) {
  1200			list_del(&filter->list);
  1201			kfree(filter);
  1202		}
  1203	
  1204		list_for_each_entry_safe(filter, tfilter, &hisi_ptt->port_filters, list) {
  1205			list_del(&filter->list);
  1206			kfree(filter);
  1207		}
  1208	}
  1209	
  1210	static void hisi_ptt_init_ctrls(struct hisi_ptt *hisi_ptt)
  1211	{
  1212		struct pci_bus *bus;
  1213		u32 reg;
  1214	
  1215		INIT_LIST_HEAD(&hisi_ptt->port_filters);
  1216		INIT_LIST_HEAD(&hisi_ptt->req_filters);
  1217	
  1218		/*
  1219		 * The device range register provides the information about the
  1220		 * root ports which the RCiEP can control and trace. The RCiEP
  1221		 * and the root ports it support are on the same PCIe core, with
  1222		 * same domain number but maybe different bus number. The device
  1223		 * range register will tell us which root ports we can support,
  1224		 * Bit[31:16] indicates the upper BDF numbers of the root port,
  1225		 * while Bit[15:0] indicates the lower.
  1226		 */
  1227		reg = readl(hisi_ptt->iobase + HISI_PTT_DEVICE_RANGE);
  1228		hisi_ptt->upper = reg >> 16;
  1229		hisi_ptt->lower = reg & 0xffff;
  1230	
> 1231		bus = pci_find_bus(pci_domain_nr(hisi_ptt->pdev->bus),
  1232				   PCI_BUS_NUM(hisi_ptt->upper));
  1233		if (bus)
> 1234			pci_walk_bus(bus, hisi_ptt_init_filters, hisi_ptt);
  1235	
  1236		/* Initialize trace controls */
  1237		INIT_LIST_HEAD(&hisi_ptt->trace_ctrl.trace_buf);
  1238		hisi_ptt->trace_ctrl.buflet_nums = HISI_PTT_DEFAULT_TRACE_BUF_CNT;
  1239		hisi_ptt->trace_ctrl.buflet_size = HISI_PTT_TRACE_DEFAULT_BUFLET_SIZE;
  1240		hisi_ptt->trace_ctrl.rxtx = HISI_PTT_TRACE_DEFAULT_RXTX.event_code;
  1241		hisi_ptt->trace_ctrl.tr_event = HISI_PTT_TRACE_DEFAULT_EVENT.event_code;
  1242	}
  1243	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 58991 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device
  2021-04-06 13:49 ` [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Greg KH
@ 2021-04-07 10:03   ` Yicong Yang
  2021-04-07 10:25     ` Greg KH
  0 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2021-04-07 10:03 UTC (permalink / raw)
  To: Greg KH
  Cc: alexander.shishkin, helgaas, linux-kernel, linux-pci,
	lorenzo.pieralisi, jonathan.cameron, song.bao.hua, prime.zeng,
	linux-doc, linuxarm, liuqi (BA)

On 2021/4/6 21:49, Greg KH wrote:
> On Tue, Apr 06, 2021 at 08:45:50PM +0800, Yicong Yang wrote:
>> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex
>> integrated Endpoint(RCiEP) device, providing the capability
>> to dynamically monitor and tune the PCIe traffic(tune),
>> and trace the TLP headers(trace). The driver exposes the user
>> interface through debugfs, so no need for extra user space tools.
>> The usage is described in the document.
> 
> Why use debugfs and not the existing perf tools for debugging?
> 

The perf doesn't match our device as we've analyzed.

For the tune function it doesn't do the sampling at all.
User specifys one link parameter and reads its current value or set
the desired one. The process is static. We didn't find a
way to adapt to perf.

For the trace function, we may barely adapt to the perf framework
but it doesn't seems like a better choice. We have our own format
of data and don't need perf doing the parsing, and we'll get extra
information added by perf as well. The settings through perf tools
won't satisfy our needs, we cannot present available settings
(filter BDF number, TLP types, buffer controls) to
the user and user cannot set in a friendly way. For example,
we cannot count on perf to decode the usual format BDF number like
<domain>:<bus>:<dev>.<fn>, which user can use filter the TLP
headers.

So we intended to make the operation of this driver a bit like
ftrace. user sets the control settings through control files
and get the result through files as well. No additional tools
is necessay. A user space tool is necessary if we use a character
device or misc device for implementing this. The trace data maybe
hundreds of megabytes, and debugfs file can just satisfy it.

Thanks,
Yicong


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device
  2021-04-07 10:03   ` Yicong Yang
@ 2021-04-07 10:25     ` Greg KH
  2021-04-08 13:25       ` Yicong Yang
  0 siblings, 1 reply; 16+ messages in thread
From: Greg KH @ 2021-04-07 10:25 UTC (permalink / raw)
  To: Yicong Yang
  Cc: alexander.shishkin, helgaas, linux-kernel, linux-pci,
	lorenzo.pieralisi, jonathan.cameron, song.bao.hua, prime.zeng,
	linux-doc, linuxarm, liuqi (BA)

On Wed, Apr 07, 2021 at 06:03:11PM +0800, Yicong Yang wrote:
> On 2021/4/6 21:49, Greg KH wrote:
> > On Tue, Apr 06, 2021 at 08:45:50PM +0800, Yicong Yang wrote:
> >> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex
> >> integrated Endpoint(RCiEP) device, providing the capability
> >> to dynamically monitor and tune the PCIe traffic(tune),
> >> and trace the TLP headers(trace). The driver exposes the user
> >> interface through debugfs, so no need for extra user space tools.
> >> The usage is described in the document.
> > 
> > Why use debugfs and not the existing perf tools for debugging?
> > 
> 
> The perf doesn't match our device as we've analyzed.
> 
> For the tune function it doesn't do the sampling at all.
> User specifys one link parameter and reads its current value or set
> the desired one. The process is static. We didn't find a
> way to adapt to perf.
> 
> For the trace function, we may barely adapt to the perf framework
> but it doesn't seems like a better choice. We have our own format
> of data and don't need perf doing the parsing, and we'll get extra
> information added by perf as well. The settings through perf tools
> won't satisfy our needs, we cannot present available settings
> (filter BDF number, TLP types, buffer controls) to
> the user and user cannot set in a friendly way. For example,
> we cannot count on perf to decode the usual format BDF number like
> <domain>:<bus>:<dev>.<fn>, which user can use filter the TLP
> headers.

Please work with the perf developers to come up with a solution.  I find
it hard to believe that your hardware is so different than all the other
hardware that perf currently supports.  I would need their agreement
that you can not use perf before accepting this patchset.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver
  2021-04-06 12:45 ` [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver Yicong Yang
@ 2021-04-07 18:55   ` Bjorn Helgaas
  2021-04-08 13:22     ` Yicong Yang
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2021-04-07 18:55 UTC (permalink / raw)
  To: Yicong Yang
  Cc: alexander.shishkin, linux-kernel, linux-pci, lorenzo.pieralisi,
	gregkh, jonathan.cameron, song.bao.hua, prime.zeng, linux-doc,
	linuxarm

Move important info in the subject earlier, e.g.,

  docs: Add HiSilicon PTT device documentation

On Tue, Apr 06, 2021 at 08:45:53PM +0800, Yicong Yang wrote:
> Document the introduction and usage of HiSilicon PTT device driver.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  Documentation/trace/hisi-ptt.rst | 316 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 316 insertions(+)
>  create mode 100644 Documentation/trace/hisi-ptt.rst
> 
> diff --git a/Documentation/trace/hisi-ptt.rst b/Documentation/trace/hisi-ptt.rst
> new file mode 100644
> index 0000000..215676f
> --- /dev/null
> +++ b/Documentation/trace/hisi-ptt.rst
> @@ -0,0 +1,316 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +======================================
> +HiSilicon PCIe Tune and Trace device
> +======================================
> +
> +Introduction
> +============
> +
> +HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
> +integrated Endpoint (RCiEP) device, providing the capability
> +to dynamically monitor and tune the PCIe link's events (tune),
> +and trace the TLP headers (trace). The two functions are independent,
> +but is recommended to use them together to analyze and enhance the
> +PCIe link's performance.

> +On Kunpeng 930 SoC, the PCIe root complex is composed of several
> +PCIe cores.
> +Each core is composed of several root ports, RCiEPs, and one
> +PTT device, like below. The PTT device is capable of tuning and
> +tracing the link of the PCIe core.

s/root complex/Root Complex/ to match spec, diagram, RCiEP above
s/root ports/Root Ports/ to match spec, etc (also below)

Can you connect "Kunpeng 930" to something in the kernel tree?
"git grep -i kunpeng" shows nothing that's obviously relevant.
I assume there's a related driver in drivers/pci/controller/?

Is this one paragraph or two?  If one, reflow.  If two, add blank line
between.

IIUC, the diagram below shows two PCIe cores, each with three Root
Ports and a PTT RCiEP.  Your text mentions "RCiEPs, and one PTT" which
suggests RCiEPs in addition to the PTT, but the diagram doesn't show
any, and if there are other RCiEPs, they don't seem relevant to this
doc.  Maybe something like this?

  Each PCIe core includes several Root Ports and a PTT RCiEP ...

> +::
> +          +--------------Core 0-------+
> +          |       |       [   PTT   ] |
> +          |       |       [Root Port]---[Endpoint]
> +          |       |       [Root Port]---[Endpoint]
> +          |       |       [Root Port]---[Endpoint]
> +    Root Complex  |------Core 1-------+
> +          |       |       [   PTT   ] |
> +          |       |       [Root Port]---[ Switch ]---[Endpoint]
> +          |       |       [Root Port]---[Endpoint] `-[Endpoint]
> +          |       |       [Root Port]---[Endpoint]
> +          +---------------------------+
> +
> +The PTT device driver cannot be loaded if debugfs is not mounted.
> +Each PTT device will be presented under /sys/kernel/debugfs/hisi_ptt
> +as its root directory, with name of its BDF number.
> +::
> +
> +    /sys/kernel/debug/hisi_ptt/<domain>:<bus>:<device>.<function>
> +
> +Tune
> +====
> +
> +PTT tune is designed for monitoring and adjusting PCIe link parameters(events).

Add a space before "(".

> +Currently we support events in 4 classes. The scope of the events
> +covers the PCIe core with which the PTT device belongs to.

... the PCIe core to which the PTT device belongs.
> +
> +Each event is presented as a file under $(PTT root dir)/$(BDF)/tune, and
> +mostly a simple open/read/write/close cycle will be used to tune
> +the event.
> +::
> +    $ cd /sys/kernel/debug/hisi_ptt/$(BDF)/tune
> +    $ ls
> +    qos_tx_cpl    qos_tx_np    qos_tx_p
> +    tx_path_rx_req_alloc_buf_level
> +    tx_path_tx_req_alloc_buf_level
> +    $ cat qos_tx_dp
> +    1
> +    $ echo 2 > qos_tx_dp
> +    $ cat qos_tx_dp
> +    2
> +
> +Current value(numerical value) of the event can be simply read

Add space before "(".

> +from the file, and the desired value written to the file to tune.

> +Tuning multiple events at the same time is not permitted, which means
> +you cannot read or write more than one tune file at one time.

I think this is obvious from the model, so the sentence doesn't really
add anything.  Each event is a separate file, and it's obvious that
there's no way to write to multiple files simultaneously.

> +1. Tx path QoS control
> +------------------------
> +
> +Following files are provided to tune the QoS of the tx path of the PCIe core.

"The following ..."

> +- qos_tx_cpl: weight of tx completion TLPs
> +- qos_tx_np: weight of tx non-posted TLPs
> +- qos_tx_p: weight of tx posted TLPs
> +
> +The weight influences the proportion of certain packets on the PCIe link.
> +For example, for the storage scenario, increase the proportion
> +of the completion packets on the link to enhance the performance as
> +more completions are consumed.

I don't believe you can directly influence the *proportions* of packet
types.  The number and types of TLPs are determined by device driver
MMIO accesses and device DMAs.  Maybe you can influence the
*priority*?  I assume that regardless of these settings, the device
always respects the transaction ordering rules in PCIe r5.0, sec 2.4,
right?

> +The available tune data of these events is [0, 1, 2].
> +Writing a negative value will return an error, and out of range
> +values will be converted to 2. Note that the event value just
> +indicates a probable level, but is not precise.
> +
> +2. Tx path buffer control
> +-------------------------
> +
> +Following files are provided to tune the buffer of tx path of the PCIe core.
> +
> +- tx_path_rx_req_alloc_buf_level: watermark of RX requested
> +- tx_path_tx_req_alloc_buf_level: watermark of TX requested
> +
> +These events influence the watermark of the buffer allocated for each
> +type. RX means the inbound while Tx means outbound. For a busy
> +direction, you should increase the related buffer watermark to enhance
> +the performance.

Based on what you have written here, I would just write 2 to both
files to enhance the performance in both directions.  But obviously
there must be some tradeoff here, e.g., increasing Rx performance
comes at the cost of Tx performane.

Use "Rx" or "RX" (and "Tx" or "TX") consistently.  So far we have
"tx", "TX", "Tx", as well as "RX" and "Tx" in the same sentence.

> +The available tune data of above events is [0, 1, 2].
> +Writing a negative value will return an error, and out of range
> +values will be converted to 2. Note that the event value just
> +indicates a probable level, but is not precise.
> +
> +Trace
> +=====
> +
> +PTT trace is designed for dumping the TLP headers to the memory, which
> +can be used to analyze the transactions and usage condition of the PCIe
> +Link. You can chose to filter the traced headers by either requester ID,

s/chose/choose/

> +or those downstream of a set of root ports on the same core of the PTT
> +device. It's also support to trace the headers of certain type and of
> +certain direction.

s/support/supported/

> +In order to start trace, you need to configure the parameters first.
> +The parameters files is provided under $(PTT root dir)/$(BDF)/trace.

s/files is/files are/

> +::
> +    $ cd /sys/kernel/debug/hisi_ptt/$(BDF)/trace
> +    $ ls
> +    free_buffer     filter      buflet_nums     buflet_size
> +    direction       type        data            trace_on
> +    data_format
> +
> +1. filter
> +---------
> +
> +You can configure the filter of TLP headers through the file. The filter
> +is provided as BDF numbers of either root port or subordinates, which
> +belong to the same PCIe core. You can get the filters available and
> +currently configured by read the file, and write the desired BDF to the
> +file to set the filters. There is no default filter, which means you
> +must specifiy at least one filter before start tracing.
> +Write invalid BDF(not in the available list) will return
> +a failure.

s/by read/by reading/
s/specifiy/specify/
s/before start/before starting/
s/Write invalid/Writing an invalid/
s/BDF(not/BDF (not/

Reflow or separate paragraphs with blank lines.

> +::
> +    $ echo 0000:80:04.0 > filter
> +    $ cat filter
> +    #### Root Ports ####
> +    0000:80:00.0
> +    [0000:80:04.0]
> +    #### Functions ####
> +    0000:81:00.0
> +    0000:81:00.1
> +    0000:82:00.0
> +
> +Note that multiple root ports can be specified at one time, but only
> +one Endpoint function can be specified in one trace.
> +Specifying both root port and function at the same time is not supported.
> +
> +If no filter is available, read the filter will get the hint.

s/read the/reading the/

> +::
> +    $ cat filter
> +    #### No available filter ####
> +
> +The filter can be dynamically updated, which means you can always
> +get correct filter information when hotplug events happens, or
> +manually remove/rescan the devices.

s/events happens/events happen/
s/or manually remove/or when you manually remove/

> +2. type
> +-------
> +
> +You can trace the TLP headers of certain types by configure the file.
> +Read the file will get available types and current setting, and write
> +the desired type to the file to configure. The default type is
> +`posted_request` and write types not in the available list will return
> +a failure.

s/by configure/by configuring/
s/Read the file/Reading the file/
s/, and write the/. Write the/

> +::
> +    $ echo completion > type
> +    $ cat type
> +    all  posted_request  non-posted_request  [completion]
> +
> +3. direction
> +------------
> +
> +You can trace the TLP headers from certain direction, which is relative
> +to the root port or the PCIe core. Read the file to get available
> +directions and current configurition, and write the desired direction
> +to configure. The default value is `rx` and any invalid direction will
> +return a failure. Note `rxtx_no_dma_p2p` means the headers of both
> +directions, but not include P2P DMA access.
> +::
> +    $ echo rxtx > direction
> +    $ cat direction
> +    rx  tx  [rxtx]  rxtx_no_dma_p2p
> +
> +4. buflet_size
> +--------------
> +
> +The traced TLP headers will be written to the memory allocated
> +by the driver. The hardware accept 4 DMA address with same size,
> +and write the buflet sequentially like below. If DMA addr 3 is
> +finished and the trace is still on, it will return to addr 0.
> +Driver will allocated each DMA buffer (we call it buflet).
> +The finished buflet will be replaced with a new one, so
> +a long time trace can be achieved.

s/hardware accept/hardware accepts/
s/and write the/and writes the/
s/will allocated/will allocate/

> +::
> +    +->[DMA addr 0]->[DMA addr 1]->[DMA addr 2]->[DMA addr 3]-+
> +    +---------------------------------------------------------+
> +
> +You should both configure the buflet_size and buflet_nums to
> +configure the `trace buffer` to receive the TLP headers. The
> +total trace buffer size is buflet_size * buflet_nums. Note
> +that the trace buffer will not be allocated immediately after you
> +configure the parameters, but will be allocated right before
> +the trace starts.
> +
> +This file configures the buflet size. Read the file will get
> +available buflet size and size set currently, write the desired
> +size to the file to configure. The default size is 2 MiB and any
> +invalid size written will return a failure.

s/Read the file/Reading the file/
s/currently, write the/currently; write the/

> +::
> +    $ cat buflet_size
> +    [2 MiB]     4 MiB
> +    $ echo 4 > buflet_size
> +    $ cat buflet_size
> +    2 MiB     [4 MiB]
> +
> +5. buflet_nums
> +--------------
> +
> +You can write the desired buflet count to the file to configure,
> +and read the file to get current buflet count. The default
> +value is 64. And any positive value is valid. Note that big value
> +may lead to DMA memory allocation failure, and you will not be
> +able to start tracing. If it happens, you should consider adjusting
> +buflet_nums or buflet_size.

s/And any positive/Any positive/

> +::
> +    $ cat buflet_nums
> +    64
> +    $ echo 128 > buflet_nums
> +    $ cat buflet_nums
> +    128
> +
> +6. data
> +-------
> +
> +The file to access the traced data. You can read the file to get the
> +binary blob of traced TLP headers. The format of the headers is
> +4 Dword length and is just as defined by the PCIe Spec r4.0,
> +Sec 2.2.4.1, or 8 Dword length with additional 4 Dword extra
> +information.
> +
> +echo "" > data will free all the trace buffers allocated as well as
> +the traced datas.
> +
> +7. trace_on
> +-----------
> +
> +Start or end the trace by simple writing to the file, and monitor the
> +trace status by reading the file.

s/by simple writing/by writing/

> +::
> +    $ echo 1 > trace_on     # start trace
> +    $ cat trace_on          # get the trace status
> +    1
> +    $ echo 0 > trace_on     # manually end trace
> +
> +The read value of the trace_on will be auto cleared if the buffer
> +allocated is full. 1 indicates the trace is running and 0 for
> +stopped. Write any non-zero value to the file can start trace.

"Writing any non-zero value to the file starts tracing."

> +8. free_buffer
> +--------------
> +
> +File to indicate the trace buffer status and to manually free the
> +trace buffer. The read value of 1 indicates the trace buffer has
> +been allocated and exists in the memory, while 0 indicates there
> +is no buffer allocated. Write 1 to the file to free the trace
> +buffer as well as the traced datas.

s/datas/data/

> +::
> +    $ cat free_buffer
> +    1                       # indicate the buffer exists
> +    $ echo 1 > free_buffer  # free the trace buffer
> +    $ cat free_buffer
> +    0
> +
> +9. data_format
> +--------------
> +
> +File to indicate the format of the traced TLP headers. User can also
> +specify the desired format of traced TLP headers. Available formats
> +are 4DW, 8DW which indicates the length of each TLP headers traced.
> +::
> +    $ cat data_format
> +    [4DW]    8DW
> +    $ echo 8 > data_format
> +    $ cat data_format
> +    4DW     [8DW]
> +
> +The traced TLP header format is different from the PCIe standard.

I'm confused.  Below you say the fields of the traced TLP header are
defined by the PCIe spec.  But here you say the format is *different*.
What exactly is different?

> +4DW format is like
> +::
> +    bits [31:30] [ 29:25 ][24][23][22][21][    20:11   ][    10:0    ]
> +         |-----|---------|---|---|---|---|-------------|-------------|
> +     DW0 [ Fmt ][  Type  ][T9][T8][TH][SO][   Length   ][    Time    ]
> +     DW1 [                     Header DW1                            ]
> +     DW2 [                     Header DW2                            ]
> +     DW3 [                     Header DW3                            ]
> +
> +For 8DW format, the bit[31:11] of DW0 is always 0x1fffff, which can be
> +used to distinguish the data format. 8DW format is like
> +::
> +    bits [                 31:11                 ][       10:0       ]
> +         |---------------------------------------|-------------------|
> +     DW0 [                0x1fffff               ][ Reserved (0x7ff) ]
> +     DW1 [                       Prefix                              ]
> +     DW2 [                     Header DW0                            ]
> +     DW3 [                     Header DW1                            ]
> +     DW4 [                     Header DW2                            ]
> +     DW5 [                     Header DW3                            ]
> +     DW6 [                   Reserved (0x0)                          ]
> +     DW7 [                        Time                               ]
> +
> +All the fields of the traced TLP header is defined by the PCIe Specification.
> +While 'Header DWx' means standard TLP header DWord x, and 'Time' is the
> +timestamp of the traced header.
> -- 
> 2.8.1
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver
  2021-04-07 18:55   ` Bjorn Helgaas
@ 2021-04-08 13:22     ` Yicong Yang
  2021-04-08 16:57       ` Bjorn Helgaas
  0 siblings, 1 reply; 16+ messages in thread
From: Yicong Yang @ 2021-04-08 13:22 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alexander.shishkin, linux-kernel, linux-pci, lorenzo.pieralisi,
	gregkh, jonathan.cameron, song.bao.hua, prime.zeng, linux-doc,
	linuxarm

On 2021/4/8 2:55, Bjorn Helgaas wrote:
> Move important info in the subject earlier, e.g.,
> 
>   docs: Add HiSilicon PTT device documentation
> 
> On Tue, Apr 06, 2021 at 08:45:53PM +0800, Yicong Yang wrote:
>> Document the introduction and usage of HiSilicon PTT device driver.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  Documentation/trace/hisi-ptt.rst | 316 +++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 316 insertions(+)
>>  create mode 100644 Documentation/trace/hisi-ptt.rst
>>
>> diff --git a/Documentation/trace/hisi-ptt.rst b/Documentation/trace/hisi-ptt.rst
>> new file mode 100644
>> index 0000000..215676f
>> --- /dev/null
>> +++ b/Documentation/trace/hisi-ptt.rst
>> @@ -0,0 +1,316 @@
>> +.. SPDX-License-Identifier: GPL-2.0
>> +
>> +======================================
>> +HiSilicon PCIe Tune and Trace device
>> +======================================
>> +
>> +Introduction
>> +============
>> +
>> +HiSilicon PCIe tune and trace device (PTT) is a PCIe Root Complex
>> +integrated Endpoint (RCiEP) device, providing the capability
>> +to dynamically monitor and tune the PCIe link's events (tune),
>> +and trace the TLP headers (trace). The two functions are independent,
>> +but is recommended to use them together to analyze and enhance the
>> +PCIe link's performance.
> 
>> +On Kunpeng 930 SoC, the PCIe root complex is composed of several
>> +PCIe cores.
>> +Each core is composed of several root ports, RCiEPs, and one
>> +PTT device, like below. The PTT device is capable of tuning and
>> +tracing the link of the PCIe core.
> 
> s/root complex/Root Complex/ to match spec, diagram, RCiEP above
> s/root ports/Root Ports/ to match spec, etc (also below)
> 

thanks. will fix here and below in this doc.

> Can you connect "Kunpeng 930" to something in the kernel tree?
> "git grep -i kunpeng" shows nothing that's obviously relevant.
> I assume there's a related driver in drivers/pci/controller/?
> 

Kunpeng 930 is the product name of Hip09 platform. The PCIe
controller uses the generic PCIe driver based on ACPI.

> Is this one paragraph or two?  If one, reflow.  If two, add blank line
> between.
> 

will reflow here and below. it's one paragraph.

> IIUC, the diagram below shows two PCIe cores, each with three Root
> Ports and a PTT RCiEP.  Your text mentions "RCiEPs, and one PTT" which
> suggests RCiEPs in addition to the PTT, but the diagram doesn't show
> any, and if there are other RCiEPs, they don't seem relevant to this
> doc.  Maybe something like this?
> 
>   Each PCIe core includes several Root Ports and a PTT RCiEP ...
> 

will fix.

>> +::
>> +          +--------------Core 0-------+
>> +          |       |       [   PTT   ] |
>> +          |       |       [Root Port]---[Endpoint]
>> +          |       |       [Root Port]---[Endpoint]
>> +          |       |       [Root Port]---[Endpoint]
>> +    Root Complex  |------Core 1-------+
>> +          |       |       [   PTT   ] |
>> +          |       |       [Root Port]---[ Switch ]---[Endpoint]
>> +          |       |       [Root Port]---[Endpoint] `-[Endpoint]
>> +          |       |       [Root Port]---[Endpoint]
>> +          +---------------------------+
>> +
>> +The PTT device driver cannot be loaded if debugfs is not mounted.
>> +Each PTT device will be presented under /sys/kernel/debugfs/hisi_ptt
>> +as its root directory, with name of its BDF number.
>> +::
>> +
>> +    /sys/kernel/debug/hisi_ptt/<domain>:<bus>:<device>.<function>
>> +
>> +Tune
>> +====
>> +
>> +PTT tune is designed for monitoring and adjusting PCIe link parameters(events).
> 
> Add a space before "(".
> 

will add here and below.

>> +Currently we support events in 4 classes. The scope of the events
>> +covers the PCIe core with which the PTT device belongs to.
> 
> ... the PCIe core to which the PTT device belongs.

will fix.

>> +
>> +Each event is presented as a file under $(PTT root dir)/$(BDF)/tune, and
>> +mostly a simple open/read/write/close cycle will be used to tune
>> +the event.
>> +::
>> +    $ cd /sys/kernel/debug/hisi_ptt/$(BDF)/tune
>> +    $ ls
>> +    qos_tx_cpl    qos_tx_np    qos_tx_p
>> +    tx_path_rx_req_alloc_buf_level
>> +    tx_path_tx_req_alloc_buf_level
>> +    $ cat qos_tx_dp
>> +    1
>> +    $ echo 2 > qos_tx_dp
>> +    $ cat qos_tx_dp
>> +    2
>> +
>> +Current value(numerical value) of the event can be simply read
> 
> Add space before "(".
> 
>> +from the file, and the desired value written to the file to tune.
> 
>> +Tuning multiple events at the same time is not permitted, which means
>> +you cannot read or write more than one tune file at one time.
> 
> I think this is obvious from the model, so the sentence doesn't really
> add anything.  Each event is a separate file, and it's obvious that
> there's no way to write to multiple files simultaneously.
> 

from the usage we shown below this situation won't happen. I just worry
that users may have a program to open multiple files at the same time and
read/write simultaneously, so add this line here to mention the restriction.

>> +1. Tx path QoS control
>> +------------------------
>> +
>> +Following files are provided to tune the QoS of the tx path of the PCIe core.
> 
> "The following ..."
> will fix.
>> +- qos_tx_cpl: weight of tx completion TLPs
>> +- qos_tx_np: weight of tx non-posted TLPs
>> +- qos_tx_p: weight of tx posted TLPs
>> +
>> +The weight influences the proportion of certain packets on the PCIe link.
>> +For example, for the storage scenario, increase the proportion
>> +of the completion packets on the link to enhance the performance as
>> +more completions are consumed.
> 
> I don't believe you can directly influence the *proportions* of packet
> types.  The number and types of TLPs are determined by device driver
> MMIO accesses and device DMAs.  Maybe you can influence the
> *priority*?  I assume that regardless of these settings, the device
> always respects the transaction ordering rules in PCIe r5.0, sec 2.4,
> right?
> 

yes you're right. the word I used here is misleading and 'priority' is
precise. what we achieved won't violate the PCIe spec. the ordering
rules are always kept. When the ordering is kept, the packets with
larger weight will have more priority to be posted.

>> +The available tune data of these events is [0, 1, 2].
>> +Writing a negative value will return an error, and out of range
>> +values will be converted to 2. Note that the event value just
>> +indicates a probable level, but is not precise.
>> +
>> +2. Tx path buffer control
>> +-------------------------
>> +
>> +Following files are provided to tune the buffer of tx path of the PCIe core.
>> +
>> +- tx_path_rx_req_alloc_buf_level: watermark of RX requested
>> +- tx_path_tx_req_alloc_buf_level: watermark of TX requested
>> +
>> +These events influence the watermark of the buffer allocated for each
>> +type. RX means the inbound while Tx means outbound. For a busy
>> +direction, you should increase the related buffer watermark to enhance
>> +the performance.
> 
> Based on what you have written here, I would just write 2 to both
> files to enhance the performance in both directions.  But obviously
> there must be some tradeoff here, e.g., increasing Rx performance
> comes at the cost of Tx performane.
> 

the Rx buffer and Tx buffer are separate, so they won't influence
each other.

> Use "Rx" or "RX" (and "Tx" or "TX") consistently.  So far we have
> "tx", "TX", "Tx", as well as "RX" and "Tx" in the same sentence.
> 

will fix here and at other places in the doc.

>> +The available tune data of above events is [0, 1, 2].
>> +Writing a negative value will return an error, and out of range
>> +values will be converted to 2. Note that the event value just
>> +indicates a probable level, but is not precise.
>> +
>> +Trace
>> +=====
>> +
>> +PTT trace is designed for dumping the TLP headers to the memory, which
>> +can be used to analyze the transactions and usage condition of the PCIe
>> +Link. You can chose to filter the traced headers by either requester ID,
> 
> s/chose/choose/
> will fix.
>> +or those downstream of a set of root ports on the same core of the PTT
>> +device. It's also support to trace the headers of certain type and of
>> +certain direction.
> 
> s/support/supported/
> 
will fix.
>> +In order to start trace, you need to configure the parameters first.
>> +The parameters files is provided under $(PTT root dir)/$(BDF)/trace.
> 
> s/files is/files are/
> 
will fix.
>> +::
>> +    $ cd /sys/kernel/debug/hisi_ptt/$(BDF)/trace
>> +    $ ls
>> +    free_buffer     filter      buflet_nums     buflet_size
>> +    direction       type        data            trace_on
>> +    data_format
>> +
>> +1. filter
>> +---------
>> +
>> +You can configure the filter of TLP headers through the file. The filter
>> +is provided as BDF numbers of either root port or subordinates, which
>> +belong to the same PCIe core. You can get the filters available and
>> +currently configured by read the file, and write the desired BDF to the
>> +file to set the filters. There is no default filter, which means you
>> +must specifiy at least one filter before start tracing.
>> +Write invalid BDF(not in the available list) will return
>> +a failure.
> 
> s/by read/by reading/
> s/specifiy/specify/
> s/before start/before starting/
> s/Write invalid/Writing an invalid/
> s/BDF(not/BDF (not/
> 
> Reflow or separate paragraphs with blank lines.
> 
will fix these and reflow here.
>> +::
>> +    $ echo 0000:80:04.0 > filter
>> +    $ cat filter
>> +    #### Root Ports ####
>> +    0000:80:00.0
>> +    [0000:80:04.0]
>> +    #### Functions ####
>> +    0000:81:00.0
>> +    0000:81:00.1
>> +    0000:82:00.0
>> +
>> +Note that multiple root ports can be specified at one time, but only
>> +one Endpoint function can be specified in one trace.
>> +Specifying both root port and function at the same time is not supported.
>> +
>> +If no filter is available, read the filter will get the hint.
> 
> s/read the/reading the/
> 
will fix.
>> +::
>> +    $ cat filter
>> +    #### No available filter ####
>> +
>> +The filter can be dynamically updated, which means you can always
>> +get correct filter information when hotplug events happens, or
>> +manually remove/rescan the devices.
> 
> s/events happens/events happen/
> s/or manually remove/or when you manually remove/
> 
will fix.
>> +2. type
>> +-------
>> +
>> +You can trace the TLP headers of certain types by configure the file.
>> +Read the file will get available types and current setting, and write
>> +the desired type to the file to configure. The default type is
>> +`posted_request` and write types not in the available list will return
>> +a failure.
> 
> s/by configure/by configuring/
> s/Read the file/Reading the file/
> s/, and write the/. Write the/
> 
will fix.
>> +::
>> +    $ echo completion > type
>> +    $ cat type
>> +    all  posted_request  non-posted_request  [completion]
>> +
>> +3. direction
>> +------------
>> +
>> +You can trace the TLP headers from certain direction, which is relative
>> +to the root port or the PCIe core. Read the file to get available
>> +directions and current configurition, and write the desired direction
>> +to configure. The default value is `rx` and any invalid direction will
>> +return a failure. Note `rxtx_no_dma_p2p` means the headers of both
>> +directions, but not include P2P DMA access.
>> +::
>> +    $ echo rxtx > direction
>> +    $ cat direction
>> +    rx  tx  [rxtx]  rxtx_no_dma_p2p
>> +
>> +4. buflet_size
>> +--------------
>> +
>> +The traced TLP headers will be written to the memory allocated
>> +by the driver. The hardware accept 4 DMA address with same size,
>> +and write the buflet sequentially like below. If DMA addr 3 is
>> +finished and the trace is still on, it will return to addr 0.
>> +Driver will allocated each DMA buffer (we call it buflet).
>> +The finished buflet will be replaced with a new one, so
>> +a long time trace can be achieved.
> 
> s/hardware accept/hardware accepts/
> s/and write the/and writes the/
> s/will allocated/will allocate/
> 
will fix.
>> +::
>> +    +->[DMA addr 0]->[DMA addr 1]->[DMA addr 2]->[DMA addr 3]-+
>> +    +---------------------------------------------------------+
>> +
>> +You should both configure the buflet_size and buflet_nums to
>> +configure the `trace buffer` to receive the TLP headers. The
>> +total trace buffer size is buflet_size * buflet_nums. Note
>> +that the trace buffer will not be allocated immediately after you
>> +configure the parameters, but will be allocated right before
>> +the trace starts.
>> +
>> +This file configures the buflet size. Read the file will get
>> +available buflet size and size set currently, write the desired
>> +size to the file to configure. The default size is 2 MiB and any
>> +invalid size written will return a failure.
> 
> s/Read the file/Reading the file/
> s/currently, write the/currently; write the/
> 
will fix.
>> +::
>> +    $ cat buflet_size
>> +    [2 MiB]     4 MiB
>> +    $ echo 4 > buflet_size
>> +    $ cat buflet_size
>> +    2 MiB     [4 MiB]
>> +
>> +5. buflet_nums
>> +--------------
>> +
>> +You can write the desired buflet count to the file to configure,
>> +and read the file to get current buflet count. The default
>> +value is 64. And any positive value is valid. Note that big value
>> +may lead to DMA memory allocation failure, and you will not be
>> +able to start tracing. If it happens, you should consider adjusting
>> +buflet_nums or buflet_size.
> 
> s/And any positive/Any positive/
> 
will fix.
>> +::
>> +    $ cat buflet_nums
>> +    64
>> +    $ echo 128 > buflet_nums
>> +    $ cat buflet_nums
>> +    128
>> +
>> +6. data
>> +-------
>> +
>> +The file to access the traced data. You can read the file to get the
>> +binary blob of traced TLP headers. The format of the headers is
>> +4 Dword length and is just as defined by the PCIe Spec r4.0,
>> +Sec 2.2.4.1, or 8 Dword length with additional 4 Dword extra
>> +information.
>> +
>> +echo "" > data will free all the trace buffers allocated as well as
>> +the traced datas.
>> +
>> +7. trace_on
>> +-----------
>> +
>> +Start or end the trace by simple writing to the file, and monitor the
>> +trace status by reading the file.
> 
> s/by simple writing/by writing/
> 
will fix.
>> +::
>> +    $ echo 1 > trace_on     # start trace
>> +    $ cat trace_on          # get the trace status
>> +    1
>> +    $ echo 0 > trace_on     # manually end trace
>> +
>> +The read value of the trace_on will be auto cleared if the buffer
>> +allocated is full. 1 indicates the trace is running and 0 for
>> +stopped. Write any non-zero value to the file can start trace.
> 
> "Writing any non-zero value to the file starts tracing."
> 
will fix.
>> +8. free_buffer
>> +--------------
>> +
>> +File to indicate the trace buffer status and to manually free the
>> +trace buffer. The read value of 1 indicates the trace buffer has
>> +been allocated and exists in the memory, while 0 indicates there
>> +is no buffer allocated. Write 1 to the file to free the trace
>> +buffer as well as the traced datas.
> 
> s/datas/data/
> 
will fix.
>> +::
>> +    $ cat free_buffer
>> +    1                       # indicate the buffer exists
>> +    $ echo 1 > free_buffer  # free the trace buffer
>> +    $ cat free_buffer
>> +    0
>> +
>> +9. data_format
>> +--------------
>> +
>> +File to indicate the format of the traced TLP headers. User can also
>> +specify the desired format of traced TLP headers. Available formats
>> +are 4DW, 8DW which indicates the length of each TLP headers traced.
>> +::
>> +    $ cat data_format
>> +    [4DW]    8DW
>> +    $ echo 8 > data_format
>> +    $ cat data_format
>> +    4DW     [8DW]
>> +
>> +The traced TLP header format is different from the PCIe standard.
> 
> I'm confused.  Below you say the fields of the traced TLP header are
> defined by the PCIe spec.  But here you say the format is *different*.
> What exactly is different?
> 

For the Request Header Format for 64-bit addressing of Memory, defind in
PCIe spec 4.0, Figure 2-15, the 1st DW is like:

Byte 0 > [Fmt] [Type] [T9] [Tc] [T8] [Attr] [LN] [TH] ... [Length]

some are recorded in our traced header like below, which some are not.
that's what I mean the format of the header are different. But for a
certain field like 'Fmt', the meaning keeps same with what Spec defined.
that's what I mean the fields definition of our traced header keep same
with the Spec.

Seem what I described in the doc is a little ambigious.

Thanks for the review.

Thanks,
Yicong

>> +4DW format is like
>> +::
>> +    bits [31:30] [ 29:25 ][24][23][22][21][    20:11   ][    10:0    ]
>> +         |-----|---------|---|---|---|---|-------------|-------------|
>> +     DW0 [ Fmt ][  Type  ][T9][T8][TH][SO][   Length   ][    Time    ]
>> +     DW1 [                     Header DW1                            ]
>> +     DW2 [                     Header DW2                            ]
>> +     DW3 [                     Header DW3                            ]
>> +
>> +For 8DW format, the bit[31:11] of DW0 is always 0x1fffff, which can be
>> +used to distinguish the data format. 8DW format is like
>> +::
>> +    bits [                 31:11                 ][       10:0       ]
>> +         |---------------------------------------|-------------------|
>> +     DW0 [                0x1fffff               ][ Reserved (0x7ff) ]
>> +     DW1 [                       Prefix                              ]
>> +     DW2 [                     Header DW0                            ]
>> +     DW3 [                     Header DW1                            ]
>> +     DW4 [                     Header DW2                            ]
>> +     DW5 [                     Header DW3                            ]
>> +     DW6 [                   Reserved (0x0)                          ]
>> +     DW7 [                        Time                               ]
>> +
>> +All the fields of the traced TLP header is defined by the PCIe Specification.
>> +While 'Header DWx' means standard TLP header DWord x, and 'Time' is the
>> +timestamp of the traced header.
>> -- 
>> 2.8.1
>>
> 
> .
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device
  2021-04-07 10:25     ` Greg KH
@ 2021-04-08 13:25       ` Yicong Yang
  0 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2021-04-08 13:25 UTC (permalink / raw)
  To: Greg KH
  Cc: alexander.shishkin, helgaas, linux-kernel, linux-pci,
	lorenzo.pieralisi, jonathan.cameron, song.bao.hua, prime.zeng,
	linux-doc, linuxarm, liuqi (BA)

On 2021/4/7 18:25, Greg KH wrote:
> On Wed, Apr 07, 2021 at 06:03:11PM +0800, Yicong Yang wrote:
>> On 2021/4/6 21:49, Greg KH wrote:
>>> On Tue, Apr 06, 2021 at 08:45:50PM +0800, Yicong Yang wrote:
>>>> HiSilicon PCIe tune and trace device(PTT) is a PCIe Root Complex
>>>> integrated Endpoint(RCiEP) device, providing the capability
>>>> to dynamically monitor and tune the PCIe traffic(tune),
>>>> and trace the TLP headers(trace). The driver exposes the user
>>>> interface through debugfs, so no need for extra user space tools.
>>>> The usage is described in the document.
>>>
>>> Why use debugfs and not the existing perf tools for debugging?
>>>
>>
>> The perf doesn't match our device as we've analyzed.
>>
>> For the tune function it doesn't do the sampling at all.
>> User specifys one link parameter and reads its current value or set
>> the desired one. The process is static. We didn't find a
>> way to adapt to perf.
>>
>> For the trace function, we may barely adapt to the perf framework
>> but it doesn't seems like a better choice. We have our own format
>> of data and don't need perf doing the parsing, and we'll get extra
>> information added by perf as well. The settings through perf tools
>> won't satisfy our needs, we cannot present available settings
>> (filter BDF number, TLP types, buffer controls) to
>> the user and user cannot set in a friendly way. For example,
>> we cannot count on perf to decode the usual format BDF number like
>> <domain>:<bus>:<dev>.<fn>, which user can use filter the TLP
>> headers.
> 
> Please work with the perf developers to come up with a solution.  I find
> it hard to believe that your hardware is so different than all the other
> hardware that perf currently supports.  I would need their agreement
> that you can not use perf before accepting this patchset.
> 

Sure. I'll resend this series with more detailed information and with perf list
and developers cc'ed to collect more suggestions on this device and driver.

Thanks,
Yicong





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver
  2021-04-08 13:22     ` Yicong Yang
@ 2021-04-08 16:57       ` Bjorn Helgaas
  2021-04-09 14:09         ` Yicong Yang
  0 siblings, 1 reply; 16+ messages in thread
From: Bjorn Helgaas @ 2021-04-08 16:57 UTC (permalink / raw)
  To: Yicong Yang
  Cc: alexander.shishkin, linux-kernel, linux-pci, lorenzo.pieralisi,
	gregkh, jonathan.cameron, song.bao.hua, prime.zeng, linux-doc,
	linuxarm

On Thu, Apr 08, 2021 at 09:22:52PM +0800, Yicong Yang wrote:
> On 2021/4/8 2:55, Bjorn Helgaas wrote:
> > On Tue, Apr 06, 2021 at 08:45:53PM +0800, Yicong Yang wrote:

> >> +On Kunpeng 930 SoC, the PCIe root complex is composed of several
> >> +PCIe cores.
> 
> > Can you connect "Kunpeng 930" to something in the kernel tree?
> > "git grep -i kunpeng" shows nothing that's obviously relevant.
> > I assume there's a related driver in drivers/pci/controller/?
> 
> Kunpeng 930 is the product name of Hip09 platform. The PCIe
> controller uses the generic PCIe driver based on ACPI.

I guess I'm just looking for a hint to help users know when to enable
the Kconfig for this.  Maybe the "HiSilicon" in the Kconfig help is
enough?  Maybe "Kunpeng 930" is not even necessary?  If "Kunpeng 930"
*is* necessary, there should be some way to relate it to something
else.

> >> +from the file, and the desired value written to the file to tune.
> > 
> >> +Tuning multiple events at the same time is not permitted, which means
> >> +you cannot read or write more than one tune file at one time.
> > 
> > I think this is obvious from the model, so the sentence doesn't really
> > add anything.  Each event is a separate file, and it's obvious that
> > there's no way to write to multiple files simultaneously.
> 
> from the usage we shown below this situation won't happen. I just worry
> that users may have a program to open multiple files at the same time and
> read/write simultaneously, so add this line here to mention the restriction.

How is this possible?  I don't think "writing multiple files
simultaneously" is even possible in the Linux syscall model.  I don't
think a user will do anything differently after reading "you cannot
read or write more than one tune file at one time."

> >> +- tx_path_rx_req_alloc_buf_level: watermark of RX requested
> >> +- tx_path_tx_req_alloc_buf_level: watermark of TX requested
> >> +
> >> +These events influence the watermark of the buffer allocated for each
> >> +type. RX means the inbound while Tx means outbound. For a busy
> >> +direction, you should increase the related buffer watermark to enhance
> >> +the performance.
> > 
> > Based on what you have written here, I would just write 2 to both
> > files to enhance the performance in both directions.  But obviously
> > there must be some tradeoff here, e.g., increasing Rx performance
> > comes at the cost of Tx performane.
> 
> the Rx buffer and Tx buffer are separate, so they won't influence
> each other.

Why would I write anything other than 2 to these files?  That's the
question I think this paragraph should answer.

> >> +9. data_format
> >> +--------------
> >> +
> >> +File to indicate the format of the traced TLP headers. User can also
> >> +specify the desired format of traced TLP headers. Available formats
> >> +are 4DW, 8DW which indicates the length of each TLP headers traced.
> >> +::
> >> +    $ cat data_format
> >> +    [4DW]    8DW
> >> +    $ echo 8 > data_format
> >> +    $ cat data_format
> >> +    4DW     [8DW]
> >> +
> >> +The traced TLP header format is different from the PCIe standard.
> > 
> > I'm confused.  Below you say the fields of the traced TLP header are
> > defined by the PCIe spec.  But here you say the format is *different*.
> > What exactly is different?
> 
> For the Request Header Format for 64-bit addressing of Memory, defind in
> PCIe spec 4.0, Figure 2-15, the 1st DW is like:
> 
> Byte 0 > [Fmt] [Type] [T9] [Tc] [T8] [Attr] [LN] [TH] ... [Length]
> 
> some are recorded in our traced header like below, which some are not.
> that's what I mean the format of the header are different. But for a
> certain field like 'Fmt', the meaning keeps same with what Spec defined.
> that's what I mean the fields definition of our traced header keep same
> with the Spec.

Ah, that helps a lot, thank you.  Maybe you could say something along
the lines of this:

  When using the 8DW data format, the entire TLP header is logged.
  For example, the TLP header for Memory Reads with 64-bit addresses
  is shown in PCIe r5.0, Figure 2-17; the header for Configuration
  Requests is shown in Figure 2.20, etc.

  In addition, 8DW trace buffer entries contain a timestamp and
  possibly a prefix, e.g., a PASID TLP prefix (see Figure 6-20).  TLPs
  may include more than one prefix, but only one can be logged in
  trace buffer entries.

  When using the 4DW data format, DW0 of the trace buffer entry
  contains selected fields of DW0 of the TLP, together with a
  timestamp.  DW1-DW3 of the trace buffer entry contain DW1-DW3
  directly from the TLP header.

This looks like a really cool device.  I wish we had this for more
platforms.

Bjorn

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver
  2021-04-08 16:57       ` Bjorn Helgaas
@ 2021-04-09 14:09         ` Yicong Yang
  0 siblings, 0 replies; 16+ messages in thread
From: Yicong Yang @ 2021-04-09 14:09 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: alexander.shishkin, linux-kernel, linux-pci, lorenzo.pieralisi,
	gregkh, jonathan.cameron, song.bao.hua, prime.zeng, linux-doc,
	linuxarm

On 2021/4/9 0:57, Bjorn Helgaas wrote:
> On Thu, Apr 08, 2021 at 09:22:52PM +0800, Yicong Yang wrote:
>> On 2021/4/8 2:55, Bjorn Helgaas wrote:
>>> On Tue, Apr 06, 2021 at 08:45:53PM +0800, Yicong Yang wrote:
> 
>>>> +On Kunpeng 930 SoC, the PCIe root complex is composed of several
>>>> +PCIe cores.
>>
>>> Can you connect "Kunpeng 930" to something in the kernel tree?
>>> "git grep -i kunpeng" shows nothing that's obviously relevant.
>>> I assume there's a related driver in drivers/pci/controller/?
>>
>> Kunpeng 930 is the product name of Hip09 platform. The PCIe
>> controller uses the generic PCIe driver based on ACPI.
> 
> I guess I'm just looking for a hint to help users know when to enable
> the Kconfig for this.  Maybe the "HiSilicon" in the Kconfig help is
> enough?  Maybe "Kunpeng 930" is not even necessary?  If "Kunpeng 930"
> *is* necessary, there should be some way to relate it to something
> else.
> 

since it's added in Kunpeng 930. otherwise users maybe confused why they
don't find it on Kunpeng 920 (Hip 08) or older platforms. The Kunpeng is
the product name, users should have known it when they have such a platform.

>>>> +from the file, and the desired value written to the file to tune.
>>>
>>>> +Tuning multiple events at the same time is not permitted, which means
>>>> +you cannot read or write more than one tune file at one time.
>>>
>>> I think this is obvious from the model, so the sentence doesn't really
>>> add anything.  Each event is a separate file, and it's obvious that
>>> there's no way to write to multiple files simultaneously.
>>
>> from the usage we shown below this situation won't happen. I just worry
>> that users may have a program to open multiple files at the same time and
>> read/write simultaneously, so add this line here to mention the restriction.
> 
> How is this possible?  I don't think "writing multiple files
> simultaneously" is even possible in the Linux syscall model.  I don't
> think a user will do anything differently after reading "you cannot
> read or write more than one tune file at one time."
> 
then I'll remove this line. thanks.
>>>> +- tx_path_rx_req_alloc_buf_level: watermark of RX requested
>>>> +- tx_path_tx_req_alloc_buf_level: watermark of TX requested
>>>> +
>>>> +These events influence the watermark of the buffer allocated for each
>>>> +type. RX means the inbound while Tx means outbound. For a busy
>>>> +direction, you should increase the related buffer watermark to enhance
>>>> +the performance.
>>>
>>> Based on what you have written here, I would just write 2 to both
>>> files to enhance the performance in both directions.  But obviously
>>> there must be some tradeoff here, e.g., increasing Rx performance
>>> comes at the cost of Tx performane.
>>
>> the Rx buffer and Tx buffer are separate, so they won't influence
>> each other.
> 
> Why would I write anything other than 2 to these files?  That's the
> question I think this paragraph should answer.
> 

In most cases just keep the normal level.

the data in the buffer will be posted when reaching the watermark
or timed out. for some situation you have an idle traffic but you
want a quick response, then set the watermark in lower level.

>>>> +9. data_format
>>>> +--------------
>>>> +
>>>> +File to indicate the format of the traced TLP headers. User can also
>>>> +specify the desired format of traced TLP headers. Available formats
>>>> +are 4DW, 8DW which indicates the length of each TLP headers traced.
>>>> +::
>>>> +    $ cat data_format
>>>> +    [4DW]    8DW
>>>> +    $ echo 8 > data_format
>>>> +    $ cat data_format
>>>> +    4DW     [8DW]
>>>> +
>>>> +The traced TLP header format is different from the PCIe standard.
>>>
>>> I'm confused.  Below you say the fields of the traced TLP header are
>>> defined by the PCIe spec.  But here you say the format is *different*.
>>> What exactly is different?
>>
>> For the Request Header Format for 64-bit addressing of Memory, defind in
>> PCIe spec 4.0, Figure 2-15, the 1st DW is like:
>>
>> Byte 0 > [Fmt] [Type] [T9] [Tc] [T8] [Attr] [LN] [TH] ... [Length]
>>
>> some are recorded in our traced header like below, which some are not.
>> that's what I mean the format of the header are different. But for a
>> certain field like 'Fmt', the meaning keeps same with what Spec defined.
>> that's what I mean the fields definition of our traced header keep same
>> with the Spec.
> 
> Ah, that helps a lot, thank you.  Maybe you could say something along
> the lines of this:
> 
>   When using the 8DW data format, the entire TLP header is logged.
>   For example, the TLP header for Memory Reads with 64-bit addresses
>   is shown in PCIe r5.0, Figure 2-17; the header for Configuration
>   Requests is shown in Figure 2.20, etc.
> 
>   In addition, 8DW trace buffer entries contain a timestamp and
>   possibly a prefix, e.g., a PASID TLP prefix (see Figure 6-20).  TLPs
>   may include more than one prefix, but only one can be logged in
>   trace buffer entries.
> 

yes. but currently we'll only trace the PASID TLP prefix. This field
will be all 0 for a packet with no PASID TLP prefix. I'll mention
it in the doc.

>   When using the 4DW data format, DW0 of the trace buffer entry
>   contains selected fields of DW0 of the TLP, together with a
>   timestamp.  DW1-DW3 of the trace buffer entry contain DW1-DW3
>   directly from the TLP header.
> 

thanks a lot. it does look more clearer. will update the doc
as suggested.

> This looks like a really cool device.  I wish we had this for more
> platforms.

thanks! glad to hear that!

Regards,
Yicong


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-04-09 14:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-06 12:45 [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Yicong Yang
2021-04-06 12:45 ` [PATCH 1/4] hwtracing: Add trace function " Yicong Yang
2021-04-06 13:51   ` Greg KH
2021-04-06 16:14   ` kernel test robot
2021-04-06 22:48   ` kernel test robot
2021-04-06 12:45 ` [PATCH 2/4] hwtracing: Add tune " Yicong Yang
2021-04-06 12:45 ` [PATCH 3/4] docs: Add documentation for HiSilicon PTT device driver Yicong Yang
2021-04-07 18:55   ` Bjorn Helgaas
2021-04-08 13:22     ` Yicong Yang
2021-04-08 16:57       ` Bjorn Helgaas
2021-04-09 14:09         ` Yicong Yang
2021-04-06 12:45 ` [PATCH 4/4] MAINTAINERS: Add maintainer for HiSilicon PTT driver Yicong Yang
2021-04-06 13:49 ` [PATCH 0/4] Add support for HiSilicon PCIe Tune and Trace device Greg KH
2021-04-07 10:03   ` Yicong Yang
2021-04-07 10:25     ` Greg KH
2021-04-08 13:25       ` Yicong Yang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).