linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices
@ 2021-10-30 13:53 Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 1/8] PCI: Use cached devcap in more places Dongdong Liu
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

This patchset is to enable 10-Bit tag for PCIe EP devices (include VF).

V10->V11:
- Rebased on V5.15-rc7.
- Rename sysfs file 10bit_tag to tags.
- Fix some other comments suggested by Bjorn.

V9->V10:
- Rebased on V5.15-rc4.
- Fix some commets suggested by Krzysztof.

V8->V9:
- Rebased on V5.15-rc2.
- Rename pcie_devcap to devcap, pcie_devcap2 to devcap2 to keep the same
  style with commit 691392448065 ("PCI: Cache PCIe Device Capabilities
  register").

V7->V8:
- Add a kernel parameter pcie_tag_peer2peer to disable 10-bit tags.
- Provide sysfs file to enable 10-bit tags.
- Remove [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices.
- Rebased on v5.14-rc6.
- Fix some other comments. Thanks to Bjorn who gave a lot of review comments.

V6->V7:
- Rebased on v5.14-rc3.
- Change the "pci=disable_10bit_tag=" parameter to sysfs file to disable
  10-Bit Tag Requester when need for p2pdma suggested by Leon.
- Fix comment for p2pdma 10-bit tag check.

V5->V6:
- Rebased on v5.14-rc2.
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V6 2/8].
- PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support.
- Add a 10-bit tag check in P2PDMA.
- Simplified implementation in [PATCH V6 6/8].
- Fix some comments in [PATCH V6 4/8].

V4->V5:
- Fix warning variable 'capa' is uninitialized.
- Fix warning unused variable 'pchild'.

V3->V4:
- Get the value of pcie_devcap2 in set_pcie_port_type().
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V4 1/6],
  [PATCH V4 3/6], [PATCH V4 4/6], [PATCH V4 5/6].
- Fix some code style.
- Rebased on v5.13-rc6.

V2->V3:
- Use cached Device Capabilities Register suggested by Christoph.
- Fix code style to avoid > 80 char lines.
- Rename devcap2 to pcie_devcap2.

V1->V2: Fix some comments by Christoph.
- Store the devcap2 value in the pci_dev instead of reading it multiple
  times.
- Change pci_info to pci_dbg to avoid the noisy log.
- Rename ext_10bit_tag_comp_path to ext_10bit_tag.
- Fix the compile error.
- Rebased on v5.13-rc1.

Dongdong Liu (8):
  PCI: Use cached devcap in more places
  PCI: Cache Device Capabilities 2 Register
  PCI: Add 10-Bit Tag register definitions
  PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices
  PCI/IOV: Add tags sysfs files for VF devices
  PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices

 Documentation/ABI/testing/sysfs-bus-pci       | 56 +++++++++++-
 .../admin-guide/kernel-parameters.txt         |  5 +
 drivers/media/pci/cobalt/cobalt-driver.c      |  4 +-
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  4 +-
 drivers/pci/iov.c                             | 84 +++++++++++++++++
 drivers/pci/p2pdma.c                          | 48 ++++++++++
 drivers/pci/pci-sysfs.c                       | 91 +++++++++++++++++++
 drivers/pci/pci.c                             | 12 ++-
 drivers/pci/pci.h                             |  9 ++
 drivers/pci/pcie/aspm.c                       | 11 +--
 drivers/pci/probe.c                           | 75 ++++++++++++---
 drivers/pci/quirks.c                          |  3 +-
 include/linux/pci.h                           |  1 +
 include/uapi/linux/pci_regs.h                 |  5 +
 14 files changed, 377 insertions(+), 31 deletions(-)

--
2.22.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH V11 1/8] PCI: Use cached devcap in more places
  2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
@ 2021-10-30 13:53 ` Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 2/8] PCI: Cache Device Capabilities 2 Register Dongdong Liu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Since commit 691392448065 ("PCI: Cache PCIe Device Capabilities register")
has already added a new member called devcap in struct pci_dev for
caching the PCIe Device Capabilities register to avoid reading
PCI_EXP_DEVCAP multiple times. Use devcap in more needed places.

Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/media/pci/cobalt/cobalt-driver.c |  4 ++--
 drivers/pci/pcie/aspm.c                  | 11 ++++-------
 drivers/pci/probe.c                      |  7 +------
 drivers/pci/quirks.c                     |  3 +--
 4 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/media/pci/cobalt/cobalt-driver.c b/drivers/media/pci/cobalt/cobalt-driver.c
index 16af58f2f93c..bc04184f1f74 100644
--- a/drivers/media/pci/cobalt/cobalt-driver.c
+++ b/drivers/media/pci/cobalt/cobalt-driver.c
@@ -193,11 +193,11 @@ void cobalt_pcie_status_show(struct cobalt *cobalt)
 		return;
 
 	/* Device */
-	pcie_capability_read_dword(pci_dev, PCI_EXP_DEVCAP, &capa);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVCTL, &ctrl);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVSTA, &stat);
 	cobalt_info("PCIe device capability 0x%08x: Max payload %d\n",
-		    capa, get_payload_size(capa & PCI_EXP_DEVCAP_PAYLOAD));
+		    pci_dev->devcap,
+		    get_payload_size(pci_dev->devcap & PCI_EXP_DEVCAP_PAYLOAD));
 	cobalt_info("PCIe device control 0x%04x: Max payload %d. Max read request %d\n",
 		    ctrl,
 		    get_payload_size((ctrl & PCI_EXP_DEVCTL_PAYLOAD) >> 5),
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 013a47f587ce..82d6234a4aa5 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -660,7 +660,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 
 	/* Get and check endpoint acceptable latencies */
 	list_for_each_entry(child, &linkbus->devices, bus_list) {
-		u32 reg32, encoding;
+		u32 encoding;
 		struct aspm_latency *acceptable =
 			&link->acceptable[PCI_FUNC(child->devfn)];
 
@@ -668,12 +668,11 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 		    pci_pcie_type(child) != PCI_EXP_TYPE_LEG_END)
 			continue;
 
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
 		/* Calculate endpoint L0s acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L0S) >> 6;
+		encoding = (child->devcap & PCI_EXP_DEVCAP_L0S) >> 6;
 		acceptable->l0s = calc_l0s_acceptable(encoding);
 		/* Calculate endpoint L1 acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L1) >> 9;
+		encoding = (child->devcap & PCI_EXP_DEVCAP_L1) >> 9;
 		acceptable->l1 = calc_l1_acceptable(encoding);
 
 		pcie_aspm_check_latency(child);
@@ -808,7 +807,6 @@ static void free_link_state(struct pcie_link_state *link)
 static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 {
 	struct pci_dev *child;
-	u32 reg32;
 
 	/*
 	 * Some functions in a slot might not all be PCIe functions,
@@ -831,8 +829,7 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
 		 * RBER bit to determine if a function is 1.1 version device
 		 */
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
-		if (!(reg32 & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
+		if (!(child->devcap & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
 			pci_info(child, "disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'\n");
 			return -EINVAL;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index d9fc02a71baa..96ecdf34f931 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2044,18 +2044,13 @@ static void pci_configure_mps(struct pci_dev *dev)
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u32 cap;
 	u16 ctl;
 	int ret;
 
 	if (!pci_is_pcie(dev))
 		return 0;
 
-	ret = pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	if (ret)
-		return 0;
-
-	if (!(cap & PCI_EXP_DEVCAP_EXT_TAG))
+	if (!(dev->devcap & PCI_EXP_DEVCAP_EXT_TAG))
 		return 0;
 
 	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL, &ctl);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4537d1ea14fd..1bd0d610f3e0 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5260,8 +5260,7 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
 		pdev->pcie_cap = pos;
 		pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
 		pdev->pcie_flags_reg = reg16;
-		pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, &reg16);
-		pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
+		pdev->pcie_mpss = pdev->devcap & PCI_EXP_DEVCAP_PAYLOAD;
 
 		pdev->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
 		if (pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status) !=
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V11 2/8] PCI: Cache Device Capabilities 2 Register
  2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 1/8] PCI: Use cached devcap in more places Dongdong Liu
@ 2021-10-30 13:53 ` Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add a new member called devcap2 in struct pci_dev for caching the PCIe
Device Capabilities 2 register to avoid reading PCI_EXP_DEVCAP2 multiple
times.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  4 +---
 drivers/pci/pci.c                               |  8 +++-----
 drivers/pci/probe.c                             | 10 ++++------
 include/linux/pci.h                             |  1 +
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 0d9cda4ab303..ae0b6def994b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6304,7 +6304,6 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		struct pci_dev *pbridge;
 		struct port_info *pi;
 		char name[IFNAMSIZ];
-		u32 devcap2;
 		u16 flags;
 
 		/* If we want to instantiate Virtual Functions, then our
@@ -6314,10 +6313,9 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		 */
 		pbridge = pdev->bus->self;
 		pcie_capability_read_word(pbridge, PCI_EXP_FLAGS, &flags);
-		pcie_capability_read_dword(pbridge, PCI_EXP_DEVCAP2, &devcap2);
 
 		if ((flags & PCI_EXP_FLAGS_VERS) < 2 ||
-		    !(devcap2 & PCI_EXP_DEVCAP2_ARI)) {
+		    !(pbridge->devcap2 & PCI_EXP_DEVCAP2_ARI)) {
 			/* Our parent bridge does not support ARI so issue a
 			 * warning and skip instantiating the VFs.  They
 			 * won't be reachable.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ce2ab62b64cf..64138a83b0f7 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3717,7 +3717,7 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 {
 	struct pci_bus *bus = dev->bus;
 	struct pci_dev *bridge;
-	u32 cap, ctl2;
+	u32 ctl2;
 
 	if (!pci_is_pcie(dev))
 		return -EINVAL;
@@ -3741,19 +3741,17 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 	while (bus->parent) {
 		bridge = bus->self;
 
-		pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
-
 		switch (pci_pcie_type(bridge)) {
 		/* Ensure switch ports support AtomicOp routing */
 		case PCI_EXP_TYPE_UPSTREAM:
 		case PCI_EXP_TYPE_DOWNSTREAM:
-			if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+			if (!(bridge->devcap2 & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
 				return -EINVAL;
 			break;
 
 		/* Ensure root port supports all the sizes we care about */
 		case PCI_EXP_TYPE_ROOT_PORT:
-			if ((cap & cap_mask) != cap_mask)
+			if ((bridge->devcap2 & cap_mask) != cap_mask)
 				return -EINVAL;
 			break;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 96ecdf34f931..7259ad774ac8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1509,6 +1509,7 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pdev->pcie_flags_reg = reg16;
 	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->devcap);
 	pdev->pcie_mpss = FIELD_GET(PCI_EXP_DEVCAP_PAYLOAD, pdev->devcap);
+	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP2, &pdev->devcap2);
 
 	parent = pci_upstream_bridge(pdev);
 	if (!parent)
@@ -2129,7 +2130,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 #ifdef CONFIG_PCIEASPM
 	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 	struct pci_dev *bridge;
-	u32 cap, ctl;
+	u32 ctl;
 
 	if (!pci_is_pcie(dev))
 		return;
@@ -2137,8 +2138,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 	/* Read L1 PM substate capabilities */
 	dev->l1ss = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS);
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_LTR))
+	if (!(dev->devcap2 & PCI_EXP_DEVCAP2_LTR))
 		return;
 
 	pcie_capability_read_dword(dev, PCI_EXP_DEVCTL2, &ctl);
@@ -2178,13 +2178,11 @@ static void pci_configure_eetlp_prefix(struct pci_dev *dev)
 #ifdef CONFIG_PCI_PASID
 	struct pci_dev *bridge;
 	int pcie_type;
-	u32 cap;
 
 	if (!pci_is_pcie(dev))
 		return;
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_EE_PREFIX))
+	if (!(dev->devcap2 & PCI_EXP_DEVCAP2_EE_PREFIX))
 		return;
 
 	pcie_type = pci_pcie_type(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index cd8aa6fce204..286d89e22738 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -333,6 +333,7 @@ struct pci_dev {
 	struct pci_dev  *rcec;          /* Associated RCEC device */
 #endif
 	u32		devcap;		/* PCIe Device Capabilities */
+	u32		devcap2;	/* PCIe Device Capabilities 2 */
 	u8		pcie_cap;	/* PCIe capability offset */
 	u8		msi_cap;	/* MSI capability offset */
 	u8		msix_cap;	/* MSI-X capability offset */
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V11 3/8] PCI: Add 10-Bit Tag register definitions
  2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 1/8] PCI: Use cached devcap in more places Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 2/8] PCI: Cache Device Capabilities 2 Register Dongdong Liu
@ 2021-10-30 13:53 ` Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 4/8] PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices Dongdong Liu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add 10-Bit Tag register definitions for use in subsequen patches.
See the PCIe 5.0 spec section 7.5.3.15 and 9.3.3.2.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 include/uapi/linux/pci_regs.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index e709ae8235e7..cf1ddb82a6b9 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -648,6 +648,8 @@
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_COMP 0x00010000 /* 10-Bit Tag Completer Supported */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_REQ  0x00020000 /* 10-Bit Tag Requester Supported */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
 #define  PCI_EXP_DEVCAP2_OBFF_WAKE	0x00080000 /* Re-use WAKE# for OBFF */
@@ -661,6 +663,7 @@
 #define  PCI_EXP_DEVCTL2_IDO_REQ_EN	0x0100	/* Allow IDO for requests */
 #define  PCI_EXP_DEVCTL2_IDO_CMP_EN	0x0200	/* Allow IDO for completions */
 #define  PCI_EXP_DEVCTL2_LTR_EN		0x0400	/* Enable LTR mechanism */
+#define  PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN 0x1000 /* 10-Bit Tag Requester Enable */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGA_EN	0x2000	/* Enable OBFF Message type A */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGB_EN	0x4000	/* Enable OBFF Message type B */
 #define  PCI_EXP_DEVCTL2_OBFF_WAKE_EN	0x6000	/* OBFF using WAKE# signaling */
@@ -931,6 +934,7 @@
 /* Single Root I/O Virtualization */
 #define PCI_SRIOV_CAP		0x04	/* SR-IOV Capabilities */
 #define  PCI_SRIOV_CAP_VFM	0x00000001  /* VF Migration Capable */
+#define  PCI_SRIOV_CAP_VF_10BIT_TAG_REQ	0x00000004 /* VF 10-Bit Tag Requester Supported */
 #define  PCI_SRIOV_CAP_INTR(x)	((x) >> 21) /* Interrupt Message Number */
 #define PCI_SRIOV_CTRL		0x08	/* SR-IOV Control */
 #define  PCI_SRIOV_CTRL_VFE	0x0001	/* VF Enable */
@@ -938,6 +942,7 @@
 #define  PCI_SRIOV_CTRL_INTR	0x0004	/* VF Migration Interrupt Enable */
 #define  PCI_SRIOV_CTRL_MSE	0x0008	/* VF Memory Space Enable */
 #define  PCI_SRIOV_CTRL_ARI	0x0010	/* ARI Capable Hierarchy */
+#define  PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN 0x0020 /* VF 10-Bit Tag Requester Enable */
 #define PCI_SRIOV_STATUS	0x0a	/* SR-IOV Status */
 #define  PCI_SRIOV_STATUS_VFM	0x0001	/* VF Migration Status */
 #define PCI_SRIOV_INITIAL_VF	0x0c	/* Initial VFs */
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V11 4/8] PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices
  2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (2 preceding siblings ...)
  2021-10-30 13:53 ` [PATCH V11 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
@ 2021-10-30 13:53 ` Dongdong Liu
  2021-11-01 20:54   ` Bjorn Helgaas
  2021-10-30 13:53 ` [PATCH V11 5/8] PCI/IOV: Add tags sysfs files for VF devices Dongdong Liu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as
  opposed to host memory), the Endpoint must not send 10-Bit Tag
  Requests to another given Endpoint unless an implementation-specific
  mechanism determines that the Endpoint supports 10-Bit Tag Completer
  capability.

Add a tags sysfs file, write 0 to disable 10-Bit Tag Requester
when the driver does not bind the device. The typical use case is for
p2pdma when the peer device does not support 10-Bit Tag Completer.
Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
Completer capability. The typical use case is for host memory targeted
by DMA Requests. The tags file content indicate current status of Tags
Enable.

PCIe r5.0, sec 2.2.6.2 says:

  Receivers/Completers must handle 8-bit Tag values correctly regardless
  of the setting of their Extended Tag Field Enable bit (see Section
  7.5.3.4).

Add this comment in pci_configure_extended_tags(). As all PCIe completers
are required to support 8-bit tags, so we do not use tags sysfs file
to manage 8-bit tags.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 24 ++++++-
 drivers/pci/pci-sysfs.c                 | 88 +++++++++++++++++++++++++
 drivers/pci/pci.h                       |  2 +
 drivers/pci/probe.c                     | 20 ++++++
 4 files changed, 133 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index d4ae03296861..c16bb31486d2 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -156,7 +156,7 @@ Description:
 		binary file containing the Vital Product Data for the
 		device.  It should follow the VPD format defined in
 		PCI Specification 2.1 or 2.2, but users should consider
-		that some devices may have incorrectly formatted data.  
+		that some devices may have incorrectly formatted data.
 		If the underlying VPD has a writable section then the
 		corresponding section of this file will be writable.
 
@@ -424,3 +424,25 @@ Description:
 
 		The file is writable if the PF is bound to a driver that
 		implements ->sriov_set_msix_vec_count().
+
+What:		/sys/bus/pci/devices/.../tags
+Date:		September 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		The file will be visible when the device supports 10-Bit Tag
+		Requester. The file is readable, the value indicate current
+		status of Tags Enable(5-Bit, 8-Bit, 10-Bit).
+
+		The file is also writable, The values accepted are:
+		* > 0 - this number will be reported as tags bit to be
+			enabled. current only 10 is accepted
+		* < 0 - not valid
+		* = 0 - disable 10-Bit Tag, use Extended Tags(8-Bit or 5-Bit)
+
+		write 0 to disable 10-Bit Tag Requester when the driver does
+		not bind the device. The typical use case is for p2pdma when
+		the peer device does not support 10-Bit Tag Completer.
+
+		Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit
+		Tag Completer capability. The typical use case is for host
+		memory targeted by DMA Requests.
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7fb5cd17cc98..04fd9a8b9d4e 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -306,6 +306,65 @@ static ssize_t enable_show(struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR_RW(enable);
 
+static ssize_t tags_store(struct device *dev,
+			  struct device_attribute *attr,
+			  const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	unsigned long val;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (kstrtoul(buf, 0, &val) < 0)
+		return -EINVAL;
+
+	/* 10 - enable 10-Bit Tag, 0 - disable 10-Bit Tag */
+	if (val != 10 && val != 0)
+		return -EINVAL;
+
+	if (pdev->driver)
+		return -EBUSY;
+
+	if (!pcie_rp_10bit_tag_cmp_supported(pdev))
+		return -EPERM;
+
+	if (val == 10)
+		pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2,
+					 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	else
+		pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
+					   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+
+	return count;
+}
+
+static ssize_t tags_show(struct device *dev,
+			 struct device_attribute *attr,
+			 char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 ctl;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	if (ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)
+		return sysfs_emit(buf, "%s\n", "10-Bit");
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	if (ctl & PCI_EXP_DEVCTL_EXT_TAG)
+		return sysfs_emit(buf, "%s\n", "8-Bit");
+
+	return sysfs_emit(buf, "%s\n", "5-Bit");
+}
+static DEVICE_ATTR_RW(tags);
+
 #ifdef CONFIG_NUMA
 static ssize_t numa_node_store(struct device *dev,
 			       struct device_attribute *attr, const char *buf,
@@ -635,6 +694,11 @@ static struct attribute *pcie_dev_attrs[] = {
 	NULL,
 };
 
+static struct attribute *pcie_dev_tags_attrs[] = {
+	&dev_attr_tags.attr,
+	NULL,
+};
+
 static struct attribute *pcibus_attrs[] = {
 	&dev_attr_bus_rescan.attr,
 	&dev_attr_cpuaffinity.attr,
@@ -1482,6 +1546,24 @@ static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
 	return 0;
 }
 
+static umode_t pcie_dev_tags_attrs_is_visible(struct kobject *kobj,
+					      struct attribute *a, int n)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (pdev->is_virtfn)
+		return 0;
+
+	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
+		return 0;
+
+	if (!(pdev->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
+		return 0;
+
+	return a->mode;
+}
+
 static const struct attribute_group pci_dev_group = {
 	.attrs = pci_dev_attrs,
 };
@@ -1522,6 +1604,11 @@ static const struct attribute_group pcie_dev_attr_group = {
 	.is_visible = pcie_dev_attrs_are_visible,
 };
 
+static const struct attribute_group pcie_dev_tags_attr_group = {
+	.attrs = pcie_dev_tags_attrs,
+	.is_visible = pcie_dev_tags_attrs_is_visible,
+};
+
 static const struct attribute_group *pci_dev_attr_groups[] = {
 	&pci_dev_attr_group,
 	&pci_dev_hp_attr_group,
@@ -1531,6 +1618,7 @@ static const struct attribute_group *pci_dev_attr_groups[] = {
 #endif
 	&pci_bridge_attr_group,
 	&pcie_dev_attr_group,
+	&pcie_dev_tags_attr_group,
 #ifdef CONFIG_PCIEAER
 	&aer_stats_attr_group,
 #endif
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1cce56c2aea0..f719a41dfc7f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -264,6 +264,8 @@ struct device *pci_get_host_bridge_device(struct pci_dev *dev);
 void pci_put_host_bridge_device(struct device *dev);
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev);
+
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 7259ad774ac8..8f5372c7c737 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2042,6 +2042,20 @@ static void pci_configure_mps(struct pci_dev *dev)
 		 p_mps, mps, mpss);
 }
 
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev)
+{
+	struct pci_dev *root;
+
+	root = pcie_find_root_port(dev);
+	if (!root)
+		return false;
+
+	if (!(root->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
+		return false;
+
+	return true;
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
@@ -2075,6 +2089,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 		return 0;
 	}
 
+	/*
+	 * PCIe r5.0, sec 2.2.6.2 says "Receivers/Completers must handle 8-bit
+	 * Tag values correctly regardless of the setting of their Extended Tag
+	 * Field Enable bit (see Section 7.5.3.4)", so it is safe to enable
+	 * Extented Tags.
+	 */
 	if (!(ctl & PCI_EXP_DEVCTL_EXT_TAG)) {
 		pci_info(dev, "enabling Extended Tags\n");
 		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V11 5/8] PCI/IOV: Add tags sysfs files for VF devices
  2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (3 preceding siblings ...)
  2021-10-30 13:53 ` [PATCH V11 4/8] PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices Dongdong Liu
@ 2021-10-30 13:53 ` Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as
  opposed to host memory), the Endpoint must not send 10-Bit Tag
  Requests to another given Endpoint unless an implementation-specific
  mechanism determines that the Endpoint supports 10-Bit Tag Completer
  capability.

Add sriov_vf_tags file to query the status of VF 10-Bit Tag
Requester Enable.

Add a sriov_vf_tags_ctl sysfs file, write 0 to disable the VF
10-Bit Tag Requester. The typical use case is for p2pdma when the peer
device does not support 10-Bit Tag Completer. Write 10 to enable 10-Bit
Tag Requester when RC supports 10-Bit Tag Completer capability. The
typical use case is for host memory targeted by DMA Requests.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 32 +++++++++++
 drivers/pci/iov.c                       | 70 +++++++++++++++++++++++++
 2 files changed, 102 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index c16bb31486d2..9343941969b6 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -446,3 +446,35 @@ Description:
 		Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit
 		Tag Completer capability. The typical use case is for host
 		memory targeted by DMA Requests.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_tags
+Date:		September 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		This file is associated with a SR-IOV physical function (PF).
+		It is visible when the device supports VF 10-Bit Tag
+		Requester. It contains the status of VF Tags Enable.
+		The file is read-only.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_tags_ctl
+Date:		September 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		This file is associated with a SR-IOV virtual function (VF).
+		It is visible when the device supports VF 10-Bit Tag
+		Requester. The file is only writeable when the VF driver
+		does not bind to a device.
+
+		The values accepted are:
+		* > 0 - this number will be reported as tags bit to be
+			enabled. current only 10 is accepted
+		* < 0 - not valid
+		* = 0 - disable 10-Bit Tag, use Extended Tags(8-Bit or 5-Bit)
+
+		Write 0 to any VF's file disables 10-Bit Tag Requester for all
+		VFs. The typical use case is for p2pdma when the peer device
+		does not support 10-Bit Tag Completer.
+
+		Write 10 to enable 10-Bit Tag Requester for all VFs when RC
+		supports 10-Bit Tag Completer capability. The typical use case
+		is for host memory targeted by DMA Requests.
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index dafdc652fcd0..3627e495d7af 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -220,10 +220,47 @@ static ssize_t sriov_vf_msix_count_store(struct device *dev,
 static DEVICE_ATTR_WO(sriov_vf_msix_count);
 #endif
 
+static ssize_t sriov_vf_tags_ctl_store(struct device *dev,
+				       struct device_attribute *attr,
+				       const char *buf, size_t count)
+{
+	struct pci_dev *vf_dev = to_pci_dev(dev);
+	struct pci_dev *pdev = pci_physfn(vf_dev);
+	unsigned long val;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return -EPERM;
+
+	if (kstrtoul(buf, 0, &val) < 0)
+		return -EINVAL;
+
+	/* 10 - enable 10-Bit Tag, 0 - disable 10-Bit Tag */
+	if (val != 10 && val != 0)
+		return -EINVAL;
+
+	if (vf_dev->driver)
+		return -EBUSY;
+
+	if (!pcie_rp_10bit_tag_cmp_supported(pdev))
+		return -EPERM;
+
+	if (val == 10)
+		pdev->sriov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+	else
+		pdev->sriov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
+	pci_write_config_word(pdev, pdev->sriov->pos + PCI_SRIOV_CTRL,
+			      pdev->sriov->ctrl);
+
+	return count;
+}
+static DEVICE_ATTR_WO(sriov_vf_tags_ctl);
+
 static struct attribute *sriov_vf_dev_attrs[] = {
 #ifdef CONFIG_PCI_MSI
 	&dev_attr_sriov_vf_msix_count.attr,
 #endif
+	&dev_attr_sriov_vf_tags_ctl.attr,
 	NULL,
 };
 
@@ -236,6 +273,11 @@ static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
 	if (!pdev->is_virtfn)
 		return 0;
 
+	pdev = pci_physfn(pdev);
+	if ((a == &dev_attr_sriov_vf_tags_ctl.attr) &&
+	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
+		return 0;
+
 	return a->mode;
 }
 
@@ -487,12 +529,34 @@ static ssize_t sriov_drivers_autoprobe_store(struct device *dev,
 	return count;
 }
 
+static ssize_t sriov_vf_tags_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 ctl;
+	int ret;
+
+	if (pdev->sriov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
+		return sysfs_emit(buf, "%s\n", "10-Bit");
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	if (ctl & PCI_EXP_DEVCTL_EXT_TAG)
+		return sysfs_emit(buf, "%s\n", "8-Bit");
+
+	return sysfs_emit(buf, "%s\n", "5-Bit");
+}
+
 static DEVICE_ATTR_RO(sriov_totalvfs);
 static DEVICE_ATTR_RW(sriov_numvfs);
 static DEVICE_ATTR_RO(sriov_offset);
 static DEVICE_ATTR_RO(sriov_stride);
 static DEVICE_ATTR_RO(sriov_vf_device);
 static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
+static DEVICE_ATTR_RO(sriov_vf_tags);
 
 static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_totalvfs.attr,
@@ -501,6 +565,7 @@ static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_stride.attr,
 	&dev_attr_sriov_vf_device.attr,
 	&dev_attr_sriov_drivers_autoprobe.attr,
+	&dev_attr_sriov_vf_tags.attr,
 #ifdef CONFIG_PCI_MSI
 	&dev_attr_sriov_vf_total_msix.attr,
 #endif
@@ -511,10 +576,15 @@ static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
 					  struct attribute *a, int n)
 {
 	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
 
 	if (!dev_is_pf(dev))
 		return 0;
 
+	if ((a == &dev_attr_sriov_vf_tags.attr) &&
+	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
+		return 0;
+
 	return a->mode;
 }
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V11 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (4 preceding siblings ...)
  2021-10-30 13:53 ` [PATCH V11 5/8] PCI/IOV: Add tags sysfs files for VF devices Dongdong Liu
@ 2021-10-30 13:53 ` Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
  2021-10-30 13:53 ` [PATCH V11 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu
  7 siblings, 0 replies; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

The P2PDMA mapping should fail if a device with 10-Bit Tag Requester
interact with a device that does not support 10-Bit Tag Completer.

Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
10-Bit Tag Requester doesn't interact with a device that does not
support 10-Bit Tag Completer. Before that happens, the kernel should
emit a warning.

"echo 0 > /sys/bus/pci/devices/.../tags" to disable 10-Bit Tag
Requester for PF device.

"echo 0 > /sys/bus/pci/devices/.../sriov_vf_tags_ctl" to disable
10-Bit Tag Requester for VF device.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/pci/p2pdma.c | 48 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index 50cdde3e9a8b..1565a31183af 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -19,6 +19,7 @@
 #include <linux/random.h>
 #include <linux/seq_buf.h>
 #include <linux/xarray.h>
+#include "pci.h"
 
 enum pci_p2pdma_map_type {
 	PCI_P2PDMA_MAP_UNKNOWN = 0,
@@ -410,6 +411,50 @@ static unsigned long map_types_idx(struct pci_dev *client)
 		(client->bus->number << 8) | client->devfn;
 }
 
+static bool pci_10bit_tags_unsupported(struct pci_dev *a,
+				       struct pci_dev *b,
+				       bool verbose)
+{
+	bool req;
+	bool comp;
+	u16 ctl;
+	const char *str = "tags";
+
+	if (a->is_virtfn) {
+#ifdef CONFIG_PCI_IOV
+		req = !!(a->physfn->sriov->ctrl &
+			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
+#endif
+	} else {
+		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl);
+		req = !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+
+	comp = !!(b->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
+
+	/* 10-bit tags not enabled on requester */
+	if (!req)
+		return false;
+
+	 /* Completer can handle anything */
+	if (comp)
+		return false;
+
+	if (!verbose)
+		return true;
+
+	pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set for this device, but peer device (%s) does not support the 10-Bit Tag Completer\n",
+		 pci_name(b));
+
+	if (a->is_virtfn)
+		str = "sriov_vf_tags_ctl";
+
+	pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/%s\n",
+		 pci_name(a), str);
+
+	return true;
+}
+
 /*
  * Calculate the P2PDMA mapping type and distance between two PCI devices.
  *
@@ -532,6 +577,9 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
 		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
 	}
 done:
+	if (pci_10bit_tags_unsupported(client, provider, verbose))
+		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
+
 	rcu_read_lock();
 	p2pdma = rcu_dereference(provider->p2pdma);
 	if (p2pdma)
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (5 preceding siblings ...)
  2021-10-30 13:53 ` [PATCH V11 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
@ 2021-10-30 13:53 ` Dongdong Liu
  2021-11-01 22:02   ` Bjorn Helgaas
  2021-10-30 13:53 ` [PATCH V11 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu
  7 siblings, 1 reply; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
10-Bit Tag Capabilities" Implementation Note:

  For platforms where the RC supports 10-Bit Tag Completer capability,
  it is highly recommended for platform firmware or operating software
  that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
  bit automatically in Endpoints with 10-Bit Tag Requester capability.
  This enables the important class of 10-Bit Tag capable adapters that
  send Memory Read Requests only to host memory.

It's safe to enable 10-bit tags for all devices below a Root Port that
supports them. Switches that lack 10-Bit Tag Completer capability are
still able to forward NPRs and Completions carrying 10-Bit Tags correctly,
since the two new Tag bits are in TLP Header bits that were formerly
Reserved.

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as opposed
  to host memory), the Endpoint must not send 10-Bit Tag Requests to
  another given Endpoint unless an implementation-specific mechanism
  determines that the Endpoint supports 10-Bit Tag Completer capability.

It is not safe for P2P traffic if an Endpoint send 10-Bit Tag Requesters
to another Endpoint that does not support 10-Bit Tag Completer capability,
so we provide sysfs file to disable 10-Bit Tag Requester. Unbind the
device driver, set the sysfs file and then rebind the driver.

Add a kernel parameter pcie_tag_peer2peer that disables 10-Bit Tag
Requester for all PCIe devices. This configuration allows peer-to-peer
DMA between any pair of devices, possibly at the cost of reduced
performance.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 .../admin-guide/kernel-parameters.txt         |  5 +++
 drivers/pci/iov.c                             |  3 ++
 drivers/pci/pci-sysfs.c                       |  3 ++
 drivers/pci/pci.c                             |  4 ++
 drivers/pci/pci.h                             |  7 ++++
 drivers/pci/probe.c                           | 42 ++++++++++++++++++-
 6 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 43dc35fe5bc0..4ab2c46a1af7 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3979,6 +3979,11 @@
 				any pair of devices, possibly at the cost of
 				reduced performance.  This also guarantees
 				that hot-added devices will work.
+		pcie_tag_peer2peer	Disable 10-Bit Tag Requester for all
+				PCIe devices. This configuration allows
+				peer-to-peer DMA between any pair of devices,
+				possibly at the cost of reduced performance.
+
 		cbiosize=nn[KMG]	The fixed amount of bus space which is
 				reserved for the CardBus bridge's IO window.
 				The default value is 256 bytes.
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 3627e495d7af..0f8203ccc701 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -241,6 +241,9 @@ static ssize_t sriov_vf_tags_ctl_store(struct device *dev,
 	if (vf_dev->driver)
 		return -EBUSY;
 
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		return -EPERM;
+
 	if (!pcie_rp_10bit_tag_cmp_supported(pdev))
 		return -EPERM;
 
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 04fd9a8b9d4e..71647bb3ac06 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -326,6 +326,9 @@ static ssize_t tags_store(struct device *dev,
 	if (pdev->driver)
 		return -EBUSY;
 
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		return -EPERM;
+
 	if (!pcie_rp_10bit_tag_cmp_supported(pdev))
 		return -EPERM;
 
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 64138a83b0f7..46faf2e8c8ab 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -118,6 +118,8 @@ enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_PEER2PEER;
 enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_DEFAULT;
 #endif
 
+enum pcie_tag_config_types pcie_tag_config = PCIE_TAG_DEFAULT;
+
 /*
  * The default CLS is used if arch didn't set CLS explicitly and not
  * all pci devices agree on the same value.  Arch can override either
@@ -6795,6 +6797,8 @@ static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+			} else if (!strncmp(str, "pcie_tag_peer2peer", 18)) {
+				pcie_tag_config = PCIE_TAG_PEER2PEER;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index f719a41dfc7f..7846aa7b85dc 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -59,6 +59,13 @@ struct pci_cap_saved_state *pci_find_saved_cap(struct pci_dev *dev, char cap);
 struct pci_cap_saved_state *pci_find_saved_ext_cap(struct pci_dev *dev,
 						   u16 cap);
 
+enum pcie_tag_config_types {
+	PCIE_TAG_DEFAULT,   /* Enable 10-Bit Tag Requester for devices below
+			       Root Port that support 10-Bit Tag Completer. */
+	PCIE_TAG_PEER2PEER  /* Disable 10-Bit Tag Requester for all devices. */
+};
+extern enum pcie_tag_config_types pcie_tag_config;
+
 #define PCI_PM_D2_DELAY         200	/* usec; see PCIe r4.0, sec 5.9.1 */
 #define PCI_PM_D3HOT_WAIT       10	/* msec */
 #define PCI_PM_D3COLD_WAIT      100	/* msec */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 8f5372c7c737..18a74e257dd9 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2059,12 +2059,20 @@ bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev)
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u16 ctl;
+	u16 ctl, ctl2;
 	int ret;
 
 	if (!pci_is_pcie(dev))
 		return 0;
 
+	/*
+	 * PCIe 5.0 section 9.3.5.4 Extended Tag Field Enable in Device Control
+	 * Register is RsvdP fo VF. section 9.3.5.10 10-Bit Tag Requester
+	 * Enable in Device Control 2 Register is RsvdP for VF.
+	 */
+	if (dev->is_virtfn)
+		return 0;
+
 	if (!(dev->devcap & PCI_EXP_DEVCAP_EXT_TAG))
 		return 0;
 
@@ -2072,6 +2080,10 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 	if (ret)
 		return 0;
 
+	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL2, &ctl2);
+	if (ret)
+		return 0;
+
 	host = pci_find_host_bridge(dev->bus);
 	if (!host)
 		return 0;
@@ -2086,6 +2098,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
 						   PCI_EXP_DEVCTL_EXT_TAG);
 		}
+
+		if (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN) {
+			pci_info(dev, "disabling 10-Bit Tags\n");
+			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+		}
 		return 0;
 	}
 
@@ -2100,6 +2118,28 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
 					 PCI_EXP_DEVCTL_EXT_TAG);
 	}
+
+	if ((pcie_tag_config == PCIE_TAG_PEER2PEER) &&
+	    (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)) {
+		pci_info(dev, "disabling 10-Bit Tags\n");
+		pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+					   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+		return 0;
+	}
+
+	if (pcie_tag_config != PCIE_TAG_DEFAULT)
+		return 0;
+
+	if (!pcie_rp_10bit_tag_cmp_supported(dev))
+		return 0;
+
+	if (pci_pcie_type(dev) != PCI_EXP_TYPE_ENDPOINT)
+		return 0;
+
+	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
+	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+				 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+
 	return 0;
 }
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH V11 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices
  2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (6 preceding siblings ...)
  2021-10-30 13:53 ` [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
@ 2021-10-30 13:53 ` Dongdong Liu
  7 siblings, 0 replies; 17+ messages in thread
From: Dongdong Liu @ 2021-10-30 13:53 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Enable 10-Bit Tag Requester for the VF devices below the
Root Port that support 10-Bit Tag Completer.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/pci/iov.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 0f8203ccc701..4dd50996204e 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -707,6 +707,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 	pci_iov_set_numvfs(dev, nr_virtfn);
 	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+
+	if ((pcie_tag_config == PCIE_TAG_DEFAULT) &&
+	    (iov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ) &&
+	    pcie_rp_10bit_tag_cmp_supported(dev))
+		iov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	msleep(100);
@@ -723,6 +732,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 err_pcibios:
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
@@ -755,6 +765,7 @@ static void sriov_disable(struct pci_dev *dev)
 
 	sriov_del_vfs(dev);
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH V11 4/8] PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices
  2021-10-30 13:53 ` [PATCH V11 4/8] PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices Dongdong Liu
@ 2021-11-01 20:54   ` Bjorn Helgaas
  2021-11-02 13:03     ` Dongdong Liu
  0 siblings, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2021-11-01 20:54 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Sat, Oct 30, 2021 at 09:53:44PM +0800, Dongdong Liu wrote:
> PCIe spec 5.0 r1.0 section 2.2.6.2 says:
> 
>   If an Endpoint supports sending Requests to other Endpoints (as
>   opposed to host memory), the Endpoint must not send 10-Bit Tag
>   Requests to another given Endpoint unless an implementation-specific
>   mechanism determines that the Endpoint supports 10-Bit Tag Completer
>   capability.
> 
> Add a tags sysfs file, write 0 to disable 10-Bit Tag Requester
> when the driver does not bind the device. The typical use case is for
> p2pdma when the peer device does not support 10-Bit Tag Completer.
> Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
> Completer capability. The typical use case is for host memory targeted
> by DMA Requests. The tags file content indicate current status of Tags
> Enable.
> 
> PCIe r5.0, sec 2.2.6.2 says:
> 
>   Receivers/Completers must handle 8-bit Tag values correctly regardless
>   of the setting of their Extended Tag Field Enable bit (see Section
>   7.5.3.4).
> 
> Add this comment in pci_configure_extended_tags(). As all PCIe completers
> are required to support 8-bit tags, so we do not use tags sysfs file
> to manage 8-bit tags.

> +What:		/sys/bus/pci/devices/.../tags
> +Date:		September 2021
> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
> +Description:
> +		The file will be visible when the device supports 10-Bit Tag
> +		Requester. The file is readable, the value indicate current
> +		status of Tags Enable(5-Bit, 8-Bit, 10-Bit).
> +
> +		The file is also writable, The values accepted are:
> +		* > 0 - this number will be reported as tags bit to be
> +			enabled. current only 10 is accepted
> +		* < 0 - not valid
> +		* = 0 - disable 10-Bit Tag, use Extended Tags(8-Bit or 5-Bit)
> +
> +		write 0 to disable 10-Bit Tag Requester when the driver does
> +		not bind the device. The typical use case is for p2pdma when
> +		the peer device does not support 10-Bit Tag Completer.
> +
> +		Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit
> +		Tag Completer capability. The typical use case is for host
> +		memory targeted by DMA Requests.

1) I think I would rename this from "tags" to "tag_bits".  A file
   named "tags" that contains 8 suggests that we can use 8 tags, but
   in fact, we can use 256 tags.

2) This controls tag size the requester will use.  The current knobs
   in the hardware allow 5, 8, or 10 bits.

   "0" to disable 10-bit tags without specifying whether we should use
   5- or 8-bit tags doesn't seem right.  All completers are *supposed*
   to support 8-bit, but we've tripped over a few that don't.

   I don't think we currently have a run-time (or even a boot-time)
   way to disable 8-bit tags; all we have is the quirk_no_ext_tags()
   quirk.  But if we ever wanted to *add* that, maybe we would want:

      5 - use 5-bit tags
      8 - use 8-bit tags
     10 - use 10-bit tags

   Maybe we just say "0" is invalid, since there's no obvious way to
   map this?

> +static ssize_t tags_show(struct device *dev,
> +			 struct device_attribute *attr,
> +			 char *buf)
> +{
> + ...

> +	if (ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)
> +		return sysfs_emit(buf, "%s\n", "10-Bit");
> +
> +	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &ctl);
> +	if (ret)
> +		return -EINVAL;
> +
> +	if (ctl & PCI_EXP_DEVCTL_EXT_TAG)
> +		return sysfs_emit(buf, "%s\n", "8-Bit");
> +
> +	return sysfs_emit(buf, "%s\n", "5-Bit");

Since I prefer the "tag_bits" name, my preference would be bare
numbers here: "10", "8", "5".

Both comments apply to the sriov files, too.

> +static umode_t pcie_dev_tags_attrs_is_visible(struct kobject *kobj,
> +					      struct attribute *a, int n)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	if (pdev->is_virtfn)
> +		return 0;
> +
> +	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
> +		return 0;
> +
> +	if (!(pdev->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
> +		return 0;

Makes sense for now that the file is only visible if a requester
supports 10-bit tags.  If we ever wanted to extend this to control 5-
vs 8-bit tags, we could make it visible in more cases then.

> +
> +	return a->mode;
> +}

> @@ -2075,6 +2089,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>  		return 0;
>  	}
>  
> +	/*
> +	 * PCIe r5.0, sec 2.2.6.2 says "Receivers/Completers must handle 8-bit
> +	 * Tag values correctly regardless of the setting of their Extended Tag
> +	 * Field Enable bit (see Section 7.5.3.4)", so it is safe to enable
> +	 * Extented Tags.

s/Extented/Extended/

> +	 */
>  	if (!(ctl & PCI_EXP_DEVCTL_EXT_TAG)) {
>  		pci_info(dev, "enabling Extended Tags\n");
>  		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
> -- 
> 2.22.0
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-10-30 13:53 ` [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
@ 2021-11-01 22:02   ` Bjorn Helgaas
  2021-11-01 22:33     ` Bjorn Helgaas
  0 siblings, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2021-11-01 22:02 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Sat, Oct 30, 2021 at 09:53:47PM +0800, Dongdong Liu wrote:
> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
> field size from 8 bits to 10 bits.
> 
> PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
> 10-Bit Tag Capabilities" Implementation Note:
> 
>   For platforms where the RC supports 10-Bit Tag Completer capability,
>   it is highly recommended for platform firmware or operating software
>   that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
>   bit automatically in Endpoints with 10-Bit Tag Requester capability.
>   This enables the important class of 10-Bit Tag capable adapters that
>   send Memory Read Requests only to host memory.
> 
> It's safe to enable 10-bit tags for all devices below a Root Port that
> supports them. Switches that lack 10-Bit Tag Completer capability are
> still able to forward NPRs and Completions carrying 10-Bit Tags correctly,
> since the two new Tag bits are in TLP Header bits that were formerly
> Reserved.

Side note: the reason we want to do this to increase performance by
allowing more outstanding requests.  Do you have any benchmarking that
we can mention here to show that this is actually a benefit?  I don't
doubt that it is, but I assume you've measured it and it would be nice
to advertise it.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-11-01 22:02   ` Bjorn Helgaas
@ 2021-11-01 22:33     ` Bjorn Helgaas
  2021-11-03 10:05       ` Dongdong Liu
  0 siblings, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2021-11-01 22:33 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, logang, leon, linux-pci, rajur, hverkuil-cisco, linux-media, netdev

On Mon, Nov 01, 2021 at 05:02:41PM -0500, Bjorn Helgaas wrote:
> On Sat, Oct 30, 2021 at 09:53:47PM +0800, Dongdong Liu wrote:
> > 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
> > field size from 8 bits to 10 bits.
> > 
> > PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
> > 10-Bit Tag Capabilities" Implementation Note:
> > 
> >   For platforms where the RC supports 10-Bit Tag Completer capability,
> >   it is highly recommended for platform firmware or operating software
> >   that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
> >   bit automatically in Endpoints with 10-Bit Tag Requester capability.
> >   This enables the important class of 10-Bit Tag capable adapters that
> >   send Memory Read Requests only to host memory.
> > 
> > It's safe to enable 10-bit tags for all devices below a Root Port that
> > supports them. Switches that lack 10-Bit Tag Completer capability are
> > still able to forward NPRs and Completions carrying 10-Bit Tags correctly,
> > since the two new Tag bits are in TLP Header bits that were formerly
> > Reserved.
> 
> Side note: the reason we want to do this to increase performance by
> allowing more outstanding requests.  Do you have any benchmarking that
> we can mention here to show that this is actually a benefit?  I don't
> doubt that it is, but I assume you've measured it and it would be nice
> to advertise it.

Hmmm.  I did a quick Google search looking for "nvme pcie 10-bit tags"
hoping to find some performance info, but what I *actually* found was
several reports of 10-bit tags causing breakage:

  https://www.reddit.com/r/MSI_Gaming/comments/exjvzg/x570_apro_7c37vh72beta_version_has_anyone_tryed_it/
  https://rog.asus.com/forum/showthread.php?115064-Beware-of-agesa-1-0-0-4B-bios-not-good!/page2
  https://forum-en.msi.com/index.php?threads/sound-blaster-z-has-weird-behaviour-after-updating-bios-x570-gaming-edge-wifi.325223/page-2
  https://gearspace.com/board/electronic-music-instruments-and-electronic-music-production/1317189-h8000fw-firewire-facts-2020-must-read.html
  https://www.soundonsound.com/forum/viewtopic.php?t=69651&start=12
  https://forum.rme-audio.de/viewtopic.php?id=30307

This is a big problem for me.

Some of these might be a broken BIOS that turns on 10-bit tags when
the completer doesn't support them.  I didn't try to debug them to
that level.  But the last thing I want is to enable 10-bit by default
and cause boot issues or sound card issues or whatever.

I'm pretty sure this is a show-stopper for wedging this into v5.16 at
this late date.  It's conceivable we could still do it if everything
defaulted to "off" and we had a knob whereby users could turn it on
via boot param or sysfs.

In any case, we (by which I'm afraid I mean "you" :)) need to
investigate the problem reports, figure out whether we will see
similar problems, and fix them before merging if we can.

Thanks to Krzysztof for pointing out the potential for issues like
this.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V11 4/8] PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices
  2021-11-01 20:54   ` Bjorn Helgaas
@ 2021-11-02 13:03     ` Dongdong Liu
  0 siblings, 0 replies; 17+ messages in thread
From: Dongdong Liu @ 2021-11-02 13:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

Hi Bjorn

Many thanks for you review.
On 2021/11/2 4:54, Bjorn Helgaas wrote:
> On Sat, Oct 30, 2021 at 09:53:44PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says:
>>
>>   If an Endpoint supports sending Requests to other Endpoints (as
>>   opposed to host memory), the Endpoint must not send 10-Bit Tag
>>   Requests to another given Endpoint unless an implementation-specific
>>   mechanism determines that the Endpoint supports 10-Bit Tag Completer
>>   capability.
>>
>> Add a tags sysfs file, write 0 to disable 10-Bit Tag Requester
>> when the driver does not bind the device. The typical use case is for
>> p2pdma when the peer device does not support 10-Bit Tag Completer.
>> Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
>> Completer capability. The typical use case is for host memory targeted
>> by DMA Requests. The tags file content indicate current status of Tags
>> Enable.
>>
>> PCIe r5.0, sec 2.2.6.2 says:
>>
>>   Receivers/Completers must handle 8-bit Tag values correctly regardless
>>   of the setting of their Extended Tag Field Enable bit (see Section
>>   7.5.3.4).
>>
>> Add this comment in pci_configure_extended_tags(). As all PCIe completers
>> are required to support 8-bit tags, so we do not use tags sysfs file
>> to manage 8-bit tags.
>
>> +What:		/sys/bus/pci/devices/.../tags
>> +Date:		September 2021
>> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
>> +Description:
>> +		The file will be visible when the device supports 10-Bit Tag
>> +		Requester. The file is readable, the value indicate current
>> +		status of Tags Enable(5-Bit, 8-Bit, 10-Bit).
>> +
>> +		The file is also writable, The values accepted are:
>> +		* > 0 - this number will be reported as tags bit to be
>> +			enabled. current only 10 is accepted
>> +		* < 0 - not valid
>> +		* = 0 - disable 10-Bit Tag, use Extended Tags(8-Bit or 5-Bit)
>> +
>> +		write 0 to disable 10-Bit Tag Requester when the driver does
>> +		not bind the device. The typical use case is for p2pdma when
>> +		the peer device does not support 10-Bit Tag Completer.
>> +
>> +		Write 10 to enable 10-Bit Tag Requester when RC supports 10-Bit
>> +		Tag Completer capability. The typical use case is for host
>> +		memory targeted by DMA Requests.
>
> 1) I think I would rename this from "tags" to "tag_bits".  A file
>    named "tags" that contains 8 suggests that we can use 8 tags, but
>    in fact, we can use 256 tags.
Looks good, Will do.
>
> 2) This controls tag size the requester will use.  The current knobs
>    in the hardware allow 5, 8, or 10 bits.
>
>    "0" to disable 10-bit tags without specifying whether we should use
>    5- or 8-bit tags doesn't seem right.  All completers are *supposed*
>    to support 8-bit, but we've tripped over a few that don't.
>
>    I don't think we currently have a run-time (or even a boot-time)
>    way to disable 8-bit tags; all we have is the quirk_no_ext_tags()
>    quirk.  But if we ever wanted to *add* that, maybe we would want:
>
>       5 - use 5-bit tags
>       8 - use 8-bit tags
>      10 - use 10-bit tags
will do.
>    Maybe we just say "0" is invalid, since there's no obvious way to
>    map this?
OK, will do.
>
>> +static ssize_t tags_show(struct device *dev,
>> +			 struct device_attribute *attr,
>> +			 char *buf)
>> +{
>> + ...
>
>> +	if (ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)
>> +		return sysfs_emit(buf, "%s\n", "10-Bit");
>> +
>> +	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &ctl);
>> +	if (ret)
>> +		return -EINVAL;
>> +
>> +	if (ctl & PCI_EXP_DEVCTL_EXT_TAG)
>> +		return sysfs_emit(buf, "%s\n", "8-Bit");
>> +
>> +	return sysfs_emit(buf, "%s\n", "5-Bit");
>
> Since I prefer the "tag_bits" name, my preference would be bare
> numbers here: "10", "8", "5".
Will do.
>
> Both comments apply to the sriov files, too.
Will do.
>
>> +static umode_t pcie_dev_tags_attrs_is_visible(struct kobject *kobj,
>> +					      struct attribute *a, int n)
>> +{
>> +	struct device *dev = kobj_to_dev(kobj);
>> +	struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> +	if (pdev->is_virtfn)
>> +		return 0;
>> +
>> +	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
>> +		return 0;
>> +
>> +	if (!(pdev->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
>> +		return 0;
>
> Makes sense for now that the file is only visible if a requester
> supports 10-bit tags.  If we ever wanted to extend this to control 5-
> vs 8-bit tags, we could make it visible in more cases then.
Will do.
>
>> +
>> +	return a->mode;
>> +}
>
>> @@ -2075,6 +2089,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>>  		return 0;
>>  	}
>>
>> +	/*
>> +	 * PCIe r5.0, sec 2.2.6.2 says "Receivers/Completers must handle 8-bit
>> +	 * Tag values correctly regardless of the setting of their Extended Tag
>> +	 * Field Enable bit (see Section 7.5.3.4)", so it is safe to enable
>> +	 * Extented Tags.
>
> s/Extented/Extended/
Will fix.

Thanks,
Dongdong
>
>> +	 */
>>  	if (!(ctl & PCI_EXP_DEVCTL_EXT_TAG)) {
>>  		pci_info(dev, "enabling Extended Tags\n");
>>  		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
>> --
>> 2.22.0
>>
> .
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-11-01 22:33     ` Bjorn Helgaas
@ 2021-11-03 10:05       ` Dongdong Liu
  2021-11-03 16:02         ` Bjorn Helgaas
  0 siblings, 1 reply; 17+ messages in thread
From: Dongdong Liu @ 2021-11-03 10:05 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, logang, leon, linux-pci, rajur, hverkuil-cisco, linux-media, netdev



On 2021/11/2 6:33, Bjorn Helgaas wrote:
> On Mon, Nov 01, 2021 at 05:02:41PM -0500, Bjorn Helgaas wrote:
>> On Sat, Oct 30, 2021 at 09:53:47PM +0800, Dongdong Liu wrote:
>>> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
>>> field size from 8 bits to 10 bits.
>>>
>>> PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
>>> 10-Bit Tag Capabilities" Implementation Note:
>>>
>>>   For platforms where the RC supports 10-Bit Tag Completer capability,
>>>   it is highly recommended for platform firmware or operating software
>>>   that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
>>>   bit automatically in Endpoints with 10-Bit Tag Requester capability.
>>>   This enables the important class of 10-Bit Tag capable adapters that
>>>   send Memory Read Requests only to host memory.
>>>
>>> It's safe to enable 10-bit tags for all devices below a Root Port that
>>> supports them. Switches that lack 10-Bit Tag Completer capability are
>>> still able to forward NPRs and Completions carrying 10-Bit Tags correctly,
>>> since the two new Tag bits are in TLP Header bits that were formerly
>>> Reserved.
>>
>> Side note: the reason we want to do this to increase performance by
>> allowing more outstanding requests.  Do you have any benchmarking that
>> we can mention here to show that this is actually a benefit?  I don't
>> doubt that it is, but I assume you've measured it and it would be nice
>> to advertise it.
>
> Hmmm.  I did a quick Google search looking for "nvme pcie 10-bit tags"
> hoping to find some performance info, but what I *actually* found was
> several reports of 10-bit tags causing breakage:
>
>   https://www.reddit.com/r/MSI_Gaming/comments/exjvzg/x570_apro_7c37vh72beta_version_has_anyone_tryed_it/
>   https://rog.asus.com/forum/showthread.php?115064-Beware-of-agesa-1-0-0-4B-bios-not-good!/page2
>   https://forum-en.msi.com/index.php?threads/sound-blaster-z-has-weird-behaviour-after-updating-bios-x570-gaming-edge-wifi.325223/page-2
>   https://gearspace.com/board/electronic-music-instruments-and-electronic-music-production/1317189-h8000fw-firewire-facts-2020-must-read.html
>   https://www.soundonsound.com/forum/viewtopic.php?t=69651&start=12
>   https://forum.rme-audio.de/viewtopic.php?id=30307
>
> This is a big problem for me.
>
> Some of these might be a broken BIOS that turns on 10-bit tags when
> the completer doesn't support them.  I didn't try to debug them to
> that level.  But the last thing I want is to enable 10-bit by default
> and cause boot issues or sound card issues or whatever.
It seems a BIOS software bug, as it turned on(as default) a 10-Bit Tag
Field for RP, but the card(non-Gen4 card) does not support 10-Bit Completer.

This patch we enable 10-Bit Tag Requester for EP when RC supports
10-Bit Tag Completer capability. So it shuld be worked ok.

The below is one of the link. other links seems the same issue.
https://forum.rme-audio.de/viewtopic.php?id=30307
"Re: AMD Ryzen X570 chipset problems
So for those running into similar problems I found out that in most
recent BIOS AMD turned on(as default) a 10-bit Tag Field, which is only
available on PCI-e Gen4 devices. So your BIOS get stuck on startup if
inserted a non-Gen4 card like my firewire card.

So you need to find that feature in your BIOS and turn it off and set
the PCI compatibility on Gen3 for the slot your card is in.

Hope this helps others running in to similar troubles."
>
> I'm pretty sure this is a show-stopper for wedging this into v5.16 at
> this late date.  It's conceivable we could still do it if everything
> defaulted to "off" and we had a knob whereby users could turn it on
> via boot param or sysfs.
Maybe we can merge this patchset later into v5.17.

But I still think default to "on" will be better,
Current we enable 10-Bit Tag, in the future PCIe 6.0 maybe need to use
14-Bit tags to get good performance.
>
> In any case, we (by which I'm afraid I mean "you" :)) need to
> investigate the problem reports, figure out whether we will see
> similar problems, and fix them before merging if we can.
We have tested a PCIe 5.0 network card on FPGA with 10-Bit tag worked
ok. I have not got the performance data as FPGA is slow.

Current we enable 10-Bit Tag Requester for EP when RC supports
10-Bit Tag Completer capability. It shoud be worked ok except hardware
bugs, we also provide boot param to diasble 10-Bit Tag if the hardware
really have a bug or can do some quirks as 8-bit tag has done if we
have known the hardware.

Thanks,
Dongdong.
>
> Thanks to Krzysztof for pointing out the potential for issues like
> this.
>
> Bjorn
> .
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-11-03 10:05       ` Dongdong Liu
@ 2021-11-03 16:02         ` Bjorn Helgaas
  2021-11-05  8:24           ` Dongdong Liu
  0 siblings, 1 reply; 17+ messages in thread
From: Bjorn Helgaas @ 2021-11-03 16:02 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, logang, leon, linux-pci, rajur, hverkuil-cisco, linux-media, netdev

On Wed, Nov 03, 2021 at 06:05:34PM +0800, Dongdong Liu wrote:
> On 2021/11/2 6:33, Bjorn Helgaas wrote:
> > On Mon, Nov 01, 2021 at 05:02:41PM -0500, Bjorn Helgaas wrote:
> > > On Sat, Oct 30, 2021 at 09:53:47PM +0800, Dongdong Liu wrote:
> > > > 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
> > > > field size from 8 bits to 10 bits.
> > > > 
> > > > PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
> > > > 10-Bit Tag Capabilities" Implementation Note:
> > > > 
> > > >   For platforms where the RC supports 10-Bit Tag Completer capability,
> > > >   it is highly recommended for platform firmware or operating software
> > > >   that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
> > > >   bit automatically in Endpoints with 10-Bit Tag Requester capability.
> > > >   This enables the important class of 10-Bit Tag capable adapters that
> > > >   send Memory Read Requests only to host memory.
> > > > 
> > > > It's safe to enable 10-bit tags for all devices below a Root Port that
> > > > supports them. Switches that lack 10-Bit Tag Completer capability are
> > > > still able to forward NPRs and Completions carrying 10-Bit Tags correctly,
> > > > since the two new Tag bits are in TLP Header bits that were formerly
> > > > Reserved.
> > > 
> > > Side note: the reason we want to do this to increase performance by
> > > allowing more outstanding requests.  Do you have any benchmarking that
> > > we can mention here to show that this is actually a benefit?  I don't
> > > doubt that it is, but I assume you've measured it and it would be nice
> > > to advertise it.
> > 
> > Hmmm.  I did a quick Google search looking for "nvme pcie 10-bit tags"
> > hoping to find some performance info, but what I *actually* found was
> > several reports of 10-bit tags causing breakage:
> > 
> >   https://www.reddit.com/r/MSI_Gaming/comments/exjvzg/x570_apro_7c37vh72beta_version_has_anyone_tryed_it/
> >   https://rog.asus.com/forum/showthread.php?115064-Beware-of-agesa-1-0-0-4B-bios-not-good!/page2
> >   https://forum-en.msi.com/index.php?threads/sound-blaster-z-has-weird-behaviour-after-updating-bios-x570-gaming-edge-wifi.325223/page-2
> >   https://gearspace.com/board/electronic-music-instruments-and-electronic-music-production/1317189-h8000fw-firewire-facts-2020-must-read.html
> >   https://www.soundonsound.com/forum/viewtopic.php?t=69651&start=12
> >   https://forum.rme-audio.de/viewtopic.php?id=30307
> > 
> > This is a big problem for me.
> > 
> > Some of these might be a broken BIOS that turns on 10-bit tags
> > when the completer doesn't support them.  I didn't try to debug
> > them to that level.  But the last thing I want is to enable 10-bit
> > by default and cause boot issues or sound card issues or whatever.
>
> It seems a BIOS software bug, as it turned on (as default) a 10-Bit
> Tag Field for RP, but the card (non-Gen4 card) does not support
> 10-Bit Completer.

It doesn't matter *where* the problem is.  If we change Linux to
*expose* a BIOS bug, that's just as much of a problem as if the bug
were in Linux.  Users are not equipped to diagnose or fix problems
like that.

> This patch we enable 10-Bit Tag Requester for EP when RC supports
> 10-Bit Tag Completer capability. So it shuld be worked ok.

That's true as long as the RC supports 10-bit tags correctly when it
advertises support for them.  It "should" work :)

But it does remind me that if the RC doesn't support 10-bit tags, but
we use sysfs to enable 10-bit tags for a reqester that intends to use
P2PDMA to a peer that *does* support them, I don't think there's
any check in the DMA API that prevents the driver from setting up DMA
to the RC in addition to the peer.

> But I still think default to "on" will be better,
> Current we enable 10-Bit Tag, in the future PCIe 6.0 maybe need to use
> 14-Bit tags to get good performance.

Maybe we can default to "on" based on BIOS date or something.  Older
systems that want the benefit can use the param to enable it, and if
there's a problem, the cause will be obvious ("we booted with
'pci=tag-bits=10' and things broke").

If we enable 10-bit tags by default on systems from 2022 or newer, we
shouldn't break any existing systems, and we have a chance to discover
any problems and add quirk if necessary.

> > In any case, we (by which I'm afraid I mean "you" :)) need to
> > investigate the problem reports, figure out whether we will see
> > similar problems, and fix them before merging if we can.
>
> We have tested a PCIe 5.0 network card on FPGA with 10-Bit tag worked
> ok. I have not got the performance data as FPGA is slow.

10-bit tag support appeared in the spec four years ago (PCIe r4.0, in
September, 2017).  Surely there is production hardware that supports
this and could demonstrate a benefit from this.

We need a commit log that says "enabling 10-bit tags allows more
outstanding transactions, which improves performance of adapters like
X by Y% on these workloads," not a log that says "we think enabling
10-bit tags is safe, but users with non-compliant hardware may see new
PCIe errors or even non-bootable systems, and they should use boot
param X to work around this."

> Current we enable 10-Bit Tag Requester for EP when RC supports
> 10-Bit Tag Completer capability. It should be worked ok except
> hardware bugs, we also provide boot param to disable 10-Bit Tag if
> the hardware really have a bug or can do some quirks as 8-bit tag
> has done if we have known the hardware.

The problem is that turning it on by default means systems with
hardware defects *used* to work but now they mysteriously *stop*
working.  Yes, a boot param can work around that, but it's just
not an acceptable user experience.  Maybe there are no such defects.
I dunno.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-11-03 16:02         ` Bjorn Helgaas
@ 2021-11-05  8:24           ` Dongdong Liu
  2021-11-05 17:39             ` Bjorn Helgaas
  0 siblings, 1 reply; 17+ messages in thread
From: Dongdong Liu @ 2021-11-05  8:24 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, logang, leon, linux-pci, rajur, hverkuil-cisco, linux-media, netdev



On 2021/11/4 0:02, Bjorn Helgaas wrote:
> On Wed, Nov 03, 2021 at 06:05:34PM +0800, Dongdong Liu wrote:
>> On 2021/11/2 6:33, Bjorn Helgaas wrote:
>>> On Mon, Nov 01, 2021 at 05:02:41PM -0500, Bjorn Helgaas wrote:
>>>> On Sat, Oct 30, 2021 at 09:53:47PM +0800, Dongdong Liu wrote:
>>>>> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
>>>>> field size from 8 bits to 10 bits.
>>>>>
>>>>> PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
>>>>> 10-Bit Tag Capabilities" Implementation Note:
>>>>>
>>>>>   For platforms where the RC supports 10-Bit Tag Completer capability,
>>>>>   it is highly recommended for platform firmware or operating software
>>>>>   that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
>>>>>   bit automatically in Endpoints with 10-Bit Tag Requester capability.
>>>>>   This enables the important class of 10-Bit Tag capable adapters that
>>>>>   send Memory Read Requests only to host memory.
>>>>>
>>>>> It's safe to enable 10-bit tags for all devices below a Root Port that
>>>>> supports them. Switches that lack 10-Bit Tag Completer capability are
>>>>> still able to forward NPRs and Completions carrying 10-Bit Tags correctly,
>>>>> since the two new Tag bits are in TLP Header bits that were formerly
>>>>> Reserved.
>>>>
>>>> Side note: the reason we want to do this to increase performance by
>>>> allowing more outstanding requests.  Do you have any benchmarking that
>>>> we can mention here to show that this is actually a benefit?  I don't
>>>> doubt that it is, but I assume you've measured it and it would be nice
>>>> to advertise it.
>>>
>>> Hmmm.  I did a quick Google search looking for "nvme pcie 10-bit tags"
>>> hoping to find some performance info, but what I *actually* found was
>>> several reports of 10-bit tags causing breakage:
>>>
>>>   https://www.reddit.com/r/MSI_Gaming/comments/exjvzg/x570_apro_7c37vh72beta_version_has_anyone_tryed_it/
>>>   https://rog.asus.com/forum/showthread.php?115064-Beware-of-agesa-1-0-0-4B-bios-not-good!/page2
>>>   https://forum-en.msi.com/index.php?threads/sound-blaster-z-has-weird-behaviour-after-updating-bios-x570-gaming-edge-wifi.325223/page-2
>>>   https://gearspace.com/board/electronic-music-instruments-and-electronic-music-production/1317189-h8000fw-firewire-facts-2020-must-read.html
>>>   https://www.soundonsound.com/forum/viewtopic.php?t=69651&start=12
>>>   https://forum.rme-audio.de/viewtopic.php?id=30307
>>>
>>> This is a big problem for me.
>>>
>>> Some of these might be a broken BIOS that turns on 10-bit tags
>>> when the completer doesn't support them.  I didn't try to debug
>>> them to that level.  But the last thing I want is to enable 10-bit
>>> by default and cause boot issues or sound card issues or whatever.
>>
>> It seems a BIOS software bug, as it turned on (as default) a 10-Bit
>> Tag Field for RP, but the card (non-Gen4 card) does not support
>> 10-Bit Completer.
>
> It doesn't matter *where* the problem is.  If we change Linux to
> *expose* a BIOS bug, that's just as much of a problem as if the bug
> were in Linux.  Users are not equipped to diagnose or fix problems
> like that.
>
>> This patch we enable 10-Bit Tag Requester for EP when RC supports
>> 10-Bit Tag Completer capability. So it shuld be worked ok.
>
> That's true as long as the RC supports 10-bit tags correctly when it
> advertises support for them.  It "should" work :)
>
> But it does remind me that if the RC doesn't support 10-bit tags, but
> we use sysfs to enable 10-bit tags for a reqester that intends to use
> P2PDMA to a peer that *does* support them, I don't think there's
> any check in the DMA API that prevents the driver from setting up DMA
> to the RC in addition to the peer.
Current we use sysfs to enable/disable 10-bit tags for a requester also
depend on the RP support 10-bit tag completer, so it will be ok.
>
>> But I still think default to "on" will be better,
>> Current we enable 10-Bit Tag, in the future PCIe 6.0 maybe need to use
>> 14-Bit tags to get good performance.
>
> Maybe we can default to "on" based on BIOS date or something.  Older
> systems that want the benefit can use the param to enable it, and if
> there's a problem, the cause will be obvious ("we booted with
> 'pci=tag-bits=10' and things broke").
>
> If we enable 10-bit tags by default on systems from 2022 or newer, we
> shouldn't break any existing systems, and we have a chance to discover
> any problems and add quirk if necessary.
>
>>> In any case, we (by which I'm afraid I mean "you" :)) need to
>>> investigate the problem reports, figure out whether we will see
>>> similar problems, and fix them before merging if we can.
>>
>> We have tested a PCIe 5.0 network card on FPGA with 10-Bit tag worked
>> ok. I have not got the performance data as FPGA is slow.
>
> 10-bit tag support appeared in the spec four years ago (PCIe r4.0, in
> September, 2017).  Surely there is production hardware that supports
> this and could demonstrate a benefit from this.
I found the below introduction about "Number of tags needed to achieve
maximum throughput for PCIe 4.0 and PCIe 5.0 links"
https://www.synopsys.com/designware-ip/technical-bulletin/accelerating-32gtps-pcie5-designs.html

It seems pretty clear.
>
> We need a commit log that says "enabling 10-bit tags allows more
> outstanding transactions, which improves performance of adapters like
> X by Y% on these workloads," not a log that says "we think enabling
> 10-bit tags is safe, but users with non-compliant hardware may see new
> PCIe errors or even non-bootable systems, and they should use boot
> param X to work around this."
Looks good, will fix the commit log.
I investigate some PCIe 4.0 cards such as mlx cx5(PCIe 4.0 16GT/s
x16), a NVME SSD(PCIe 4.0 16GT/s X4), but these cards only support
10-bit tag completer not support 10-bit tag requester. Maybe
these cards use 8-bit tag can achieve its performance specs.
>
>> Current we enable 10-Bit Tag Requester for EP when RC supports
>> 10-Bit Tag Completer capability. It should be worked ok except
>> hardware bugs, we also provide boot param to disable 10-Bit Tag if
>> the hardware really have a bug or can do some quirks as 8-bit tag
>> has done if we have known the hardware.
>
> The problem is that turning it on by default means systems with
> hardware defects *used* to work but now they mysteriously *stop*
> working.  Yes, a boot param can work around that, but it's just
> not an acceptable user experience.  Maybe there are no such defects.
> I dunno.
Ok, current defaulted to "off" and use boot param and sysfs to turn on
maybe a safe choice.

Thanks,
Dongdong
>
> Bjorn
> .
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-11-05  8:24           ` Dongdong Liu
@ 2021-11-05 17:39             ` Bjorn Helgaas
  0 siblings, 0 replies; 17+ messages in thread
From: Bjorn Helgaas @ 2021-11-05 17:39 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, logang, leon, linux-pci, rajur, hverkuil-cisco, linux-media, netdev

On Fri, Nov 05, 2021 at 04:24:24PM +0800, Dongdong Liu wrote:
> On 2021/11/4 0:02, Bjorn Helgaas wrote:

> > But it does remind me that if the RC doesn't support 10-bit tags, but
> > we use sysfs to enable 10-bit tags for a reqester that intends to use
> > P2PDMA to a peer that *does* support them, I don't think there's
> > any check in the DMA API that prevents the driver from setting up DMA
> > to the RC in addition to the peer.
>
> Current we use sysfs to enable/disable 10-bit tags for a requester also
> depend on the RP support 10-bit tag completer, so it will be ok.

Ah, OK.  So we can never *enable* 10-bit tags unless the Root Port
supports them.

I misunderstood the purpose of this file.  When the Root Port doesn't
support 10-bit tags, we won't enable them during enumeration.  I
though the point was that if we want to do P2PDMA to a peer that
*does* support them, we could use this file to enable them.

But my understanding was wrong -- the real purpose of the file is to
*disable* 10-bit tags for the case when a P2PDMA peer doesn't support
them.

It does support enabling 10-bit tags as well, but that's only because
we need a way to get back to the default "enabled during enumeration"
state without having to reboot.

We might be able to highlight this a little more in the commit log.

> > 10-bit tag support appeared in the spec four years ago (PCIe r4.0, in
> > September, 2017).  Surely there is production hardware that supports
> > this and could demonstrate a benefit from this.
>
> I found the below introduction about "Number of tags needed to achieve
> maximum throughput for PCIe 4.0 and PCIe 5.0 links"
> https://www.synopsys.com/designware-ip/technical-bulletin/accelerating-32gtps-pcie5-designs.html
> 
> It seems pretty clear.

Yes, that's a start.  But we don't really need a white paper to tell
us that more outstanding transactions is better.  That's obvious.  But
this adds risk, and if we can't demonstrate a tangible, measurable
benefit, there's no point in doing it.

Bjorn

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2021-11-05 17:39 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-30 13:53 [PATCH V11 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
2021-10-30 13:53 ` [PATCH V11 1/8] PCI: Use cached devcap in more places Dongdong Liu
2021-10-30 13:53 ` [PATCH V11 2/8] PCI: Cache Device Capabilities 2 Register Dongdong Liu
2021-10-30 13:53 ` [PATCH V11 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
2021-10-30 13:53 ` [PATCH V11 4/8] PCI/sysfs: Add a tags sysfs file for PCIe Endpoint devices Dongdong Liu
2021-11-01 20:54   ` Bjorn Helgaas
2021-11-02 13:03     ` Dongdong Liu
2021-10-30 13:53 ` [PATCH V11 5/8] PCI/IOV: Add tags sysfs files for VF devices Dongdong Liu
2021-10-30 13:53 ` [PATCH V11 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
2021-10-30 13:53 ` [PATCH V11 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
2021-11-01 22:02   ` Bjorn Helgaas
2021-11-01 22:33     ` Bjorn Helgaas
2021-11-03 10:05       ` Dongdong Liu
2021-11-03 16:02         ` Bjorn Helgaas
2021-11-05  8:24           ` Dongdong Liu
2021-11-05 17:39             ` Bjorn Helgaas
2021-10-30 13:53 ` [PATCH V11 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).