All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices
@ 2021-09-22 13:36 Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 1/8] PCI: Use cached devcap in more places Dongdong Liu
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

This patchset is to enable 10-Bit tag for PCIe EP devices (include VF).

V8->V9:
- Rebased on V5.15-rc2.
- Rename pcie_devcap to devcap, pcie_devcap2 to devcap2 to keep the same
  style with commit 691392448065 ("PCI: Cache PCIe Device Capabilities
  register").

V7->V8:
- Add a kernel parameter pcie_tag_peer2peer to disable 10-bit tags.
- Provide sysfs file to enable 10-bit tags.
- Remove [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices.
- Rebased on v5.14-rc6.
- Fix some other comments. Thanks to Bjorn who gave a lot of review
  comments.

V6->V7:
- Rebased on v5.14-rc3.
- Change the "pci=disable_10bit_tag=" parameter to sysfs file to disable
  10-Bit Tag Requester when need for p2pdma suggested by Leon.
- Fix comment for p2pdma 10-bit tag check.

V5->V6:
- Rebased on v5.14-rc2.
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V6 2/8].
- PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support.
- Add a 10-bit tag check in P2PDMA.
- Simplified implementation in [PATCH V6 6/8].
- Fix some comments in [PATCH V6 4/8].

V4->V5:
- Fix warning variable 'capa' is uninitialized.
- Fix warning unused variable 'pchild'.

V3->V4:
- Get the value of pcie_devcap2 in set_pcie_port_type().
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V4 1/6],
  [PATCH V4 3/6], [PATCH V4 4/6], [PATCH V4 5/6].
- Fix some code style.
- Rebased on v5.13-rc6.

V2->V3:
- Use cached Device Capabilities Register suggested by Christoph.
- Fix code style to avoid > 80 char lines.
- Rename devcap2 to pcie_devcap2.

V1->V2: Fix some comments by Christoph.
- Store the devcap2 value in the pci_dev instead of reading it multiple
  times.
- Change pci_info to pci_dbg to avoid the noisy log.
- Rename ext_10bit_tag_comp_path to ext_10bit_tag.
- Fix the compile error.
- Rebased on v5.13-rc1.

Dongdong Liu (8):
  PCI: Use cached devcap in more places
  PCI: Cache Device Capabilities 2 Register
  PCI: Add 10-Bit Tag register definitions
  PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices
  PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices

 Documentation/ABI/testing/sysfs-bus-pci       | 41 +++++++++-
 .../admin-guide/kernel-parameters.txt         |  5 ++
 drivers/media/pci/cobalt/cobalt-driver.c      |  4 +-
 .../net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  4 +-
 drivers/pci/iov.c                             | 67 ++++++++++++++++
 drivers/pci/p2pdma.c                          | 48 +++++++++++
 drivers/pci/pci-sysfs.c                       | 79 +++++++++++++++++++
 drivers/pci/pci.c                             | 12 +--
 drivers/pci/pci.h                             |  9 +++
 drivers/pci/pcie/aspm.c                       | 11 +--
 drivers/pci/probe.c                           | 75 +++++++++++++++---
 drivers/pci/quirks.c                          |  3 +-
 include/linux/pci.h                           |  1 +
 include/uapi/linux/pci_regs.h                 |  5 ++
 14 files changed, 332 insertions(+), 32 deletions(-)

--
2.22.0


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH V9 1/8] PCI: Use cached devcap in more places
  2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
@ 2021-09-22 13:36 ` Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 2/8] PCI: Cache Device Capabilities 2 Register Dongdong Liu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Since commit 691392448065 ("PCI: Cache PCIe Device Capabilities register")
has already added a new member called devcap in struct pci_dev for
caching the PCIe Device Capabilities register to avoid reading
PCI_EXP_DEVCAP multiple times. Use devcap in more needed places.

Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/media/pci/cobalt/cobalt-driver.c |  4 ++--
 drivers/pci/pcie/aspm.c                  | 11 ++++-------
 drivers/pci/probe.c                      |  7 +------
 drivers/pci/quirks.c                     |  3 +--
 4 files changed, 8 insertions(+), 17 deletions(-)

diff --git a/drivers/media/pci/cobalt/cobalt-driver.c b/drivers/media/pci/cobalt/cobalt-driver.c
index 16af58f2f93c..bc04184f1f74 100644
--- a/drivers/media/pci/cobalt/cobalt-driver.c
+++ b/drivers/media/pci/cobalt/cobalt-driver.c
@@ -193,11 +193,11 @@ void cobalt_pcie_status_show(struct cobalt *cobalt)
 		return;
 
 	/* Device */
-	pcie_capability_read_dword(pci_dev, PCI_EXP_DEVCAP, &capa);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVCTL, &ctrl);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVSTA, &stat);
 	cobalt_info("PCIe device capability 0x%08x: Max payload %d\n",
-		    capa, get_payload_size(capa & PCI_EXP_DEVCAP_PAYLOAD));
+		    pci_dev->devcap,
+		    get_payload_size(pci_dev->devcap & PCI_EXP_DEVCAP_PAYLOAD));
 	cobalt_info("PCIe device control 0x%04x: Max payload %d. Max read request %d\n",
 		    ctrl,
 		    get_payload_size((ctrl & PCI_EXP_DEVCTL_PAYLOAD) >> 5),
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 013a47f587ce..82d6234a4aa5 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -660,7 +660,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 
 	/* Get and check endpoint acceptable latencies */
 	list_for_each_entry(child, &linkbus->devices, bus_list) {
-		u32 reg32, encoding;
+		u32 encoding;
 		struct aspm_latency *acceptable =
 			&link->acceptable[PCI_FUNC(child->devfn)];
 
@@ -668,12 +668,11 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 		    pci_pcie_type(child) != PCI_EXP_TYPE_LEG_END)
 			continue;
 
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
 		/* Calculate endpoint L0s acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L0S) >> 6;
+		encoding = (child->devcap & PCI_EXP_DEVCAP_L0S) >> 6;
 		acceptable->l0s = calc_l0s_acceptable(encoding);
 		/* Calculate endpoint L1 acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L1) >> 9;
+		encoding = (child->devcap & PCI_EXP_DEVCAP_L1) >> 9;
 		acceptable->l1 = calc_l1_acceptable(encoding);
 
 		pcie_aspm_check_latency(child);
@@ -808,7 +807,6 @@ static void free_link_state(struct pcie_link_state *link)
 static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 {
 	struct pci_dev *child;
-	u32 reg32;
 
 	/*
 	 * Some functions in a slot might not all be PCIe functions,
@@ -831,8 +829,7 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
 		 * RBER bit to determine if a function is 1.1 version device
 		 */
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
-		if (!(reg32 & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
+		if (!(child->devcap & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
 			pci_info(child, "disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'\n");
 			return -EINVAL;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index d9fc02a71baa..96ecdf34f931 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2044,18 +2044,13 @@ static void pci_configure_mps(struct pci_dev *dev)
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u32 cap;
 	u16 ctl;
 	int ret;
 
 	if (!pci_is_pcie(dev))
 		return 0;
 
-	ret = pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	if (ret)
-		return 0;
-
-	if (!(cap & PCI_EXP_DEVCAP_EXT_TAG))
+	if (!(dev->devcap & PCI_EXP_DEVCAP_EXT_TAG))
 		return 0;
 
 	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL, &ctl);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 4537d1ea14fd..1bd0d610f3e0 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5260,8 +5260,7 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
 		pdev->pcie_cap = pos;
 		pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
 		pdev->pcie_flags_reg = reg16;
-		pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, &reg16);
-		pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
+		pdev->pcie_mpss = pdev->devcap & PCI_EXP_DEVCAP_PAYLOAD;
 
 		pdev->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
 		if (pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status) !=
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V9 2/8] PCI: Cache Device Capabilities 2 Register
  2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 1/8] PCI: Use cached devcap in more places Dongdong Liu
@ 2021-09-22 13:36 ` Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add a new member called devcap2 in struct pci_dev for caching the PCIe
Device Capabilities 2 register to avoid reading PCI_EXP_DEVCAP2 multiple
times.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  4 +---
 drivers/pci/pci.c                               |  8 +++-----
 drivers/pci/probe.c                             | 10 ++++------
 include/linux/pci.h                             |  1 +
 4 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 0d9cda4ab303..ae0b6def994b 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6304,7 +6304,6 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		struct pci_dev *pbridge;
 		struct port_info *pi;
 		char name[IFNAMSIZ];
-		u32 devcap2;
 		u16 flags;
 
 		/* If we want to instantiate Virtual Functions, then our
@@ -6314,10 +6313,9 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		 */
 		pbridge = pdev->bus->self;
 		pcie_capability_read_word(pbridge, PCI_EXP_FLAGS, &flags);
-		pcie_capability_read_dword(pbridge, PCI_EXP_DEVCAP2, &devcap2);
 
 		if ((flags & PCI_EXP_FLAGS_VERS) < 2 ||
-		    !(devcap2 & PCI_EXP_DEVCAP2_ARI)) {
+		    !(pbridge->devcap2 & PCI_EXP_DEVCAP2_ARI)) {
 			/* Our parent bridge does not support ARI so issue a
 			 * warning and skip instantiating the VFs.  They
 			 * won't be reachable.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index ce2ab62b64cf..64138a83b0f7 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3717,7 +3717,7 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 {
 	struct pci_bus *bus = dev->bus;
 	struct pci_dev *bridge;
-	u32 cap, ctl2;
+	u32 ctl2;
 
 	if (!pci_is_pcie(dev))
 		return -EINVAL;
@@ -3741,19 +3741,17 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 	while (bus->parent) {
 		bridge = bus->self;
 
-		pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
-
 		switch (pci_pcie_type(bridge)) {
 		/* Ensure switch ports support AtomicOp routing */
 		case PCI_EXP_TYPE_UPSTREAM:
 		case PCI_EXP_TYPE_DOWNSTREAM:
-			if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+			if (!(bridge->devcap2 & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
 				return -EINVAL;
 			break;
 
 		/* Ensure root port supports all the sizes we care about */
 		case PCI_EXP_TYPE_ROOT_PORT:
-			if ((cap & cap_mask) != cap_mask)
+			if ((bridge->devcap2 & cap_mask) != cap_mask)
 				return -EINVAL;
 			break;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 96ecdf34f931..7259ad774ac8 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1509,6 +1509,7 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pdev->pcie_flags_reg = reg16;
 	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->devcap);
 	pdev->pcie_mpss = FIELD_GET(PCI_EXP_DEVCAP_PAYLOAD, pdev->devcap);
+	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP2, &pdev->devcap2);
 
 	parent = pci_upstream_bridge(pdev);
 	if (!parent)
@@ -2129,7 +2130,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 #ifdef CONFIG_PCIEASPM
 	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 	struct pci_dev *bridge;
-	u32 cap, ctl;
+	u32 ctl;
 
 	if (!pci_is_pcie(dev))
 		return;
@@ -2137,8 +2138,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 	/* Read L1 PM substate capabilities */
 	dev->l1ss = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS);
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_LTR))
+	if (!(dev->devcap2 & PCI_EXP_DEVCAP2_LTR))
 		return;
 
 	pcie_capability_read_dword(dev, PCI_EXP_DEVCTL2, &ctl);
@@ -2178,13 +2178,11 @@ static void pci_configure_eetlp_prefix(struct pci_dev *dev)
 #ifdef CONFIG_PCI_PASID
 	struct pci_dev *bridge;
 	int pcie_type;
-	u32 cap;
 
 	if (!pci_is_pcie(dev))
 		return;
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_EE_PREFIX))
+	if (!(dev->devcap2 & PCI_EXP_DEVCAP2_EE_PREFIX))
 		return;
 
 	pcie_type = pci_pcie_type(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index cd8aa6fce204..286d89e22738 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -333,6 +333,7 @@ struct pci_dev {
 	struct pci_dev  *rcec;          /* Associated RCEC device */
 #endif
 	u32		devcap;		/* PCIe Device Capabilities */
+	u32		devcap2;	/* PCIe Device Capabilities 2 */
 	u8		pcie_cap;	/* PCIe capability offset */
 	u8		msi_cap;	/* MSI capability offset */
 	u8		msix_cap;	/* MSI-X capability offset */
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V9 3/8] PCI: Add 10-Bit Tag register definitions
  2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 1/8] PCI: Use cached devcap in more places Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 2/8] PCI: Cache Device Capabilities 2 Register Dongdong Liu
@ 2021-09-22 13:36 ` Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices Dongdong Liu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add 10-Bit Tag register definitions for use in subsequen patches.
See the PCIe 5.0 spec section 7.5.3.15 and 9.3.3.2.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 include/uapi/linux/pci_regs.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index e709ae8235e7..cf1ddb82a6b9 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -648,6 +648,8 @@
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_COMP 0x00010000 /* 10-Bit Tag Completer Supported */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_REQ  0x00020000 /* 10-Bit Tag Requester Supported */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
 #define  PCI_EXP_DEVCAP2_OBFF_WAKE	0x00080000 /* Re-use WAKE# for OBFF */
@@ -661,6 +663,7 @@
 #define  PCI_EXP_DEVCTL2_IDO_REQ_EN	0x0100	/* Allow IDO for requests */
 #define  PCI_EXP_DEVCTL2_IDO_CMP_EN	0x0200	/* Allow IDO for completions */
 #define  PCI_EXP_DEVCTL2_LTR_EN		0x0400	/* Enable LTR mechanism */
+#define  PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN 0x1000 /* 10-Bit Tag Requester Enable */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGA_EN	0x2000	/* Enable OBFF Message type A */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGB_EN	0x4000	/* Enable OBFF Message type B */
 #define  PCI_EXP_DEVCTL2_OBFF_WAKE_EN	0x6000	/* OBFF using WAKE# signaling */
@@ -931,6 +934,7 @@
 /* Single Root I/O Virtualization */
 #define PCI_SRIOV_CAP		0x04	/* SR-IOV Capabilities */
 #define  PCI_SRIOV_CAP_VFM	0x00000001  /* VF Migration Capable */
+#define  PCI_SRIOV_CAP_VF_10BIT_TAG_REQ	0x00000004 /* VF 10-Bit Tag Requester Supported */
 #define  PCI_SRIOV_CAP_INTR(x)	((x) >> 21) /* Interrupt Message Number */
 #define PCI_SRIOV_CTRL		0x08	/* SR-IOV Control */
 #define  PCI_SRIOV_CTRL_VFE	0x0001	/* VF Enable */
@@ -938,6 +942,7 @@
 #define  PCI_SRIOV_CTRL_INTR	0x0004	/* VF Migration Interrupt Enable */
 #define  PCI_SRIOV_CTRL_MSE	0x0008	/* VF Memory Space Enable */
 #define  PCI_SRIOV_CTRL_ARI	0x0010	/* ARI Capable Hierarchy */
+#define  PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN 0x0020 /* VF 10-Bit Tag Requester Enable */
 #define PCI_SRIOV_STATUS	0x0a	/* SR-IOV Status */
 #define  PCI_SRIOV_STATUS_VFM	0x0001	/* VF Migration Status */
 #define PCI_SRIOV_INITIAL_VF	0x0c	/* Initial VFs */
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V9 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices
  2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (2 preceding siblings ...)
  2021-09-22 13:36 ` [PATCH V9 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
@ 2021-09-22 13:36 ` Dongdong Liu
  2021-09-23  4:21   ` Krzysztof Wilczyński
  2021-09-22 13:36 ` [PATCH V9 5/8] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as
  opposed to host memory), the Endpoint must not send 10-Bit Tag
  Requests to another given Endpoint unless an implementation-specific
  mechanism determines that the Endpoint supports 10-Bit Tag Completer
  capability.

Add a 10bit_tag sysfs file, write 0 to disable 10-Bit Tag Requester
when the driver does not bind the device. The typical use case is for
p2pdma when the peer device does not support 10-Bit Tag Completer.
Write 1 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
Completer capability. The typical use case is for host memory targeted
by DMA Requests. The 10bit_tag file content indicate current status of
10-Bit Tag Requester Enable.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 18 +++++-
 drivers/pci/pci-sysfs.c                 | 76 +++++++++++++++++++++++++
 drivers/pci/pci.h                       |  2 +
 drivers/pci/probe.c                     | 14 +++++
 4 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index d4ae03296861..0c26346d1069 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -156,7 +156,7 @@ Description:
 		binary file containing the Vital Product Data for the
 		device.  It should follow the VPD format defined in
 		PCI Specification 2.1 or 2.2, but users should consider
-		that some devices may have incorrectly formatted data.  
+		that some devices may have incorrectly formatted data.
 		If the underlying VPD has a writable section then the
 		corresponding section of this file will be writable.
 
@@ -424,3 +424,19 @@ Description:
 
 		The file is writable if the PF is bound to a driver that
 		implements ->sriov_set_msix_vec_count().
+
+What:		/sys/bus/pci/devices/.../10bit_tag
+Date:		September 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		The file will be visible when the device supports 10-Bit Tag
+		Requester. The file is readable, the value indicate current
+		status of 10-Bit Tag Requester Enable.
+		1 - enabled, 0 - disabled.
+
+		The file is also writable, write 0 to disable 10-Bit Tag
+		Requester when the driver does not bind the device. The typical
+		use case is for p2pdma when the peer device does not support
+		10-Bit Tag Completer. Write 1 to enable 10-Bit Tag Requester
+		when RC supports 10-Bit Tag Completer capability. The typical
+		use case is for host memory targeted by DMA Requests.
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7fb5cd17cc98..7be65e919d33 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -306,6 +306,53 @@ static ssize_t enable_show(struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR_RW(enable);
 
+static ssize_t pci_10bit_tag_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	bool enable;
+
+	if (kstrtobool(buf, &enable) < 0)
+		return -EINVAL;
+
+	if (pdev->driver)
+		return -EBUSY;
+
+	if (enable) {
+		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
+			return -EPERM;
+
+		pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2,
+				PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	} else {
+		pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
+				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+
+	return count;
+}
+
+static ssize_t pci_10bit_tag_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 ctl;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	return sysfs_emit(buf, "%u\n",
+			  !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN));
+}
+
+static struct device_attribute dev_attr_10bit_tag = __ATTR(10bit_tag, 0644,
+							   pci_10bit_tag_show,
+							   pci_10bit_tag_store);
+
 #ifdef CONFIG_NUMA
 static ssize_t numa_node_store(struct device *dev,
 			       struct device_attribute *attr, const char *buf,
@@ -635,6 +682,11 @@ static struct attribute *pcie_dev_attrs[] = {
 	NULL,
 };
 
+static struct attribute *pcie_dev_10bit_tag_attrs[] = {
+	&dev_attr_10bit_tag.attr,
+	NULL,
+};
+
 static struct attribute *pcibus_attrs[] = {
 	&dev_attr_bus_rescan.attr,
 	&dev_attr_cpuaffinity.attr,
@@ -1482,6 +1534,24 @@ static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
 	return 0;
 }
 
+static umode_t pcie_dev_10bit_tag_attrs_are_visible(struct kobject *kobj,
+					  struct attribute *a, int n)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (pdev->is_virtfn)
+		return 0;
+
+	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
+		return 0;
+
+	if (!(pdev->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
+		return 0;
+
+	return a->mode;
+}
+
 static const struct attribute_group pci_dev_group = {
 	.attrs = pci_dev_attrs,
 };
@@ -1522,6 +1592,11 @@ static const struct attribute_group pcie_dev_attr_group = {
 	.is_visible = pcie_dev_attrs_are_visible,
 };
 
+static const struct attribute_group pcie_dev_10bit_tag_attr_group = {
+	.attrs = pcie_dev_10bit_tag_attrs,
+	.is_visible = pcie_dev_10bit_tag_attrs_are_visible,
+};
+
 static const struct attribute_group *pci_dev_attr_groups[] = {
 	&pci_dev_attr_group,
 	&pci_dev_hp_attr_group,
@@ -1531,6 +1606,7 @@ static const struct attribute_group *pci_dev_attr_groups[] = {
 #endif
 	&pci_bridge_attr_group,
 	&pcie_dev_attr_group,
+	&pcie_dev_10bit_tag_attr_group,
 #ifdef CONFIG_PCIEAER
 	&aer_stats_attr_group,
 #endif
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 1cce56c2aea0..f719a41dfc7f 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -264,6 +264,8 @@ struct device *pci_get_host_bridge_device(struct pci_dev *dev);
 void pci_put_host_bridge_device(struct device *dev);
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev);
+
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 7259ad774ac8..705dd4e85df5 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2042,6 +2042,20 @@ static void pci_configure_mps(struct pci_dev *dev)
 		 p_mps, mps, mpss);
 }
 
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev)
+{
+	struct pci_dev *root;
+
+	root = pcie_find_root_port(dev);
+	if (!root)
+		return false;
+
+	if (!(root->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
+		return false;
+
+	return true;
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V9 5/8] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (3 preceding siblings ...)
  2021-09-22 13:36 ` [PATCH V9 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices Dongdong Liu
@ 2021-09-22 13:36 ` Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as
  opposed to host memory), the Endpoint must not send 10-Bit Tag
  Requests to another given Endpoint unless an implementation-specific
  mechanism determines that the Endpoint supports 10-Bit Tag Completer
  capability.

Add sriov_vf_10bit_tag file to query the status of VF 10-Bit Tag
Requester Enable.

Add a sriov_vf_10bit_tag_ctl sysfs file, write 0 to disable the VF
10-Bit Tag Requester. The typical use case is for p2pdma when the peer
device does not support 10-Bit Tag Completer. Write 1 to enable 10-Bit
Tag Requester when RC supports 10-Bit Tag Completer capability. The
typical use case is for host memory targeted by DMA Requests.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 23 +++++++++++
 drivers/pci/iov.c                       | 53 +++++++++++++++++++++++++
 2 files changed, 76 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 0c26346d1069..28b1f71df620 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -440,3 +440,26 @@ Description:
 		10-Bit Tag Completer. Write 1 to enable 10-Bit Tag Requester
 		when RC supports 10-Bit Tag Completer capability. The typical
 		use case is for host memory targeted by DMA Requests.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag
+Date:		September 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		This file is associated with a SR-IOV physical function (PF).
+		It is visible when the device supports VF 10-Bit Tag Requester.
+		It contains the status of VF 10-Bit Tag Requester Enable.
+		The file is read-only.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl
+Date:		September 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		This file is associated with a SR-IOV virtual function (VF).
+		It is visible when the device supports VF 10-Bit Tag
+		Requester. The file is only writeable when the VF driver
+		does not bind to a device. Write 0 to any VF's file disables
+		10-Bit Tag Requester for all VFs. The typical use case is for
+		p2pdma when the peer device does not support 10-Bit Tag
+		Completer. Write 1 to enable 10-Bit Tag Requester for all VFs
+		when RC supports 10-Bit Tag Completer capability. The typical
+		use case is for host memory targeted by DMA Requests.
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index dafdc652fcd0..7781b87a940c 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -220,10 +220,41 @@ static ssize_t sriov_vf_msix_count_store(struct device *dev,
 static DEVICE_ATTR_WO(sriov_vf_msix_count);
 #endif
 
+static ssize_t sriov_vf_10bit_tag_ctl_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
+{
+	struct pci_dev *vf_dev = to_pci_dev(dev);
+	struct pci_dev *pdev = pci_physfn(vf_dev);
+	bool enable;
+
+	if (kstrtobool(buf, &enable) < 0)
+		return -EINVAL;
+
+	if (vf_dev->driver)
+		return -EBUSY;
+
+	if (enable) {
+		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
+			return -EPERM;
+
+		pdev->sriov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+	} else {
+		pdev->sriov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+	}
+
+	pci_write_config_word(pdev, pdev->sriov->pos + PCI_SRIOV_CTRL,
+				      pdev->sriov->ctrl);
+
+	return count;
+}
+static DEVICE_ATTR_WO(sriov_vf_10bit_tag_ctl);
+
 static struct attribute *sriov_vf_dev_attrs[] = {
 #ifdef CONFIG_PCI_MSI
 	&dev_attr_sriov_vf_msix_count.attr,
 #endif
+	&dev_attr_sriov_vf_10bit_tag_ctl.attr,
 	NULL,
 };
 
@@ -236,6 +267,11 @@ static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
 	if (!pdev->is_virtfn)
 		return 0;
 
+	pdev = pci_physfn(pdev);
+	if ((a == &dev_attr_sriov_vf_10bit_tag_ctl.attr) &&
+	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
+		return 0;
+
 	return a->mode;
 }
 
@@ -487,12 +523,23 @@ static ssize_t sriov_drivers_autoprobe_store(struct device *dev,
 	return count;
 }
 
+static ssize_t sriov_vf_10bit_tag_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	return sysfs_emit(buf, "%u\n",
+		!!(pdev->sriov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN));
+}
+
 static DEVICE_ATTR_RO(sriov_totalvfs);
 static DEVICE_ATTR_RW(sriov_numvfs);
 static DEVICE_ATTR_RO(sriov_offset);
 static DEVICE_ATTR_RO(sriov_stride);
 static DEVICE_ATTR_RO(sriov_vf_device);
 static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
+static DEVICE_ATTR_RO(sriov_vf_10bit_tag);
 
 static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_totalvfs.attr,
@@ -501,6 +548,7 @@ static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_stride.attr,
 	&dev_attr_sriov_vf_device.attr,
 	&dev_attr_sriov_drivers_autoprobe.attr,
+	&dev_attr_sriov_vf_10bit_tag.attr,
 #ifdef CONFIG_PCI_MSI
 	&dev_attr_sriov_vf_total_msix.attr,
 #endif
@@ -511,10 +559,15 @@ static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
 					  struct attribute *a, int n)
 {
 	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
 
 	if (!dev_is_pf(dev))
 		return 0;
 
+	if ((a == &dev_attr_sriov_vf_10bit_tag.attr) &&
+	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
+		return 0;
+
 	return a->mode;
 }
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V9 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (4 preceding siblings ...)
  2021-09-22 13:36 ` [PATCH V9 5/8] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
@ 2021-09-22 13:36 ` Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu
  7 siblings, 0 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
10-Bit Tag Requester doesn't interact with a device that does not
support 10-Bit Tag Completer. Before that happens, the kernel should
emit a warning.

"echo 0 > /sys/bus/pci/devices/.../10bit_tag" to disable 10-Bit Tag
Requester for PF device.

"echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
10-Bit Tag Requester for VF device.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
---
 drivers/pci/p2pdma.c | 48 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index 50cdde3e9a8b..804e390f4c22 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -19,6 +19,7 @@
 #include <linux/random.h>
 #include <linux/seq_buf.h>
 #include <linux/xarray.h>
+#include "pci.h"
 
 enum pci_p2pdma_map_type {
 	PCI_P2PDMA_MAP_UNKNOWN = 0,
@@ -410,6 +411,50 @@ static unsigned long map_types_idx(struct pci_dev *client)
 		(client->bus->number << 8) | client->devfn;
 }
 
+static bool pci_10bit_tags_unsupported(struct pci_dev *a,
+				       struct pci_dev *b,
+				       bool verbose)
+{
+	bool req;
+	bool comp;
+	u16 ctl;
+	const char *str = "10bit_tag";
+
+	if (a->is_virtfn) {
+#ifdef CONFIG_PCI_IOV
+		req = !!(a->physfn->sriov->ctrl &
+			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
+#endif
+	} else {
+		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl);
+		req = !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+
+	comp = !!(b->devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
+
+	/* 10-bit tags not enabled on requester */
+	if (!req)
+		return false;
+
+	 /* Completer can handle anything */
+	if (comp)
+		return false;
+
+	if (!verbose)
+		return true;
+
+	pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set for this device, but peer device (%s) does not support the 10-Bit Tag Completer\n",
+		 pci_name(b));
+
+	if (a->is_virtfn)
+		str = "sriov_vf_10bit_tag_ctl";
+
+	pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/%s\n",
+		 pci_name(a), str);
+
+	return true;
+}
+
 /*
  * Calculate the P2PDMA mapping type and distance between two PCI devices.
  *
@@ -532,6 +577,9 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
 		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
 	}
 done:
+	if (pci_10bit_tags_unsupported(client, provider, verbose))
+		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
+
 	rcu_read_lock();
 	p2pdma = rcu_dereference(provider->p2pdma);
 	if (p2pdma)
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V9 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (5 preceding siblings ...)
  2021-09-22 13:36 ` [PATCH V9 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
@ 2021-09-22 13:36 ` Dongdong Liu
  2021-09-22 13:36 ` [PATCH V9 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu
  7 siblings, 0 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
10-Bit Tag Capabilities" Implementation Note:

  For platforms where the RC supports 10-Bit Tag Completer capability,
  it is highly recommended for platform firmware or operating software
  that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
  bit automatically in Endpoints with 10-Bit Tag Requester capability.
  This enables the important class of 10-Bit Tag capable adapters that
  send Memory Read Requests only to host memory.

It's safe to enable 10-bit tags for all devices below a Root Port that
supports them. Switches that lack 10-Bit Tag Completer capability are
still able to forward NPRs and Completions carrying 10-Bit Tags correctly,
since the two new Tag bits are in TLP Header bits that were formerly
Reserved.

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as opposed
  to host memory), the Endpoint must not send 10-Bit Tag Requests to
  another given Endpoint unless an implementation-specific mechanism
  determines that the Endpoint supports 10-Bit Tag Completer capability.

It is not safe for P2P traffic if an Endpoint send 10-Bit Tag Requesters
to another Endpoint that does not support 10-Bit Tag Completer capability,
so we provide sysfs file to disable 10-Bit Tag Requester. Unbind the
device driver, set the sysfs file and then rebind the driver.

Add a kernel parameter pcie_tag_peer2peer that disables 10-Bit Tag
Requester for all PCIe devices. This configuration allows peer-to-peer
DMA between any pair of devices, possibly at the cost of reduced
performance.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 .../admin-guide/kernel-parameters.txt         |  5 ++
 drivers/pci/iov.c                             |  3 ++
 drivers/pci/pci-sysfs.c                       |  3 ++
 drivers/pci/pci.c                             |  4 ++
 drivers/pci/pci.h                             |  7 +++
 drivers/pci/probe.c                           | 46 ++++++++++++++++++-
 6 files changed, 67 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 91ba391f9b32..54fddae7b2d9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3979,6 +3979,11 @@
 				any pair of devices, possibly at the cost of
 				reduced performance.  This also guarantees
 				that hot-added devices will work.
+		pcie_tag_peer2peer	Disable 10-Bit Tag Requester for all
+				PCIe devices. This configuration allows
+				peer-to-peer DMA between any pair of devices,
+				possibly at the cost of reduced performance.
+
 		cbiosize=nn[KMG]	The fixed amount of bus space which is
 				reserved for the CardBus bridge's IO window.
 				The default value is 256 bytes.
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 7781b87a940c..4fd1fca69457 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -234,6 +234,9 @@ static ssize_t sriov_vf_10bit_tag_ctl_store(struct device *dev,
 	if (vf_dev->driver)
 		return -EBUSY;
 
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		return -EPERM;
+
 	if (enable) {
 		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
 			return -EPERM;
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 7be65e919d33..edb06bb9600e 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -319,6 +319,9 @@ static ssize_t pci_10bit_tag_store(struct device *dev,
 	if (pdev->driver)
 		return -EBUSY;
 
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		return -EPERM;
+
 	if (enable) {
 		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
 			return -EPERM;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 64138a83b0f7..46faf2e8c8ab 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -118,6 +118,8 @@ enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_PEER2PEER;
 enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_DEFAULT;
 #endif
 
+enum pcie_tag_config_types pcie_tag_config = PCIE_TAG_DEFAULT;
+
 /*
  * The default CLS is used if arch didn't set CLS explicitly and not
  * all pci devices agree on the same value.  Arch can override either
@@ -6795,6 +6797,8 @@ static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+			} else if (!strncmp(str, "pcie_tag_peer2peer", 18)) {
+				pcie_tag_config = PCIE_TAG_PEER2PEER;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index f719a41dfc7f..7846aa7b85dc 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -59,6 +59,13 @@ struct pci_cap_saved_state *pci_find_saved_cap(struct pci_dev *dev, char cap);
 struct pci_cap_saved_state *pci_find_saved_ext_cap(struct pci_dev *dev,
 						   u16 cap);
 
+enum pcie_tag_config_types {
+	PCIE_TAG_DEFAULT,   /* Enable 10-Bit Tag Requester for devices below
+			       Root Port that support 10-Bit Tag Completer. */
+	PCIE_TAG_PEER2PEER  /* Disable 10-Bit Tag Requester for all devices. */
+};
+extern enum pcie_tag_config_types pcie_tag_config;
+
 #define PCI_PM_D2_DELAY         200	/* usec; see PCIe r4.0, sec 5.9.1 */
 #define PCI_PM_D3HOT_WAIT       10	/* msec */
 #define PCI_PM_D3COLD_WAIT      100	/* msec */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 705dd4e85df5..ab18fac5d54d 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2056,10 +2056,30 @@ bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev)
 	return true;
 }
 
+static void pci_configure_10bit_tags(struct pci_dev *dev)
+{
+	/*
+	 * PCIe 5.0 section 9.3.5.10 10-Bit Tag Requester Enable in Device
+	 * Control 2 Register is RsvdP for VF.
+	 */
+	if (dev->is_virtfn)
+		return;
+
+	if (!pcie_rp_10bit_tag_cmp_supported(dev))
+		return;
+
+	if (pci_pcie_type(dev) != PCI_EXP_TYPE_ENDPOINT)
+		return;
+
+	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
+	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+				PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u16 ctl;
+	u16 ctl, ctl2;
 	int ret;
 
 	if (!pci_is_pcie(dev))
@@ -2072,6 +2092,10 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 	if (ret)
 		return 0;
 
+	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL2, &ctl2);
+	if (ret)
+		return 0;
+
 	host = pci_find_host_bridge(dev->bus);
 	if (!host)
 		return 0;
@@ -2086,6 +2110,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
 						   PCI_EXP_DEVCTL_EXT_TAG);
 		}
+
+		if (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN) {
+			pci_info(dev, "disabling 10-Bit Tags\n");
+			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+		}
 		return 0;
 	}
 
@@ -2094,6 +2124,20 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
 					 PCI_EXP_DEVCTL_EXT_TAG);
 	}
+
+	if ((pcie_tag_config == PCIE_TAG_PEER2PEER) &&
+	    (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)) {
+		pci_info(dev, "disabling 10-Bit Tags\n");
+		pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+		return 0;
+	}
+
+	if (pcie_tag_config != PCIE_TAG_DEFAULT)
+		return 0;
+
+	pci_configure_10bit_tags(dev);
+
 	return 0;
 }
 
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V9 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices
  2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (6 preceding siblings ...)
  2021-09-22 13:36 ` [PATCH V9 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
@ 2021-09-22 13:36 ` Dongdong Liu
  7 siblings, 0 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-22 13:36 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Enable 10-Bit Tag Requester for the VF devices below the
Root Port that support 10-Bit Tag Completer.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/pci/iov.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 4fd1fca69457..28dd6875e2cb 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -690,6 +690,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 	pci_iov_set_numvfs(dev, nr_virtfn);
 	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+
+	if ((pcie_tag_config == PCIE_TAG_DEFAULT) &&
+	    (iov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ) &&
+	    pcie_rp_10bit_tag_cmp_supported(dev))
+		iov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	msleep(100);
@@ -706,6 +715,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 err_pcibios:
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
@@ -738,6 +748,7 @@ static void sriov_disable(struct pci_dev *dev)
 
 	sriov_del_vfs(dev);
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
-- 
2.22.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH V9 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices
  2021-09-22 13:36 ` [PATCH V9 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices Dongdong Liu
@ 2021-09-23  4:21   ` Krzysztof Wilczyński
  2021-09-23 11:06     ` Dongdong Liu
  0 siblings, 1 reply; 11+ messages in thread
From: Krzysztof Wilczyński @ 2021-09-23  4:21 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: helgaas, hch, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

Hi,

Thank you for sending the patch over!  A few small comments below.

[...]
> +static ssize_t pci_10bit_tag_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	bool enable;

Would you mind adding the following capabilities check here?

	if (!capable(CAP_SYS_ADMIN))
		return -EPERM;

This is so we make sure that whatever user is going to use this sysfs
attribute actually has enough permissions to update this value safely.

> +	if (kstrtobool(buf, &enable) < 0)
> +		return -EINVAL;
> +
> +	if (pdev->driver)
> +		return -EBUSY;
> +
> +	if (enable) {
> +		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
> +			return -EPERM;

Would it make sense to also verify 10-Bit Tag Completer support on the
"disable" path too?   We won't be able to set a value if there is no
support, but nothing will stop us from clearing it regardless - unless
this would be safe to do?  What do you think?

> +		pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2,
> +				PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +	} else {
> +		pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
> +				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +	}
> +
> +	return count;
> +}

[...]
> +> +static umode_t pcie_dev_10bit_tag_attrs_are_visible(struct kobject *kobj,
> +					  struct attribute *a, int n)

The preferred function name for the .is_visible() callback in a case when
there is only a single sysfs attribute being added would be:

  pcie_dev_10bit_tag_attr_is_visible()

Albeit, I appreciate that you followed the existing naming pattern.

	Krzysztof

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V9 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices
  2021-09-23  4:21   ` Krzysztof Wilczyński
@ 2021-09-23 11:06     ` Dongdong Liu
  0 siblings, 0 replies; 11+ messages in thread
From: Dongdong Liu @ 2021-09-23 11:06 UTC (permalink / raw)
  To: Krzysztof Wilczyński
  Cc: helgaas, hch, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

Hi Krzysztof

Many thanks for your review.
On 2021/9/23 12:21, Krzysztof Wilczyński wrote:
> Hi,
>
> Thank you for sending the patch over!  A few small comments below.
>
> [...]
>> +static ssize_t pci_10bit_tag_store(struct device *dev,
>> +				   struct device_attribute *attr,
>> +				   const char *buf, size_t count)
>> +{
>> +	struct pci_dev *pdev = to_pci_dev(dev);
>> +	bool enable;
>
> Would you mind adding the following capabilities check here?
OK, will do.
>
> 	if (!capable(CAP_SYS_ADMIN))
> 		return -EPERM;
>
> This is so we make sure that whatever user is going to use this sysfs
> attribute actually has enough permissions to update this value safely.
>
>> +	if (kstrtobool(buf, &enable) < 0)
>> +		return -EINVAL;
>> +
>> +	if (pdev->driver)
>> +		return -EBUSY;
>> +
>> +	if (enable) {
>> +		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
>> +			return -EPERM;
>
> Would it make sense to also verify 10-Bit Tag Completer support on the
> "disable" path too?   We won't be able to set a value if there is no
> support, but nothing will stop us from clearing it regardless - unless
> this would be safe to do?  What do you think?
Seems make sense, Will do. It is better do the same thing on the
"disable" path too.
>
>> +		pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2,
>> +				PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +	} else {
>> +		pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
>> +				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +	}
>> +
>> +	return count;
>> +}
>
> [...]
>> +> +static umode_t pcie_dev_10bit_tag_attrs_are_visible(struct kobject *kobj,
>> +					  struct attribute *a, int n)
>
> The preferred function name for the .is_visible() callback in a case when
> there is only a single sysfs attribute being added would be:
>
>   pcie_dev_10bit_tag_attr_is_visible()
Will fix.

Thanks,
Dongdong
>
> Albeit, I appreciate that you followed the existing naming pattern.
>
> 	Krzysztof
> .
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-23 11:06 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22 13:36 [PATCH V9 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
2021-09-22 13:36 ` [PATCH V9 1/8] PCI: Use cached devcap in more places Dongdong Liu
2021-09-22 13:36 ` [PATCH V9 2/8] PCI: Cache Device Capabilities 2 Register Dongdong Liu
2021-09-22 13:36 ` [PATCH V9 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
2021-09-22 13:36 ` [PATCH V9 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices Dongdong Liu
2021-09-23  4:21   ` Krzysztof Wilczyński
2021-09-23 11:06     ` Dongdong Liu
2021-09-22 13:36 ` [PATCH V9 5/8] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
2021-09-22 13:36 ` [PATCH V9 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
2021-09-22 13:36 ` [PATCH V9 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
2021-09-22 13:36 ` [PATCH V9 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.