linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices
@ 2021-08-18 13:01 Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 1/8] PCI: Use cached Device Capabilities Register Dongdong Liu
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

This patchset is to enable 10-Bit tag for PCIe EP devices (include VF).

V7->V8:
- Add a kernel parameter pcie_tag_peer2peer to disable 10-bit tags.
- Provide sysfs file to enable 10-bit tags.
- Remove [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices.
- Rebased on v5.14-rc6.
- Fix some other comments. Thanks to Bjorn who gave a lot of review comments.

V6->V7:
- Rebased on v5.14-rc3.
- Change the "pci=disable_10bit_tag=" parameter to sysfs file to disable
  10-Bit Tag Requester when need for p2pdma suggested by Leon.
- Fix comment for p2pdma 10-bit tag check.

V5->V6:
- Rebased on v5.14-rc2.
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V6 2/8].
- PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support.
- Add a 10-bit tag check in P2PDMA.
- Simplified implementation in [PATCH V6 6/8].
- Fix some comments in [PATCH V6 4/8].

V4->V5:
- Fix warning variable 'capa' is uninitialized.
- Fix warning unused variable 'pchild'.

V3->V4:
- Get the value of pcie_devcap2 in set_pcie_port_type().
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V4 1/6],
  [PATCH V4 3/6], [PATCH V4 4/6], [PATCH V4 5/6].
- Fix some code style.
- Rebased on v5.13-rc6.

V2->V3:
- Use cached Device Capabilities Register suggested by Christoph.
- Fix code style to avoid > 80 char lines.
- Rename devcap2 to pcie_devcap2.

V1->V2: Fix some comments by Christoph.
- Store the devcap2 value in the pci_dev instead of reading it multiple
  times.
- Change pci_info to pci_dbg to avoid the noisy log.
- Rename ext_10bit_tag_comp_path to ext_10bit_tag.
- Fix the compile error.
- Rebased on v5.13-rc1.

Dongdong Liu (8):
  PCI: Use cached Device Capabilities Register
  PCI: Use cached Device Capabilities 2 Register
  PCI: Add 10-Bit Tag register definitions
  PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices
  PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices

 Documentation/ABI/testing/sysfs-bus-pci         | 41 ++++++++++++-
 Documentation/admin-guide/kernel-parameters.txt |  5 ++
 drivers/media/pci/cobalt/cobalt-driver.c        |  5 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  4 +-
 drivers/pci/iov.c                               | 68 +++++++++++++++++++++
 drivers/pci/p2pdma.c                            | 48 +++++++++++++++
 drivers/pci/pci-sysfs.c                         | 79 +++++++++++++++++++++++++
 drivers/pci/pci.c                               | 18 +++---
 drivers/pci/pci.h                               | 10 ++++
 drivers/pci/pcie/aspm.c                         | 11 ++--
 drivers/pci/probe.c                             | 79 ++++++++++++++++++++-----
 drivers/pci/quirks.c                            |  3 +-
 include/linux/pci.h                             |  3 +
 include/uapi/linux/pci_regs.h                   |  5 ++
 14 files changed, 341 insertions(+), 38 deletions(-)

--
2.7.4


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH V8 1/8] PCI: Use cached Device Capabilities Register
  2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
@ 2021-08-18 13:01 ` Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 2/8] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

It will make sense to store the pcie_devcap value in the pci_dev
structure instead of reading Device Capabilities Register multiple
times. The fisrt place to use pcie_devcap is in set_pcie_port_type(),
get the pcie_devcap value here, then use cached pcie_devcap in the
needed place.

Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/media/pci/cobalt/cobalt-driver.c |  5 +++--
 drivers/pci/pci.c                        |  5 +----
 drivers/pci/pcie/aspm.c                  | 11 ++++-------
 drivers/pci/probe.c                      | 11 +++--------
 drivers/pci/quirks.c                     |  3 +--
 include/linux/pci.h                      |  1 +
 6 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/drivers/media/pci/cobalt/cobalt-driver.c b/drivers/media/pci/cobalt/cobalt-driver.c
index 16af58f..23c6436 100644
--- a/drivers/media/pci/cobalt/cobalt-driver.c
+++ b/drivers/media/pci/cobalt/cobalt-driver.c
@@ -193,11 +193,12 @@ void cobalt_pcie_status_show(struct cobalt *cobalt)
 		return;
 
 	/* Device */
-	pcie_capability_read_dword(pci_dev, PCI_EXP_DEVCAP, &capa);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVCTL, &ctrl);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVSTA, &stat);
 	cobalt_info("PCIe device capability 0x%08x: Max payload %d\n",
-		    capa, get_payload_size(capa & PCI_EXP_DEVCAP_PAYLOAD));
+		    pci_dev->pcie_devcap,
+		    get_payload_size(pci_dev->pcie_devcap &
+				     PCI_EXP_DEVCAP_PAYLOAD));
 	cobalt_info("PCIe device control 0x%04x: Max payload %d. Max read request %d\n",
 		    ctrl,
 		    get_payload_size((ctrl & PCI_EXP_DEVCTL_PAYLOAD) >> 5),
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index aacf575..dc3bfb2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4630,13 +4630,10 @@ EXPORT_SYMBOL(pci_wait_for_pending_transaction);
  */
 bool pcie_has_flr(struct pci_dev *dev)
 {
-	u32 cap;
-
 	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
 		return false;
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	return cap & PCI_EXP_DEVCAP_FLR;
+	return dev->pcie_devcap & PCI_EXP_DEVCAP_FLR;
 }
 EXPORT_SYMBOL_GPL(pcie_has_flr);
 
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 013a47f..db944f6 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -660,7 +660,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 
 	/* Get and check endpoint acceptable latencies */
 	list_for_each_entry(child, &linkbus->devices, bus_list) {
-		u32 reg32, encoding;
+		u32 encoding;
 		struct aspm_latency *acceptable =
 			&link->acceptable[PCI_FUNC(child->devfn)];
 
@@ -668,12 +668,11 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 		    pci_pcie_type(child) != PCI_EXP_TYPE_LEG_END)
 			continue;
 
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
 		/* Calculate endpoint L0s acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L0S) >> 6;
+		encoding = (child->pcie_devcap & PCI_EXP_DEVCAP_L0S) >> 6;
 		acceptable->l0s = calc_l0s_acceptable(encoding);
 		/* Calculate endpoint L1 acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L1) >> 9;
+		encoding = (child->pcie_devcap & PCI_EXP_DEVCAP_L1) >> 9;
 		acceptable->l1 = calc_l1_acceptable(encoding);
 
 		pcie_aspm_check_latency(child);
@@ -808,7 +807,6 @@ static void free_link_state(struct pcie_link_state *link)
 static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 {
 	struct pci_dev *child;
-	u32 reg32;
 
 	/*
 	 * Some functions in a slot might not all be PCIe functions,
@@ -831,8 +829,7 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
 		 * RBER bit to determine if a function is 1.1 version device
 		 */
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
-		if (!(reg32 & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
+		if (!(child->pcie_devcap & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
 			pci_info(child, "disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'\n");
 			return -EINVAL;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 79177ac..cc700f6 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1498,8 +1498,8 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pdev->pcie_cap = pos;
 	pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
 	pdev->pcie_flags_reg = reg16;
-	pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, &reg16);
-	pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
+	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->pcie_devcap);
+	pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
 
 	parent = pci_upstream_bridge(pdev);
 	if (!parent)
@@ -2031,18 +2031,13 @@ static void pci_configure_mps(struct pci_dev *dev)
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u32 cap;
 	u16 ctl;
 	int ret;
 
 	if (!pci_is_pcie(dev))
 		return 0;
 
-	ret = pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	if (ret)
-		return 0;
-
-	if (!(cap & PCI_EXP_DEVCAP_EXT_TAG))
+	if (!(dev->pcie_devcap & PCI_EXP_DEVCAP_EXT_TAG))
 		return 0;
 
 	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL, &ctl);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6d74386..2b405c5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5173,8 +5173,7 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
 		pdev->pcie_cap = pos;
 		pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
 		pdev->pcie_flags_reg = reg16;
-		pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, &reg16);
-		pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
+		pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
 
 		pdev->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
 		if (pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status) !=
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 540b377..aee7c85 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -340,6 +340,7 @@ struct pci_dev {
 	u8		rom_base_reg;	/* Config register controlling ROM */
 	u8		pin;		/* Interrupt pin this device uses */
 	u16		pcie_flags_reg;	/* Cached PCIe Capabilities Register */
+	u32		pcie_devcap;	/* Cached Device Capabilities Register */
 	unsigned long	*dma_alias_mask;/* Mask of enabled devfn aliases */
 
 	struct pci_driver *driver;	/* Driver bound to this device */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V8 2/8] PCI: Use cached Device Capabilities 2 Register
  2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 1/8] PCI: Use cached Device Capabilities Register Dongdong Liu
@ 2021-08-18 13:01 ` Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

It will make sense to store the pcie_devcap2 value in the pci_dev
structure instead of reading Device Capabilities 2 Register multiple
times. Get the pcie_devcap2 value set_pcie_port_type(), then use
cached pcie_devcap2 in the needed place.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  4 +---
 drivers/pci/pci.c                               |  9 ++++-----
 drivers/pci/probe.c                             | 10 ++++------
 include/linux/pci.h                             |  2 ++
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index dbf9a0e..a8e1e22 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6304,7 +6304,6 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		struct pci_dev *pbridge;
 		struct port_info *pi;
 		char name[IFNAMSIZ];
-		u32 devcap2;
 		u16 flags;
 
 		/* If we want to instantiate Virtual Functions, then our
@@ -6314,10 +6313,9 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		 */
 		pbridge = pdev->bus->self;
 		pcie_capability_read_word(pbridge, PCI_EXP_FLAGS, &flags);
-		pcie_capability_read_dword(pbridge, PCI_EXP_DEVCAP2, &devcap2);
 
 		if ((flags & PCI_EXP_FLAGS_VERS) < 2 ||
-		    !(devcap2 & PCI_EXP_DEVCAP2_ARI)) {
+		    !(pbridge->pcie_devcap2 & PCI_EXP_DEVCAP2_ARI)) {
 			/* Our parent bridge does not support ARI so issue a
 			 * warning and skip instantiating the VFs.  They
 			 * won't be reachable.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index dc3bfb2..d14c573 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3700,7 +3700,7 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 {
 	struct pci_bus *bus = dev->bus;
 	struct pci_dev *bridge;
-	u32 cap, ctl2;
+	u32 ctl2;
 
 	if (!pci_is_pcie(dev))
 		return -EINVAL;
@@ -3724,19 +3724,18 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 	while (bus->parent) {
 		bridge = bus->self;
 
-		pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
-
 		switch (pci_pcie_type(bridge)) {
 		/* Ensure switch ports support AtomicOp routing */
 		case PCI_EXP_TYPE_UPSTREAM:
 		case PCI_EXP_TYPE_DOWNSTREAM:
-			if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+			if (!(bridge->pcie_devcap2 &
+			      PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
 				return -EINVAL;
 			break;
 
 		/* Ensure root port supports all the sizes we care about */
 		case PCI_EXP_TYPE_ROOT_PORT:
-			if ((cap & cap_mask) != cap_mask)
+			if ((bridge->pcie_devcap2 & cap_mask) != cap_mask)
 				return -EINVAL;
 			break;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index cc700f6..c83245b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1500,6 +1500,7 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pdev->pcie_flags_reg = reg16;
 	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->pcie_devcap);
 	pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
+	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP2, &pdev->pcie_devcap2);
 
 	parent = pci_upstream_bridge(pdev);
 	if (!parent)
@@ -2116,7 +2117,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 #ifdef CONFIG_PCIEASPM
 	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 	struct pci_dev *bridge;
-	u32 cap, ctl;
+	u32 ctl;
 
 	if (!pci_is_pcie(dev))
 		return;
@@ -2124,8 +2125,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 	/* Read L1 PM substate capabilities */
 	dev->l1ss = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS);
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_LTR))
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_LTR))
 		return;
 
 	pcie_capability_read_dword(dev, PCI_EXP_DEVCTL2, &ctl);
@@ -2165,13 +2165,11 @@ static void pci_configure_eetlp_prefix(struct pci_dev *dev)
 #ifdef CONFIG_PCI_PASID
 	struct pci_dev *bridge;
 	int pcie_type;
-	u32 cap;
 
 	if (!pci_is_pcie(dev))
 		return;
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_EE_PREFIX))
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_EE_PREFIX))
 		return;
 
 	pcie_type = pci_pcie_type(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index aee7c85..9aab67f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -341,6 +341,8 @@ struct pci_dev {
 	u8		pin;		/* Interrupt pin this device uses */
 	u16		pcie_flags_reg;	/* Cached PCIe Capabilities Register */
 	u32		pcie_devcap;	/* Cached Device Capabilities Register */
+	u32		pcie_devcap2;	/* Cached Device Capabilities 2
+					   Register */
 	unsigned long	*dma_alias_mask;/* Mask of enabled devfn aliases */
 
 	struct pci_driver *driver;	/* Driver bound to this device */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V8 3/8] PCI: Add 10-Bit Tag register definitions
  2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 1/8] PCI: Use cached Device Capabilities Register Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 2/8] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
@ 2021-08-18 13:01 ` Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices Dongdong Liu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add 10-Bit Tag register definitions for use in subsequen patches.
See the PCIe 5.0 spec section 7.5.3.15 and 9.3.3.2.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 include/uapi/linux/pci_regs.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index e709ae8..cf1ddb8 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -648,6 +648,8 @@
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_COMP 0x00010000 /* 10-Bit Tag Completer Supported */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_REQ  0x00020000 /* 10-Bit Tag Requester Supported */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
 #define  PCI_EXP_DEVCAP2_OBFF_WAKE	0x00080000 /* Re-use WAKE# for OBFF */
@@ -661,6 +663,7 @@
 #define  PCI_EXP_DEVCTL2_IDO_REQ_EN	0x0100	/* Allow IDO for requests */
 #define  PCI_EXP_DEVCTL2_IDO_CMP_EN	0x0200	/* Allow IDO for completions */
 #define  PCI_EXP_DEVCTL2_LTR_EN		0x0400	/* Enable LTR mechanism */
+#define  PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN 0x1000 /* 10-Bit Tag Requester Enable */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGA_EN	0x2000	/* Enable OBFF Message type A */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGB_EN	0x4000	/* Enable OBFF Message type B */
 #define  PCI_EXP_DEVCTL2_OBFF_WAKE_EN	0x6000	/* OBFF using WAKE# signaling */
@@ -931,6 +934,7 @@
 /* Single Root I/O Virtualization */
 #define PCI_SRIOV_CAP		0x04	/* SR-IOV Capabilities */
 #define  PCI_SRIOV_CAP_VFM	0x00000001  /* VF Migration Capable */
+#define  PCI_SRIOV_CAP_VF_10BIT_TAG_REQ	0x00000004 /* VF 10-Bit Tag Requester Supported */
 #define  PCI_SRIOV_CAP_INTR(x)	((x) >> 21) /* Interrupt Message Number */
 #define PCI_SRIOV_CTRL		0x08	/* SR-IOV Control */
 #define  PCI_SRIOV_CTRL_VFE	0x0001	/* VF Enable */
@@ -938,6 +942,7 @@
 #define  PCI_SRIOV_CTRL_INTR	0x0004	/* VF Migration Interrupt Enable */
 #define  PCI_SRIOV_CTRL_MSE	0x0008	/* VF Memory Space Enable */
 #define  PCI_SRIOV_CTRL_ARI	0x0010	/* ARI Capable Hierarchy */
+#define  PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN 0x0020 /* VF 10-Bit Tag Requester Enable */
 #define PCI_SRIOV_STATUS	0x0a	/* SR-IOV Status */
 #define  PCI_SRIOV_STATUS_VFM	0x0001	/* VF Migration Status */
 #define PCI_SRIOV_INITIAL_VF	0x0c	/* Initial VFs */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V8 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices
  2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (2 preceding siblings ...)
  2021-08-18 13:01 ` [PATCH V8 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
@ 2021-08-18 13:01 ` Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 5/8] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as
  opposed to host memory), the Endpoint must not send 10-Bit Tag
  Requests to another given Endpoint unless an implementation-specific
  mechanism determines that the Endpoint supports 10-Bit Tag Completer
  capability.

Add a 10bit_tag sysfs file, write 0 to disable 10-Bit Tag Requester
when the driver does not bind the deivce. The typical use case is for
p2pdma when the peer device does not support 10-Bit Tag Completer.
Write 1 to enable 10-Bit Tag Requester when RC supports 10-Bit Tag
Completer capability. The typical use case is for host memory targeted
by DMA Requests. The 10bit_tag file content indicate current status of
10-Bit Tag Requester Enable.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 18 +++++++-
 drivers/pci/pci-sysfs.c                 | 76 +++++++++++++++++++++++++++++++++
 drivers/pci/pci.h                       |  2 +
 drivers/pci/probe.c                     | 14 ++++++
 4 files changed, 109 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 793cbb7..759682f 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -139,7 +139,7 @@ Description:
 		binary file containing the Vital Product Data for the
 		device.  It should follow the VPD format defined in
 		PCI Specification 2.1 or 2.2, but users should consider
-		that some devices may have incorrectly formatted data.  
+		that some devices may have incorrectly formatted data.
 		If the underlying VPD has a writable section then the
 		corresponding section of this file will be writable.
 
@@ -407,3 +407,19 @@ Description:
 
 		The file is writable if the PF is bound to a driver that
 		implements ->sriov_set_msix_vec_count().
+
+What:		/sys/bus/pci/devices/.../10bit_tag
+Date:		August 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		The file will be visible when the device supports 10-Bit Tag
+		Requester. The file is readable, the value indicate current
+		status of 10-Bit Tag Requester Enable.
+		1 - enabled, 0 - disabled.
+
+		The file is also writable, write 0 to disable 10-Bit Tag
+		Requester when the driver does not bind the device. The typical
+		use case is for p2pdma when the peer device does not support
+		10-Bit Tag Completer. Write 1 to enable 10-Bit Tag Requester
+		when RC supports 10-Bit Tag Completer capability. The typical
+		use case is for host memory targeted by DMA Requests.
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 5d63df7..52d0757 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -306,6 +306,53 @@ static ssize_t enable_show(struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR_RW(enable);
 
+static ssize_t pci_10bit_tag_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	bool enable;
+
+	if (kstrtobool(buf, &enable) < 0)
+		return -EINVAL;
+
+	if (pdev->driver)
+		return -EBUSY;
+
+	if (enable) {
+		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
+			return -EPERM;
+
+		pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2,
+				PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	} else {
+		pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
+				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+
+	return count;
+}
+
+static ssize_t pci_10bit_tag_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 ctl;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	return sysfs_emit(buf, "%u\n",
+			  !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN));
+}
+
+static struct device_attribute dev_attr_10bit_tag = __ATTR(10bit_tag, 0644,
+							   pci_10bit_tag_show,
+							   pci_10bit_tag_store);
+
 #ifdef CONFIG_NUMA
 static ssize_t numa_node_store(struct device *dev,
 			       struct device_attribute *attr, const char *buf,
@@ -635,6 +682,11 @@ static struct attribute *pcie_dev_attrs[] = {
 	NULL,
 };
 
+static struct attribute *pcie_dev_10bit_tag_attrs[] = {
+	&dev_attr_10bit_tag.attr,
+	NULL,
+};
+
 static struct attribute *pcibus_attrs[] = {
 	&dev_attr_bus_rescan.attr,
 	&dev_attr_cpuaffinity.attr,
@@ -1482,6 +1534,24 @@ static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
 	return 0;
 }
 
+static umode_t pcie_dev_10bit_tag_attrs_are_visible(struct kobject *kobj,
+					  struct attribute *a, int n)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (pdev->is_virtfn)
+		return 0;
+
+	if (pci_pcie_type(pdev) != PCI_EXP_TYPE_ENDPOINT)
+		return 0;
+
+	if (!(pdev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
+		return 0;
+
+	return a->mode;
+}
+
 static const struct attribute_group pci_dev_group = {
 	.attrs = pci_dev_attrs,
 };
@@ -1521,6 +1591,11 @@ static const struct attribute_group pcie_dev_attr_group = {
 	.is_visible = pcie_dev_attrs_are_visible,
 };
 
+static const struct attribute_group pcie_dev_10bit_tag_attr_group = {
+	.attrs = pcie_dev_10bit_tag_attrs,
+	.is_visible = pcie_dev_10bit_tag_attrs_are_visible,
+};
+
 static const struct attribute_group *pci_dev_attr_groups[] = {
 	&pci_dev_attr_group,
 	&pci_dev_hp_attr_group,
@@ -1530,6 +1605,7 @@ static const struct attribute_group *pci_dev_attr_groups[] = {
 #endif
 	&pci_bridge_attr_group,
 	&pcie_dev_attr_group,
+	&pcie_dev_10bit_tag_attr_group,
 #ifdef CONFIG_PCIEAER
 	&aer_stats_attr_group,
 #endif
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 93dcdd4..dca5aaf 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -244,6 +244,8 @@ struct device *pci_get_host_bridge_device(struct pci_dev *dev);
 void pci_put_host_bridge_device(struct device *dev);
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign);
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev);
+
 bool pci_bus_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
 				int crs_timeout);
 bool pci_bus_generic_read_dev_vendor_id(struct pci_bus *bus, int devfn, u32 *pl,
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c83245b..f9bc562 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2029,6 +2029,20 @@ static void pci_configure_mps(struct pci_dev *dev)
 		 p_mps, mps, mpss);
 }
 
+bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev)
+{
+	struct pci_dev *root;
+
+	root = pcie_find_root_port(dev);
+	if (!root)
+		return false;
+
+	if (!(root->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
+		return false;
+
+	return true;
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V8 5/8] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (3 preceding siblings ...)
  2021-08-18 13:01 ` [PATCH V8 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices Dongdong Liu
@ 2021-08-18 13:01 ` Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as
  opposed to host memory), the Endpoint must not send 10-Bit Tag
  Requests to another given Endpoint unless an implementation-specific
  mechanism determines that the Endpoint supports 10-Bit Tag Completer
  capability.

Add sriov_vf_10bit_tag file to query the status of VF 10-Bit Tag
Requester Enable.

Add a sriov_vf_10bit_tag_ctl sysfs file, write 0 to disable the VF
10-Bit Tag Requester. The typical use case is for p2pdma when the peer
device does not support 10-Bit Tag Completer. Write 1 to enable 10-Bit
Tag Requester when RC supports 10-Bit Tag Completer capability. The
typical use case is for host memory targeted by DMA Requests.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 23 ++++++++++++++
 drivers/pci/iov.c                       | 54 +++++++++++++++++++++++++++++++++
 2 files changed, 77 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 759682f..692bf65 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -423,3 +423,26 @@ Description:
 		10-Bit Tag Completer. Write 1 to enable 10-Bit Tag Requester
 		when RC supports 10-Bit Tag Completer capability. The typical
 		use case is for host memory targeted by DMA Requests.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag
+Date:		August 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		This file is associated with a SR-IOV physical function (PF).
+		It is visible when the device supports VF 10-Bit Tag Requester.
+		It contains the status of VF 10-Bit Tag Requester Enable.
+		The file is read-only.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl
+Date:		August 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		This file is associated with a SR-IOV virtual function (VF).
+		It is visible when the device supports VF 10-Bit Tag
+		Requester. The file is only writeable when the VF driver
+		does not bind to a device. Write 0 to any VF's file disables
+		10-Bit Tag Requester for all VFs. The typical use case is for
+		p2pdma when the peer device does not support 10-Bit Tag
+		Completer. Write 1 to enable 10-Bit Tag Requester for all VFs
+		when RC supports 10-Bit Tag Completer capability. The typical
+		use case is for host memory targeted by DMA Requests.
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index dafdc65..bf15b33 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -220,10 +220,42 @@ static ssize_t sriov_vf_msix_count_store(struct device *dev,
 static DEVICE_ATTR_WO(sriov_vf_msix_count);
 #endif
 
+
+static ssize_t sriov_vf_10bit_tag_ctl_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
+{
+	struct pci_dev *vf_dev = to_pci_dev(dev);
+	struct pci_dev *pdev = pci_physfn(vf_dev);
+	bool enable;
+
+	if (kstrtobool(buf, &enable) < 0)
+		return -EINVAL;
+
+	if (vf_dev->driver)
+		return -EBUSY;
+
+	if (enable) {
+		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
+			return -EPERM;
+
+		pdev->sriov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+	} else {
+		pdev->sriov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+	}
+
+	pci_write_config_word(pdev, pdev->sriov->pos + PCI_SRIOV_CTRL,
+				      pdev->sriov->ctrl);
+
+	return count;
+}
+static DEVICE_ATTR_WO(sriov_vf_10bit_tag_ctl);
+
 static struct attribute *sriov_vf_dev_attrs[] = {
 #ifdef CONFIG_PCI_MSI
 	&dev_attr_sriov_vf_msix_count.attr,
 #endif
+	&dev_attr_sriov_vf_10bit_tag_ctl.attr,
 	NULL,
 };
 
@@ -236,6 +268,11 @@ static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
 	if (!pdev->is_virtfn)
 		return 0;
 
+	pdev = pci_physfn(pdev);
+	if ((a == &dev_attr_sriov_vf_10bit_tag_ctl.attr) &&
+	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
+		return 0;
+
 	return a->mode;
 }
 
@@ -487,12 +524,23 @@ static ssize_t sriov_drivers_autoprobe_store(struct device *dev,
 	return count;
 }
 
+static ssize_t sriov_vf_10bit_tag_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	return sysfs_emit(buf, "%u\n",
+		!!(pdev->sriov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN));
+}
+
 static DEVICE_ATTR_RO(sriov_totalvfs);
 static DEVICE_ATTR_RW(sriov_numvfs);
 static DEVICE_ATTR_RO(sriov_offset);
 static DEVICE_ATTR_RO(sriov_stride);
 static DEVICE_ATTR_RO(sriov_vf_device);
 static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
+static DEVICE_ATTR_RO(sriov_vf_10bit_tag);
 
 static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_totalvfs.attr,
@@ -501,6 +549,7 @@ static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_stride.attr,
 	&dev_attr_sriov_vf_device.attr,
 	&dev_attr_sriov_drivers_autoprobe.attr,
+	&dev_attr_sriov_vf_10bit_tag.attr,
 #ifdef CONFIG_PCI_MSI
 	&dev_attr_sriov_vf_total_msix.attr,
 #endif
@@ -511,10 +560,15 @@ static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
 					  struct attribute *a, int n)
 {
 	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
 
 	if (!dev_is_pf(dev))
 		return 0;
 
+	if ((a == &dev_attr_sriov_vf_10bit_tag.attr) &&
+	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
+		return 0;
+
 	return a->mode;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V8 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (4 preceding siblings ...)
  2021-08-18 13:01 ` [PATCH V8 5/8] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
@ 2021-08-18 13:01 ` Dongdong Liu
  2021-08-18 16:07   ` Logan Gunthorpe
  2021-08-18 13:01 ` [PATCH V8 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu
  7 siblings, 1 reply; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
10-Bit Tag Requester doesn't interact with a device that does not
support 10-Bit Tag Completer. Before that happens, the kernel should
emit a warning.

"echo 0 > /sys/bus/pci/devices/.../10bit_tag" to disable 10-Bit Tag
Requester for PF device.

"echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
10-Bit Tag Requester for VF device.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/pci/p2pdma.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index 50cdde3..2b9c2c9 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -19,6 +19,7 @@
 #include <linux/random.h>
 #include <linux/seq_buf.h>
 #include <linux/xarray.h>
+#include "pci.h"
 
 enum pci_p2pdma_map_type {
 	PCI_P2PDMA_MAP_UNKNOWN = 0,
@@ -410,6 +411,50 @@ static unsigned long map_types_idx(struct pci_dev *client)
 		(client->bus->number << 8) | client->devfn;
 }
 
+static bool pci_10bit_tags_unsupported(struct pci_dev *a,
+				       struct pci_dev *b,
+				       bool verbose)
+{
+	bool req;
+	bool comp;
+	u16 ctl;
+	const char *str = "10bit_tag";
+
+	if (a->is_virtfn) {
+#ifdef CONFIG_PCI_IOV
+		req = !!(a->physfn->sriov->ctrl &
+			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
+#endif
+	} else {
+		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl);
+		req = !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+
+	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
+
+	/* 10-bit tags not enabled on requester */
+	if (!req)
+		return false;
+
+	 /* Completer can handle anything */
+	if (comp)
+		return false;
+
+	if (!verbose)
+		return true;
+
+	pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set for this device, but peer device (%s) does not support the 10-Bit Tag Completer\n",
+		 pci_name(b));
+
+	if (a->is_virtfn)
+		str = "sriov_vf_10bit_tag_ctl";
+
+	pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/%s\n",
+		 pci_name(a), str);
+
+	return true;
+}
+
 /*
  * Calculate the P2PDMA mapping type and distance between two PCI devices.
  *
@@ -532,6 +577,9 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
 		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
 	}
 done:
+	if (pci_10bit_tags_unsupported(client, provider, verbose))
+		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
+
 	rcu_read_lock();
 	p2pdma = rcu_dereference(provider->p2pdma);
 	if (p2pdma)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V8 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device
  2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (5 preceding siblings ...)
  2021-08-18 13:01 ` [PATCH V8 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
@ 2021-08-18 13:01 ` Dongdong Liu
  2021-08-18 13:01 ` [PATCH V8 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu
  7 siblings, 0 replies; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
10-Bit Tag Capabilities" Implementation Note:

  For platforms where the RC supports 10-Bit Tag Completer capability,
  it is highly recommended for platform firmware or operating software
  that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
  bit automatically in Endpoints with 10-Bit Tag Requester capability.
  This enables the important class of 10-Bit Tag capable adapters that
  send Memory Read Requests only to host memory.

It's safe to enable 10-bit tags for all devices below a Root Port that
supports them. Switches that lack 10-Bit Tag Completer capability are
still able to forward NPRs and Completions carrying 10-Bit Tags correctly,
since the two new Tag bits are in TLP Header bits that were formerly
Reserved.

PCIe spec 5.0 r1.0 section 2.2.6.2 says:

  If an Endpoint supports sending Requests to other Endpoints (as opposed
  to host memory), the Endpoint must not send 10-Bit Tag Requests to
  another given Endpoint unless an implementation-specific mechanism
  determines that the Endpoint supports 10-Bit Tag Completer capability.

It is not safe for P2P traffic if an Endpoint send 10-Bit Tag Requesters
to another Endpoint that does not support 10-Bit Tag Completer capability,
so we provide sysfs file to disable 10-Bit Tag Requester. Unbind the
device driver, set the sysfs file and then rebind the driver.

Add a kernel parameter pcie_tag_peer2peer that disables 10-Bit Tag
Requester for all PCIe devices. This configuration allows peer-to-peer
DMA between any pair of devices, possibly at the cost of reduced
performance.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  5 +++
 drivers/pci/iov.c                               |  3 ++
 drivers/pci/pci-sysfs.c                         |  3 ++
 drivers/pci/pci.c                               |  4 +++
 drivers/pci/pci.h                               |  8 +++++
 drivers/pci/probe.c                             | 46 ++++++++++++++++++++++++-
 6 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index bdb2200..ebcda18 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3948,6 +3948,11 @@
 				any pair of devices, possibly at the cost of
 				reduced performance.  This also guarantees
 				that hot-added devices will work.
+		pcie_tag_peer2peer	Disable 10-Bit Tag Requester for all
+				PCIe devices. This configuration allows
+				peer-to-peer DMA between any pair of devices,
+				possibly at the cost of reduced performance.
+
 		cbiosize=nn[KMG]	The fixed amount of bus space which is
 				reserved for the CardBus bridge's IO window.
 				The default value is 256 bytes.
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index bf15b33..757e77d 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -235,6 +235,9 @@ static ssize_t sriov_vf_10bit_tag_ctl_store(struct device *dev,
 	if (vf_dev->driver)
 		return -EBUSY;
 
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		return -EPERM;
+
 	if (enable) {
 		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
 			return -EPERM;
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 52d0757..5bc1970 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -319,6 +319,9 @@ static ssize_t pci_10bit_tag_store(struct device *dev,
 	if (pdev->driver)
 		return -EBUSY;
 
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		return -EPERM;
+
 	if (enable) {
 		if (!pcie_rp_10bit_tag_cmp_supported(pdev))
 			return -EPERM;
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d14c573..e6c6bce 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -112,6 +112,8 @@ enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_PEER2PEER;
 enum pcie_bus_config_types pcie_bus_config = PCIE_BUS_DEFAULT;
 #endif
 
+enum pcie_tag_config_types pcie_tag_config = PCIE_TAG_DEFAULT;
+
 /*
  * The default CLS is used if arch didn't set CLS explicitly and not
  * all pci devices agree on the same value.  Arch can override either
@@ -6643,6 +6645,8 @@ static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+			} else if (!strncmp(str, "pcie_tag_peer2peer", 18)) {
+				pcie_tag_config = PCIE_TAG_PEER2PEER;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index dca5aaf..e9b5875 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -37,6 +37,14 @@ int pci_probe_reset_function(struct pci_dev *dev);
 int pci_bridge_secondary_bus_reset(struct pci_dev *dev);
 int pci_bus_error_reset(struct pci_dev *dev);
 
+enum pcie_tag_config_types {
+	PCIE_TAG_DEFAULT,   /* Enable 10-Bit Tag Requester for devices below
+			       Root Port that support 10-Bit Tag Completer. */
+	PCIE_TAG_PEER2PEER  /* Disable 10-Bit Tag Requester for all devices. */
+};
+
+extern enum pcie_tag_config_types pcie_tag_config;
+
 #define PCI_PM_D2_DELAY         200	/* usec; see PCIe r4.0, sec 5.9.1 */
 #define PCI_PM_D3HOT_WAIT       10	/* msec */
 #define PCI_PM_D3COLD_WAIT      100	/* msec */
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index f9bc562..49f16d5 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2043,10 +2043,30 @@ bool pcie_rp_10bit_tag_cmp_supported(struct pci_dev *dev)
 	return true;
 }
 
+static void pci_configure_10bit_tags(struct pci_dev *dev)
+{
+	/*
+	 * PCIe 5.0 section 9.3.5.10 10-Bit Tag Requester Enable in Device
+	 * Control 2 Register is RsvdP for VF.
+	 */
+	if (dev->is_virtfn)
+		return;
+
+	if (!pcie_rp_10bit_tag_cmp_supported(dev))
+		return;
+
+	if (pci_pcie_type(dev) != PCI_EXP_TYPE_ENDPOINT)
+		return;
+
+	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
+	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+				PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u16 ctl;
+	u16 ctl, ctl2;
 	int ret;
 
 	if (!pci_is_pcie(dev))
@@ -2059,6 +2079,10 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 	if (ret)
 		return 0;
 
+	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL2, &ctl2);
+	if (ret)
+		return 0;
+
 	host = pci_find_host_bridge(dev->bus);
 	if (!host)
 		return 0;
@@ -2073,6 +2097,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
 						   PCI_EXP_DEVCTL_EXT_TAG);
 		}
+
+		if (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN) {
+			pci_info(dev, "disabling 10-Bit Tags\n");
+			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+		}
 		return 0;
 	}
 
@@ -2081,6 +2111,20 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
 					 PCI_EXP_DEVCTL_EXT_TAG);
 	}
+
+	if ((pcie_tag_config == PCIE_TAG_PEER2PEER) &&
+	    (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN)) {
+		pci_info(dev, "disabling 10-Bit Tags\n");
+		pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+		return 0;
+	}
+
+	if (pcie_tag_config != PCIE_TAG_DEFAULT)
+		return 0;
+
+	pci_configure_10bit_tags(dev);
+
 	return 0;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH V8 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices
  2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (6 preceding siblings ...)
  2021-08-18 13:01 ` [PATCH V8 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
@ 2021-08-18 13:01 ` Dongdong Liu
  7 siblings, 0 replies; 10+ messages in thread
From: Dongdong Liu @ 2021-08-18 13:01 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Enable 10-Bit Tag Requester for the VF devices below the
Root Port that support 10-Bit Tag Completer.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/pci/iov.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 757e77d..1c40baf 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -691,6 +691,15 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 	pci_iov_set_numvfs(dev, nr_virtfn);
 	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+
+	if ((pcie_tag_config == PCIE_TAG_DEFAULT) &&
+	    (iov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ) &&
+	    pcie_rp_10bit_tag_cmp_supported(dev))
+		iov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
+	if (pcie_tag_config == PCIE_TAG_PEER2PEER)
+		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	msleep(100);
@@ -707,6 +716,7 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 err_pcibios:
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
@@ -739,6 +749,7 @@ static void sriov_disable(struct pci_dev *dev)
 
 	sriov_del_vfs(dev);
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH V8 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-18 13:01 ` [PATCH V8 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
@ 2021-08-18 16:07   ` Logan Gunthorpe
  0 siblings, 0 replies; 10+ messages in thread
From: Logan Gunthorpe @ 2021-08-18 16:07 UTC (permalink / raw)
  To: Dongdong Liu, helgaas, hch, kw, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev



On 2021-08-18 7:01 a.m., Dongdong Liu wrote:
> Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
> 10-Bit Tag Requester doesn't interact with a device that does not
> support 10-Bit Tag Completer. Before that happens, the kernel should
> emit a warning.
> 
> "echo 0 > /sys/bus/pci/devices/.../10bit_tag" to disable 10-Bit Tag
> Requester for PF device.
> 
> "echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
> 10-Bit Tag Requester for VF device.
> 
> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>

Looks good to me, thanks.

Reviewed-by: Logan Gunthorpe <logang@deltatee.com>

> ---
>  drivers/pci/p2pdma.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 48 insertions(+)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index 50cdde3..2b9c2c9 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -19,6 +19,7 @@
>  #include <linux/random.h>
>  #include <linux/seq_buf.h>
>  #include <linux/xarray.h>
> +#include "pci.h"
>  
>  enum pci_p2pdma_map_type {
>  	PCI_P2PDMA_MAP_UNKNOWN = 0,
> @@ -410,6 +411,50 @@ static unsigned long map_types_idx(struct pci_dev *client)
>  		(client->bus->number << 8) | client->devfn;
>  }
>  
> +static bool pci_10bit_tags_unsupported(struct pci_dev *a,
> +				       struct pci_dev *b,
> +				       bool verbose)
> +{
> +	bool req;
> +	bool comp;
> +	u16 ctl;
> +	const char *str = "10bit_tag";
> +
> +	if (a->is_virtfn) {
> +#ifdef CONFIG_PCI_IOV
> +		req = !!(a->physfn->sriov->ctrl &
> +			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
> +#endif
> +	} else {
> +		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl);
> +		req = !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +	}
> +
> +	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
> +
> +	/* 10-bit tags not enabled on requester */
> +	if (!req)
> +		return false;
> +
> +	 /* Completer can handle anything */
> +	if (comp)
> +		return false;
> +
> +	if (!verbose)
> +		return true;
> +
> +	pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set for this device, but peer device (%s) does not support the 10-Bit Tag Completer\n",
> +		 pci_name(b));
> +
> +	if (a->is_virtfn)
> +		str = "sriov_vf_10bit_tag_ctl";
> +
> +	pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/%s\n",
> +		 pci_name(a), str);
> +
> +	return true;
> +}
> +
>  /*
>   * Calculate the P2PDMA mapping type and distance between two PCI devices.
>   *
> @@ -532,6 +577,9 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
>  		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>  	}
>  done:
> +	if (pci_10bit_tags_unsupported(client, provider, verbose))
> +		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
> +
>  	rcu_read_lock();
>  	p2pdma = rcu_dereference(provider->p2pdma);
>  	if (p2pdma)
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-08-18 16:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-18 13:01 [PATCH V8 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
2021-08-18 13:01 ` [PATCH V8 1/8] PCI: Use cached Device Capabilities Register Dongdong Liu
2021-08-18 13:01 ` [PATCH V8 2/8] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
2021-08-18 13:01 ` [PATCH V8 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
2021-08-18 13:01 ` [PATCH V8 4/8] PCI/sysfs: Add a 10-Bit Tag sysfs file PCIe Endpoint devices Dongdong Liu
2021-08-18 13:01 ` [PATCH V8 5/8] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
2021-08-18 13:01 ` [PATCH V8 6/8] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
2021-08-18 16:07   ` Logan Gunthorpe
2021-08-18 13:01 ` [PATCH V8 7/8] PCI: Enable 10-Bit Tag support for PCIe Endpoint device Dongdong Liu
2021-08-18 13:01 ` [PATCH V8 8/8] PCI/IOV: Enable 10-Bit Tag support for PCIe VF devices Dongdong Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).