linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices
@ 2021-08-04 13:46 Dongdong Liu
  2021-08-04 13:47 ` [PATCH V7 1/9] PCI: Use cached Device Capabilities Register Dongdong Liu
                   ` (8 more replies)
  0 siblings, 9 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:46 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

This patchset is to enable 10-Bit tag for PCIe EP devices (include VF) and
RP device.

V6->V7:
- Rebased on v5.14-rc3.
- Change the "pci=disable_10bit_tag=" parameter to sysfs file to disable
  10-Bit Tag Requester when need for p2pdma suggested by Leon.
- Fix comment for p2pdma 10-bit tag check.

V5->V6:
- Rebased on v5.14-rc2.
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V6 2/8].
- PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support.
- Add a 10-bit tag check in P2PDMA.
- Simplified implementation in [PATCH V6 6/8].
- Fix some comments in [PATCH V6 4/8].

V4->V5:
- Fix warning variable 'capa' is uninitialized.
- Fix warning unused variable 'pchild'.

V3->V4:
- Get the value of pcie_devcap2 in set_pcie_port_type().
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V4 1/6],
  [PATCH V4 3/6], [PATCH V4 4/6], [PATCH V4 5/6].
- Fix some code style.
- Rebased on v5.13-rc6.

V2->V3:
- Use cached Device Capabilities Register suggested by Christoph.
- Fix code style to avoid > 80 char lines.
- Rename devcap2 to pcie_devcap2.

V1->V2: Fix some comments by Christoph.
- Store the devcap2 value in the pci_dev instead of reading it multiple
  times.
- Change pci_info to pci_dbg to avoid the noisy log.
- Rename ext_10bit_tag_comp_path to ext_10bit_tag.
- Fix the compile error.
- Rebased on v5.13-rc1.

Dongdong Liu (9):
  PCI: Use cached Device Capabilities Register
  PCI: Use cached Device Capabilities 2 Register
  PCI: Add 10-Bit Tag register definitions
  PCI: Enable 10-Bit Tag support for PCIe Endpoint devices
  PCI/IOV: Enable 10-Bit tag support for PCIe VF devices
  PCI: Enable 10-Bit Tag support for PCIe RP devices
  PCI/sysfs: Add a 10-Bit Tag sysfs file
  PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA

 Documentation/ABI/testing/sysfs-bus-pci         | 36 ++++++++++++-
 drivers/media/pci/cobalt/cobalt-driver.c        |  5 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  4 +-
 drivers/pci/iov.c                               | 58 +++++++++++++++++++++
 drivers/pci/p2pdma.c                            | 40 ++++++++++++++
 drivers/pci/pci-sysfs.c                         | 69 +++++++++++++++++++++++++
 drivers/pci/pci.c                               | 14 ++---
 drivers/pci/pcie/aspm.c                         | 11 ++--
 drivers/pci/pcie/portdrv_pci.c                  | 69 +++++++++++++++++++++++++
 drivers/pci/probe.c                             | 66 ++++++++++++++++++-----
 drivers/pci/quirks.c                            |  3 +-
 include/linux/pci.h                             |  5 ++
 include/uapi/linux/pci_regs.h                   |  5 ++
 13 files changed, 347 insertions(+), 38 deletions(-)

--
2.7.4


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH V7 1/9] PCI: Use cached Device Capabilities Register
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-04 13:47 ` [PATCH V7 2/9] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

It will make sense to store the pcie_devcap value in the pci_dev
structure instead of reading Device Capabilities Register multiple
times. The fisrt place to use pcie_devcap is in set_pcie_port_type(),
get the pcie_devcap value here, then use cached pcie_devcap in the
needed place.

Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/media/pci/cobalt/cobalt-driver.c |  5 +++--
 drivers/pci/pci.c                        |  5 +----
 drivers/pci/pcie/aspm.c                  | 11 ++++-------
 drivers/pci/probe.c                      | 11 +++--------
 drivers/pci/quirks.c                     |  3 +--
 include/linux/pci.h                      |  1 +
 6 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/drivers/media/pci/cobalt/cobalt-driver.c b/drivers/media/pci/cobalt/cobalt-driver.c
index 16af58f..23c6436 100644
--- a/drivers/media/pci/cobalt/cobalt-driver.c
+++ b/drivers/media/pci/cobalt/cobalt-driver.c
@@ -193,11 +193,12 @@ void cobalt_pcie_status_show(struct cobalt *cobalt)
 		return;
 
 	/* Device */
-	pcie_capability_read_dword(pci_dev, PCI_EXP_DEVCAP, &capa);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVCTL, &ctrl);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVSTA, &stat);
 	cobalt_info("PCIe device capability 0x%08x: Max payload %d\n",
-		    capa, get_payload_size(capa & PCI_EXP_DEVCAP_PAYLOAD));
+		    pci_dev->pcie_devcap,
+		    get_payload_size(pci_dev->pcie_devcap &
+				     PCI_EXP_DEVCAP_PAYLOAD));
 	cobalt_info("PCIe device control 0x%04x: Max payload %d. Max read request %d\n",
 		    ctrl,
 		    get_payload_size((ctrl & PCI_EXP_DEVCTL_PAYLOAD) >> 5),
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index aacf575..dc3bfb2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4630,13 +4630,10 @@ EXPORT_SYMBOL(pci_wait_for_pending_transaction);
  */
 bool pcie_has_flr(struct pci_dev *dev)
 {
-	u32 cap;
-
 	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
 		return false;
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	return cap & PCI_EXP_DEVCAP_FLR;
+	return dev->pcie_devcap & PCI_EXP_DEVCAP_FLR;
 }
 EXPORT_SYMBOL_GPL(pcie_has_flr);
 
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 013a47f..db944f6 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -660,7 +660,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 
 	/* Get and check endpoint acceptable latencies */
 	list_for_each_entry(child, &linkbus->devices, bus_list) {
-		u32 reg32, encoding;
+		u32 encoding;
 		struct aspm_latency *acceptable =
 			&link->acceptable[PCI_FUNC(child->devfn)];
 
@@ -668,12 +668,11 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 		    pci_pcie_type(child) != PCI_EXP_TYPE_LEG_END)
 			continue;
 
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
 		/* Calculate endpoint L0s acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L0S) >> 6;
+		encoding = (child->pcie_devcap & PCI_EXP_DEVCAP_L0S) >> 6;
 		acceptable->l0s = calc_l0s_acceptable(encoding);
 		/* Calculate endpoint L1 acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L1) >> 9;
+		encoding = (child->pcie_devcap & PCI_EXP_DEVCAP_L1) >> 9;
 		acceptable->l1 = calc_l1_acceptable(encoding);
 
 		pcie_aspm_check_latency(child);
@@ -808,7 +807,6 @@ static void free_link_state(struct pcie_link_state *link)
 static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 {
 	struct pci_dev *child;
-	u32 reg32;
 
 	/*
 	 * Some functions in a slot might not all be PCIe functions,
@@ -831,8 +829,7 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
 		 * RBER bit to determine if a function is 1.1 version device
 		 */
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
-		if (!(reg32 & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
+		if (!(child->pcie_devcap & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
 			pci_info(child, "disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'\n");
 			return -EINVAL;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 79177ac..cc700f6 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1498,8 +1498,8 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pdev->pcie_cap = pos;
 	pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
 	pdev->pcie_flags_reg = reg16;
-	pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, &reg16);
-	pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
+	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->pcie_devcap);
+	pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
 
 	parent = pci_upstream_bridge(pdev);
 	if (!parent)
@@ -2031,18 +2031,13 @@ static void pci_configure_mps(struct pci_dev *dev)
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u32 cap;
 	u16 ctl;
 	int ret;
 
 	if (!pci_is_pcie(dev))
 		return 0;
 
-	ret = pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	if (ret)
-		return 0;
-
-	if (!(cap & PCI_EXP_DEVCAP_EXT_TAG))
+	if (!(dev->pcie_devcap & PCI_EXP_DEVCAP_EXT_TAG))
 		return 0;
 
 	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL, &ctl);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6d74386..2b405c5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5173,8 +5173,7 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
 		pdev->pcie_cap = pos;
 		pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
 		pdev->pcie_flags_reg = reg16;
-		pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, &reg16);
-		pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
+		pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
 
 		pdev->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
 		if (pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status) !=
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 540b377..aee7c85 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -340,6 +340,7 @@ struct pci_dev {
 	u8		rom_base_reg;	/* Config register controlling ROM */
 	u8		pin;		/* Interrupt pin this device uses */
 	u16		pcie_flags_reg;	/* Cached PCIe Capabilities Register */
+	u32		pcie_devcap;	/* Cached Device Capabilities Register */
 	unsigned long	*dma_alias_mask;/* Mask of enabled devfn aliases */
 
 	struct pci_driver *driver;	/* Driver bound to this device */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V7 2/9] PCI: Use cached Device Capabilities 2 Register
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-08-04 13:47 ` [PATCH V7 1/9] PCI: Use cached Device Capabilities Register Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-04 13:47 ` [PATCH V7 3/9] PCI: Add 10-Bit Tag register definitions Dongdong Liu
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

It will make sense to store the pcie_devcap2 value in the pci_dev
structure instead of reading Device Capabilities 2 Register multiple
times. Get the pcie_devcap2 value set_pcie_port_type(), then use
cached pcie_devcap2 in the needed place.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  4 +---
 drivers/pci/pci.c                               |  9 ++++-----
 drivers/pci/probe.c                             | 10 ++++------
 include/linux/pci.h                             |  2 ++
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index dbf9a0e..a8e1e22 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6304,7 +6304,6 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		struct pci_dev *pbridge;
 		struct port_info *pi;
 		char name[IFNAMSIZ];
-		u32 devcap2;
 		u16 flags;
 
 		/* If we want to instantiate Virtual Functions, then our
@@ -6314,10 +6313,9 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		 */
 		pbridge = pdev->bus->self;
 		pcie_capability_read_word(pbridge, PCI_EXP_FLAGS, &flags);
-		pcie_capability_read_dword(pbridge, PCI_EXP_DEVCAP2, &devcap2);
 
 		if ((flags & PCI_EXP_FLAGS_VERS) < 2 ||
-		    !(devcap2 & PCI_EXP_DEVCAP2_ARI)) {
+		    !(pbridge->pcie_devcap2 & PCI_EXP_DEVCAP2_ARI)) {
 			/* Our parent bridge does not support ARI so issue a
 			 * warning and skip instantiating the VFs.  They
 			 * won't be reachable.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index dc3bfb2..d14c573 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3700,7 +3700,7 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 {
 	struct pci_bus *bus = dev->bus;
 	struct pci_dev *bridge;
-	u32 cap, ctl2;
+	u32 ctl2;
 
 	if (!pci_is_pcie(dev))
 		return -EINVAL;
@@ -3724,19 +3724,18 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 	while (bus->parent) {
 		bridge = bus->self;
 
-		pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
-
 		switch (pci_pcie_type(bridge)) {
 		/* Ensure switch ports support AtomicOp routing */
 		case PCI_EXP_TYPE_UPSTREAM:
 		case PCI_EXP_TYPE_DOWNSTREAM:
-			if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+			if (!(bridge->pcie_devcap2 &
+			      PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
 				return -EINVAL;
 			break;
 
 		/* Ensure root port supports all the sizes we care about */
 		case PCI_EXP_TYPE_ROOT_PORT:
-			if ((cap & cap_mask) != cap_mask)
+			if ((bridge->pcie_devcap2 & cap_mask) != cap_mask)
 				return -EINVAL;
 			break;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index cc700f6..c83245b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1500,6 +1500,7 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pdev->pcie_flags_reg = reg16;
 	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->pcie_devcap);
 	pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
+	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP2, &pdev->pcie_devcap2);
 
 	parent = pci_upstream_bridge(pdev);
 	if (!parent)
@@ -2116,7 +2117,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 #ifdef CONFIG_PCIEASPM
 	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 	struct pci_dev *bridge;
-	u32 cap, ctl;
+	u32 ctl;
 
 	if (!pci_is_pcie(dev))
 		return;
@@ -2124,8 +2125,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 	/* Read L1 PM substate capabilities */
 	dev->l1ss = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS);
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_LTR))
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_LTR))
 		return;
 
 	pcie_capability_read_dword(dev, PCI_EXP_DEVCTL2, &ctl);
@@ -2165,13 +2165,11 @@ static void pci_configure_eetlp_prefix(struct pci_dev *dev)
 #ifdef CONFIG_PCI_PASID
 	struct pci_dev *bridge;
 	int pcie_type;
-	u32 cap;
 
 	if (!pci_is_pcie(dev))
 		return;
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_EE_PREFIX))
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_EE_PREFIX))
 		return;
 
 	pcie_type = pci_pcie_type(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index aee7c85..9aab67f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -341,6 +341,8 @@ struct pci_dev {
 	u8		pin;		/* Interrupt pin this device uses */
 	u16		pcie_flags_reg;	/* Cached PCIe Capabilities Register */
 	u32		pcie_devcap;	/* Cached Device Capabilities Register */
+	u32		pcie_devcap2;	/* Cached Device Capabilities 2
+					   Register */
 	unsigned long	*dma_alias_mask;/* Mask of enabled devfn aliases */
 
 	struct pci_driver *driver;	/* Driver bound to this device */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V7 3/9] PCI: Add 10-Bit Tag register definitions
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-08-04 13:47 ` [PATCH V7 1/9] PCI: Use cached Device Capabilities Register Dongdong Liu
  2021-08-04 13:47 ` [PATCH V7 2/9] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-04 13:47 ` [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices Dongdong Liu
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add 10-Bit Tag register definitions for use in subsequen patches.
See the PCIe 5.0 spec section 7.5.3.15 and 9.3.3.2.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 include/uapi/linux/pci_regs.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index e709ae8..cf1ddb8 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -648,6 +648,8 @@
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_COMP 0x00010000 /* 10-Bit Tag Completer Supported */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_REQ  0x00020000 /* 10-Bit Tag Requester Supported */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
 #define  PCI_EXP_DEVCAP2_OBFF_WAKE	0x00080000 /* Re-use WAKE# for OBFF */
@@ -661,6 +663,7 @@
 #define  PCI_EXP_DEVCTL2_IDO_REQ_EN	0x0100	/* Allow IDO for requests */
 #define  PCI_EXP_DEVCTL2_IDO_CMP_EN	0x0200	/* Allow IDO for completions */
 #define  PCI_EXP_DEVCTL2_LTR_EN		0x0400	/* Enable LTR mechanism */
+#define  PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN 0x1000 /* 10-Bit Tag Requester Enable */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGA_EN	0x2000	/* Enable OBFF Message type A */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGB_EN	0x4000	/* Enable OBFF Message type B */
 #define  PCI_EXP_DEVCTL2_OBFF_WAKE_EN	0x6000	/* OBFF using WAKE# signaling */
@@ -931,6 +934,7 @@
 /* Single Root I/O Virtualization */
 #define PCI_SRIOV_CAP		0x04	/* SR-IOV Capabilities */
 #define  PCI_SRIOV_CAP_VFM	0x00000001  /* VF Migration Capable */
+#define  PCI_SRIOV_CAP_VF_10BIT_TAG_REQ	0x00000004 /* VF 10-Bit Tag Requester Supported */
 #define  PCI_SRIOV_CAP_INTR(x)	((x) >> 21) /* Interrupt Message Number */
 #define PCI_SRIOV_CTRL		0x08	/* SR-IOV Control */
 #define  PCI_SRIOV_CTRL_VFE	0x0001	/* VF Enable */
@@ -938,6 +942,7 @@
 #define  PCI_SRIOV_CTRL_INTR	0x0004	/* VF Migration Interrupt Enable */
 #define  PCI_SRIOV_CTRL_MSE	0x0008	/* VF Memory Space Enable */
 #define  PCI_SRIOV_CTRL_ARI	0x0010	/* ARI Capable Hierarchy */
+#define  PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN 0x0020 /* VF 10-Bit Tag Requester Enable */
 #define PCI_SRIOV_STATUS	0x0a	/* SR-IOV Status */
 #define  PCI_SRIOV_STATUS_VFM	0x0001	/* VF Migration Status */
 #define PCI_SRIOV_INITIAL_VF	0x0c	/* Initial VFs */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (2 preceding siblings ...)
  2021-08-04 13:47 ` [PATCH V7 3/9] PCI: Add 10-Bit Tag register definitions Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-04 23:17   ` Bjorn Helgaas
  2021-08-04 13:47 ` [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices Dongdong Liu
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
10-Bit Tag Capabilities" Implementation Note.
For platforms where the RC supports 10-Bit Tag Completer capability,
it is highly recommended for platform firmware or operating software
that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
bit automatically in Endpoints with 10-Bit Tag Requester capability. This
enables the important class of 10-Bit Tag capable adapters that send
Memory Read Requests only to host memory.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/pci/probe.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/pci.h |  2 ++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c83245b..3da7baa 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2029,10 +2029,42 @@ static void pci_configure_mps(struct pci_dev *dev)
 		 p_mps, mps, mpss);
 }
 
+static void pci_configure_10bit_tags(struct pci_dev *dev)
+{
+	struct pci_dev *bridge;
+
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
+		return;
+
+	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
+		dev->ext_10bit_tag = 1;
+		return;
+	}
+
+	bridge = pci_upstream_bridge(dev);
+	if (bridge && bridge->ext_10bit_tag)
+		dev->ext_10bit_tag = 1;
+
+	/*
+	 * 10-Bit Tag Requester Enable in Device Control 2 Register is RsvdP
+	 * for VF.
+	 */
+	if (dev->is_virtfn)
+		return;
+
+	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT &&
+	    dev->ext_10bit_tag == 1 &&
+	    (dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ)) {
+		pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
+		pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u16 ctl;
+	u16 ctl, ctl2;
 	int ret;
 
 	if (!pci_is_pcie(dev))
@@ -2045,6 +2077,10 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 	if (ret)
 		return 0;
 
+	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL2, &ctl2);
+	if (ret)
+		return 0;
+
 	host = pci_find_host_bridge(dev->bus);
 	if (!host)
 		return 0;
@@ -2059,6 +2095,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
 						   PCI_EXP_DEVCTL_EXT_TAG);
 		}
+
+		if (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN) {
+			pci_info(dev, "disabling 10-Bit Tags\n");
+			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+		}
 		return 0;
 	}
 
@@ -2067,6 +2109,9 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
 					 PCI_EXP_DEVCTL_EXT_TAG);
 	}
+
+	pci_configure_10bit_tags(dev);
+
 	return 0;
 }
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 9aab67f..af6cb53 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -393,6 +393,8 @@ struct pci_dev {
 #endif
 	unsigned int	eetlp_prefix_path:1;	/* End-to-End TLP Prefix */
 
+	unsigned int	ext_10bit_tag:1; /* 10-Bit Tag Completer Supported
+					    from root to here */
 	pci_channel_state_t error_state;	/* Current connectivity state */
 	struct device	dev;			/* Generic device interface */
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (3 preceding siblings ...)
  2021-08-04 13:47 ` [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-04 23:29   ` Bjorn Helgaas
  2021-08-04 13:47 ` [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices Dongdong Liu
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Enable VF 10-Bit Tag Requester when it's upstream component support
10-bit Tag Completer.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/pci/iov.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index dafdc65..0d0bed1 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -634,6 +634,10 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 	pci_iov_set_numvfs(dev, nr_virtfn);
 	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+	if ((iov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ) &&
+	    dev->ext_10bit_tag)
+		iov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	msleep(100);
@@ -650,6 +654,8 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 err_pcibios:
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	if (iov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
+		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
@@ -682,6 +688,8 @@ static void sriov_disable(struct pci_dev *dev)
 
 	sriov_del_vfs(dev);
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	if (iov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
+		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (4 preceding siblings ...)
  2021-08-04 13:47 ` [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-04 23:38   ` Bjorn Helgaas
  2021-08-04 13:47 ` [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file Dongdong Liu
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0r1.0 section 2.2.6.2 implementation note, In configurations
where a Requester with 10-Bit Tag Requester capability needs to target
multiple Completers, one needs to ensure that the Requester sends 10-Bit
Tag Requests only to Completers that have 10-Bit Tag Completer capability.
So we enable 10-Bit Tag Requester for root port only when the devices
under the root port support 10-Bit Tag Completer.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/pci/pcie/portdrv_pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index c7ff1ee..2382cd2 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -90,6 +90,72 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
 #define PCIE_PORTDRV_PM_OPS	NULL
 #endif /* !PM */
 
+static int pci_10bit_tag_comp_support(struct pci_dev *dev, void *data)
+{
+	bool *support = (bool *)data;
+
+	if (!pci_is_pcie(dev)) {
+		*support = false;
+		return 1;
+	}
+
+	/*
+	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
+	 * For configurations where a Requester with 10-Bit Tag Requester
+	 * capability targets Completers where some do and some do not have
+	 * 10-Bit Tag Completer capability, how the Requester determines which
+	 * NPRs include 10-Bit Tags is outside the scope of this specification.
+	 * So we do not consider hotplug scenario.
+	 */
+	if (dev->is_hotplug_bridge) {
+		*support = false;
+		return 1;
+	}
+
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) {
+		*support = false;
+		return 1;
+	}
+
+	return 0;
+}
+
+static void pci_configure_rp_10bit_tag(struct pci_dev *dev)
+{
+	bool support = true;
+
+	if (dev->subordinate == NULL)
+		return;
+
+	/* If no devices under the root port, no need to enable 10-Bit Tag. */
+	if (list_empty(&dev->subordinate->devices))
+		return;
+
+	pci_10bit_tag_comp_support(dev, &support);
+	if (!support)
+		return;
+
+	/*
+	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
+	 * In configurations where a Requester with 10-Bit Tag Requester
+	 * capability needs to target multiple Completers, one needs to ensure
+	 * that the Requester sends 10-Bit Tag Requests only to Completers
+	 * that have 10-Bit Tag Completer capability. So we enable 10-Bit Tag
+	 * Requester for root port only when the devices under the root port
+	 * support 10-Bit Tag Completer.
+	 */
+	pci_walk_bus(dev->subordinate, pci_10bit_tag_comp_support, &support);
+	if (!support)
+		return;
+
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
+		return;
+
+	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
+	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+				 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+}
+
 /*
  * pcie_portdrv_probe - Probe PCI-Express port devices
  * @dev: PCI-Express port device being probed
@@ -111,6 +177,9 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
 	     (type != PCI_EXP_TYPE_RC_EC)))
 		return -ENODEV;
 
+	if (type == PCI_EXP_TYPE_ROOT_PORT)
+		pci_configure_rp_10bit_tag(dev);
+
 	if (type == PCI_EXP_TYPE_RC_EC)
 		pcie_link_rcec(dev);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (5 preceding siblings ...)
  2021-08-04 13:47 ` [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-04 15:51   ` Logan Gunthorpe
                     ` (2 more replies)
  2021-08-04 13:47 ` [PATCH V7 8/9] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
  2021-08-04 13:47 ` [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
  8 siblings, 3 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
sending Requests to other Endpoints (as opposed to host memory), the
Endpoint must not send 10-Bit Tag Requests to another given Endpoint
unless an implementation-specific mechanism determines that the Endpoint
supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
write 0 to disable 10-Bit Tag Requester when the driver does not bind
the device if the peer device does not support the 10-Bit Tag Completer.
This will make P2P traffic safe. the 10bit_tag file content indicate
current 10-Bit Tag Requester Enable status.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 16 +++++++-
 drivers/pci/pci-sysfs.c                 | 69 +++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+), 1 deletion(-)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 793cbb7..0e0c97d 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -139,7 +139,7 @@ Description:
 		binary file containing the Vital Product Data for the
 		device.  It should follow the VPD format defined in
 		PCI Specification 2.1 or 2.2, but users should consider
-		that some devices may have incorrectly formatted data.  
+		that some devices may have incorrectly formatted data.
 		If the underlying VPD has a writable section then the
 		corresponding section of this file will be writable.
 
@@ -407,3 +407,17 @@ Description:
 
 		The file is writable if the PF is bound to a driver that
 		implements ->sriov_set_msix_vec_count().
+
+What:		/sys/bus/pci/devices/.../10bit_tag
+Date:		August 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		If a PCI device support 10-Bit Tag Requester, will create the
+		10bit_tag sysfs file. The file is readable, the value
+		indicate current 10-Bit Tag Requester Enable.
+		1 - enabled, 0 - disabled.
+
+		The file is also writeable, the value only accept by write 0
+		to disable 10-Bit Tag Requester when the driver does not bind
+		the deivce. The typical use case is for p2pdma when the peer
+		device does not support 10-BIT Tag Completer.
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 5d63df7..e93ce8b 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -306,6 +306,49 @@ static ssize_t enable_show(struct device *dev, struct device_attribute *attr,
 }
 static DEVICE_ATTR_RW(enable);
 
+static ssize_t pci_10bit_tag_store(struct device *dev,
+				   struct device_attribute *attr,
+				   const char *buf, size_t count)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	bool enable;
+
+	if (kstrtobool(buf, &enable) < 0)
+		return -EINVAL;
+
+	if (enable != false )
+		return -EINVAL;
+
+	if (pdev->driver)
+		 return -EBUSY;
+
+	pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
+				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	pci_info(pdev, "disabled 10-Bit Tag Requester\n");
+
+	return count;
+}
+
+static ssize_t pci_10bit_tag_show(struct device *dev,
+				  struct device_attribute *attr,
+				  char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+	u16 ctl;
+	int ret;
+
+	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &ctl);
+	if (ret)
+		return -EINVAL;
+
+	return sysfs_emit(buf, "%u\n",
+			  !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN));
+}
+
+static struct device_attribute dev_attr_10bit_tag = __ATTR(10bit_tag, 0600,
+							   pci_10bit_tag_show,
+							   pci_10bit_tag_store);
+
 #ifdef CONFIG_NUMA
 static ssize_t numa_node_store(struct device *dev,
 			       struct device_attribute *attr, const char *buf,
@@ -635,6 +678,11 @@ static struct attribute *pcie_dev_attrs[] = {
 	NULL,
 };
 
+static struct attribute *pcie_dev_10bit_tag_attrs[] = {
+	&dev_attr_10bit_tag.attr,
+	NULL,
+};
+
 static struct attribute *pcibus_attrs[] = {
 	&dev_attr_bus_rescan.attr,
 	&dev_attr_cpuaffinity.attr,
@@ -1482,6 +1530,21 @@ static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
 	return 0;
 }
 
+static umode_t pcie_dev_10bit_tag_attrs_are_visible(struct kobject *kobj,
+					  struct attribute *a, int n)
+{
+	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	if (pdev->is_virtfn)
+		return 0;
+
+	if (!(pdev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
+		return 0;
+
+	return a->mode;
+}
+
 static const struct attribute_group pci_dev_group = {
 	.attrs = pci_dev_attrs,
 };
@@ -1521,6 +1584,11 @@ static const struct attribute_group pcie_dev_attr_group = {
 	.is_visible = pcie_dev_attrs_are_visible,
 };
 
+static const struct attribute_group pcie_dev_10bit_tag_attr_group = {
+	.attrs = pcie_dev_10bit_tag_attrs,
+	.is_visible = pcie_dev_10bit_tag_attrs_are_visible,
+};
+
 static const struct attribute_group *pci_dev_attr_groups[] = {
 	&pci_dev_attr_group,
 	&pci_dev_hp_attr_group,
@@ -1530,6 +1598,7 @@ static const struct attribute_group *pci_dev_attr_groups[] = {
 #endif
 	&pci_bridge_attr_group,
 	&pcie_dev_attr_group,
+	&pcie_dev_10bit_tag_attr_group,
 #ifdef CONFIG_PCIEAER
 	&aer_stats_attr_group,
 #endif
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V7 8/9] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (6 preceding siblings ...)
  2021-08-04 13:47 ` [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-05  0:05   ` Bjorn Helgaas
  2021-08-04 13:47 ` [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
  8 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
sending Requests to other Endpoints (as opposed to host memory), the
Endpoint must not send 10-Bit Tag Requests to another given Endpoint
unless an implementation-specific mechanism determines that the
Endpoint supports 10-Bit Tag Completer capability.
Add sriov_vf_10bit_tag file to query the status of VF 10-Bit Tag
Requester Enable. Add sriov_vf_10bit_tag_ctl file to disable the VF
10-Bit Tag Requester. The typical use case is for p2pdma when the peer
device does not support 10-BIT Tag Completer.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/ABI/testing/sysfs-bus-pci | 20 +++++++++++++
 drivers/pci/iov.c                       | 50 +++++++++++++++++++++++++++++++++
 2 files changed, 70 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
index 0e0c97d..8fdbfae 100644
--- a/Documentation/ABI/testing/sysfs-bus-pci
+++ b/Documentation/ABI/testing/sysfs-bus-pci
@@ -421,3 +421,23 @@ Description:
 		to disable 10-Bit Tag Requester when the driver does not bind
 		the deivce. The typical use case is for p2pdma when the peer
 		device does not support 10-BIT Tag Completer.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag
+Date:		August 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		This file is associated with a SR-IOV physical function (PF).
+		It is visible when the device has VF 10-Bit Tag Requester
+		Supported. It contains the status of VF 10-Bit Tag Requester
+		Enable. The file is only readable.
+
+What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl
+Date:		August 2021
+Contact:	Dongdong Liu <liudongdong3@huawei.com>
+Description:
+		This file is associated with a SR-IOV virtual function (VF).
+		It is visible when the device has VF 10-Bit Tag Requester
+		Supported. It only allows to write 0 to disable VF 10-Bit
+		Tag Requester. The file is only writeable when the vf driver
+		does not bind to a dirver. The typical use case is for p2pdma
+		when the peer device does not support 10-BIT Tag Completer.
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 0d0bed1..04c1298 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -220,10 +220,38 @@ static ssize_t sriov_vf_msix_count_store(struct device *dev,
 static DEVICE_ATTR_WO(sriov_vf_msix_count);
 #endif
 
+static ssize_t sriov_vf_10bit_tag_ctl_store(struct device *dev,
+					    struct device_attribute *attr,
+					    const char *buf, size_t count)
+{
+	struct pci_dev *vf_dev = to_pci_dev(dev);
+	struct pci_dev *pdev = pci_physfn(vf_dev);
+	struct pci_sriov *iov;
+	bool enable;
+
+	if (kstrtobool(buf, &enable) < 0)
+		return -EINVAL;
+
+	if (enable != false)
+		return -EINVAL;
+
+	if (vf_dev->driver)
+		return -EBUSY;
+
+	iov = pdev->sriov;
+	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+	pci_write_config_word(pdev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
+	pci_info(pdev, "disabled SRIOV 10-Bit Tag Requester\n");
+
+	return count;
+}
+static DEVICE_ATTR_WO(sriov_vf_10bit_tag_ctl);
+
 static struct attribute *sriov_vf_dev_attrs[] = {
 #ifdef CONFIG_PCI_MSI
 	&dev_attr_sriov_vf_msix_count.attr,
 #endif
+	&dev_attr_sriov_vf_10bit_tag_ctl.attr,
 	NULL,
 };
 
@@ -236,6 +264,11 @@ static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
 	if (!pdev->is_virtfn)
 		return 0;
 
+	pdev = pci_physfn(pdev);
+	if ((a == &dev_attr_sriov_vf_10bit_tag_ctl.attr) &&
+	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
+		return 0;
+
 	return a->mode;
 }
 
@@ -487,12 +520,23 @@ static ssize_t sriov_drivers_autoprobe_store(struct device *dev,
 	return count;
 }
 
+static ssize_t sriov_vf_10bit_tag_show(struct device *dev,
+				       struct device_attribute *attr,
+				       char *buf)
+{
+	struct pci_dev *pdev = to_pci_dev(dev);
+
+	return sysfs_emit(buf, "%u\n",
+		!!(pdev->sriov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN));
+}
+
 static DEVICE_ATTR_RO(sriov_totalvfs);
 static DEVICE_ATTR_RW(sriov_numvfs);
 static DEVICE_ATTR_RO(sriov_offset);
 static DEVICE_ATTR_RO(sriov_stride);
 static DEVICE_ATTR_RO(sriov_vf_device);
 static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
+static DEVICE_ATTR_RO(sriov_vf_10bit_tag);
 
 static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_totalvfs.attr,
@@ -501,6 +545,7 @@ static struct attribute *sriov_pf_dev_attrs[] = {
 	&dev_attr_sriov_stride.attr,
 	&dev_attr_sriov_vf_device.attr,
 	&dev_attr_sriov_drivers_autoprobe.attr,
+	&dev_attr_sriov_vf_10bit_tag.attr,
 #ifdef CONFIG_PCI_MSI
 	&dev_attr_sriov_vf_total_msix.attr,
 #endif
@@ -511,10 +556,15 @@ static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
 					  struct attribute *a, int n)
 {
 	struct device *dev = kobj_to_dev(kobj);
+	struct pci_dev *pdev = to_pci_dev(dev);
 
 	if (!dev_is_pf(dev))
 		return 0;
 
+	if ((a == &dev_attr_sriov_vf_10bit_tag.attr) &&
+	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
+		return 0;
+
 	return a->mode;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (7 preceding siblings ...)
  2021-08-04 13:47 ` [PATCH V7 8/9] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
@ 2021-08-04 13:47 ` Dongdong Liu
  2021-08-04 15:56   ` Logan Gunthorpe
  2021-08-05 18:12   ` Bjorn Helgaas
  8 siblings, 2 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-04 13:47 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
10-Bit Tag Requester doesn't interact with a device that does not
support 10-BIT Tag Completer. Before that happens, the kernel should
emit a warning. "echo 0 > /sys/bus/pci/devices/.../10bit_tag" to
disable 10-BIT Tag Requester for PF device.
"echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
10-BIT Tag Requester for VF device.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/pci/p2pdma.c | 40 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index 50cdde3..948f2be 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -19,6 +19,7 @@
 #include <linux/random.h>
 #include <linux/seq_buf.h>
 #include <linux/xarray.h>
+#include "pci.h"
 
 enum pci_p2pdma_map_type {
 	PCI_P2PDMA_MAP_UNKNOWN = 0,
@@ -410,6 +411,41 @@ static unsigned long map_types_idx(struct pci_dev *client)
 		(client->bus->number << 8) | client->devfn;
 }
 
+static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
+				   bool verbose)
+{
+	bool req;
+	bool comp;
+	u16 ctl2;
+
+	if (a->is_virtfn) {
+#ifdef CONFIG_PCI_IOV
+		req = !!(a->physfn->sriov->ctrl &
+			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
+#endif
+	} else {
+		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl2);
+		req = !!(ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+
+	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
+	if (req && (!comp)) {
+		if (verbose) {
+			pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set in device (%s), but peer device (%s) does not support the 10-Bit Tag Completer\n",
+				 pci_name(a), pci_name(b));
+			if (a->is_virtfn)
+				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/sriov_vf_10bit_tag_ctl\n",
+					 pci_name(a));
+			else
+				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/10bit_tag\n",
+					 pci_name(a));
+		}
+		return false;
+	}
+
+	return true;
+}
+
 /*
  * Calculate the P2PDMA mapping type and distance between two PCI devices.
  *
@@ -532,6 +568,10 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
 		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
 	}
 done:
+	if (!check_10bit_tags_vaild(client, provider, verbose) ||
+	    !check_10bit_tags_vaild(provider, client, verbose))
+		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
+
 	rcu_read_lock();
 	p2pdma = rcu_dereference(provider->p2pdma);
 	if (p2pdma)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-04 13:47 ` [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file Dongdong Liu
@ 2021-08-04 15:51   ` Logan Gunthorpe
  2021-08-05 13:14     ` Dongdong Liu
  2021-08-04 23:49   ` Bjorn Helgaas
  2021-08-04 23:52   ` Bjorn Helgaas
  2 siblings, 1 reply; 43+ messages in thread
From: Logan Gunthorpe @ 2021-08-04 15:51 UTC (permalink / raw)
  To: Dongdong Liu, helgaas, hch, kw, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev




On 2021-08-04 7:47 a.m., Dongdong Liu wrote:
> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> sending Requests to other Endpoints (as opposed to host memory), the
> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> unless an implementation-specific mechanism determines that the Endpoint
> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
> write 0 to disable 10-Bit Tag Requester when the driver does not bind
> the device if the peer device does not support the 10-Bit Tag Completer.
> This will make P2P traffic safe. the 10bit_tag file content indicate
> current 10-Bit Tag Requester Enable status.

Can we not have both the sysfs file and the command line parameter? If
the user wants to disable it always for a specific device this sysfs
parameter is fairly awkward. A script at boot to unbind the driver, set
the sysfs file and rebind the driver is not trivial and the command line
parameter offers additional options for users.

Logan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-04 13:47 ` [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
@ 2021-08-04 15:56   ` Logan Gunthorpe
  2021-08-05  8:49     ` Dongdong Liu
  2021-08-05 18:12   ` Bjorn Helgaas
  1 sibling, 1 reply; 43+ messages in thread
From: Logan Gunthorpe @ 2021-08-04 15:56 UTC (permalink / raw)
  To: Dongdong Liu, helgaas, hch, kw, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev



On 2021-08-04 7:47 a.m., Dongdong Liu wrote:
> Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
> 10-Bit Tag Requester doesn't interact with a device that does not
> support 10-BIT Tag Completer. Before that happens, the kernel should
> emit a warning. "echo 0 > /sys/bus/pci/devices/.../10bit_tag" to
> disable 10-BIT Tag Requester for PF device.
> "echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
> 10-BIT Tag Requester for VF device.
> 
> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  drivers/pci/p2pdma.c | 40 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index 50cdde3..948f2be 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -19,6 +19,7 @@
>  #include <linux/random.h>
>  #include <linux/seq_buf.h>
>  #include <linux/xarray.h>
> +#include "pci.h"
>  
>  enum pci_p2pdma_map_type {
>  	PCI_P2PDMA_MAP_UNKNOWN = 0,
> @@ -410,6 +411,41 @@ static unsigned long map_types_idx(struct pci_dev *client)
>  		(client->bus->number << 8) | client->devfn;
>  }
>  
> +static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
> +				   bool verbose)
> +{
> +	bool req;
> +	bool comp;
> +	u16 ctl2;
> +
> +	if (a->is_virtfn) {
> +#ifdef CONFIG_PCI_IOV
> +		req = !!(a->physfn->sriov->ctrl &
> +			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
> +#endif
> +	} else {
> +		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl2);
> +		req = !!(ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +	}
> +
> +	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
> +	if (req && (!comp)) {

I think the brackets around !comp are unnecessary.

> +		if (verbose) {
> +			pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set in device (%s), but peer device (%s) does not support the 10-Bit Tag Completer\n",
> +				 pci_name(a), pci_name(b));
> +			if (a->is_virtfn)
> +				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/sriov_vf_10bit_tag_ctl\n",
> +					 pci_name(a));
> +			else
> +				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/10bit_tag\n",
> +					 pci_name(a));

Can we not simplify this slightly by having a const char * set to the
tag in the above if (a->is_virtfn)?

pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 >
/sys/bus/pci/devices/%s/%s\n", pci_name(a), tag);

> +		}
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
>  /*
>   * Calculate the P2PDMA mapping type and distance between two PCI devices.
>   *
> @@ -532,6 +568,10 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
>  		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>  	}
>  done:
> +	if (!check_10bit_tags_vaild(client, provider, verbose) ||
> +	    !check_10bit_tags_vaild(provider, client, verbose))
> +		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
> +
>  	rcu_read_lock();
>  	p2pdma = rcu_dereference(provider->p2pdma);
>  	if (p2pdma)
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices
  2021-08-04 13:47 ` [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices Dongdong Liu
@ 2021-08-04 23:17   ` Bjorn Helgaas
  2021-08-05  7:47     ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-04 23:17 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Wed, Aug 04, 2021 at 09:47:03PM +0800, Dongdong Liu wrote:
> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
> field size from 8 bits to 10 bits.
> 
> PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
> 10-Bit Tag Capabilities" Implementation Note.
> For platforms where the RC supports 10-Bit Tag Completer capability,
> it is highly recommended for platform firmware or operating software
> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
> bit automatically in Endpoints with 10-Bit Tag Requester capability. This
> enables the important class of 10-Bit Tag capable adapters that send
> Memory Read Requests only to host memory.

Quoted material should be set off with a blank line before it and
indented by two spaces so it's clear exactly what comes from the spec
and what you've added.  For example, see
https://git.kernel.org/linus/ec411e02b7a2

We need to say why we assume it's safe to enable 10-bit tags for all
devices below a Root Port that supports them.  I think this has to do
with switches being required to forward 10-bit tags correctly even if
they were designed before 10-bit tags were added to the spec.

And it should call out any cases where it is *not* safe, e.g., if P2P
traffic is an issue.

If there are cases where we don't want to enable 10-bit tags, whether
it's to enable P2P traffic or merely to work around device defects,
that ability needs to be here from the beginning.  If somebody needs
to bisect with 10-bit tags disabled, we don't want a bisection hole
between this commit and the commit that adds the control.

> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/pci/probe.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
>  include/linux/pci.h |  2 ++
>  2 files changed, 48 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
> index c83245b..3da7baa 100644
> --- a/drivers/pci/probe.c
> +++ b/drivers/pci/probe.c
> @@ -2029,10 +2029,42 @@ static void pci_configure_mps(struct pci_dev *dev)
>  		 p_mps, mps, mpss);
>  }
>  
> +static void pci_configure_10bit_tags(struct pci_dev *dev)
> +{
> +	struct pci_dev *bridge;
> +
> +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
> +		return;
> +
> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
> +		dev->ext_10bit_tag = 1;
> +		return;
> +	}
> +
> +	bridge = pci_upstream_bridge(dev);
> +	if (bridge && bridge->ext_10bit_tag)
> +		dev->ext_10bit_tag = 1;

Is it meaningful to set dev->ext_10bit_tag when "dev" is a VF?  I
suspect only if the VF could be a switch.  Is that possible?  If not,
I think the dev->is_virtfn check could be done first.

> +
> +	/*
> +	 * 10-Bit Tag Requester Enable in Device Control 2 Register is RsvdP
> +	 * for VF.

(Per 9.3.5.10)

> +	 */
> +	if (dev->is_virtfn)
> +		return;
> +
> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT &&
> +	    dev->ext_10bit_tag == 1 &&
> +	    (dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ)) {
> +		pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
> +		pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
> +					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +	}
> +}
> +
>  int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>  {
>  	struct pci_host_bridge *host;
> -	u16 ctl;
> +	u16 ctl, ctl2;
>  	int ret;
>  
>  	if (!pci_is_pcie(dev))
> @@ -2045,6 +2077,10 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>  	if (ret)
>  		return 0;
>  
> +	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL2, &ctl2);
> +	if (ret)
> +		return 0;
> +
>  	host = pci_find_host_bridge(dev->bus);
>  	if (!host)
>  		return 0;
> @@ -2059,6 +2095,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>  			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
>  						   PCI_EXP_DEVCTL_EXT_TAG);
>  		}
> +
> +		if (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN) {
> +			pci_info(dev, "disabling 10-Bit Tags\n");
> +			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
> +					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +		}
>  		return 0;
>  	}
>  
> @@ -2067,6 +2109,9 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>  		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
>  					 PCI_EXP_DEVCTL_EXT_TAG);
>  	}
> +
> +	pci_configure_10bit_tags(dev);
> +
>  	return 0;
>  }
>  
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 9aab67f..af6cb53 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -393,6 +393,8 @@ struct pci_dev {
>  #endif
>  	unsigned int	eetlp_prefix_path:1;	/* End-to-End TLP Prefix */
>  
> +	unsigned int	ext_10bit_tag:1; /* 10-Bit Tag Completer Supported
> +					    from root to here */
>  	pci_channel_state_t error_state;	/* Current connectivity state */
>  	struct device	dev;			/* Generic device interface */
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices
  2021-08-04 13:47 ` [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices Dongdong Liu
@ 2021-08-04 23:29   ` Bjorn Helgaas
  2021-08-05  8:03     ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-04 23:29 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Wed, Aug 04, 2021 at 09:47:04PM +0800, Dongdong Liu wrote:
> Enable VF 10-Bit Tag Requester when it's upstream component support
> 10-bit Tag Completer.

s/it's/its/
s/support/supports/

I think "upstream component" here means the PF, doesn't it?  I don't
think the PF is really an *upstream* component; there's no routing
like with a switch.

> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> ---
>  drivers/pci/iov.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index dafdc65..0d0bed1 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -634,6 +634,10 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>  
>  	pci_iov_set_numvfs(dev, nr_virtfn);
>  	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
> +	if ((iov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ) &&
> +	    dev->ext_10bit_tag)
> +		iov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
> +
>  	pci_cfg_access_lock(dev);
>  	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>  	msleep(100);
> @@ -650,6 +654,8 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>  
>  err_pcibios:
>  	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> +	if (iov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
> +		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
>  	pci_cfg_access_lock(dev);
>  	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>  	ssleep(1);
> @@ -682,6 +688,8 @@ static void sriov_disable(struct pci_dev *dev)
>  
>  	sriov_del_vfs(dev);
>  	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
> +	if (iov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
> +		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;

You can just clear PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN unconditionally,
can't you?  I know it wouldn't change anything, but removing the "if"
makes the code prettier.  You could just add it in the existing
PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE mask.

>  	pci_cfg_access_lock(dev);
>  	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>  	ssleep(1);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices
  2021-08-04 13:47 ` [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices Dongdong Liu
@ 2021-08-04 23:38   ` Bjorn Helgaas
  2021-08-05  8:25     ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-04 23:38 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Wed, Aug 04, 2021 at 09:47:05PM +0800, Dongdong Liu wrote:
> PCIe spec 5.0r1.0 section 2.2.6.2 implementation note, In configurations
> where a Requester with 10-Bit Tag Requester capability needs to target
> multiple Completers, one needs to ensure that the Requester sends 10-Bit
> Tag Requests only to Completers that have 10-Bit Tag Completer capability.
> So we enable 10-Bit Tag Requester for root port only when the devices
> under the root port support 10-Bit Tag Completer.

Fix quoting.  I can't tell what is from the spec and what you wrote.

> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  drivers/pci/pcie/portdrv_pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 69 insertions(+)
> 
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index c7ff1ee..2382cd2 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -90,6 +90,72 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>  #define PCIE_PORTDRV_PM_OPS	NULL
>  #endif /* !PM */
>  
> +static int pci_10bit_tag_comp_support(struct pci_dev *dev, void *data)
> +{
> +	bool *support = (bool *)data;
> +
> +	if (!pci_is_pcie(dev)) {
> +		*support = false;
> +		return 1;
> +	}
> +
> +	/*
> +	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
> +	 * For configurations where a Requester with 10-Bit Tag Requester
> +	 * capability targets Completers where some do and some do not have
> +	 * 10-Bit Tag Completer capability, how the Requester determines which
> +	 * NPRs include 10-Bit Tags is outside the scope of this specification.
> +	 * So we do not consider hotplug scenario.
> +	 */
> +	if (dev->is_hotplug_bridge) {
> +		*support = false;
> +		return 1;
> +	}
> +
> +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) {
> +		*support = false;
> +		return 1;
> +	}
> +
> +	return 0;
> +}
> +
> +static void pci_configure_rp_10bit_tag(struct pci_dev *dev)
> +{
> +	bool support = true;
> +
> +	if (dev->subordinate == NULL)
> +		return;
> +
> +	/* If no devices under the root port, no need to enable 10-Bit Tag. */
> +	if (list_empty(&dev->subordinate->devices))
> +		return;
> +
> +	pci_10bit_tag_comp_support(dev, &support);
> +	if (!support)
> +		return;
> +
> +	/*
> +	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
> +	 * In configurations where a Requester with 10-Bit Tag Requester
> +	 * capability needs to target multiple Completers, one needs to ensure
> +	 * that the Requester sends 10-Bit Tag Requests only to Completers
> +	 * that have 10-Bit Tag Completer capability. So we enable 10-Bit Tag
> +	 * Requester for root port only when the devices under the root port
> +	 * support 10-Bit Tag Completer.
> +	 */
> +	pci_walk_bus(dev->subordinate, pci_10bit_tag_comp_support, &support);
> +	if (!support)
> +		return;
> +
> +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
> +		return;
> +
> +	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
> +	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
> +				 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +}
> +
>  /*
>   * pcie_portdrv_probe - Probe PCI-Express port devices
>   * @dev: PCI-Express port device being probed
> @@ -111,6 +177,9 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
>  	     (type != PCI_EXP_TYPE_RC_EC)))
>  		return -ENODEV;
>  
> +	if (type == PCI_EXP_TYPE_ROOT_PORT)
> +		pci_configure_rp_10bit_tag(dev);

I don't think this has anything to do with the portdrv, so all this
should go somewhere else.

Out of curiosity, IIUC this enables 10-bit tags for MMIO transactions
from the root port toward the device, i.e., traffic that originates
from a CPU.  Is that a significant benefit?  I would expect high-speed
devices would primarily operate via DMA with relatively little MMIO
traffic.

>  	if (type == PCI_EXP_TYPE_RC_EC)
>  		pcie_link_rcec(dev);
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-04 13:47 ` [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file Dongdong Liu
  2021-08-04 15:51   ` Logan Gunthorpe
@ 2021-08-04 23:49   ` Bjorn Helgaas
  2021-08-05  8:37     ` Dongdong Liu
  2021-08-04 23:52   ` Bjorn Helgaas
  2 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-04 23:49 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote:
> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> sending Requests to other Endpoints (as opposed to host memory), the
> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> unless an implementation-specific mechanism determines that the Endpoint
> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
> write 0 to disable 10-Bit Tag Requester when the driver does not bind
> the device if the peer device does not support the 10-Bit Tag Completer.
> This will make P2P traffic safe. the 10bit_tag file content indicate
> current 10-Bit Tag Requester Enable status.
> 
> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  Documentation/ABI/testing/sysfs-bus-pci | 16 +++++++-
>  drivers/pci/pci-sysfs.c                 | 69 +++++++++++++++++++++++++++++++++
>  2 files changed, 84 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index 793cbb7..0e0c97d 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -139,7 +139,7 @@ Description:
>  		binary file containing the Vital Product Data for the
>  		device.  It should follow the VPD format defined in
>  		PCI Specification 2.1 or 2.2, but users should consider
> -		that some devices may have incorrectly formatted data.  
> +		that some devices may have incorrectly formatted data.
>  		If the underlying VPD has a writable section then the
>  		corresponding section of this file will be writable.
>  
> @@ -407,3 +407,17 @@ Description:
>  
>  		The file is writable if the PF is bound to a driver that
>  		implements ->sriov_set_msix_vec_count().
> +
> +What:		/sys/bus/pci/devices/.../10bit_tag
> +Date:		August 2021
> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
> +Description:
> +		If a PCI device support 10-Bit Tag Requester, will create the
> +		10bit_tag sysfs file. The file is readable, the value
> +		indicate current 10-Bit Tag Requester Enable.
> +		1 - enabled, 0 - disabled.
> +
> +		The file is also writeable, the value only accept by write 0
> +		to disable 10-Bit Tag Requester when the driver does not bind
> +		the deivce. The typical use case is for p2pdma when the peer
> +		device does not support 10-BIT Tag Completer.

s/writeable/writable/
s/deivce/device/

The first sentence does not parse.

> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 5d63df7..e93ce8b 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -306,6 +306,49 @@ static ssize_t enable_show(struct device *dev, struct device_attribute *attr,
>  }
>  static DEVICE_ATTR_RW(enable);
>  
> +static ssize_t pci_10bit_tag_store(struct device *dev,
> +				   struct device_attribute *attr,
> +				   const char *buf, size_t count)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	bool enable;
> +
> +	if (kstrtobool(buf, &enable) < 0)
> +		return -EINVAL;
> +
> +	if (enable != false )
> +		return -EINVAL;

Is this the same as "if (enable)"?

> +	if (pdev->driver)
> +		 return -EBUSY;
> +
> +	pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
> +				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +	pci_info(pdev, "disabled 10-Bit Tag Requester\n");
> +
> +	return count;
> +}
> +
> +static ssize_t pci_10bit_tag_show(struct device *dev,
> +				  struct device_attribute *attr,
> +				  char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +	u16 ctl;
> +	int ret;
> +
> +	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &ctl);
> +	if (ret)
> +		return -EINVAL;
> +
> +	return sysfs_emit(buf, "%u\n",
> +			  !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN));
> +}
> +
> +static struct device_attribute dev_attr_10bit_tag = __ATTR(10bit_tag, 0600,
> +							   pci_10bit_tag_show,
> +							   pci_10bit_tag_store);

Is this DEVICE_ATTR_ADMIN_RW()?

Why is it 0600?  Everything else in this file looks like
DEVICE_ATTR_RO or DEVICE_ATTR_RW.  This should be the same unless
there's a reason to be different.

>  #ifdef CONFIG_NUMA
>  static ssize_t numa_node_store(struct device *dev,
>  			       struct device_attribute *attr, const char *buf,
> @@ -635,6 +678,11 @@ static struct attribute *pcie_dev_attrs[] = {
>  	NULL,
>  };
>  
> +static struct attribute *pcie_dev_10bit_tag_attrs[] = {
> +	&dev_attr_10bit_tag.attr,
> +	NULL,
> +};
> +
>  static struct attribute *pcibus_attrs[] = {
>  	&dev_attr_bus_rescan.attr,
>  	&dev_attr_cpuaffinity.attr,
> @@ -1482,6 +1530,21 @@ static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
>  	return 0;
>  }
>  
> +static umode_t pcie_dev_10bit_tag_attrs_are_visible(struct kobject *kobj,
> +					  struct attribute *a, int n)
> +{
> +	struct device *dev = kobj_to_dev(kobj);
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	if (pdev->is_virtfn)
> +		return 0;
> +
> +	if (!(pdev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
> +		return 0;
> +
> +	return a->mode;
> +}
> +
>  static const struct attribute_group pci_dev_group = {
>  	.attrs = pci_dev_attrs,
>  };
> @@ -1521,6 +1584,11 @@ static const struct attribute_group pcie_dev_attr_group = {
>  	.is_visible = pcie_dev_attrs_are_visible,
>  };
>  
> +static const struct attribute_group pcie_dev_10bit_tag_attr_group = {
> +	.attrs = pcie_dev_10bit_tag_attrs,
> +	.is_visible = pcie_dev_10bit_tag_attrs_are_visible,
> +};
> +
>  static const struct attribute_group *pci_dev_attr_groups[] = {
>  	&pci_dev_attr_group,
>  	&pci_dev_hp_attr_group,
> @@ -1530,6 +1598,7 @@ static const struct attribute_group *pci_dev_attr_groups[] = {
>  #endif
>  	&pci_bridge_attr_group,
>  	&pcie_dev_attr_group,
> +	&pcie_dev_10bit_tag_attr_group,
>  #ifdef CONFIG_PCIEAER
>  	&aer_stats_attr_group,
>  #endif
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-04 13:47 ` [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file Dongdong Liu
  2021-08-04 15:51   ` Logan Gunthorpe
  2021-08-04 23:49   ` Bjorn Helgaas
@ 2021-08-04 23:52   ` Bjorn Helgaas
  2021-08-05  8:38     ` Dongdong Liu
  2 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-04 23:52 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote:
> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> sending Requests to other Endpoints (as opposed to host memory), the
> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> unless an implementation-specific mechanism determines that the Endpoint
> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
> write 0 to disable 10-Bit Tag Requester when the driver does not bind
> the device if the peer device does not support the 10-Bit Tag Completer.
> This will make P2P traffic safe. the 10bit_tag file content indicate
> current 10-Bit Tag Requester Enable status.
> 
> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>

> +		The file is also writeable, the value only accept by write 0
> +		to disable 10-Bit Tag Requester when the driver does not bind
> +		the deivce. The typical use case is for p2pdma when the peer
> +		device does not support 10-BIT Tag Completer.

s/10-BIT/10-Bit/

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 8/9] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  2021-08-04 13:47 ` [PATCH V7 8/9] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
@ 2021-08-05  0:05   ` Bjorn Helgaas
  2021-08-05  8:47     ` Dongdong Liu
  2021-08-05  9:39     ` Dongdong Liu
  0 siblings, 2 replies; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-05  0:05 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Wed, Aug 04, 2021 at 09:47:07PM +0800, Dongdong Liu wrote:
> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> sending Requests to other Endpoints (as opposed to host memory), the
> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> unless an implementation-specific mechanism determines that the
> Endpoint supports 10-Bit Tag Completer capability.
> Add sriov_vf_10bit_tag file to query the status of VF 10-Bit Tag
> Requester Enable. Add sriov_vf_10bit_tag_ctl file to disable the VF
> 10-Bit Tag Requester. The typical use case is for p2pdma when the peer
> device does not support 10-BIT Tag Completer.

Fix the usual spec quoting issue.  Or maybe this is not actually
quoted but is missing blank lines between paragraphs.

s/10-BIT/10-Bit/

> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  Documentation/ABI/testing/sysfs-bus-pci | 20 +++++++++++++
>  drivers/pci/iov.c                       | 50 +++++++++++++++++++++++++++++++++
>  2 files changed, 70 insertions(+)
> 
> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> index 0e0c97d..8fdbfae 100644
> --- a/Documentation/ABI/testing/sysfs-bus-pci
> +++ b/Documentation/ABI/testing/sysfs-bus-pci
> @@ -421,3 +421,23 @@ Description:
>  		to disable 10-Bit Tag Requester when the driver does not bind
>  		the deivce. The typical use case is for p2pdma when the peer
>  		device does not support 10-BIT Tag Completer.
> +
> +What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag
> +Date:		August 2021
> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
> +Description:
> +		This file is associated with a SR-IOV physical function (PF).
> +		It is visible when the device has VF 10-Bit Tag Requester
> +		Supported. It contains the status of VF 10-Bit Tag Requester
> +		Enable. The file is only readable.

s/only readable/read-only/

> +What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl

Why does this file have "_ctl" on the end when the one in patch 7/9
does not?

> +Date:		August 2021
> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
> +Description:
> +		This file is associated with a SR-IOV virtual function (VF).
> +		It is visible when the device has VF 10-Bit Tag Requester
> +		Supported. It only allows to write 0 to disable VF 10-Bit
> +		Tag Requester. The file is only writeable when the vf driver
> +		does not bind to a dirver. The typical use case is for p2pdma
> +		when the peer device does not support 10-BIT Tag Completer.

s/vf/VF/
s/dirver/driver/
s/10-BIT/10-Bit/

"when the vr driver does not bind to a driver"?  Not quite right.
Must be a "device" in there somewhere.

So IIUC this file is associated with a VF, but the bit it writes is
actually in the *PF*?  So writing 0 to any VF's file disables 10-bit
tags for *all* VFs?  That's worth mentioning here.

> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
> index 0d0bed1..04c1298 100644
> --- a/drivers/pci/iov.c
> +++ b/drivers/pci/iov.c
> @@ -220,10 +220,38 @@ static ssize_t sriov_vf_msix_count_store(struct device *dev,
>  static DEVICE_ATTR_WO(sriov_vf_msix_count);
>  #endif
>  
> +static ssize_t sriov_vf_10bit_tag_ctl_store(struct device *dev,
> +					    struct device_attribute *attr,
> +					    const char *buf, size_t count)
> +{
> +	struct pci_dev *vf_dev = to_pci_dev(dev);
> +	struct pci_dev *pdev = pci_physfn(vf_dev);
> +	struct pci_sriov *iov;
> +	bool enable;
> +
> +	if (kstrtobool(buf, &enable) < 0)
> +		return -EINVAL;
> +
> +	if (enable != false)
> +		return -EINVAL;

Is this "if (enable)" again?

> +	if (vf_dev->driver)
> +		return -EBUSY;
> +
> +	iov = pdev->sriov;
> +	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
> +	pci_write_config_word(pdev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
> +	pci_info(pdev, "disabled SRIOV 10-Bit Tag Requester\n");

s/SRIOV/SR-IOV/ to match spec and other usages.

> +
> +	return count;
> +}
> +static DEVICE_ATTR_WO(sriov_vf_10bit_tag_ctl);
> +
>  static struct attribute *sriov_vf_dev_attrs[] = {
>  #ifdef CONFIG_PCI_MSI
>  	&dev_attr_sriov_vf_msix_count.attr,
>  #endif
> +	&dev_attr_sriov_vf_10bit_tag_ctl.attr,
>  	NULL,
>  };
>  
> @@ -236,6 +264,11 @@ static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
>  	if (!pdev->is_virtfn)
>  		return 0;
>  
> +	pdev = pci_physfn(pdev);
> +	if ((a == &dev_attr_sriov_vf_10bit_tag_ctl.attr) &&
> +	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
> +		return 0;
> +
>  	return a->mode;
>  }
>  
> @@ -487,12 +520,23 @@ static ssize_t sriov_drivers_autoprobe_store(struct device *dev,
>  	return count;
>  }
>  
> +static ssize_t sriov_vf_10bit_tag_show(struct device *dev,
> +				       struct device_attribute *attr,
> +				       char *buf)
> +{
> +	struct pci_dev *pdev = to_pci_dev(dev);
> +
> +	return sysfs_emit(buf, "%u\n",
> +		!!(pdev->sriov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN));
> +}
> +
>  static DEVICE_ATTR_RO(sriov_totalvfs);
>  static DEVICE_ATTR_RW(sriov_numvfs);
>  static DEVICE_ATTR_RO(sriov_offset);
>  static DEVICE_ATTR_RO(sriov_stride);
>  static DEVICE_ATTR_RO(sriov_vf_device);
>  static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
> +static DEVICE_ATTR_RO(sriov_vf_10bit_tag);
>  
>  static struct attribute *sriov_pf_dev_attrs[] = {
>  	&dev_attr_sriov_totalvfs.attr,
> @@ -501,6 +545,7 @@ static struct attribute *sriov_pf_dev_attrs[] = {
>  	&dev_attr_sriov_stride.attr,
>  	&dev_attr_sriov_vf_device.attr,
>  	&dev_attr_sriov_drivers_autoprobe.attr,
> +	&dev_attr_sriov_vf_10bit_tag.attr,
>  #ifdef CONFIG_PCI_MSI
>  	&dev_attr_sriov_vf_total_msix.attr,
>  #endif
> @@ -511,10 +556,15 @@ static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
>  					  struct attribute *a, int n)
>  {
>  	struct device *dev = kobj_to_dev(kobj);
> +	struct pci_dev *pdev = to_pci_dev(dev);
>  
>  	if (!dev_is_pf(dev))
>  		return 0;
>  
> +	if ((a == &dev_attr_sriov_vf_10bit_tag.attr) &&
> +	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
> +		return 0;
> +
>  	return a->mode;
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices
  2021-08-04 23:17   ` Bjorn Helgaas
@ 2021-08-05  7:47     ` Dongdong Liu
  2021-08-05 19:54       ` Bjorn Helgaas
  0 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05  7:47 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

Hi Bjorn

Many thanks for your review.
On 2021/8/5 7:17, Bjorn Helgaas wrote:
> On Wed, Aug 04, 2021 at 09:47:03PM +0800, Dongdong Liu wrote:
>> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
>> field size from 8 bits to 10 bits.
>>
>> PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
>> 10-Bit Tag Capabilities" Implementation Note.
>> For platforms where the RC supports 10-Bit Tag Completer capability,
>> it is highly recommended for platform firmware or operating software
>> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
>> bit automatically in Endpoints with 10-Bit Tag Requester capability. This
>> enables the important class of 10-Bit Tag capable adapters that send
>> Memory Read Requests only to host memory.
>
> Quoted material should be set off with a blank line before it and
> indented by two spaces so it's clear exactly what comes from the spec
> and what you've added.  For example, see
> https://git.kernel.org/linus/ec411e02b7a2
Good point, will fix.
>
> We need to say why we assume it's safe to enable 10-bit tags for all
> devices below a Root Port that supports them.  I think this has to do
> with switches being required to forward 10-bit tags correctly even if
> they were designed before 10-bit tags were added to the spec.

PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
10-Bit Tag Capabilities" Implementation Note:

   Switches that lack 10-Bit Tag Completer capability are still able to
   forward NPRs and Completions carrying 10-Bit Tags correctly, since the
   two new Tag bits are in TLP Header bits that were formerly Reserved,
   and Switches are required to forward Reserved TLP Header bits without
   modification. However, if such a Switch detects an error with an NPR
   carrying a 10-Bit Tag, and that Switch handles the error by acting as
   the Completer for the NPR, the resulting Completion will have an
   invalid 10-Bit Tag. Thus, it is strongly recommended that Switches
   between any components using 10-Bit Tags support 10-Bit Tag Completer
   capability.  Note that Switches supporting 16.0 GT/s data rates or
   greater must support 10-Bit Tag Completer capability.

This patch also consider to enable 10-Bit Tag for EP device need RP
and Switch device support 10-Bit Tag Completer capability.
>
> And it should call out any cases where it is *not* safe, e.g., if P2P
> traffic is an issue.
Yes, indeed.
>
> If there are cases where we don't want to enable 10-bit tags, whether
> it's to enable P2P traffic or merely to work around device defects,
> that ability needs to be here from the beginning.  If somebody needs
> to bisect with 10-bit tags disabled, we don't want a bisection hole
> between this commit and the commit that adds the control.
We provide sysfs file to disable 10-bit tag for P2P traffic when needed.
The details see PATCH 7/8/9.

Current we do not know the 10-bit tag defective devices, current may no
need do as 8-bit tag does in quirk_no_ext_tags().
>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> ---
>>  drivers/pci/probe.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
>>  include/linux/pci.h |  2 ++
>>  2 files changed, 48 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
>> index c83245b..3da7baa 100644
>> --- a/drivers/pci/probe.c
>> +++ b/drivers/pci/probe.c
>> @@ -2029,10 +2029,42 @@ static void pci_configure_mps(struct pci_dev *dev)
>>  		 p_mps, mps, mpss);
>>  }
>>
>> +static void pci_configure_10bit_tags(struct pci_dev *dev)
>> +{
>> +	struct pci_dev *bridge;
>> +
>> +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
>> +		return;
>> +
>> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
>> +		dev->ext_10bit_tag = 1;
>> +		return;
>> +	}
>> +
>> +	bridge = pci_upstream_bridge(dev);
>> +	if (bridge && bridge->ext_10bit_tag)
>> +		dev->ext_10bit_tag = 1;
>
> Is it meaningful to set dev->ext_10bit_tag when "dev" is a VF?  I
> suspect only if the VF could be a switch.  Is that possible?  If not,
Yes, no need.
> I think the dev->is_virtfn check could be done first.
Will do.
>
>> +
>> +	/*
>> +	 * 10-Bit Tag Requester Enable in Device Control 2 Register is RsvdP
>> +	 * for VF.
>
> (Per 9.3.5.10)
Will fix.

Thanks,
Dongdong
>
>> +	 */
>> +	if (dev->is_virtfn)
>> +		return;
>> +
>> +	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT &&
>> +	    dev->ext_10bit_tag == 1 &&
>> +	    (dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ)) {
>> +		pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
>> +		pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
>> +					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +	}
>> +}
>> +
>>  int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>>  {
>>  	struct pci_host_bridge *host;
>> -	u16 ctl;
>> +	u16 ctl, ctl2;
>>  	int ret;
>>
>>  	if (!pci_is_pcie(dev))
>> @@ -2045,6 +2077,10 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>>  	if (ret)
>>  		return 0;
>>
>> +	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL2, &ctl2);
>> +	if (ret)
>> +		return 0;
>> +
>>  	host = pci_find_host_bridge(dev->bus);
>>  	if (!host)
>>  		return 0;
>> @@ -2059,6 +2095,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>>  			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
>>  						   PCI_EXP_DEVCTL_EXT_TAG);
>>  		}
>> +
>> +		if (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN) {
>> +			pci_info(dev, "disabling 10-Bit Tags\n");
>> +			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
>> +					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +		}
>>  		return 0;
>>  	}
>>
>> @@ -2067,6 +2109,9 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
>>  		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
>>  					 PCI_EXP_DEVCTL_EXT_TAG);
>>  	}
>> +
>> +	pci_configure_10bit_tags(dev);
>> +
>>  	return 0;
>>  }
>>
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index 9aab67f..af6cb53 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -393,6 +393,8 @@ struct pci_dev {
>>  #endif
>>  	unsigned int	eetlp_prefix_path:1;	/* End-to-End TLP Prefix */
>>
>> +	unsigned int	ext_10bit_tag:1; /* 10-Bit Tag Completer Supported
>> +					    from root to here */
>>  	pci_channel_state_t error_state;	/* Current connectivity state */
>>  	struct device	dev;			/* Generic device interface */
>>
>> --
>> 2.7.4
>>
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices
  2021-08-04 23:29   ` Bjorn Helgaas
@ 2021-08-05  8:03     ` Dongdong Liu
  2021-08-06 22:59       ` Bjorn Helgaas
  0 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05  8:03 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev


On 2021/8/5 7:29, Bjorn Helgaas wrote:
> On Wed, Aug 04, 2021 at 09:47:04PM +0800, Dongdong Liu wrote:
>> Enable VF 10-Bit Tag Requester when it's upstream component support
>> 10-bit Tag Completer.
>
> s/it's/its/
> s/support/supports/
Will fix.
>
> I think "upstream component" here means the PF, doesn't it?  I don't
> think the PF is really an *upstream* component; there's no routing
> like with a switch.
I want to say the switch and root port devices that support 10-Bit
Tag Completer. Sure, VF also needs to have 10-bit Tag Requester
Supported capability.
>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>> ---
>>  drivers/pci/iov.c | 8 ++++++++
>>  1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index dafdc65..0d0bed1 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -634,6 +634,10 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>>
>>  	pci_iov_set_numvfs(dev, nr_virtfn);
>>  	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
>> +	if ((iov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ) &&
>> +	    dev->ext_10bit_tag)
>> +		iov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
>> +
>>  	pci_cfg_access_lock(dev);
>>  	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>  	msleep(100);
>> @@ -650,6 +654,8 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
>>
>>  err_pcibios:
>>  	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
>> +	if (iov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
>> +		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
>>  	pci_cfg_access_lock(dev);
>>  	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>  	ssleep(1);
>> @@ -682,6 +688,8 @@ static void sriov_disable(struct pci_dev *dev)
>>
>>  	sriov_del_vfs(dev);
>>  	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
>> +	if (iov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
>> +		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
>
> You can just clear PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN unconditionally,
> can't you?  I know it wouldn't change anything, but removing the "if"
> makes the code prettier.  You could just add it in the existing
> PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE mask.
Will do.

Thanks,
Dongdong
>
>>  	pci_cfg_access_lock(dev);
>>  	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>>  	ssleep(1);
>> --
>> 2.7.4
>>
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices
  2021-08-04 23:38   ` Bjorn Helgaas
@ 2021-08-05  8:25     ` Dongdong Liu
  2021-08-09 17:26       ` Bjorn Helgaas
  0 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05  8:25 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On 2021/8/5 7:38, Bjorn Helgaas wrote:
> On Wed, Aug 04, 2021 at 09:47:05PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0r1.0 section 2.2.6.2 implementation note, In configurations
>> where a Requester with 10-Bit Tag Requester capability needs to target
>> multiple Completers, one needs to ensure that the Requester sends 10-Bit
>> Tag Requests only to Completers that have 10-Bit Tag Completer capability.
>> So we enable 10-Bit Tag Requester for root port only when the devices
>> under the root port support 10-Bit Tag Completer.
>
> Fix quoting.  I can't tell what is from the spec and what you wrote.
Will fix.
>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>>  drivers/pci/pcie/portdrv_pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 69 insertions(+)
>>
>> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>> index c7ff1ee..2382cd2 100644
>> --- a/drivers/pci/pcie/portdrv_pci.c
>> +++ b/drivers/pci/pcie/portdrv_pci.c
>> @@ -90,6 +90,72 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>>  #define PCIE_PORTDRV_PM_OPS	NULL
>>  #endif /* !PM */
>>
>> +static int pci_10bit_tag_comp_support(struct pci_dev *dev, void *data)
>> +{
>> +	bool *support = (bool *)data;
>> +
>> +	if (!pci_is_pcie(dev)) {
>> +		*support = false;
>> +		return 1;
>> +	}
>> +
>> +	/*
>> +	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
>> +	 * For configurations where a Requester with 10-Bit Tag Requester
>> +	 * capability targets Completers where some do and some do not have
>> +	 * 10-Bit Tag Completer capability, how the Requester determines which
>> +	 * NPRs include 10-Bit Tags is outside the scope of this specification.
>> +	 * So we do not consider hotplug scenario.
>> +	 */
>> +	if (dev->is_hotplug_bridge) {
>> +		*support = false;
>> +		return 1;
>> +	}
>> +
>> +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) {
>> +		*support = false;
>> +		return 1;
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static void pci_configure_rp_10bit_tag(struct pci_dev *dev)
>> +{
>> +	bool support = true;
>> +
>> +	if (dev->subordinate == NULL)
>> +		return;
>> +
>> +	/* If no devices under the root port, no need to enable 10-Bit Tag. */
>> +	if (list_empty(&dev->subordinate->devices))
>> +		return;
>> +
>> +	pci_10bit_tag_comp_support(dev, &support);
>> +	if (!support)
>> +		return;
>> +
>> +	/*
>> +	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
>> +	 * In configurations where a Requester with 10-Bit Tag Requester
>> +	 * capability needs to target multiple Completers, one needs to ensure
>> +	 * that the Requester sends 10-Bit Tag Requests only to Completers
>> +	 * that have 10-Bit Tag Completer capability. So we enable 10-Bit Tag
>> +	 * Requester for root port only when the devices under the root port
>> +	 * support 10-Bit Tag Completer.
>> +	 */
>> +	pci_walk_bus(dev->subordinate, pci_10bit_tag_comp_support, &support);
>> +	if (!support)
>> +		return;
>> +
>> +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
>> +		return;
>> +
>> +	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
>> +	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
>> +				 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +}
>> +
>>  /*
>>   * pcie_portdrv_probe - Probe PCI-Express port devices
>>   * @dev: PCI-Express port device being probed
>> @@ -111,6 +177,9 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
>>  	     (type != PCI_EXP_TYPE_RC_EC)))
>>  		return -ENODEV;
>>
>> +	if (type == PCI_EXP_TYPE_ROOT_PORT)
>> +		pci_configure_rp_10bit_tag(dev);
>
> I don't think this has anything to do with the portdrv, so all this
> should go somewhere else.
Yes, any suggestion where to put the code?
>
> Out of curiosity, IIUC this enables 10-bit tags for MMIO transactions
> from the root port toward the device, i.e., traffic that originates
> from a CPU.  Is that a significant benefit?  I would expect high-speed
> devices would primarily operate via DMA with relatively little MMIO
> traffic.
The benefits of 10-Bit Tag for EP are obvious.
There are few RP scenarios. Unless there are two:
1. RC has its own DMA.
2. The P2P tag is replaced at the RP when the P2PDMA go through RP.

Thanks,
Dongdong
>
>>  	if (type == PCI_EXP_TYPE_RC_EC)
>>  		pcie_link_rcec(dev);
>>
>> --
>> 2.7.4
>>
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-04 23:49   ` Bjorn Helgaas
@ 2021-08-05  8:37     ` Dongdong Liu
  2021-08-05 15:31       ` Bjorn Helgaas
  0 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05  8:37 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev



On 2021/8/5 7:49, Bjorn Helgaas wrote:
> On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>> sending Requests to other Endpoints (as opposed to host memory), the
>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>> unless an implementation-specific mechanism determines that the Endpoint
>> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
>> write 0 to disable 10-Bit Tag Requester when the driver does not bind
>> the device if the peer device does not support the 10-Bit Tag Completer.
>> This will make P2P traffic safe. the 10bit_tag file content indicate
>> current 10-Bit Tag Requester Enable status.
>>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>>  Documentation/ABI/testing/sysfs-bus-pci | 16 +++++++-
>>  drivers/pci/pci-sysfs.c                 | 69 +++++++++++++++++++++++++++++++++
>>  2 files changed, 84 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
>> index 793cbb7..0e0c97d 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-pci
>> +++ b/Documentation/ABI/testing/sysfs-bus-pci
>> @@ -139,7 +139,7 @@ Description:
>>  		binary file containing the Vital Product Data for the
>>  		device.  It should follow the VPD format defined in
>>  		PCI Specification 2.1 or 2.2, but users should consider
>> -		that some devices may have incorrectly formatted data.
>> +		that some devices may have incorrectly formatted data.
>>  		If the underlying VPD has a writable section then the
>>  		corresponding section of this file will be writable.
>>
>> @@ -407,3 +407,17 @@ Description:
>>
>>  		The file is writable if the PF is bound to a driver that
>>  		implements ->sriov_set_msix_vec_count().
>> +
>> +What:		/sys/bus/pci/devices/.../10bit_tag
>> +Date:		August 2021
>> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
>> +Description:
>> +		If a PCI device support 10-Bit Tag Requester, will create the
>> +		10bit_tag sysfs file. The file is readable, the value
>> +		indicate current 10-Bit Tag Requester Enable.
>> +		1 - enabled, 0 - disabled.
>> +
>> +		The file is also writeable, the value only accept by write 0
>> +		to disable 10-Bit Tag Requester when the driver does not bind
>> +		the deivce. The typical use case is for p2pdma when the peer
>> +		device does not support 10-BIT Tag Completer.
>
> s/writeable/writable/
> s/deivce/device/
Will fix.
>
> The first sentence does not parse.
10bit_tag sysfs file will be visible only when the device have 10-Bit
Tag Requester Supported capability.

Will fix.
>
>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
>> index 5d63df7..e93ce8b 100644
>> --- a/drivers/pci/pci-sysfs.c
>> +++ b/drivers/pci/pci-sysfs.c
>> @@ -306,6 +306,49 @@ static ssize_t enable_show(struct device *dev, struct device_attribute *attr,
>>  }
>>  static DEVICE_ATTR_RW(enable);
>>
>> +static ssize_t pci_10bit_tag_store(struct device *dev,
>> +				   struct device_attribute *attr,
>> +				   const char *buf, size_t count)
>> +{
>> +	struct pci_dev *pdev = to_pci_dev(dev);
>> +	bool enable;
>> +
>> +	if (kstrtobool(buf, &enable) < 0)
>> +		return -EINVAL;
>> +
>> +	if (enable != false )
>> +		return -EINVAL;
>
> Is this the same as "if (enable)"?
Yes, Will fix.
>
>> +	if (pdev->driver)
>> +		 return -EBUSY;
>> +
>> +	pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
>> +				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +	pci_info(pdev, "disabled 10-Bit Tag Requester\n");
>> +
>> +	return count;
>> +}
>> +
>> +static ssize_t pci_10bit_tag_show(struct device *dev,
>> +				  struct device_attribute *attr,
>> +				  char *buf)
>> +{
>> +	struct pci_dev *pdev = to_pci_dev(dev);
>> +	u16 ctl;
>> +	int ret;
>> +
>> +	ret = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL2, &ctl);
>> +	if (ret)
>> +		return -EINVAL;
>> +
>> +	return sysfs_emit(buf, "%u\n",
>> +			  !!(ctl & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN));
>> +}
>> +
>> +static struct device_attribute dev_attr_10bit_tag = __ATTR(10bit_tag, 0600,
>> +							   pci_10bit_tag_show,
>> +							   pci_10bit_tag_store);
>
> Is this DEVICE_ATTR_ADMIN_RW()?
>
> Why is it 0600?  Everything else in this file looks like
> DEVICE_ATTR_RO or DEVICE_ATTR_RW.  This should be the same unless
> there's a reason to be different.
Should be 0644, I do not use DEVICE_ATTR_RW, just want to
use "10bit_tag" file name.
Will fix.

Thanks,
Dongdong
>
>>  #ifdef CONFIG_NUMA
>>  static ssize_t numa_node_store(struct device *dev,
>>  			       struct device_attribute *attr, const char *buf,
>> @@ -635,6 +678,11 @@ static struct attribute *pcie_dev_attrs[] = {
>>  	NULL,
>>  };
>>
>> +static struct attribute *pcie_dev_10bit_tag_attrs[] = {
>> +	&dev_attr_10bit_tag.attr,
>> +	NULL,
>> +};
>> +
>>  static struct attribute *pcibus_attrs[] = {
>>  	&dev_attr_bus_rescan.attr,
>>  	&dev_attr_cpuaffinity.attr,
>> @@ -1482,6 +1530,21 @@ static umode_t pcie_dev_attrs_are_visible(struct kobject *kobj,
>>  	return 0;
>>  }
>>
>> +static umode_t pcie_dev_10bit_tag_attrs_are_visible(struct kobject *kobj,
>> +					  struct attribute *a, int n)
>> +{
>> +	struct device *dev = kobj_to_dev(kobj);
>> +	struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> +	if (pdev->is_virtfn)
>> +		return 0;
>> +
>> +	if (!(pdev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
>> +		return 0;
>> +
>> +	return a->mode;
>> +}
>> +
>>  static const struct attribute_group pci_dev_group = {
>>  	.attrs = pci_dev_attrs,
>>  };
>> @@ -1521,6 +1584,11 @@ static const struct attribute_group pcie_dev_attr_group = {
>>  	.is_visible = pcie_dev_attrs_are_visible,
>>  };
>>
>> +static const struct attribute_group pcie_dev_10bit_tag_attr_group = {
>> +	.attrs = pcie_dev_10bit_tag_attrs,
>> +	.is_visible = pcie_dev_10bit_tag_attrs_are_visible,
>> +};
>> +
>>  static const struct attribute_group *pci_dev_attr_groups[] = {
>>  	&pci_dev_attr_group,
>>  	&pci_dev_hp_attr_group,
>> @@ -1530,6 +1598,7 @@ static const struct attribute_group *pci_dev_attr_groups[] = {
>>  #endif
>>  	&pci_bridge_attr_group,
>>  	&pcie_dev_attr_group,
>> +	&pcie_dev_10bit_tag_attr_group,
>>  #ifdef CONFIG_PCIEAER
>>  	&aer_stats_attr_group,
>>  #endif
>> --
>> 2.7.4
>>
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-04 23:52   ` Bjorn Helgaas
@ 2021-08-05  8:38     ` Dongdong Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05  8:38 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev


On 2021/8/5 7:52, Bjorn Helgaas wrote:
> On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>> sending Requests to other Endpoints (as opposed to host memory), the
>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>> unless an implementation-specific mechanism determines that the Endpoint
>> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
>> write 0 to disable 10-Bit Tag Requester when the driver does not bind
>> the device if the peer device does not support the 10-Bit Tag Completer.
>> This will make P2P traffic safe. the 10bit_tag file content indicate
>> current 10-Bit Tag Requester Enable status.
>>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>
>> +		The file is also writeable, the value only accept by write 0
>> +		to disable 10-Bit Tag Requester when the driver does not bind
>> +		the deivce. The typical use case is for p2pdma when the peer
>> +		device does not support 10-BIT Tag Completer.
>
> s/10-BIT/10-Bit/
Will fix and will check other place.

Thank,
Dongdong
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 8/9] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  2021-08-05  0:05   ` Bjorn Helgaas
@ 2021-08-05  8:47     ` Dongdong Liu
  2021-08-05  9:39     ` Dongdong Liu
  1 sibling, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05  8:47 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev


On 2021/8/5 8:05, Bjorn Helgaas wrote:
> On Wed, Aug 04, 2021 at 09:47:07PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>> sending Requests to other Endpoints (as opposed to host memory), the
>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>> unless an implementation-specific mechanism determines that the
>> Endpoint supports 10-Bit Tag Completer capability.
>> Add sriov_vf_10bit_tag file to query the status of VF 10-Bit Tag
>> Requester Enable. Add sriov_vf_10bit_tag_ctl file to disable the VF
>> 10-Bit Tag Requester. The typical use case is for p2pdma when the peer
>> device does not support 10-BIT Tag Completer.
>
> Fix the usual spec quoting issue.  Or maybe this is not actually
> quoted but is missing blank lines between paragraphs.
Will fix.
>
> s/10-BIT/10-Bit/
Will fix.
>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>>  Documentation/ABI/testing/sysfs-bus-pci | 20 +++++++++++++
>>  drivers/pci/iov.c                       | 50 +++++++++++++++++++++++++++++++++
>>  2 files changed, 70 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
>> index 0e0c97d..8fdbfae 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-pci
>> +++ b/Documentation/ABI/testing/sysfs-bus-pci
>> @@ -421,3 +421,23 @@ Description:
>>  		to disable 10-Bit Tag Requester when the driver does not bind
>>  		the deivce. The typical use case is for p2pdma when the peer
>>  		device does not support 10-BIT Tag Completer.
>> +
>> +What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag
>> +Date:		August 2021
>> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
>> +Description:
>> +		This file is associated with a SR-IOV physical function (PF).
>> +		It is visible when the device has VF 10-Bit Tag Requester
>> +		Supported. It contains the status of VF 10-Bit Tag Requester
>> +		Enable. The file is only readable.
>
> s/only readable/read-only/
Will fix.
>
>> +What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl
>
> Why does this file have "_ctl" on the end when the one in patch 7/9
> does not?
>
>> +Date:		August 2021
>> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
>> +Description:
>> +		This file is associated with a SR-IOV virtual function (VF).
>> +		It is visible when the device has VF 10-Bit Tag Requester
>> +		Supported. It only allows to write 0 to disable VF 10-Bit
>> +		Tag Requester. The file is only writeable when the vf driver
>> +		does not bind to a dirver. The typical use case is for p2pdma
>> +		when the peer device does not support 10-BIT Tag Completer.
>
> s/vf/VF/
> s/dirver/driver/
> s/10-BIT/10-Bit/
Will fix.
>
> "when the vr driver does not bind to a driver"?  Not quite right.
> Must be a "device" in there somewhere.
Will fix.
>
> So IIUC this file is associated with a VF, but the bit it writes is
> actually in the *PF*?  So writing 0 to any VF's file disables 10-bit
> tags for *all* VFs?  That's worth mentioning here.
Yes, will do.
>
>> diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
>> index 0d0bed1..04c1298 100644
>> --- a/drivers/pci/iov.c
>> +++ b/drivers/pci/iov.c
>> @@ -220,10 +220,38 @@ static ssize_t sriov_vf_msix_count_store(struct device *dev,
>>  static DEVICE_ATTR_WO(sriov_vf_msix_count);
>>  #endif
>>
>> +static ssize_t sriov_vf_10bit_tag_ctl_store(struct device *dev,
>> +					    struct device_attribute *attr,
>> +					    const char *buf, size_t count)
>> +{
>> +	struct pci_dev *vf_dev = to_pci_dev(dev);
>> +	struct pci_dev *pdev = pci_physfn(vf_dev);
>> +	struct pci_sriov *iov;
>> +	bool enable;
>> +
>> +	if (kstrtobool(buf, &enable) < 0)
>> +		return -EINVAL;
>> +
>> +	if (enable != false)
>> +		return -EINVAL;
>
> Is this "if (enable)" again?
Will fix.
>
>> +	if (vf_dev->driver)
>> +		return -EBUSY;
>> +
>> +	iov = pdev->sriov;
>> +	iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
>> +	pci_write_config_word(pdev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
>> +	pci_info(pdev, "disabled SRIOV 10-Bit Tag Requester\n");
>
> s/SRIOV/SR-IOV/ to match spec and other usages.
Will fix.

Thanks,
Dongdong
>
>> +
>> +	return count;
>> +}
>> +static DEVICE_ATTR_WO(sriov_vf_10bit_tag_ctl);
>> +
>>  static struct attribute *sriov_vf_dev_attrs[] = {
>>  #ifdef CONFIG_PCI_MSI
>>  	&dev_attr_sriov_vf_msix_count.attr,
>>  #endif
>> +	&dev_attr_sriov_vf_10bit_tag_ctl.attr,
>>  	NULL,
>>  };
>>
>> @@ -236,6 +264,11 @@ static umode_t sriov_vf_attrs_are_visible(struct kobject *kobj,
>>  	if (!pdev->is_virtfn)
>>  		return 0;
>>
>> +	pdev = pci_physfn(pdev);
>> +	if ((a == &dev_attr_sriov_vf_10bit_tag_ctl.attr) &&
>> +	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
>> +		return 0;
>> +
>>  	return a->mode;
>>  }
>>
>> @@ -487,12 +520,23 @@ static ssize_t sriov_drivers_autoprobe_store(struct device *dev,
>>  	return count;
>>  }
>>
>> +static ssize_t sriov_vf_10bit_tag_show(struct device *dev,
>> +				       struct device_attribute *attr,
>> +				       char *buf)
>> +{
>> +	struct pci_dev *pdev = to_pci_dev(dev);
>> +
>> +	return sysfs_emit(buf, "%u\n",
>> +		!!(pdev->sriov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN));
>> +}
>> +
>>  static DEVICE_ATTR_RO(sriov_totalvfs);
>>  static DEVICE_ATTR_RW(sriov_numvfs);
>>  static DEVICE_ATTR_RO(sriov_offset);
>>  static DEVICE_ATTR_RO(sriov_stride);
>>  static DEVICE_ATTR_RO(sriov_vf_device);
>>  static DEVICE_ATTR_RW(sriov_drivers_autoprobe);
>> +static DEVICE_ATTR_RO(sriov_vf_10bit_tag);
>>
>>  static struct attribute *sriov_pf_dev_attrs[] = {
>>  	&dev_attr_sriov_totalvfs.attr,
>> @@ -501,6 +545,7 @@ static struct attribute *sriov_pf_dev_attrs[] = {
>>  	&dev_attr_sriov_stride.attr,
>>  	&dev_attr_sriov_vf_device.attr,
>>  	&dev_attr_sriov_drivers_autoprobe.attr,
>> +	&dev_attr_sriov_vf_10bit_tag.attr,
>>  #ifdef CONFIG_PCI_MSI
>>  	&dev_attr_sriov_vf_total_msix.attr,
>>  #endif
>> @@ -511,10 +556,15 @@ static umode_t sriov_pf_attrs_are_visible(struct kobject *kobj,
>>  					  struct attribute *a, int n)
>>  {
>>  	struct device *dev = kobj_to_dev(kobj);
>> +	struct pci_dev *pdev = to_pci_dev(dev);
>>
>>  	if (!dev_is_pf(dev))
>>  		return 0;
>>
>> +	if ((a == &dev_attr_sriov_vf_10bit_tag.attr) &&
>> +	     !(pdev->sriov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ))
>> +		return 0;
>> +
>>  	return a->mode;
>>  }
>>
>> --
>> 2.7.4
>>
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-04 15:56   ` Logan Gunthorpe
@ 2021-08-05  8:49     ` Dongdong Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05  8:49 UTC (permalink / raw)
  To: Logan Gunthorpe, helgaas, hch, kw, leon, linux-pci, rajur,
	hverkuil-cisco
  Cc: linux-media, netdev

Hi Logan

Many thanks for your review.
On 2021/8/4 23:56, Logan Gunthorpe wrote:
>
>
> On 2021-08-04 7:47 a.m., Dongdong Liu wrote:
>> Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
>> 10-Bit Tag Requester doesn't interact with a device that does not
>> support 10-BIT Tag Completer. Before that happens, the kernel should
>> emit a warning. "echo 0 > /sys/bus/pci/devices/.../10bit_tag" to
>> disable 10-BIT Tag Requester for PF device.
>> "echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
>> 10-BIT Tag Requester for VF device.
>>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>>  drivers/pci/p2pdma.c | 40 ++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 40 insertions(+)
>>
>> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
>> index 50cdde3..948f2be 100644
>> --- a/drivers/pci/p2pdma.c
>> +++ b/drivers/pci/p2pdma.c
>> @@ -19,6 +19,7 @@
>>  #include <linux/random.h>
>>  #include <linux/seq_buf.h>
>>  #include <linux/xarray.h>
>> +#include "pci.h"
>>
>>  enum pci_p2pdma_map_type {
>>  	PCI_P2PDMA_MAP_UNKNOWN = 0,
>> @@ -410,6 +411,41 @@ static unsigned long map_types_idx(struct pci_dev *client)
>>  		(client->bus->number << 8) | client->devfn;
>>  }
>>
>> +static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
>> +				   bool verbose)
>> +{
>> +	bool req;
>> +	bool comp;
>> +	u16 ctl2;
>> +
>> +	if (a->is_virtfn) {
>> +#ifdef CONFIG_PCI_IOV
>> +		req = !!(a->physfn->sriov->ctrl &
>> +			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
>> +#endif
>> +	} else {
>> +		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl2);
>> +		req = !!(ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +	}
>> +
>> +	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
>> +	if (req && (!comp)) {
>
> I think the brackets around !comp are unnecessary.
Yes, will fix.
>
>> +		if (verbose) {
>> +			pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set in device (%s), but peer device (%s) does not support the 10-Bit Tag Completer\n",
>> +				 pci_name(a), pci_name(b));
>> +			if (a->is_virtfn)
>> +				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/sriov_vf_10bit_tag_ctl\n",
>> +					 pci_name(a));
>> +			else
>> +				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/10bit_tag\n",
>> +					 pci_name(a));
>
> Can we not simplify this slightly by having a const char * set to the
> tag in the above if (a->is_virtfn)?
>
> pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 >
> /sys/bus/pci/devices/%s/%s\n", pci_name(a), tag);
Good point, will fix.

Thanks,
Dongdong
>
>> +		}
>> +		return false;
>> +	}
>> +
>> +	return true;
>> +}
>> +
>>  /*
>>   * Calculate the P2PDMA mapping type and distance between two PCI devices.
>>   *
>> @@ -532,6 +568,10 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
>>  		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>>  	}
>>  done:
>> +	if (!check_10bit_tags_vaild(client, provider, verbose) ||
>> +	    !check_10bit_tags_vaild(provider, client, verbose))
>> +		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>> +
>>  	rcu_read_lock();
>>  	p2pdma = rcu_dereference(provider->p2pdma);
>>  	if (p2pdma)
>>
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 8/9] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices
  2021-08-05  0:05   ` Bjorn Helgaas
  2021-08-05  8:47     ` Dongdong Liu
@ 2021-08-05  9:39     ` Dongdong Liu
  1 sibling, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05  9:39 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

Miss a comment reply  :).

On 2021/8/5 8:05, Bjorn Helgaas wrote:
> On Wed, Aug 04, 2021 at 09:47:07PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>> sending Requests to other Endpoints (as opposed to host memory), the
>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>> unless an implementation-specific mechanism determines that the
>> Endpoint supports 10-Bit Tag Completer capability.
>> Add sriov_vf_10bit_tag file to query the status of VF 10-Bit Tag
>> Requester Enable. Add sriov_vf_10bit_tag_ctl file to disable the VF
>> 10-Bit Tag Requester. The typical use case is for p2pdma when the peer
>> device does not support 10-BIT Tag Completer.
>
> Fix the usual spec quoting issue.  Or maybe this is not actually
> quoted but is missing blank lines between paragraphs.
>
> s/10-BIT/10-Bit/
>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>>  Documentation/ABI/testing/sysfs-bus-pci | 20 +++++++++++++
>>  drivers/pci/iov.c                       | 50 +++++++++++++++++++++++++++++++++
>>  2 files changed, 70 insertions(+)
>>
>> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
>> index 0e0c97d..8fdbfae 100644
>> --- a/Documentation/ABI/testing/sysfs-bus-pci
>> +++ b/Documentation/ABI/testing/sysfs-bus-pci
>> @@ -421,3 +421,23 @@ Description:
>>  		to disable 10-Bit Tag Requester when the driver does not bind
>>  		the deivce. The typical use case is for p2pdma when the peer
>>  		device does not support 10-BIT Tag Completer.
>> +
>> +What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag
>> +Date:		August 2021
>> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
>> +Description:
>> +		This file is associated with a SR-IOV physical function (PF).
>> +		It is visible when the device has VF 10-Bit Tag Requester
>> +		Supported. It contains the status of VF 10-Bit Tag Requester
>> +		Enable. The file is only readable.
>
> s/only readable/read-only/
>
>> +What:		/sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl
>
> Why does this file have "_ctl" on the end when the one in patch 7/9
> does not?

PF: 0000:82:00.0  VF:0000:82:10.0
/sys/bus/pci/devices/0000:82:00.0/sriov_vf_10bit_tag
/sys/bus/pci/devices/0000:82:10.0/sriov_vf_10bit_tag_ctl

sriov_vf_10bit_tag is used to qeury the status of VF 10-Bit Tag
Requester Enable,  bind with PF device.

sriov_vf_10bit_tag_ctl is used to disable the VF 10-Bit Tag Requester,
bind with VF device, although in fact it writes PF SR-IOV control 
register, just detect if the VF driver have already bond with the VF deivce.

Thanks,
Dongdong

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-04 15:51   ` Logan Gunthorpe
@ 2021-08-05 13:14     ` Dongdong Liu
  2021-08-05 13:53       ` Leon Romanovsky
  2021-08-05 15:36       ` Logan Gunthorpe
  0 siblings, 2 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-05 13:14 UTC (permalink / raw)
  To: Logan Gunthorpe, helgaas, hch, kw, leon, linux-pci, rajur,
	hverkuil-cisco
  Cc: linux-media, netdev

On 2021/8/4 23:51, Logan Gunthorpe wrote:
>
>
>
> On 2021-08-04 7:47 a.m., Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>> sending Requests to other Endpoints (as opposed to host memory), the
>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>> unless an implementation-specific mechanism determines that the Endpoint
>> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
>> write 0 to disable 10-Bit Tag Requester when the driver does not bind
>> the device if the peer device does not support the 10-Bit Tag Completer.
>> This will make P2P traffic safe. the 10bit_tag file content indicate
>> current 10-Bit Tag Requester Enable status.
>
> Can we not have both the sysfs file and the command line parameter? If
> the user wants to disable it always for a specific device this sysfs
> parameter is fairly awkward. A script at boot to unbind the driver, set
> the sysfs file and rebind the driver is not trivial and the command line
> parameter offers additional options for users.
Does the command line parameter as "[PATCH V6 7/8] PCI: Add 
"pci=disable_10bit_tag=" parameter for peer-to-peer support" does?

Do we also need such command line if we already had sysfs file?
I think we may not need.

Thanks,
Dongdong
>
> Logan
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-05 13:14     ` Dongdong Liu
@ 2021-08-05 13:53       ` Leon Romanovsky
  2021-08-05 15:36       ` Logan Gunthorpe
  1 sibling, 0 replies; 43+ messages in thread
From: Leon Romanovsky @ 2021-08-05 13:53 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: Logan Gunthorpe, helgaas, hch, kw, linux-pci, rajur,
	hverkuil-cisco, linux-media, netdev

On Thu, Aug 05, 2021 at 09:14:50PM +0800, Dongdong Liu wrote:
> On 2021/8/4 23:51, Logan Gunthorpe wrote:
> > 
> > 
> > 
> > On 2021-08-04 7:47 a.m., Dongdong Liu wrote:
> > > PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> > > sending Requests to other Endpoints (as opposed to host memory), the
> > > Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> > > unless an implementation-specific mechanism determines that the Endpoint
> > > supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
> > > write 0 to disable 10-Bit Tag Requester when the driver does not bind
> > > the device if the peer device does not support the 10-Bit Tag Completer.
> > > This will make P2P traffic safe. the 10bit_tag file content indicate
> > > current 10-Bit Tag Requester Enable status.
> > 
> > Can we not have both the sysfs file and the command line parameter? If
> > the user wants to disable it always for a specific device this sysfs
> > parameter is fairly awkward. A script at boot to unbind the driver, set
> > the sysfs file and rebind the driver is not trivial and the command line
> > parameter offers additional options for users.
> Does the command line parameter as "[PATCH V6 7/8] PCI: Add
> "pci=disable_10bit_tag=" parameter for peer-to-peer support" does?
> 
> Do we also need such command line if we already had sysfs file?
> I think we may not need.

I think the same.

> 
> Thanks,
> Dongdong
> > 
> > Logan
> > .
> > 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-05  8:37     ` Dongdong Liu
@ 2021-08-05 15:31       ` Bjorn Helgaas
  2021-08-07  7:01         ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-05 15:31 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Thu, Aug 05, 2021 at 04:37:39PM +0800, Dongdong Liu wrote:
> 
> 
> On 2021/8/5 7:49, Bjorn Helgaas wrote:
> > On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote:
> > > PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> > > sending Requests to other Endpoints (as opposed to host memory), the
> > > Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> > > unless an implementation-specific mechanism determines that the Endpoint
> > > supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
> > > write 0 to disable 10-Bit Tag Requester when the driver does not bind
> > > the device if the peer device does not support the 10-Bit Tag Completer.
> > > This will make P2P traffic safe. the 10bit_tag file content indicate
> > > current 10-Bit Tag Requester Enable status.
> > > 
> > > Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> > > ---
> > >  Documentation/ABI/testing/sysfs-bus-pci | 16 +++++++-
> > >  drivers/pci/pci-sysfs.c                 | 69 +++++++++++++++++++++++++++++++++
> > >  2 files changed, 84 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> > > index 793cbb7..0e0c97d 100644
> > > --- a/Documentation/ABI/testing/sysfs-bus-pci
> > > +++ b/Documentation/ABI/testing/sysfs-bus-pci
> > > @@ -139,7 +139,7 @@ Description:
> > >  		binary file containing the Vital Product Data for the
> > >  		device.  It should follow the VPD format defined in
> > >  		PCI Specification 2.1 or 2.2, but users should consider
> > > -		that some devices may have incorrectly formatted data.
> > > +		that some devices may have incorrectly formatted data.
> > >  		If the underlying VPD has a writable section then the
> > >  		corresponding section of this file will be writable.
> > > 
> > > @@ -407,3 +407,17 @@ Description:
> > > 
> > >  		The file is writable if the PF is bound to a driver that
> > >  		implements ->sriov_set_msix_vec_count().
> > > +
> > > +What:		/sys/bus/pci/devices/.../10bit_tag
> > > +Date:		August 2021
> > > +Contact:	Dongdong Liu <liudongdong3@huawei.com>
> > > +Description:
> > > +		If a PCI device support 10-Bit Tag Requester, will create the
> > > +		10bit_tag sysfs file. The file is readable, the value
> > > +		indicate current 10-Bit Tag Requester Enable.
> > > +		1 - enabled, 0 - disabled.
> > > +
> > > +		The file is also writeable, the value only accept by write 0
> > > +		to disable 10-Bit Tag Requester when the driver does not bind
> > > +		the deivce. The typical use case is for p2pdma when the peer
> > > +		device does not support 10-BIT Tag Completer.

> > > +static ssize_t pci_10bit_tag_store(struct device *dev,
> > > +				   struct device_attribute *attr,
> > > +				   const char *buf, size_t count)
> > > +{
> > > +	struct pci_dev *pdev = to_pci_dev(dev);
> > > +	bool enable;
> > > +
> > > +	if (kstrtobool(buf, &enable) < 0)
> > > +		return -EINVAL;
> > > +
> > > +	if (enable != false )
> > > +		return -EINVAL;
> > 
> > Is this the same as "if (enable)"?
> Yes, Will fix.

I actually don't like the one-way nature of this.  When the hierarchy
supports 10-bit tags, we automatically enable them during enumeration.

Then we provide this sysfs file, but it can only *disable* 10-bit
tags.  There's no way to re-enable them except by rebooting (or using
setpci, I guess).

Why can't we allow *enabling* them here if they're supported in this
hierarchy?

> > > +	if (pdev->driver)
> > > +		 return -EBUSY;
> > > +
> > > +	pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
> > > +				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> > > +	pci_info(pdev, "disabled 10-Bit Tag Requester\n");
> > > +
> > > +	return count;
> > > +}

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-05 13:14     ` Dongdong Liu
  2021-08-05 13:53       ` Leon Romanovsky
@ 2021-08-05 15:36       ` Logan Gunthorpe
  1 sibling, 0 replies; 43+ messages in thread
From: Logan Gunthorpe @ 2021-08-05 15:36 UTC (permalink / raw)
  To: Dongdong Liu, helgaas, hch, kw, leon, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev



On 2021-08-05 7:14 a.m., Dongdong Liu wrote:
> On 2021/8/4 23:51, Logan Gunthorpe wrote:
>>
>>
>>
>> On 2021-08-04 7:47 a.m., Dongdong Liu wrote:
>>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>>> sending Requests to other Endpoints (as opposed to host memory), the
>>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>>> unless an implementation-specific mechanism determines that the Endpoint
>>> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
>>> write 0 to disable 10-Bit Tag Requester when the driver does not bind
>>> the device if the peer device does not support the 10-Bit Tag Completer.
>>> This will make P2P traffic safe. the 10bit_tag file content indicate
>>> current 10-Bit Tag Requester Enable status.
>>
>> Can we not have both the sysfs file and the command line parameter? If
>> the user wants to disable it always for a specific device this sysfs
>> parameter is fairly awkward. A script at boot to unbind the driver, set
>> the sysfs file and rebind the driver is not trivial and the command line
>> parameter offers additional options for users.
> Does the command line parameter as "[PATCH V6 7/8] PCI: Add
> "pci=disable_10bit_tag=" parameter for peer-to-peer support" does?
> 
> Do we also need such command line if we already had sysfs file?
> I think we may not need.

In my opinion, for reasons stated above, the command line parameter is
way more convenient.

Logan

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-04 13:47 ` [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
  2021-08-04 15:56   ` Logan Gunthorpe
@ 2021-08-05 18:12   ` Bjorn Helgaas
  2021-08-07  7:11     ` Dongdong Liu
  1 sibling, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-05 18:12 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Wed, Aug 04, 2021 at 09:47:08PM +0800, Dongdong Liu wrote:
> Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
> 10-Bit Tag Requester doesn't interact with a device that does not
> support 10-BIT Tag Completer. Before that happens, the kernel should
> emit a warning. "echo 0 > /sys/bus/pci/devices/.../10bit_tag" to
> disable 10-BIT Tag Requester for PF device.
> "echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
> 10-BIT Tag Requester for VF device.

s/10-BIT/10-Bit/ several times.

Add blank lines between paragraphs.

> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  drivers/pci/p2pdma.c | 40 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index 50cdde3..948f2be 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -19,6 +19,7 @@
>  #include <linux/random.h>
>  #include <linux/seq_buf.h>
>  #include <linux/xarray.h>
> +#include "pci.h"
>  
>  enum pci_p2pdma_map_type {
>  	PCI_P2PDMA_MAP_UNKNOWN = 0,
> @@ -410,6 +411,41 @@ static unsigned long map_types_idx(struct pci_dev *client)
>  		(client->bus->number << 8) | client->devfn;
>  }
>  
> +static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,

s/vaild/valid/

Or maybe s/valid/safe/ or s/valid/supported/, since "valid" isn't
quite the right word here.  We want to know whether the source is
enabled to generate 10-bit tags, and if so, whether the destination
can handle them.

"if (check_10bit_tags_valid())" does not make sense because
"check_10bit_tags_valid()" is not a question with a yes/no answer.

"10bit_tags_valid()" *might* be, because "if (10bit_tags_valid())"
makes sense.  But I don't think you can start with a digit.

Or maybe you want to invert the sense, e.g.,
"10bit_tags_unsupported()", since that avoids negation at the caller:

  if (10bit_tags_unsupported(a, b) ||
      10bit_tags_unsupported(b, a))
        map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;

Doesn't this patch need to be at the very beginning, before you start
enabling 10-bit tags?  Otherwise there's a hole in the middle where we
enable them and P2P DMA might break.

> +				   bool verbose)
> +{
> +	bool req;
> +	bool comp;
> +	u16 ctl2;
> +
> +	if (a->is_virtfn) {
> +#ifdef CONFIG_PCI_IOV
> +		req = !!(a->physfn->sriov->ctrl &
> +			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
> +#endif
> +	} else {
> +		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl2);
> +		req = !!(ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +	}
> +
> +	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
> +	if (req && (!comp)) {
> +		if (verbose) {
> +			pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set in device (%s), but peer device (%s) does not support the 10-Bit Tag Completer\n",
> +				 pci_name(a), pci_name(b));

No point in printing pci_name(a) twice.  pci_warn() prints it already;
that should be enough.

I think you can simplify this a little, e.g.,

  if (!req)           /* 10-bit tags not enabled on requester */
    return true;

  if (comp)           /* completer can handle anything */
    return true;

  /* error case */
  if (!verbose)
    return false;

  pci_warn(...);
  return false;

> +			if (a->is_virtfn)
> +				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/sriov_vf_10bit_tag_ctl\n",
> +					 pci_name(a));
> +			else
> +				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/10bit_tag\n",
> +					 pci_name(a));
> +		}
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
>  /*
>   * Calculate the P2PDMA mapping type and distance between two PCI devices.
>   *
> @@ -532,6 +568,10 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
>  		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>  	}
>  done:
> +	if (!check_10bit_tags_vaild(client, provider, verbose) ||
> +	    !check_10bit_tags_vaild(provider, client, verbose))
> +		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
> +
>  	rcu_read_lock();
>  	p2pdma = rcu_dereference(provider->p2pdma);
>  	if (p2pdma)
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices
  2021-08-05  7:47     ` Dongdong Liu
@ 2021-08-05 19:54       ` Bjorn Helgaas
  2021-08-07  6:19         ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-05 19:54 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Thu, Aug 05, 2021 at 03:47:31PM +0800, Dongdong Liu wrote:
> Hi Bjorn
> 
> Many thanks for your review.
> On 2021/8/5 7:17, Bjorn Helgaas wrote:
> > On Wed, Aug 04, 2021 at 09:47:03PM +0800, Dongdong Liu wrote:
> > > 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
> > > field size from 8 bits to 10 bits.
> > > 
> > > PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
> > > 10-Bit Tag Capabilities" Implementation Note.
> > > For platforms where the RC supports 10-Bit Tag Completer capability,
> > > it is highly recommended for platform firmware or operating software
> > > that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
> > > bit automatically in Endpoints with 10-Bit Tag Requester capability. This
> > > enables the important class of 10-Bit Tag capable adapters that send
> > > Memory Read Requests only to host memory.
> > 
> > Quoted material should be set off with a blank line before it and
> > indented by two spaces so it's clear exactly what comes from the spec
> > and what you've added.  For example, see
> > https://git.kernel.org/linus/ec411e02b7a2
> Good point, will fix.
> > 
> > We need to say why we assume it's safe to enable 10-bit tags for all
> > devices below a Root Port that supports them.  I think this has to do
> > with switches being required to forward 10-bit tags correctly even if
> > they were designed before 10-bit tags were added to the spec.
> 
> PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
> 10-Bit Tag Capabilities" Implementation Note:
> 
>   Switches that lack 10-Bit Tag Completer capability are still able to
>   forward NPRs and Completions carrying 10-Bit Tags correctly, since the
>   two new Tag bits are in TLP Header bits that were formerly Reserved,
>   and Switches are required to forward Reserved TLP Header bits without
>   modification. However, if such a Switch detects an error with an NPR
>   carrying a 10-Bit Tag, and that Switch handles the error by acting as
>   the Completer for the NPR, the resulting Completion will have an
>   invalid 10-Bit Tag. Thus, it is strongly recommended that Switches
>   between any components using 10-Bit Tags support 10-Bit Tag Completer
>   capability.  Note that Switches supporting 16.0 GT/s data rates or
>   greater must support 10-Bit Tag Completer capability.
> 
> This patch also consider to enable 10-Bit Tag for EP device need RP
> and Switch device support 10-Bit Tag Completer capability.
> > 
> > And it should call out any cases where it is *not* safe, e.g., if P2P
> > traffic is an issue.
> Yes, indeed.
> > 
> > If there are cases where we don't want to enable 10-bit tags, whether
> > it's to enable P2P traffic or merely to work around device defects,
> > that ability needs to be here from the beginning.  If somebody needs
> > to bisect with 10-bit tags disabled, we don't want a bisection hole
> > between this commit and the commit that adds the control.
> We provide sysfs file to disable 10-bit tag for P2P traffic when needed.
> The details see PATCH 7/8/9.

A mechanism for avoiding problems needs to be present from the very
beginning so there's no bisection hole.  It should not be added by a
future patch.

The sysfs file is a start, but if we run into an issue, it could mean
that we can't boot and run long enough to use sysfs to disable 10-bit
tags.  So I think we might need a kernel parameter that disables it
(and possibly other things like MPS optimization).

> Current we do not know the 10-bit tag defective devices, current may no
> need do as 8-bit tag does in quirk_no_ext_tags().

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices
  2021-08-05  8:03     ` Dongdong Liu
@ 2021-08-06 22:59       ` Bjorn Helgaas
  2021-08-07  7:46         ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-06 22:59 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Thu, Aug 05, 2021 at 04:03:58PM +0800, Dongdong Liu wrote:
> 
> On 2021/8/5 7:29, Bjorn Helgaas wrote:
> > On Wed, Aug 04, 2021 at 09:47:04PM +0800, Dongdong Liu wrote:
> > > Enable VF 10-Bit Tag Requester when it's upstream component support
> > > 10-bit Tag Completer.
> > 
> > I think "upstream component" here means the PF, doesn't it?  I don't
> > think the PF is really an *upstream* component; there's no routing
> > like with a switch.
>
> I want to say the switch and root port devices that support 10-Bit
> Tag Completer. Sure, VF also needs to have 10-bit Tag Requester
> Supported capability.

OK.  IIUC we're not talking about P2PDMA here; we're talking about
regular DMA to host memory, which means I *think* only the Root Port
is important, since it is the completer for DMA to host memory.  We're
not talking about P2PDMA to a switch BAR, where the switch would be
the completer.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices
  2021-08-05 19:54       ` Bjorn Helgaas
@ 2021-08-07  6:19         ` Dongdong Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-07  6:19 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev



On 2021/8/6 3:54, Bjorn Helgaas wrote:
> On Thu, Aug 05, 2021 at 03:47:31PM +0800, Dongdong Liu wrote:
>> Hi Bjorn
>>
>> Many thanks for your review.
>> On 2021/8/5 7:17, Bjorn Helgaas wrote:
>>> On Wed, Aug 04, 2021 at 09:47:03PM +0800, Dongdong Liu wrote:
>>>> 10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
>>>> field size from 8 bits to 10 bits.
>>>>
>>>> PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
>>>> 10-Bit Tag Capabilities" Implementation Note.
>>>> For platforms where the RC supports 10-Bit Tag Completer capability,
>>>> it is highly recommended for platform firmware or operating software
>>>> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
>>>> bit automatically in Endpoints with 10-Bit Tag Requester capability. This
>>>> enables the important class of 10-Bit Tag capable adapters that send
>>>> Memory Read Requests only to host memory.
>>>
>>> Quoted material should be set off with a blank line before it and
>>> indented by two spaces so it's clear exactly what comes from the spec
>>> and what you've added.  For example, see
>>> https://git.kernel.org/linus/ec411e02b7a2
>> Good point, will fix.
>>>
>>> We need to say why we assume it's safe to enable 10-bit tags for all
>>> devices below a Root Port that supports them.  I think this has to do
>>> with switches being required to forward 10-bit tags correctly even if
>>> they were designed before 10-bit tags were added to the spec.
>>
>> PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
>> 10-Bit Tag Capabilities" Implementation Note:
>>
>>   Switches that lack 10-Bit Tag Completer capability are still able to
>>   forward NPRs and Completions carrying 10-Bit Tags correctly, since the
>>   two new Tag bits are in TLP Header bits that were formerly Reserved,
>>   and Switches are required to forward Reserved TLP Header bits without
>>   modification. However, if such a Switch detects an error with an NPR
>>   carrying a 10-Bit Tag, and that Switch handles the error by acting as
>>   the Completer for the NPR, the resulting Completion will have an
>>   invalid 10-Bit Tag. Thus, it is strongly recommended that Switches
>>   between any components using 10-Bit Tags support 10-Bit Tag Completer
>>   capability.  Note that Switches supporting 16.0 GT/s data rates or
>>   greater must support 10-Bit Tag Completer capability.
>>
>> This patch also consider to enable 10-Bit Tag for EP device need RP
>> and Switch device support 10-Bit Tag Completer capability.
>>>
>>> And it should call out any cases where it is *not* safe, e.g., if P2P
>>> traffic is an issue.
>> Yes, indeed.
>>>
>>> If there are cases where we don't want to enable 10-bit tags, whether
>>> it's to enable P2P traffic or merely to work around device defects,
>>> that ability needs to be here from the beginning.  If somebody needs
>>> to bisect with 10-bit tags disabled, we don't want a bisection hole
>>> between this commit and the commit that adds the control.
>> We provide sysfs file to disable 10-bit tag for P2P traffic when needed.
>> The details see PATCH 7/8/9.
>
> A mechanism for avoiding problems needs to be present from the very
> beginning so there's no bisection hole.  It should not be added by a
> future patch.
Yes, will adjust PATCH 7/8/9 before PATCH 4。
>
> The sysfs file is a start, but if we run into an issue, it could mean
> that we can't boot and run long enough to use sysfs to disable 10-bit
> tags.  So I think we might need a kernel parameter that disables it
> (and possibly other things like MPS optimization).

Yes, We can add a pcie_tag_p2p kernel parameter just to use the 8-bit
tags, not to enable 10-bit tags for all PCIe devices.

Thanks,
Dongdong


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-05 15:31       ` Bjorn Helgaas
@ 2021-08-07  7:01         ` Dongdong Liu
  2021-08-09 17:37           ` Bjorn Helgaas
  0 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-07  7:01 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev


On 2021/8/5 23:31, Bjorn Helgaas wrote:
> On Thu, Aug 05, 2021 at 04:37:39PM +0800, Dongdong Liu wrote:
>>
>>
>> On 2021/8/5 7:49, Bjorn Helgaas wrote:
>>> On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote:
>>>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>>>> sending Requests to other Endpoints (as opposed to host memory), the
>>>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>>>> unless an implementation-specific mechanism determines that the Endpoint
>>>> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
>>>> write 0 to disable 10-Bit Tag Requester when the driver does not bind
>>>> the device if the peer device does not support the 10-Bit Tag Completer.
>>>> This will make P2P traffic safe. the 10bit_tag file content indicate
>>>> current 10-Bit Tag Requester Enable status.
>>>>
>>>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>>>> ---
>>>>  Documentation/ABI/testing/sysfs-bus-pci | 16 +++++++-
>>>>  drivers/pci/pci-sysfs.c                 | 69 +++++++++++++++++++++++++++++++++
>>>>  2 files changed, 84 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
>>>> index 793cbb7..0e0c97d 100644
>>>> --- a/Documentation/ABI/testing/sysfs-bus-pci
>>>> +++ b/Documentation/ABI/testing/sysfs-bus-pci
>>>> @@ -139,7 +139,7 @@ Description:
>>>>  		binary file containing the Vital Product Data for the
>>>>  		device.  It should follow the VPD format defined in
>>>>  		PCI Specification 2.1 or 2.2, but users should consider
>>>> -		that some devices may have incorrectly formatted data.
>>>> +		that some devices may have incorrectly formatted data.
>>>>  		If the underlying VPD has a writable section then the
>>>>  		corresponding section of this file will be writable.
>>>>
>>>> @@ -407,3 +407,17 @@ Description:
>>>>
>>>>  		The file is writable if the PF is bound to a driver that
>>>>  		implements ->sriov_set_msix_vec_count().
>>>> +
>>>> +What:		/sys/bus/pci/devices/.../10bit_tag
>>>> +Date:		August 2021
>>>> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
>>>> +Description:
>>>> +		If a PCI device support 10-Bit Tag Requester, will create the
>>>> +		10bit_tag sysfs file. The file is readable, the value
>>>> +		indicate current 10-Bit Tag Requester Enable.
>>>> +		1 - enabled, 0 - disabled.
>>>> +
>>>> +		The file is also writeable, the value only accept by write 0
>>>> +		to disable 10-Bit Tag Requester when the driver does not bind
>>>> +		the deivce. The typical use case is for p2pdma when the peer
>>>> +		device does not support 10-BIT Tag Completer.
>
>>>> +static ssize_t pci_10bit_tag_store(struct device *dev,
>>>> +				   struct device_attribute *attr,
>>>> +				   const char *buf, size_t count)
>>>> +{
>>>> +	struct pci_dev *pdev = to_pci_dev(dev);
>>>> +	bool enable;
>>>> +
>>>> +	if (kstrtobool(buf, &enable) < 0)
>>>> +		return -EINVAL;
>>>> +
>>>> +	if (enable != false )
>>>> +		return -EINVAL;
>>>
>>> Is this the same as "if (enable)"?
>> Yes, Will fix.
>
> I actually don't like the one-way nature of this.  When the hierarchy
> supports 10-bit tags, we automatically enable them during enumeration.
>
> Then we provide this sysfs file, but it can only *disable* 10-bit
> tags.  There's no way to re-enable them except by rebooting (or using
> setpci, I guess).
>
> Why can't we allow *enabling* them here if they're supported in this
> hierarchy?
Yes, we can also provide this sysfs to enable 10-bit tag for EP devices
when the hierarchy supports 10-bit tags.

I do not want to provide sysfs to enable/disable 10-bit tag for RP
devices as I can not tell current if the the Function has outstanding
Non-Posted Requests, may need to unbind all the EP drivers under the
RP, and current seems no scenario need to do this. This will make things
more complex.

Thanks,
Dongdong
>
>>>> +	if (pdev->driver)
>>>> +		 return -EBUSY;
>>>> +
>>>> +	pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2,
>>>> +				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>>>> +	pci_info(pdev, "disabled 10-Bit Tag Requester\n");
>>>> +
>>>> +	return count;
>>>> +}
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-05 18:12   ` Bjorn Helgaas
@ 2021-08-07  7:11     ` Dongdong Liu
  2021-08-09 17:31       ` Bjorn Helgaas
  0 siblings, 1 reply; 43+ messages in thread
From: Dongdong Liu @ 2021-08-07  7:11 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev


On 2021/8/6 2:12, Bjorn Helgaas wrote:
> On Wed, Aug 04, 2021 at 09:47:08PM +0800, Dongdong Liu wrote:
>> Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
>> 10-Bit Tag Requester doesn't interact with a device that does not
>> support 10-BIT Tag Completer. Before that happens, the kernel should
>> emit a warning. "echo 0 > /sys/bus/pci/devices/.../10bit_tag" to
>> disable 10-BIT Tag Requester for PF device.
>> "echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
>> 10-BIT Tag Requester for VF device.
>
> s/10-BIT/10-Bit/ several times.
Will fix.
>
> Add blank lines between paragraphs.
Will fix.
>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>>  drivers/pci/p2pdma.c | 40 ++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 40 insertions(+)
>>
>> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
>> index 50cdde3..948f2be 100644
>> --- a/drivers/pci/p2pdma.c
>> +++ b/drivers/pci/p2pdma.c
>> @@ -19,6 +19,7 @@
>>  #include <linux/random.h>
>>  #include <linux/seq_buf.h>
>>  #include <linux/xarray.h>
>> +#include "pci.h"
>>
>>  enum pci_p2pdma_map_type {
>>  	PCI_P2PDMA_MAP_UNKNOWN = 0,
>> @@ -410,6 +411,41 @@ static unsigned long map_types_idx(struct pci_dev *client)
>>  		(client->bus->number << 8) | client->devfn;
>>  }
>>
>> +static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
>
> s/vaild/valid/
>
> Or maybe s/valid/safe/ or s/valid/supported/, since "valid" isn't
> quite the right word here.  We want to know whether the source is
> enabled to generate 10-bit tags, and if so, whether the destination
> can handle them.
>
> "if (check_10bit_tags_valid())" does not make sense because
> "check_10bit_tags_valid()" is not a question with a yes/no answer.
>
> "10bit_tags_valid()" *might* be, because "if (10bit_tags_valid())"
> makes sense.  But I don't think you can start with a digit.
>
> Or maybe you want to invert the sense, e.g.,
> "10bit_tags_unsupported()", since that avoids negation at the caller:
>
>   if (10bit_tags_unsupported(a, b) ||
>       10bit_tags_unsupported(b, a))
>         map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
Good suggestion. add a pci_ prefix.

if (pci_10bit_tags_unsupported(a, b) ||
     pci_10bit_tags_unsupported(b, a))
	map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;

> Doesn't this patch need to be at the very beginning, before you start
> enabling 10-bit tags?  Otherwise there's a hole in the middle where we
> enable them and P2P DMA might break.
Yes, will do.
>
>> +				   bool verbose)
>> +{
>> +	bool req;
>> +	bool comp;
>> +	u16 ctl2;
>> +
>> +	if (a->is_virtfn) {
>> +#ifdef CONFIG_PCI_IOV
>> +		req = !!(a->physfn->sriov->ctrl &
>> +			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
>> +#endif
>> +	} else {
>> +		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl2);
>> +		req = !!(ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +	}
>> +
>> +	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
>> +	if (req && (!comp)) {
>> +		if (verbose) {
>> +			pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set in device (%s), but peer device (%s) does not support the 10-Bit Tag Completer\n",
>> +				 pci_name(a), pci_name(b));
>
> No point in printing pci_name(a) twice.  pci_warn() prints it already;
> that should be enough.
Will fix.
>
> I think you can simplify this a little, e.g.,
>
>   if (!req)           /* 10-bit tags not enabled on requester */
>     return true;
>
>   if (comp)           /* completer can handle anything */
>     return true;
>
>   /* error case */
>   if (!verbose)
>     return false;
>
>   pci_warn(...);
>   return false;

Good point, this will make code more clean and readable.

Thanks,
Dongdong
>
>> +			if (a->is_virtfn)
>> +				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/sriov_vf_10bit_tag_ctl\n",
>> +					 pci_name(a));
>> +			else
>> +				pci_warn(a, "to disable 10-Bit Tag Requester for this device, echo 0 > /sys/bus/pci/devices/%s/10bit_tag\n",
>> +					 pci_name(a));
>> +		}
>> +		return false;
>> +	}
>> +
>> +	return true;
>> +}
>> +
>>  /*
>>   * Calculate the P2PDMA mapping type and distance between two PCI devices.
>>   *
>> @@ -532,6 +568,10 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
>>  		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>>  	}
>>  done:
>> +	if (!check_10bit_tags_vaild(client, provider, verbose) ||
>> +	    !check_10bit_tags_vaild(provider, client, verbose))
>> +		map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>> +
>>  	rcu_read_lock();
>>  	p2pdma = rcu_dereference(provider->p2pdma);
>>  	if (p2pdma)
>> --
>> 2.7.4
>>
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices
  2021-08-06 22:59       ` Bjorn Helgaas
@ 2021-08-07  7:46         ` Dongdong Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-07  7:46 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev



On 2021/8/7 6:59, Bjorn Helgaas wrote:
> On Thu, Aug 05, 2021 at 04:03:58PM +0800, Dongdong Liu wrote:
>>
>> On 2021/8/5 7:29, Bjorn Helgaas wrote:
>>> On Wed, Aug 04, 2021 at 09:47:04PM +0800, Dongdong Liu wrote:
>>>> Enable VF 10-Bit Tag Requester when it's upstream component support
>>>> 10-bit Tag Completer.
>>>
>>> I think "upstream component" here means the PF, doesn't it?  I don't
>>> think the PF is really an *upstream* component; there's no routing
>>> like with a switch.
>>
>> I want to say the switch and root port devices that support 10-Bit
>> Tag Completer. Sure, VF also needs to have 10-bit Tag Requester
>> Supported capability.
>
> OK.  IIUC we're not talking about P2PDMA here; we're talking about
> regular DMA to host memory, which means I *think* only the Root Port
> is important, since it is the completer for DMA to host memory.  We're
> not talking about P2PDMA to a switch BAR, where the switch would be
> the completer.

Yes, only the Root Port is important, this is also PCIe spec
recommended.

Only when switch detects an error with an NPR carrying a 10-Bit Tag,
and that Switch handles the error by acting as the Completer for the
NPR, the resulting Completion will have an invalid 10-Bit Tag.

Enable 10-bit for EP devices depend on the hierarchy(include switch)
supports 10-bit tags in "[PATCH V7 4/9] PCI: Enable 10-Bit Tag support
for PCIe Endpoint devices". This seems complex.
I will fix this. Enable 10-bit tag for EP devices only depend on Root
Port 10-bit tag completer about regular DMA to host memory.

The below is the PCIe spec describe:
PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
10-Bit Tag Capabilities" Implementation Note.

   For platforms where the RC supports 10-Bit Tag Completer capability,
   it is highly recommended for platform firmware or operating software
   that configures PCIe hierarchies to Set the 10-Bit Tag Requester
   Enable bit automatically in Endpoints with 10-Bit Tag Requester
   capability. This enables the important class of 10-Bit Tag capable
   adapters that send Memory Read Requests only to host memory.

...
   Switches that lack 10-Bit Tag Completer capability are still able to
   forward NPRs and Completions carrying 10-Bit Tags correctly, since the
   two new Tag bits are in TLP Header bits that were formerly Reserved,
   and Switches are required to forward Reserved TLP Header bits without
   modification. However, if such a Switch detects an error with an NPR
   carrying a 10-Bit Tag, and that Switch handles the error by acting as
   the Completer for the NPR, the resulting Completion will have an
   invalid 10-Bit Tag. Thus, it is strongly recommended that Switches
   between any components using 10-Bit Tags support 10-Bit Tag Completer
   capability.  Note that Switches supporting 16.0 GT/s data rates or
   greater must support 10-Bit Tag Completer capability.

Thanks,
Dongdong

> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices
  2021-08-05  8:25     ` Dongdong Liu
@ 2021-08-09 17:26       ` Bjorn Helgaas
  2021-08-10 11:59         ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-09 17:26 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Thu, Aug 05, 2021 at 04:25:23PM +0800, Dongdong Liu wrote:
> On 2021/8/5 7:38, Bjorn Helgaas wrote:
> > On Wed, Aug 04, 2021 at 09:47:05PM +0800, Dongdong Liu wrote:
> > > PCIe spec 5.0r1.0 section 2.2.6.2 implementation note, In configurations
> > > where a Requester with 10-Bit Tag Requester capability needs to target
> > > multiple Completers, one needs to ensure that the Requester sends 10-Bit
> > > Tag Requests only to Completers that have 10-Bit Tag Completer capability.
> > > So we enable 10-Bit Tag Requester for root port only when the devices
> > > under the root port support 10-Bit Tag Completer.
> > 
> > Fix quoting.  I can't tell what is from the spec and what you wrote.
> Will fix.
> > 
> > > Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> > > ---
> > >  drivers/pci/pcie/portdrv_pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 69 insertions(+)
> > > 
> > > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> > > index c7ff1ee..2382cd2 100644
> > > --- a/drivers/pci/pcie/portdrv_pci.c
> > > +++ b/drivers/pci/pcie/portdrv_pci.c
> > > @@ -90,6 +90,72 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
> > >  #define PCIE_PORTDRV_PM_OPS	NULL
> > >  #endif /* !PM */
> > > 
> > > +static int pci_10bit_tag_comp_support(struct pci_dev *dev, void *data)
> > > +{
> > > +	bool *support = (bool *)data;
> > > +
> > > +	if (!pci_is_pcie(dev)) {
> > > +		*support = false;
> > > +		return 1;
> > > +	}
> > > +
> > > +	/*
> > > +	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
> > > +	 * For configurations where a Requester with 10-Bit Tag Requester
> > > +	 * capability targets Completers where some do and some do not have
> > > +	 * 10-Bit Tag Completer capability, how the Requester determines which
> > > +	 * NPRs include 10-Bit Tags is outside the scope of this specification.
> > > +	 * So we do not consider hotplug scenario.
> > > +	 */
> > > +	if (dev->is_hotplug_bridge) {
> > > +		*support = false;
> > > +		return 1;
> > > +	}
> > > +
> > > +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) {
> > > +		*support = false;
> > > +		return 1;
> > > +	}
> > > +
> > > +	return 0;
> > > +}
> > > +
> > > +static void pci_configure_rp_10bit_tag(struct pci_dev *dev)
> > > +{
> > > +	bool support = true;
> > > +
> > > +	if (dev->subordinate == NULL)
> > > +		return;
> > > +
> > > +	/* If no devices under the root port, no need to enable 10-Bit Tag. */
> > > +	if (list_empty(&dev->subordinate->devices))
> > > +		return;
> > > +
> > > +	pci_10bit_tag_comp_support(dev, &support);
> > > +	if (!support)
> > > +		return;
> > > +
> > > +	/*
> > > +	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
> > > +	 * In configurations where a Requester with 10-Bit Tag Requester
> > > +	 * capability needs to target multiple Completers, one needs to ensure
> > > +	 * that the Requester sends 10-Bit Tag Requests only to Completers
> > > +	 * that have 10-Bit Tag Completer capability. So we enable 10-Bit Tag
> > > +	 * Requester for root port only when the devices under the root port
> > > +	 * support 10-Bit Tag Completer.
> > > +	 */
> > > +	pci_walk_bus(dev->subordinate, pci_10bit_tag_comp_support, &support);
> > > +	if (!support)
> > > +		return;
> > > +
> > > +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
> > > +		return;
> > > +
> > > +	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
> > > +	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
> > > +				 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> > > +}
> > > +
> > >  /*
> > >   * pcie_portdrv_probe - Probe PCI-Express port devices
> > >   * @dev: PCI-Express port device being probed
> > > @@ -111,6 +177,9 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
> > >  	     (type != PCI_EXP_TYPE_RC_EC)))
> > >  		return -ENODEV;
> > > 
> > > +	if (type == PCI_EXP_TYPE_ROOT_PORT)
> > > +		pci_configure_rp_10bit_tag(dev);
> > 
> > I don't think this has anything to do with the portdrv, so all this
> > should go somewhere else.
>
> Yes, any suggestion where to put the code?

It seems similar to pci_configure_ltr(), pci_configure_eetlp_prefix(),
and other things in drivers/pci/probe.c, so maybe there?

Or, if this is more of a theoretical advantage than a demonstrated
performance improvement, we could just hold off on doing it until it
becomes important.  I can't tell if you have a scenario that actually
benefits from this yet.

> > Out of curiosity, IIUC this enables 10-bit tags for MMIO transactions
> > from the root port toward the device, i.e., traffic that originates
> > from a CPU.  Is that a significant benefit?  I would expect high-speed
> > devices would primarily operate via DMA with relatively little MMIO
> > traffic.
>
> The benefits of 10-Bit Tag for EP are obvious.
> There are few RP scenarios. Unless there are two:
> 1. RC has its own DMA.
> 2. The P2P tag is replaced at the RP when the P2PDMA go through RP.


^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-07  7:11     ` Dongdong Liu
@ 2021-08-09 17:31       ` Bjorn Helgaas
  2021-08-10 12:31         ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-09 17:31 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Sat, Aug 07, 2021 at 03:11:34PM +0800, Dongdong Liu wrote:
> 
> On 2021/8/6 2:12, Bjorn Helgaas wrote:
> > On Wed, Aug 04, 2021 at 09:47:08PM +0800, Dongdong Liu wrote:
> > > Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
> > > 10-Bit Tag Requester doesn't interact with a device that does not
> > > support 10-BIT Tag Completer. Before that happens, the kernel should
> > > emit a warning. "echo 0 > /sys/bus/pci/devices/.../10bit_tag" to
> > > disable 10-BIT Tag Requester for PF device.
> > > "echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
> > > 10-BIT Tag Requester for VF device.
> > 
> > s/10-BIT/10-Bit/ several times.
> Will fix.
> > 
> > Add blank lines between paragraphs.
> Will fix.
> > 
> > > Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> > > ---
> > >  drivers/pci/p2pdma.c | 40 ++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 40 insertions(+)
> > > 
> > > diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> > > index 50cdde3..948f2be 100644
> > > --- a/drivers/pci/p2pdma.c
> > > +++ b/drivers/pci/p2pdma.c
> > > @@ -19,6 +19,7 @@
> > >  #include <linux/random.h>
> > >  #include <linux/seq_buf.h>
> > >  #include <linux/xarray.h>
> > > +#include "pci.h"
> > > 
> > >  enum pci_p2pdma_map_type {
> > >  	PCI_P2PDMA_MAP_UNKNOWN = 0,
> > > @@ -410,6 +411,41 @@ static unsigned long map_types_idx(struct pci_dev *client)
> > >  		(client->bus->number << 8) | client->devfn;
> > >  }
> > > 
> > > +static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
> > 
> > s/vaild/valid/
> > 
> > Or maybe s/valid/safe/ or s/valid/supported/, since "valid" isn't
> > quite the right word here.  We want to know whether the source is
> > enabled to generate 10-bit tags, and if so, whether the destination
> > can handle them.
> > 
> > "if (check_10bit_tags_valid())" does not make sense because
> > "check_10bit_tags_valid()" is not a question with a yes/no answer.
> > 
> > "10bit_tags_valid()" *might* be, because "if (10bit_tags_valid())"
> > makes sense.  But I don't think you can start with a digit.
> > 
> > Or maybe you want to invert the sense, e.g.,
> > "10bit_tags_unsupported()", since that avoids negation at the caller:
> > 
> >   if (10bit_tags_unsupported(a, b) ||
> >       10bit_tags_unsupported(b, a))
> >         map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
> Good suggestion. add a pci_ prefix.
> 
> if (pci_10bit_tags_unsupported(a, b) ||
>     pci_10bit_tags_unsupported(b, a))
> 	map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;

This treats both directions as equally important.  I don't know P2PDMA
very well, but that doesn't seem like it would necessarily be the
case.  I would think a common case would be device A doing DMA to B,
but B *not* doing DMA to A.  So can you tell which direction you're
setting up here, and can you take advantage of any asymmetry, e.g., by
enabling 10-bit tags in the direction that supports it even if the
other direction does not?

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-07  7:01         ` Dongdong Liu
@ 2021-08-09 17:37           ` Bjorn Helgaas
  2021-08-10 12:16             ` Dongdong Liu
  0 siblings, 1 reply; 43+ messages in thread
From: Bjorn Helgaas @ 2021-08-09 17:37 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Sat, Aug 07, 2021 at 03:01:56PM +0800, Dongdong Liu wrote:
> 
> On 2021/8/5 23:31, Bjorn Helgaas wrote:
> > On Thu, Aug 05, 2021 at 04:37:39PM +0800, Dongdong Liu wrote:
> > > 
> > > 
> > > On 2021/8/5 7:49, Bjorn Helgaas wrote:
> > > > On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote:
> > > > > PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> > > > > sending Requests to other Endpoints (as opposed to host memory), the
> > > > > Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> > > > > unless an implementation-specific mechanism determines that the Endpoint
> > > > > supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
> > > > > write 0 to disable 10-Bit Tag Requester when the driver does not bind
> > > > > the device if the peer device does not support the 10-Bit Tag Completer.
> > > > > This will make P2P traffic safe. the 10bit_tag file content indicate
> > > > > current 10-Bit Tag Requester Enable status.
> > > > > 
> > > > > Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> > > > > ---
> > > > >  Documentation/ABI/testing/sysfs-bus-pci | 16 +++++++-
> > > > >  drivers/pci/pci-sysfs.c                 | 69 +++++++++++++++++++++++++++++++++
> > > > >  2 files changed, 84 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
> > > > > index 793cbb7..0e0c97d 100644
> > > > > --- a/Documentation/ABI/testing/sysfs-bus-pci
> > > > > +++ b/Documentation/ABI/testing/sysfs-bus-pci
> > > > > @@ -139,7 +139,7 @@ Description:
> > > > >  		binary file containing the Vital Product Data for the
> > > > >  		device.  It should follow the VPD format defined in
> > > > >  		PCI Specification 2.1 or 2.2, but users should consider
> > > > > -		that some devices may have incorrectly formatted data.
> > > > > +		that some devices may have incorrectly formatted data.
> > > > >  		If the underlying VPD has a writable section then the
> > > > >  		corresponding section of this file will be writable.
> > > > > 
> > > > > @@ -407,3 +407,17 @@ Description:
> > > > > 
> > > > >  		The file is writable if the PF is bound to a driver that
> > > > >  		implements ->sriov_set_msix_vec_count().
> > > > > +
> > > > > +What:		/sys/bus/pci/devices/.../10bit_tag
> > > > > +Date:		August 2021
> > > > > +Contact:	Dongdong Liu <liudongdong3@huawei.com>
> > > > > +Description:
> > > > > +		If a PCI device support 10-Bit Tag Requester, will create the
> > > > > +		10bit_tag sysfs file. The file is readable, the value
> > > > > +		indicate current 10-Bit Tag Requester Enable.
> > > > > +		1 - enabled, 0 - disabled.
> > > > > +
> > > > > +		The file is also writeable, the value only accept by write 0
> > > > > +		to disable 10-Bit Tag Requester when the driver does not bind
> > > > > +		the deivce. The typical use case is for p2pdma when the peer
> > > > > +		device does not support 10-BIT Tag Completer.
> > 
> > > > > +static ssize_t pci_10bit_tag_store(struct device *dev,
> > > > > +				   struct device_attribute *attr,
> > > > > +				   const char *buf, size_t count)
> > > > > +{
> > > > > +	struct pci_dev *pdev = to_pci_dev(dev);
> > > > > +	bool enable;
> > > > > +
> > > > > +	if (kstrtobool(buf, &enable) < 0)
> > > > > +		return -EINVAL;
> > > > > +
> > > > > +	if (enable != false )
> > > > > +		return -EINVAL;
> > > > 
> > > > Is this the same as "if (enable)"?
> > > Yes, Will fix.
> > 
> > I actually don't like the one-way nature of this.  When the hierarchy
> > supports 10-bit tags, we automatically enable them during enumeration.
> > 
> > Then we provide this sysfs file, but it can only *disable* 10-bit
> > tags.  There's no way to re-enable them except by rebooting (or using
> > setpci, I guess).
> > 
> > Why can't we allow *enabling* them here if they're supported in this
> > hierarchy?
> Yes, we can also provide this sysfs to enable 10-bit tag for EP devices
> when the hierarchy supports 10-bit tags.
> 
> I do not want to provide sysfs to enable/disable 10-bit tag for RP
> devices as I can not tell current if the the Function has outstanding
> Non-Posted Requests, may need to unbind all the EP drivers under the
> RP, and current seems no scenario need to do this. This will make things
> more complex.

You mean "no scenario requires disabling 10-bit tags and then
re-enabling them"?  That may be true, but I'm still hesitant to
provide a switch than can only be reversed by rebooting.

This is similar to the issue Leon raised that it's not practical to
reboot machines.  Maybe we accept a one-way switch if the sole purpose
is to work around a hardware defect.  Or maybe a kernel parameter that
disables 10-bit tags completely is the defect mitigation.  I think we
probably need such a parameter in case a defect prevents us from
booting far enough to use the sysfs switch.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices
  2021-08-09 17:26       ` Bjorn Helgaas
@ 2021-08-10 11:59         ` Dongdong Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-10 11:59 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev



On 2021/8/10 1:26, Bjorn Helgaas wrote:
> On Thu, Aug 05, 2021 at 04:25:23PM +0800, Dongdong Liu wrote:
>> On 2021/8/5 7:38, Bjorn Helgaas wrote:
>>> On Wed, Aug 04, 2021 at 09:47:05PM +0800, Dongdong Liu wrote:
>>>> PCIe spec 5.0r1.0 section 2.2.6.2 implementation note, In configurations
>>>> where a Requester with 10-Bit Tag Requester capability needs to target
>>>> multiple Completers, one needs to ensure that the Requester sends 10-Bit
>>>> Tag Requests only to Completers that have 10-Bit Tag Completer capability.
>>>> So we enable 10-Bit Tag Requester for root port only when the devices
>>>> under the root port support 10-Bit Tag Completer.
>>>
>>> Fix quoting.  I can't tell what is from the spec and what you wrote.
>> Will fix.
>>>
>>>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>>>> ---
>>>>  drivers/pci/pcie/portdrv_pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 69 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>>>> index c7ff1ee..2382cd2 100644
>>>> --- a/drivers/pci/pcie/portdrv_pci.c
>>>> +++ b/drivers/pci/pcie/portdrv_pci.c
>>>> @@ -90,6 +90,72 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
>>>>  #define PCIE_PORTDRV_PM_OPS	NULL
>>>>  #endif /* !PM */
>>>>
>>>> +static int pci_10bit_tag_comp_support(struct pci_dev *dev, void *data)
>>>> +{
>>>> +	bool *support = (bool *)data;
>>>> +
>>>> +	if (!pci_is_pcie(dev)) {
>>>> +		*support = false;
>>>> +		return 1;
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
>>>> +	 * For configurations where a Requester with 10-Bit Tag Requester
>>>> +	 * capability targets Completers where some do and some do not have
>>>> +	 * 10-Bit Tag Completer capability, how the Requester determines which
>>>> +	 * NPRs include 10-Bit Tags is outside the scope of this specification.
>>>> +	 * So we do not consider hotplug scenario.
>>>> +	 */
>>>> +	if (dev->is_hotplug_bridge) {
>>>> +		*support = false;
>>>> +		return 1;
>>>> +	}
>>>> +
>>>> +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) {
>>>> +		*support = false;
>>>> +		return 1;
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void pci_configure_rp_10bit_tag(struct pci_dev *dev)
>>>> +{
>>>> +	bool support = true;
>>>> +
>>>> +	if (dev->subordinate == NULL)
>>>> +		return;
>>>> +
>>>> +	/* If no devices under the root port, no need to enable 10-Bit Tag. */
>>>> +	if (list_empty(&dev->subordinate->devices))
>>>> +		return;
>>>> +
>>>> +	pci_10bit_tag_comp_support(dev, &support);
>>>> +	if (!support)
>>>> +		return;
>>>> +
>>>> +	/*
>>>> +	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
>>>> +	 * In configurations where a Requester with 10-Bit Tag Requester
>>>> +	 * capability needs to target multiple Completers, one needs to ensure
>>>> +	 * that the Requester sends 10-Bit Tag Requests only to Completers
>>>> +	 * that have 10-Bit Tag Completer capability. So we enable 10-Bit Tag
>>>> +	 * Requester for root port only when the devices under the root port
>>>> +	 * support 10-Bit Tag Completer.
>>>> +	 */
>>>> +	pci_walk_bus(dev->subordinate, pci_10bit_tag_comp_support, &support);
>>>> +	if (!support)
>>>> +		return;
>>>> +
>>>> +	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
>>>> +		return;
>>>> +
>>>> +	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
>>>> +	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
>>>> +				 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>>>> +}
>>>> +
>>>>  /*
>>>>   * pcie_portdrv_probe - Probe PCI-Express port devices
>>>>   * @dev: PCI-Express port device being probed
>>>> @@ -111,6 +177,9 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
>>>>  	     (type != PCI_EXP_TYPE_RC_EC)))
>>>>  		return -ENODEV;
>>>>
>>>> +	if (type == PCI_EXP_TYPE_ROOT_PORT)
>>>> +		pci_configure_rp_10bit_tag(dev);
>>>
>>> I don't think this has anything to do with the portdrv, so all this
>>> should go somewhere else.
>>
>> Yes, any suggestion where to put the code?
>
> It seems similar to pci_configure_ltr(), pci_configure_eetlp_prefix(),
> and other things in drivers/pci/probe.c, so maybe there?
Seems similar to pcie_bus_configure_settings().
Enable RP 10-bit tag requester need to know all the EP devices 10-bit 
tag completer capability under the RP.

>
> Or, if this is more of a theoretical advantage than a demonstrated
> performance improvement, we could just hold off on doing it until it
> becomes important.  I can't tell if you have a scenario that actually
> benefits from this yet.

Ok, I will remove this patch from the patchset.
We will do this later when get performance improvement data.

Thanks,
Dongdong
>
>>> Out of curiosity, IIUC this enables 10-bit tags for MMIO transactions
>>> from the root port toward the device, i.e., traffic that originates
>>> from a CPU.  Is that a significant benefit?  I would expect high-speed
>>> devices would primarily operate via DMA with relatively little MMIO
>>> traffic.
>>
>> The benefits of 10-Bit Tag for EP are obvious.
>> There are few RP scenarios. Unless there are two:
>> 1. RC has its own DMA.
>> 2. The P2P tag is replaced at the RP when the P2PDMA go through RP.
>
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file
  2021-08-09 17:37           ` Bjorn Helgaas
@ 2021-08-10 12:16             ` Dongdong Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-10 12:16 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev



On 2021/8/10 1:37, Bjorn Helgaas wrote:
> On Sat, Aug 07, 2021 at 03:01:56PM +0800, Dongdong Liu wrote:
>>
>> On 2021/8/5 23:31, Bjorn Helgaas wrote:
>>> On Thu, Aug 05, 2021 at 04:37:39PM +0800, Dongdong Liu wrote:
>>>>
>>>>
>>>> On 2021/8/5 7:49, Bjorn Helgaas wrote:
>>>>> On Wed, Aug 04, 2021 at 09:47:06PM +0800, Dongdong Liu wrote:
>>>>>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>>>>>> sending Requests to other Endpoints (as opposed to host memory), the
>>>>>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>>>>>> unless an implementation-specific mechanism determines that the Endpoint
>>>>>> supports 10-Bit Tag Completer capability. Add a 10bit_tag sysfs file,
>>>>>> write 0 to disable 10-Bit Tag Requester when the driver does not bind
>>>>>> the device if the peer device does not support the 10-Bit Tag Completer.
>>>>>> This will make P2P traffic safe. the 10bit_tag file content indicate
>>>>>> current 10-Bit Tag Requester Enable status.
>>>>>>
>>>>>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>>>>>> ---
>>>>>>  Documentation/ABI/testing/sysfs-bus-pci | 16 +++++++-
>>>>>>  drivers/pci/pci-sysfs.c                 | 69 +++++++++++++++++++++++++++++++++
>>>>>>  2 files changed, 84 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/Documentation/ABI/testing/sysfs-bus-pci b/Documentation/ABI/testing/sysfs-bus-pci
>>>>>> index 793cbb7..0e0c97d 100644
>>>>>> --- a/Documentation/ABI/testing/sysfs-bus-pci
>>>>>> +++ b/Documentation/ABI/testing/sysfs-bus-pci
>>>>>> @@ -139,7 +139,7 @@ Description:
>>>>>>  		binary file containing the Vital Product Data for the
>>>>>>  		device.  It should follow the VPD format defined in
>>>>>>  		PCI Specification 2.1 or 2.2, but users should consider
>>>>>> -		that some devices may have incorrectly formatted data.
>>>>>> +		that some devices may have incorrectly formatted data.
>>>>>>  		If the underlying VPD has a writable section then the
>>>>>>  		corresponding section of this file will be writable.
>>>>>>
>>>>>> @@ -407,3 +407,17 @@ Description:
>>>>>>
>>>>>>  		The file is writable if the PF is bound to a driver that
>>>>>>  		implements ->sriov_set_msix_vec_count().
>>>>>> +
>>>>>> +What:		/sys/bus/pci/devices/.../10bit_tag
>>>>>> +Date:		August 2021
>>>>>> +Contact:	Dongdong Liu <liudongdong3@huawei.com>
>>>>>> +Description:
>>>>>> +		If a PCI device support 10-Bit Tag Requester, will create the
>>>>>> +		10bit_tag sysfs file. The file is readable, the value
>>>>>> +		indicate current 10-Bit Tag Requester Enable.
>>>>>> +		1 - enabled, 0 - disabled.
>>>>>> +
>>>>>> +		The file is also writeable, the value only accept by write 0
>>>>>> +		to disable 10-Bit Tag Requester when the driver does not bind
>>>>>> +		the deivce. The typical use case is for p2pdma when the peer
>>>>>> +		device does not support 10-BIT Tag Completer.
>>>
>>>>>> +static ssize_t pci_10bit_tag_store(struct device *dev,
>>>>>> +				   struct device_attribute *attr,
>>>>>> +				   const char *buf, size_t count)
>>>>>> +{
>>>>>> +	struct pci_dev *pdev = to_pci_dev(dev);
>>>>>> +	bool enable;
>>>>>> +
>>>>>> +	if (kstrtobool(buf, &enable) < 0)
>>>>>> +		return -EINVAL;
>>>>>> +
>>>>>> +	if (enable != false )
>>>>>> +		return -EINVAL;
>>>>>
>>>>> Is this the same as "if (enable)"?
>>>> Yes, Will fix.
>>>
>>> I actually don't like the one-way nature of this.  When the hierarchy
>>> supports 10-bit tags, we automatically enable them during enumeration.
>>>
>>> Then we provide this sysfs file, but it can only *disable* 10-bit
>>> tags.  There's no way to re-enable them except by rebooting (or using
>>> setpci, I guess).
>>>
>>> Why can't we allow *enabling* them here if they're supported in this
>>> hierarchy?
>> Yes, we can also provide this sysfs to enable 10-bit tag for EP devices
>> when the hierarchy supports 10-bit tags.
>>
>> I do not want to provide sysfs to enable/disable 10-bit tag for RP
>> devices as I can not tell current if the the Function has outstanding
>> Non-Posted Requests, may need to unbind all the EP drivers under the
>> RP, and current seems no scenario need to do this. This will make things
>> more complex.
>
> You mean "no scenario requires disabling 10-bit tags and then
> re-enabling them"?
Just for Root Port devices.
> That may be true, but I'm still hesitant to
> provide a switch than can only be reversed by rebooting.
>
> This is similar to the issue Leon raised that it's not practical to
> reboot machines.  Maybe we accept a one-way switch if the sole purpose
> is to work around a hardware defect.  Or maybe a kernel parameter that
> disables 10-bit tags completely is the defect mitigation.  I think we
> probably need such a parameter in case a defect prevents us from
> booting far enough to use the sysfs switch.
Make sense, will provide sysfs to enable and disable 10-bit tag.

Thanks,
Dongdong
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA
  2021-08-09 17:31       ` Bjorn Helgaas
@ 2021-08-10 12:31         ` Dongdong Liu
  0 siblings, 0 replies; 43+ messages in thread
From: Dongdong Liu @ 2021-08-10 12:31 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: hch, kw, logang, leon, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev



On 2021/8/10 1:31, Bjorn Helgaas wrote:
> On Sat, Aug 07, 2021 at 03:11:34PM +0800, Dongdong Liu wrote:
>>
>> On 2021/8/6 2:12, Bjorn Helgaas wrote:
>>> On Wed, Aug 04, 2021 at 09:47:08PM +0800, Dongdong Liu wrote:
>>>> Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
>>>> 10-Bit Tag Requester doesn't interact with a device that does not
>>>> support 10-BIT Tag Completer. Before that happens, the kernel should
>>>> emit a warning. "echo 0 > /sys/bus/pci/devices/.../10bit_tag" to
>>>> disable 10-BIT Tag Requester for PF device.
>>>> "echo 0 > /sys/bus/pci/devices/.../sriov_vf_10bit_tag_ctl" to disable
>>>> 10-BIT Tag Requester for VF device.
>>>
>>> s/10-BIT/10-Bit/ several times.
>> Will fix.
>>>
>>> Add blank lines between paragraphs.
>> Will fix.
>>>
>>>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>>>> ---
>>>>  drivers/pci/p2pdma.c | 40 ++++++++++++++++++++++++++++++++++++++++
>>>>  1 file changed, 40 insertions(+)
>>>>
>>>> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
>>>> index 50cdde3..948f2be 100644
>>>> --- a/drivers/pci/p2pdma.c
>>>> +++ b/drivers/pci/p2pdma.c
>>>> @@ -19,6 +19,7 @@
>>>>  #include <linux/random.h>
>>>>  #include <linux/seq_buf.h>
>>>>  #include <linux/xarray.h>
>>>> +#include "pci.h"
>>>>
>>>>  enum pci_p2pdma_map_type {
>>>>  	PCI_P2PDMA_MAP_UNKNOWN = 0,
>>>> @@ -410,6 +411,41 @@ static unsigned long map_types_idx(struct pci_dev *client)
>>>>  		(client->bus->number << 8) | client->devfn;
>>>>  }
>>>>
>>>> +static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
>>>
>>> s/vaild/valid/
>>>
>>> Or maybe s/valid/safe/ or s/valid/supported/, since "valid" isn't
>>> quite the right word here.  We want to know whether the source is
>>> enabled to generate 10-bit tags, and if so, whether the destination
>>> can handle them.
>>>
>>> "if (check_10bit_tags_valid())" does not make sense because
>>> "check_10bit_tags_valid()" is not a question with a yes/no answer.
>>>
>>> "10bit_tags_valid()" *might* be, because "if (10bit_tags_valid())"
>>> makes sense.  But I don't think you can start with a digit.
>>>
>>> Or maybe you want to invert the sense, e.g.,
>>> "10bit_tags_unsupported()", since that avoids negation at the caller:
>>>
>>>   if (10bit_tags_unsupported(a, b) ||
>>>       10bit_tags_unsupported(b, a))
>>>         map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>> Good suggestion. add a pci_ prefix.
>>
>> if (pci_10bit_tags_unsupported(a, b) ||
>>     pci_10bit_tags_unsupported(b, a))
>> 	map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;
>
> This treats both directions as equally important.  I don't know P2PDMA
> very well, but that doesn't seem like it would necessarily be the
> case.  I would think a common case would be device A doing DMA to B,
> but B *not* doing DMA to A.  So can you tell which direction you're
> setting up here, and can you take advantage of any asymmetry, e.g., by
> enabling 10-bit tags in the direction that supports it even if the
> other direction does not?

Documentation/driver-api/pci/p2pdma.rst
* Provider - A driver which provides or publishes P2P resources like
   memory or doorbell registers to other drivers.
* Client - A driver which makes use of a resource by setting up a
   DMA transaction to or from it.

So we may just check as below.
if (10bit_tags_unsupported(client, provider, verbose)
	map_type = PCI_P2PDMA_MAP_NOT_SUPPORTED;

@Logan What's your opinion?

Thanks,
Dongdong
> .
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2021-08-10 12:31 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-04 13:46 [PATCH V7 0/9] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 1/9] PCI: Use cached Device Capabilities Register Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 2/9] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 3/9] PCI: Add 10-Bit Tag register definitions Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 4/9] PCI: Enable 10-Bit Tag support for PCIe Endpoint devices Dongdong Liu
2021-08-04 23:17   ` Bjorn Helgaas
2021-08-05  7:47     ` Dongdong Liu
2021-08-05 19:54       ` Bjorn Helgaas
2021-08-07  6:19         ` Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 5/9] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices Dongdong Liu
2021-08-04 23:29   ` Bjorn Helgaas
2021-08-05  8:03     ` Dongdong Liu
2021-08-06 22:59       ` Bjorn Helgaas
2021-08-07  7:46         ` Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 6/9] PCI: Enable 10-Bit Tag support for PCIe RP devices Dongdong Liu
2021-08-04 23:38   ` Bjorn Helgaas
2021-08-05  8:25     ` Dongdong Liu
2021-08-09 17:26       ` Bjorn Helgaas
2021-08-10 11:59         ` Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 7/9] PCI/sysfs: Add a 10-Bit Tag sysfs file Dongdong Liu
2021-08-04 15:51   ` Logan Gunthorpe
2021-08-05 13:14     ` Dongdong Liu
2021-08-05 13:53       ` Leon Romanovsky
2021-08-05 15:36       ` Logan Gunthorpe
2021-08-04 23:49   ` Bjorn Helgaas
2021-08-05  8:37     ` Dongdong Liu
2021-08-05 15:31       ` Bjorn Helgaas
2021-08-07  7:01         ` Dongdong Liu
2021-08-09 17:37           ` Bjorn Helgaas
2021-08-10 12:16             ` Dongdong Liu
2021-08-04 23:52   ` Bjorn Helgaas
2021-08-05  8:38     ` Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 8/9] PCI/IOV: Add 10-Bit Tag sysfs files for VF devices Dongdong Liu
2021-08-05  0:05   ` Bjorn Helgaas
2021-08-05  8:47     ` Dongdong Liu
2021-08-05  9:39     ` Dongdong Liu
2021-08-04 13:47 ` [PATCH V7 9/9] PCI/P2PDMA: Add a 10-Bit Tag check in P2PDMA Dongdong Liu
2021-08-04 15:56   ` Logan Gunthorpe
2021-08-05  8:49     ` Dongdong Liu
2021-08-05 18:12   ` Bjorn Helgaas
2021-08-07  7:11     ` Dongdong Liu
2021-08-09 17:31       ` Bjorn Helgaas
2021-08-10 12:31         ` Dongdong Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).