linux-media.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices
@ 2021-07-23 11:06 Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 1/8] PCI: Use cached Device Capabilities Register Dongdong Liu
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

This patchset is to enable 10-Bit tag for PCIe EP devices (include VF) and
RP device.

V5->V6:
- Rebased on v5.14-rc2.
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V6 2/8].
- PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support.
- Add a 10-bit tag check in P2PDMA.
- Simplified implementation in [PATCH V6 6/8].
- Fix some comments in [PATCH V6 4/8].

V4->V5:
- Fix warning variable 'capa' is uninitialized.
- Fix warning unused variable 'pchild'.

V3->V4:
- Get the value of pcie_devcap2 in set_pcie_port_type().
- Add Reviewed-by: Christoph Hellwig <hch@lst.de> in [PATCH V4 1/6],
  [PATCH V4 3/6], [PATCH V4 4/6], [PATCH V4 5/6].
- Fix some code style.
- Rebased on v5.13-rc6.

V2->V3:
- Use cached Device Capabilities Register suggested by Christoph.
- Fix code style to avoid > 80 char lines.
- Renamve devcap2 to pcie_devcap2.

V1->V2: Fix some comments by Christoph.
- Store the devcap2 value in the pci_dev instead of reading it multiple
      times.
- Change pci_info to pci_dbg to avoid the noisy log.
- Rename ext_10bit_tag_comp_path to ext_10bit_tag.
- Fix the compile error.
- Rebased on v5.13-rc1.

Dongdong Liu (8):
  PCI: Use cached Device Capabilities Register
  PCI: Use cached Device Capabilities 2 Register
  PCI: Add 10-Bit Tag register definitions
  PCI: Enable 10-Bit tag support for PCIe Endpoint devices
  PCI/IOV: Enable 10-Bit tag support for PCIe VF devices
  PCI: Enable 10-Bit tag support for PCIe RP devices
  PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  PCI/P2PDMA: Add a 10-bit tag check in P2PDMA

 Documentation/admin-guide/kernel-parameters.txt |  7 +++
 drivers/media/pci/cobalt/cobalt-driver.c        |  5 +-
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  4 +-
 drivers/pci/iov.c                               |  8 +++
 drivers/pci/p2pdma.c                            | 38 +++++++++++++
 drivers/pci/pci.c                               | 70 ++++++++++++++++++++----
 drivers/pci/pci.h                               |  1 +
 drivers/pci/pcie/aspm.c                         | 11 ++--
 drivers/pci/pcie/portdrv_pci.c                  | 72 +++++++++++++++++++++++++
 drivers/pci/probe.c                             | 69 +++++++++++++++++++-----
 drivers/pci/quirks.c                            |  3 +-
 include/linux/pci.h                             |  5 ++
 include/uapi/linux/pci_regs.h                   |  5 ++
 13 files changed, 261 insertions(+), 37 deletions(-)

--
2.7.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH V6 1/8] PCI: Use cached Device Capabilities Register
  2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
@ 2021-07-23 11:06 ` Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 2/8] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

It will make sense to store the pcie_devcap value in the pci_dev
structure instead of reading Device Capabilities Register multiple
times. The fisrt place to use pcie_devcap is in set_pcie_port_type(),
get the pcie_devcap value here, then use cached pcie_devcap in the
needed place.

Acked-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/media/pci/cobalt/cobalt-driver.c |  5 +++--
 drivers/pci/pci.c                        |  5 +----
 drivers/pci/pcie/aspm.c                  | 11 ++++-------
 drivers/pci/probe.c                      | 11 +++--------
 drivers/pci/quirks.c                     |  3 +--
 include/linux/pci.h                      |  1 +
 6 files changed, 13 insertions(+), 23 deletions(-)

diff --git a/drivers/media/pci/cobalt/cobalt-driver.c b/drivers/media/pci/cobalt/cobalt-driver.c
index 16af58f..23c6436 100644
--- a/drivers/media/pci/cobalt/cobalt-driver.c
+++ b/drivers/media/pci/cobalt/cobalt-driver.c
@@ -193,11 +193,12 @@ void cobalt_pcie_status_show(struct cobalt *cobalt)
 		return;
 
 	/* Device */
-	pcie_capability_read_dword(pci_dev, PCI_EXP_DEVCAP, &capa);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVCTL, &ctrl);
 	pcie_capability_read_word(pci_dev, PCI_EXP_DEVSTA, &stat);
 	cobalt_info("PCIe device capability 0x%08x: Max payload %d\n",
-		    capa, get_payload_size(capa & PCI_EXP_DEVCAP_PAYLOAD));
+		    pci_dev->pcie_devcap,
+		    get_payload_size(pci_dev->pcie_devcap &
+				     PCI_EXP_DEVCAP_PAYLOAD));
 	cobalt_info("PCIe device control 0x%04x: Max payload %d. Max read request %d\n",
 		    ctrl,
 		    get_payload_size((ctrl & PCI_EXP_DEVCTL_PAYLOAD) >> 5),
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index aacf575..dc3bfb2 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4630,13 +4630,10 @@ EXPORT_SYMBOL(pci_wait_for_pending_transaction);
  */
 bool pcie_has_flr(struct pci_dev *dev)
 {
-	u32 cap;
-
 	if (dev->dev_flags & PCI_DEV_FLAGS_NO_FLR_RESET)
 		return false;
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	return cap & PCI_EXP_DEVCAP_FLR;
+	return dev->pcie_devcap & PCI_EXP_DEVCAP_FLR;
 }
 EXPORT_SYMBOL_GPL(pcie_has_flr);
 
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 013a47f..db944f6 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -660,7 +660,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 
 	/* Get and check endpoint acceptable latencies */
 	list_for_each_entry(child, &linkbus->devices, bus_list) {
-		u32 reg32, encoding;
+		u32 encoding;
 		struct aspm_latency *acceptable =
 			&link->acceptable[PCI_FUNC(child->devfn)];
 
@@ -668,12 +668,11 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
 		    pci_pcie_type(child) != PCI_EXP_TYPE_LEG_END)
 			continue;
 
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
 		/* Calculate endpoint L0s acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L0S) >> 6;
+		encoding = (child->pcie_devcap & PCI_EXP_DEVCAP_L0S) >> 6;
 		acceptable->l0s = calc_l0s_acceptable(encoding);
 		/* Calculate endpoint L1 acceptable latency */
-		encoding = (reg32 & PCI_EXP_DEVCAP_L1) >> 9;
+		encoding = (child->pcie_devcap & PCI_EXP_DEVCAP_L1) >> 9;
 		acceptable->l1 = calc_l1_acceptable(encoding);
 
 		pcie_aspm_check_latency(child);
@@ -808,7 +807,6 @@ static void free_link_state(struct pcie_link_state *link)
 static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 {
 	struct pci_dev *child;
-	u32 reg32;
 
 	/*
 	 * Some functions in a slot might not all be PCIe functions,
@@ -831,8 +829,7 @@ static int pcie_aspm_sanity_check(struct pci_dev *pdev)
 		 * Disable ASPM for pre-1.1 PCIe device, we follow MS to use
 		 * RBER bit to determine if a function is 1.1 version device
 		 */
-		pcie_capability_read_dword(child, PCI_EXP_DEVCAP, &reg32);
-		if (!(reg32 & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
+		if (!(child->pcie_devcap & PCI_EXP_DEVCAP_RBER) && !aspm_force) {
 			pci_info(child, "disabling ASPM on pre-1.1 PCIe device.  You can enable it with 'pcie_aspm=force'\n");
 			return -EINVAL;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 79177ac..cc700f6 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1498,8 +1498,8 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pdev->pcie_cap = pos;
 	pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
 	pdev->pcie_flags_reg = reg16;
-	pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, &reg16);
-	pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
+	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->pcie_devcap);
+	pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
 
 	parent = pci_upstream_bridge(pdev);
 	if (!parent)
@@ -2031,18 +2031,13 @@ static void pci_configure_mps(struct pci_dev *dev)
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u32 cap;
 	u16 ctl;
 	int ret;
 
 	if (!pci_is_pcie(dev))
 		return 0;
 
-	ret = pcie_capability_read_dword(dev, PCI_EXP_DEVCAP, &cap);
-	if (ret)
-		return 0;
-
-	if (!(cap & PCI_EXP_DEVCAP_EXT_TAG))
+	if (!(dev->pcie_devcap & PCI_EXP_DEVCAP_EXT_TAG))
 		return 0;
 
 	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL, &ctl);
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6d74386..2b405c5 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5173,8 +5173,7 @@ static void quirk_intel_qat_vf_cap(struct pci_dev *pdev)
 		pdev->pcie_cap = pos;
 		pci_read_config_word(pdev, pos + PCI_EXP_FLAGS, &reg16);
 		pdev->pcie_flags_reg = reg16;
-		pci_read_config_word(pdev, pos + PCI_EXP_DEVCAP, &reg16);
-		pdev->pcie_mpss = reg16 & PCI_EXP_DEVCAP_PAYLOAD;
+		pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
 
 		pdev->cfg_size = PCI_CFG_SPACE_EXP_SIZE;
 		if (pci_read_config_dword(pdev, PCI_CFG_SPACE_SIZE, &status) !=
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 540b377..aee7c85 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -340,6 +340,7 @@ struct pci_dev {
 	u8		rom_base_reg;	/* Config register controlling ROM */
 	u8		pin;		/* Interrupt pin this device uses */
 	u16		pcie_flags_reg;	/* Cached PCIe Capabilities Register */
+	u32		pcie_devcap;	/* Cached Device Capabilities Register */
 	unsigned long	*dma_alias_mask;/* Mask of enabled devfn aliases */
 
 	struct pci_driver *driver;	/* Driver bound to this device */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V6 2/8] PCI: Use cached Device Capabilities 2 Register
  2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 1/8] PCI: Use cached Device Capabilities Register Dongdong Liu
@ 2021-07-23 11:06 ` Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

It will make sense to store the pcie_devcap2 value in the pci_dev
structure instead of reading Device Capabilities 2 Register multiple
times. Get the pcie_devcap2 value set_pcie_port_type(), then use
cached pcie_devcap2 in the needed place.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c |  4 +---
 drivers/pci/pci.c                               |  9 ++++-----
 drivers/pci/probe.c                             | 10 ++++------
 include/linux/pci.h                             |  2 ++
 4 files changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index dbf9a0e..a8e1e22 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -6304,7 +6304,6 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		struct pci_dev *pbridge;
 		struct port_info *pi;
 		char name[IFNAMSIZ];
-		u32 devcap2;
 		u16 flags;
 
 		/* If we want to instantiate Virtual Functions, then our
@@ -6314,10 +6313,9 @@ static int cxgb4_iov_configure(struct pci_dev *pdev, int num_vfs)
 		 */
 		pbridge = pdev->bus->self;
 		pcie_capability_read_word(pbridge, PCI_EXP_FLAGS, &flags);
-		pcie_capability_read_dword(pbridge, PCI_EXP_DEVCAP2, &devcap2);
 
 		if ((flags & PCI_EXP_FLAGS_VERS) < 2 ||
-		    !(devcap2 & PCI_EXP_DEVCAP2_ARI)) {
+		    !(pbridge->pcie_devcap2 & PCI_EXP_DEVCAP2_ARI)) {
 			/* Our parent bridge does not support ARI so issue a
 			 * warning and skip instantiating the VFs.  They
 			 * won't be reachable.
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index dc3bfb2..d14c573 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -3700,7 +3700,7 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 {
 	struct pci_bus *bus = dev->bus;
 	struct pci_dev *bridge;
-	u32 cap, ctl2;
+	u32 ctl2;
 
 	if (!pci_is_pcie(dev))
 		return -EINVAL;
@@ -3724,19 +3724,18 @@ int pci_enable_atomic_ops_to_root(struct pci_dev *dev, u32 cap_mask)
 	while (bus->parent) {
 		bridge = bus->self;
 
-		pcie_capability_read_dword(bridge, PCI_EXP_DEVCAP2, &cap);
-
 		switch (pci_pcie_type(bridge)) {
 		/* Ensure switch ports support AtomicOp routing */
 		case PCI_EXP_TYPE_UPSTREAM:
 		case PCI_EXP_TYPE_DOWNSTREAM:
-			if (!(cap & PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
+			if (!(bridge->pcie_devcap2 &
+			      PCI_EXP_DEVCAP2_ATOMIC_ROUTE))
 				return -EINVAL;
 			break;
 
 		/* Ensure root port supports all the sizes we care about */
 		case PCI_EXP_TYPE_ROOT_PORT:
-			if ((cap & cap_mask) != cap_mask)
+			if ((bridge->pcie_devcap2 & cap_mask) != cap_mask)
 				return -EINVAL;
 			break;
 		}
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index cc700f6..c83245b 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1500,6 +1500,7 @@ void set_pcie_port_type(struct pci_dev *pdev)
 	pdev->pcie_flags_reg = reg16;
 	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP, &pdev->pcie_devcap);
 	pdev->pcie_mpss = pdev->pcie_devcap & PCI_EXP_DEVCAP_PAYLOAD;
+	pci_read_config_dword(pdev, pos + PCI_EXP_DEVCAP2, &pdev->pcie_devcap2);
 
 	parent = pci_upstream_bridge(pdev);
 	if (!parent)
@@ -2116,7 +2117,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 #ifdef CONFIG_PCIEASPM
 	struct pci_host_bridge *host = pci_find_host_bridge(dev->bus);
 	struct pci_dev *bridge;
-	u32 cap, ctl;
+	u32 ctl;
 
 	if (!pci_is_pcie(dev))
 		return;
@@ -2124,8 +2125,7 @@ static void pci_configure_ltr(struct pci_dev *dev)
 	/* Read L1 PM substate capabilities */
 	dev->l1ss = pci_find_ext_capability(dev, PCI_EXT_CAP_ID_L1SS);
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_LTR))
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_LTR))
 		return;
 
 	pcie_capability_read_dword(dev, PCI_EXP_DEVCTL2, &ctl);
@@ -2165,13 +2165,11 @@ static void pci_configure_eetlp_prefix(struct pci_dev *dev)
 #ifdef CONFIG_PCI_PASID
 	struct pci_dev *bridge;
 	int pcie_type;
-	u32 cap;
 
 	if (!pci_is_pcie(dev))
 		return;
 
-	pcie_capability_read_dword(dev, PCI_EXP_DEVCAP2, &cap);
-	if (!(cap & PCI_EXP_DEVCAP2_EE_PREFIX))
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_EE_PREFIX))
 		return;
 
 	pcie_type = pci_pcie_type(dev);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index aee7c85..9aab67f 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -341,6 +341,8 @@ struct pci_dev {
 	u8		pin;		/* Interrupt pin this device uses */
 	u16		pcie_flags_reg;	/* Cached PCIe Capabilities Register */
 	u32		pcie_devcap;	/* Cached Device Capabilities Register */
+	u32		pcie_devcap2;	/* Cached Device Capabilities 2
+					   Register */
 	unsigned long	*dma_alias_mask;/* Mask of enabled devfn aliases */
 
 	struct pci_driver *driver;	/* Driver bound to this device */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V6 3/8] PCI: Add 10-Bit Tag register definitions
  2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 1/8] PCI: Use cached Device Capabilities Register Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 2/8] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
@ 2021-07-23 11:06 ` Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 4/8] PCI: Enable 10-Bit tag support for PCIe Endpoint devices Dongdong Liu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add 10-Bit Tag register definitions for use in subsequen patches.
See the PCIe 5.0 spec section 7.5.3.15 and 9.3.3.2.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 include/uapi/linux/pci_regs.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index e709ae8..cf1ddb8 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -648,6 +648,8 @@
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP64	0x00000100 /* 64b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_ATOMIC_COMP128	0x00000200 /* 128b AtomicOp completion */
 #define  PCI_EXP_DEVCAP2_LTR		0x00000800 /* Latency tolerance reporting */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_COMP 0x00010000 /* 10-Bit Tag Completer Supported */
+#define  PCI_EXP_DEVCAP2_10BIT_TAG_REQ  0x00020000 /* 10-Bit Tag Requester Supported */
 #define  PCI_EXP_DEVCAP2_OBFF_MASK	0x000c0000 /* OBFF support mechanism */
 #define  PCI_EXP_DEVCAP2_OBFF_MSG	0x00040000 /* New message signaling */
 #define  PCI_EXP_DEVCAP2_OBFF_WAKE	0x00080000 /* Re-use WAKE# for OBFF */
@@ -661,6 +663,7 @@
 #define  PCI_EXP_DEVCTL2_IDO_REQ_EN	0x0100	/* Allow IDO for requests */
 #define  PCI_EXP_DEVCTL2_IDO_CMP_EN	0x0200	/* Allow IDO for completions */
 #define  PCI_EXP_DEVCTL2_LTR_EN		0x0400	/* Enable LTR mechanism */
+#define  PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN 0x1000 /* 10-Bit Tag Requester Enable */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGA_EN	0x2000	/* Enable OBFF Message type A */
 #define  PCI_EXP_DEVCTL2_OBFF_MSGB_EN	0x4000	/* Enable OBFF Message type B */
 #define  PCI_EXP_DEVCTL2_OBFF_WAKE_EN	0x6000	/* OBFF using WAKE# signaling */
@@ -931,6 +934,7 @@
 /* Single Root I/O Virtualization */
 #define PCI_SRIOV_CAP		0x04	/* SR-IOV Capabilities */
 #define  PCI_SRIOV_CAP_VFM	0x00000001  /* VF Migration Capable */
+#define  PCI_SRIOV_CAP_VF_10BIT_TAG_REQ	0x00000004 /* VF 10-Bit Tag Requester Supported */
 #define  PCI_SRIOV_CAP_INTR(x)	((x) >> 21) /* Interrupt Message Number */
 #define PCI_SRIOV_CTRL		0x08	/* SR-IOV Control */
 #define  PCI_SRIOV_CTRL_VFE	0x0001	/* VF Enable */
@@ -938,6 +942,7 @@
 #define  PCI_SRIOV_CTRL_INTR	0x0004	/* VF Migration Interrupt Enable */
 #define  PCI_SRIOV_CTRL_MSE	0x0008	/* VF Memory Space Enable */
 #define  PCI_SRIOV_CTRL_ARI	0x0010	/* ARI Capable Hierarchy */
+#define  PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN 0x0020 /* VF 10-Bit Tag Requester Enable */
 #define PCI_SRIOV_STATUS	0x0a	/* SR-IOV Status */
 #define  PCI_SRIOV_STATUS_VFM	0x0001	/* VF Migration Status */
 #define PCI_SRIOV_INITIAL_VF	0x0c	/* Initial VFs */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V6 4/8] PCI: Enable 10-Bit tag support for PCIe Endpoint devices
  2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (2 preceding siblings ...)
  2021-07-23 11:06 ` [PATCH V6 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
@ 2021-07-23 11:06 ` Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 5/8] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices Dongdong Liu
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

10-Bit Tag capability, introduced in PCIe-4.0 increases the total Tag
field size from 8 bits to 10 bits.

PCIe spec 5.0 r1.0 section 2.2.6.2 "Considerations for Implementing
10-Bit Tag Capabilities" Implementation Note.
For platforms where the RC supports 10-Bit Tag Completer capability,
it is highly recommended for platform firmware or operating software
that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
bit automatically in Endpoints with 10-Bit Tag Requester capability. This
enables the important class of 10-Bit Tag capable adapters that send
Memory Read Requests only to host memory.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/pci/probe.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++-
 include/linux/pci.h |  2 ++
 2 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index c83245b..3da7baa 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2029,10 +2029,42 @@ static void pci_configure_mps(struct pci_dev *dev)
 		 p_mps, mps, mpss);
 }
 
+static void pci_configure_10bit_tags(struct pci_dev *dev)
+{
+	struct pci_dev *bridge;
+
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
+		return;
+
+	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
+		dev->ext_10bit_tag = 1;
+		return;
+	}
+
+	bridge = pci_upstream_bridge(dev);
+	if (bridge && bridge->ext_10bit_tag)
+		dev->ext_10bit_tag = 1;
+
+	/*
+	 * 10-Bit Tag Requester Enable in Device Control 2 Register is RsvdP
+	 * for VF.
+	 */
+	if (dev->is_virtfn)
+		return;
+
+	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT &&
+	    dev->ext_10bit_tag == 1 &&
+	    (dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ)) {
+		pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
+		pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+}
+
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 {
 	struct pci_host_bridge *host;
-	u16 ctl;
+	u16 ctl, ctl2;
 	int ret;
 
 	if (!pci_is_pcie(dev))
@@ -2045,6 +2077,10 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 	if (ret)
 		return 0;
 
+	ret = pcie_capability_read_word(dev, PCI_EXP_DEVCTL2, &ctl2);
+	if (ret)
+		return 0;
+
 	host = pci_find_host_bridge(dev->bus);
 	if (!host)
 		return 0;
@@ -2059,6 +2095,12 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL,
 						   PCI_EXP_DEVCTL_EXT_TAG);
 		}
+
+		if (ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN) {
+			pci_info(dev, "disabling 10-Bit Tags\n");
+			pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+		}
 		return 0;
 	}
 
@@ -2067,6 +2109,9 @@ int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
 		pcie_capability_set_word(dev, PCI_EXP_DEVCTL,
 					 PCI_EXP_DEVCTL_EXT_TAG);
 	}
+
+	pci_configure_10bit_tags(dev);
+
 	return 0;
 }
 
diff --git a/include/linux/pci.h b/include/linux/pci.h
index 9aab67f..af6cb53 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -393,6 +393,8 @@ struct pci_dev {
 #endif
 	unsigned int	eetlp_prefix_path:1;	/* End-to-End TLP Prefix */
 
+	unsigned int	ext_10bit_tag:1; /* 10-Bit Tag Completer Supported
+					    from root to here */
 	pci_channel_state_t error_state;	/* Current connectivity state */
 	struct device	dev;			/* Generic device interface */
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V6 5/8] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices
  2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (3 preceding siblings ...)
  2021-07-23 11:06 ` [PATCH V6 4/8] PCI: Enable 10-Bit tag support for PCIe Endpoint devices Dongdong Liu
@ 2021-07-23 11:06 ` Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 6/8] PCI: Enable 10-Bit tag support for PCIe RP devices Dongdong Liu
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Enable VF 10-Bit Tag Requester when it's upstream component support
10-bit Tag Completer.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
 drivers/pci/iov.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index dafdc65..0d0bed1 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -634,6 +634,10 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 	pci_iov_set_numvfs(dev, nr_virtfn);
 	iov->ctrl |= PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE;
+	if ((iov->cap & PCI_SRIOV_CAP_VF_10BIT_TAG_REQ) &&
+	    dev->ext_10bit_tag)
+		iov->ctrl |= PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	msleep(100);
@@ -650,6 +654,8 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn)
 
 err_pcibios:
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	if (iov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
+		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
@@ -682,6 +688,8 @@ static void sriov_disable(struct pci_dev *dev)
 
 	sriov_del_vfs(dev);
 	iov->ctrl &= ~(PCI_SRIOV_CTRL_VFE | PCI_SRIOV_CTRL_MSE);
+	if (iov->ctrl & PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN)
+		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
 	pci_cfg_access_lock(dev);
 	pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL, iov->ctrl);
 	ssleep(1);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V6 6/8] PCI: Enable 10-Bit tag support for PCIe RP devices
  2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (4 preceding siblings ...)
  2021-07-23 11:06 ` [PATCH V6 5/8] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices Dongdong Liu
@ 2021-07-23 11:06 ` Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support Dongdong Liu
  2021-07-23 11:06 ` [PATCH V6 8/8] PCI/P2PDMA: Add a 10-bit tag check in P2PDMA Dongdong Liu
  7 siblings, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0r1.0 section 2.2.6.2 implementation note, In configurations
where a Requester with 10-Bit Tag Requester capability needs to target
multiple Completers, one needs to ensure that the Requester sends 10-Bit
Tag Requests only to Completers that have 10-Bit Tag Completer capability.
So we enable 10-Bit Tag Requester for root port only when the devices
under the root port support 10-Bit Tag Completer.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/pci/pcie/portdrv_pci.c | 69 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 69 insertions(+)

diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index c7ff1ee..2382cd2 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -90,6 +90,72 @@ static const struct dev_pm_ops pcie_portdrv_pm_ops = {
 #define PCIE_PORTDRV_PM_OPS	NULL
 #endif /* !PM */
 
+static int pci_10bit_tag_comp_support(struct pci_dev *dev, void *data)
+{
+	bool *support = (bool *)data;
+
+	if (!pci_is_pcie(dev)) {
+		*support = false;
+		return 1;
+	}
+
+	/*
+	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
+	 * For configurations where a Requester with 10-Bit Tag Requester
+	 * capability targets Completers where some do and some do not have
+	 * 10-Bit Tag Completer capability, how the Requester determines which
+	 * NPRs include 10-Bit Tags is outside the scope of this specification.
+	 * So we do not consider hotplug scenario.
+	 */
+	if (dev->is_hotplug_bridge) {
+		*support = false;
+		return 1;
+	}
+
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP)) {
+		*support = false;
+		return 1;
+	}
+
+	return 0;
+}
+
+static void pci_configure_rp_10bit_tag(struct pci_dev *dev)
+{
+	bool support = true;
+
+	if (dev->subordinate == NULL)
+		return;
+
+	/* If no devices under the root port, no need to enable 10-Bit Tag. */
+	if (list_empty(&dev->subordinate->devices))
+		return;
+
+	pci_10bit_tag_comp_support(dev, &support);
+	if (!support)
+		return;
+
+	/*
+	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
+	 * In configurations where a Requester with 10-Bit Tag Requester
+	 * capability needs to target multiple Completers, one needs to ensure
+	 * that the Requester sends 10-Bit Tag Requests only to Completers
+	 * that have 10-Bit Tag Completer capability. So we enable 10-Bit Tag
+	 * Requester for root port only when the devices under the root port
+	 * support 10-Bit Tag Completer.
+	 */
+	pci_walk_bus(dev->subordinate, pci_10bit_tag_comp_support, &support);
+	if (!support)
+		return;
+
+	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
+		return;
+
+	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
+	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
+				 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+}
+
 /*
  * pcie_portdrv_probe - Probe PCI-Express port devices
  * @dev: PCI-Express port device being probed
@@ -111,6 +177,9 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
 	     (type != PCI_EXP_TYPE_RC_EC)))
 		return -ENODEV;
 
+	if (type == PCI_EXP_TYPE_ROOT_PORT)
+		pci_configure_rp_10bit_tag(dev);
+
 	if (type == PCI_EXP_TYPE_RC_EC)
 		pcie_link_rcec(dev);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (5 preceding siblings ...)
  2021-07-23 11:06 ` [PATCH V6 6/8] PCI: Enable 10-Bit tag support for PCIe RP devices Dongdong Liu
@ 2021-07-23 11:06 ` Dongdong Liu
  2021-07-23 11:32   ` Leon Romanovsky
  2021-07-23 16:58   ` kernel test robot
  2021-07-23 11:06 ` [PATCH V6 8/8] PCI/P2PDMA: Add a 10-bit tag check in P2PDMA Dongdong Liu
  7 siblings, 2 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
sending Requests to other Endpoints (as opposed to host memory), the
Endpoint must not send 10-Bit Tag Requests to another given Endpoint
unless an implementation-specific mechanism determines that the Endpoint
supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
parameter to disable 10-Bit Tag Requester if the peer device does not
support the 10-Bit Tag Completer. This will make P2P traffic safe.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 Documentation/admin-guide/kernel-parameters.txt |  7 ++++
 drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
 drivers/pci/pci.h                               |  1 +
 drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
 drivers/pci/probe.c                             |  9 ++--
 5 files changed, 78 insertions(+), 8 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index bdb2200..c2c4585 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4019,6 +4019,13 @@
 				bridges without forcing it upstream. Note:
 				this removes isolation between devices and
 				may put more devices in an IOMMU group.
+		disable_10bit_tag=<pci_dev>[; ...]
+				  Specify one or more PCI devices (in the format
+				  specified above) separated by semicolons.
+				  Disable 10-Bit Tag Requester if the peer
+				  device does not support the 10-Bit Tag
+				  Completer.This will make P2P traffic safe.
+
 		force_floating	[S390] Force usage of floating interrupts.
 		nomio		[S390] Do not use MIO instructions.
 		norid		[S390] ignore the RID field and force use of
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index d14c573..8494e4f 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -6568,6 +6568,59 @@ int pci_bus_find_domain_nr(struct pci_bus *bus, struct device *parent)
 }
 #endif
 
+static const char *disable_10bit_tag_param;
+
+void pci_disable_10bit_tag(struct pci_dev *dev)
+{
+	int ret = 0;
+	const char *p;
+#ifdef CONFIG_PCI_IOV
+	struct pci_sriov *iov;
+#endif
+
+	if (!disable_10bit_tag_param)
+		return;
+
+	p = disable_10bit_tag_param;
+	while (*p) {
+		ret = pci_dev_str_match(dev, p, &p);
+		if (ret < 0) {
+			pr_info_once("PCI: Can't parse disable_10bit_tag parameter: %s\n",
+				     disable_10bit_tag_param);
+
+			break;
+		} else if (ret == 1) {
+			/* Found a match */
+			break;
+		}
+
+		if (*p != ';' && *p != ',') {
+			/* End of param or invalid format */
+			break;
+		}
+		p++;
+	}
+
+	if (ret != 1)
+		return;
+
+#ifdef CONFIG_PCI_IOV
+	if (dev->is_virtfn) {
+		iov = dev->physfn->sriov;
+		iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
+		pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL,
+				      iov->ctrl);
+		pci_info(dev, "disabled PF SRIOV 10-Bit Tag Requester\n");
+		return;
+#endif
+	}
+
+	pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
+				   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+
+	pci_info(dev, "disabled 10-Bit Tag Requester\n");
+}
+
 /**
  * pci_ext_cfg_avail - can we access extended PCI config space?
  *
@@ -6643,6 +6696,8 @@ static int __init pci_setup(char *str)
 				pci_add_flags(PCI_SCAN_ALL_PCIE_DEVS);
 			} else if (!strncmp(str, "disable_acs_redir=", 18)) {
 				disable_acs_redir_param = str + 18;
+			} else if (!strncmp(str, "disable_10bit_tag=", 18)) {
+				disable_10bit_tag_param = str + 18;
 			} else {
 				pr_err("PCI: Unknown option `%s'\n", str);
 			}
@@ -6667,6 +6722,7 @@ static int __init pci_realloc_setup_params(void)
 	resource_alignment_param = kstrdup(resource_alignment_param,
 					   GFP_KERNEL);
 	disable_acs_redir_param = kstrdup(disable_acs_redir_param, GFP_KERNEL);
+	disable_10bit_tag_param = kstrdup(disable_10bit_tag_param, GFP_KERNEL);
 
 	return 0;
 }
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index 93dcdd4..87c8187 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -16,6 +16,7 @@ extern bool pci_early_dump;
 
 bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
 bool pcie_cap_has_rtctl(const struct pci_dev *dev);
+void pci_disable_10bit_tag(struct pci_dev *dev);
 
 /* Functions internal to the PCI core code */
 
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index 2382cd2..747728e 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -125,15 +125,15 @@ static void pci_configure_rp_10bit_tag(struct pci_dev *dev)
 	bool support = true;
 
 	if (dev->subordinate == NULL)
-		return;
+		goto disable_10bit_tag_req;
 
 	/* If no devices under the root port, no need to enable 10-Bit Tag. */
 	if (list_empty(&dev->subordinate->devices))
-		return;
+		goto disable_10bit_tag_req;
 
 	pci_10bit_tag_comp_support(dev, &support);
 	if (!support)
-		return;
+		goto disable_10bit_tag_req;
 
 	/*
 	 * PCIe spec 5.0r1.0 section 2.2.6.2 implementation note.
@@ -146,14 +146,17 @@ static void pci_configure_rp_10bit_tag(struct pci_dev *dev)
 	 */
 	pci_walk_bus(dev->subordinate, pci_10bit_tag_comp_support, &support);
 	if (!support)
-		return;
+		goto disable_10bit_tag_req;
 
 	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_REQ))
-		return;
+		goto disable_10bit_tag_req;
 
 	pci_dbg(dev, "enabling 10-Bit Tag Requester\n");
 	pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
 				 PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+
+disable_10bit_tag_req:
+	pci_disable_10bit_tag(dev);
 }
 
 /*
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 3da7baa..0b7b053 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2034,11 +2034,11 @@ static void pci_configure_10bit_tags(struct pci_dev *dev)
 	struct pci_dev *bridge;
 
 	if (!(dev->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP))
-		return;
+		goto disable_10bit_tag_req;
 
 	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT) {
 		dev->ext_10bit_tag = 1;
-		return;
+		goto disable_10bit_tag_req;
 	}
 
 	bridge = pci_upstream_bridge(dev);
@@ -2050,7 +2050,7 @@ static void pci_configure_10bit_tags(struct pci_dev *dev)
 	 * for VF.
 	 */
 	if (dev->is_virtfn)
-		return;
+		goto disable_10bit_tag_req;
 
 	if (pci_pcie_type(dev) == PCI_EXP_TYPE_ENDPOINT &&
 	    dev->ext_10bit_tag == 1 &&
@@ -2059,6 +2059,9 @@ static void pci_configure_10bit_tags(struct pci_dev *dev)
 		pcie_capability_set_word(dev, PCI_EXP_DEVCTL2,
 					PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
 	}
+
+disable_10bit_tag_req:
+	 pci_disable_10bit_tag(dev);
 }
 
 int pci_configure_extended_tags(struct pci_dev *dev, void *ign)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH V6 8/8]  PCI/P2PDMA: Add a 10-bit tag check in P2PDMA
  2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
                   ` (6 preceding siblings ...)
  2021-07-23 11:06 ` [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support Dongdong Liu
@ 2021-07-23 11:06 ` Dongdong Liu
  2021-07-23 16:25   ` Logan Gunthorpe
  7 siblings, 1 reply; 21+ messages in thread
From: Dongdong Liu @ 2021-07-23 11:06 UTC (permalink / raw)
  To: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
10-Bit Tag Requester doesn't interact with a device that does not
support 10-BIT tag Completer. Before that happens, the kernel should
emit a warning saying to enable a ”pci=disable_10bit_tag=“ kernel
parameter.

Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
---
 drivers/pci/p2pdma.c | 38 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 38 insertions(+)

diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
index 50cdde3..bd93840 100644
--- a/drivers/pci/p2pdma.c
+++ b/drivers/pci/p2pdma.c
@@ -19,6 +19,7 @@
 #include <linux/random.h>
 #include <linux/seq_buf.h>
 #include <linux/xarray.h>
+#include "pci.h"
 
 enum pci_p2pdma_map_type {
 	PCI_P2PDMA_MAP_UNKNOWN = 0,
@@ -541,6 +542,39 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
 	return map_type;
 }
 
+
+static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
+				   bool verbose)
+{
+	bool req;
+	bool comp;
+	u16 ctl2;
+
+	if (a->is_virtfn) {
+#ifdef CONFIG_PCI_IOV
+		req = !!(a->physfn->sriov->ctrl &
+			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
+#endif
+	} else {
+		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl2);
+		req = !!(ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
+	}
+
+	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
+	if (req && (!comp)) {
+		if (verbose) {
+			pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set in device (%s), but peer device (%s) does not support the 10-Bit Tag Completer\n",
+				 pci_name(a), pci_name(b));
+
+			pci_warn(a, "to disable 10-Bit Tag Requester for this device, add the kernel parameter: pci=disable_10bit_tag=%s\n",
+				 pci_name(a));
+		}
+		return false;
+	}
+
+	return true;
+}
+
 /**
  * pci_p2pdma_distance_many - Determine the cumulative distance between
  *	a p2pdma provider and the clients in use.
@@ -579,6 +613,10 @@ int pci_p2pdma_distance_many(struct pci_dev *provider, struct device **clients,
 			return -1;
 		}
 
+		if (!check_10bit_tags_vaild(pci_client, provider, verbose) ||
+		    !check_10bit_tags_vaild(provider, pci_client, verbose))
+			not_supported = true;
+
 		map = calc_map_type_and_dist(provider, pci_client, &distance,
 					     verbose);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-23 11:06 ` [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support Dongdong Liu
@ 2021-07-23 11:32   ` Leon Romanovsky
  2021-07-23 16:20     ` Logan Gunthorpe
  2021-07-23 16:58   ` kernel test robot
  1 sibling, 1 reply; 21+ messages in thread
From: Leon Romanovsky @ 2021-07-23 11:32 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Fri, Jul 23, 2021 at 07:06:41PM +0800, Dongdong Liu wrote:
> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> sending Requests to other Endpoints (as opposed to host memory), the
> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> unless an implementation-specific mechanism determines that the Endpoint
> supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
> parameter to disable 10-Bit Tag Requester if the peer device does not
> support the 10-Bit Tag Completer. This will make P2P traffic safe.
> 
> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
>  drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
>  drivers/pci/pci.h                               |  1 +
>  drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
>  drivers/pci/probe.c                             |  9 ++--
>  5 files changed, 78 insertions(+), 8 deletions(-)
> 
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index bdb2200..c2c4585 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4019,6 +4019,13 @@
>  				bridges without forcing it upstream. Note:
>  				this removes isolation between devices and
>  				may put more devices in an IOMMU group.
> +		disable_10bit_tag=<pci_dev>[; ...]
> +				  Specify one or more PCI devices (in the format
> +				  specified above) separated by semicolons.
> +				  Disable 10-Bit Tag Requester if the peer
> +				  device does not support the 10-Bit Tag
> +				  Completer.This will make P2P traffic safe.

I can't imagine more awkward user experience than such kernel parameter.

As a user, I will need to boot the system, hope for the best that system
works, write down all PCI device numbers, guess which one doesn't work
properly, update grub with new command line argument and reboot the
system. Any HW change and this dance should be repeated.

Thanks

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-23 11:32   ` Leon Romanovsky
@ 2021-07-23 16:20     ` Logan Gunthorpe
  2021-07-25  6:39       ` Leon Romanovsky
  0 siblings, 1 reply; 21+ messages in thread
From: Logan Gunthorpe @ 2021-07-23 16:20 UTC (permalink / raw)
  To: Leon Romanovsky, Dongdong Liu
  Cc: helgaas, hch, kw, linux-pci, rajur, hverkuil-cisco, linux-media, netdev




On 2021-07-23 5:32 a.m., Leon Romanovsky wrote:
> On Fri, Jul 23, 2021 at 07:06:41PM +0800, Dongdong Liu wrote:
>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>> sending Requests to other Endpoints (as opposed to host memory), the
>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>> unless an implementation-specific mechanism determines that the Endpoint
>> supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
>> parameter to disable 10-Bit Tag Requester if the peer device does not
>> support the 10-Bit Tag Completer. This will make P2P traffic safe.
>>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>>  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
>>  drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
>>  drivers/pci/pci.h                               |  1 +
>>  drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
>>  drivers/pci/probe.c                             |  9 ++--
>>  5 files changed, 78 insertions(+), 8 deletions(-)
>>
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>> index bdb2200..c2c4585 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -4019,6 +4019,13 @@
>>  				bridges without forcing it upstream. Note:
>>  				this removes isolation between devices and
>>  				may put more devices in an IOMMU group.
>> +		disable_10bit_tag=<pci_dev>[; ...]
>> +				  Specify one or more PCI devices (in the format
>> +				  specified above) separated by semicolons.
>> +				  Disable 10-Bit Tag Requester if the peer
>> +				  device does not support the 10-Bit Tag
>> +				  Completer.This will make P2P traffic safe.
> 
> I can't imagine more awkward user experience than such kernel parameter.
> 
> As a user, I will need to boot the system, hope for the best that system
> works, write down all PCI device numbers, guess which one doesn't work
> properly, update grub with new command line argument and reboot the
> system. Any HW change and this dance should be repeated.

There are already two such PCI parameters with this pattern and they are
not that awkward. pci_dev may be specified with either vendor/device IDS
or with a path of BDFs (which protects against renumbering).

This flag is only useful in P2PDMA traffic, and if the user attempts
such a transfer, it prints a warning (see the next patch) with the exact
parameter that needs to be added to the command line.

This has worked well for disable_acs_redir and was used for
resource_alignment before that for quite some time. So save a better
suggestion I think this is more than acceptable.

Logan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 8/8] PCI/P2PDMA: Add a 10-bit tag check in P2PDMA
  2021-07-23 11:06 ` [PATCH V6 8/8] PCI/P2PDMA: Add a 10-bit tag check in P2PDMA Dongdong Liu
@ 2021-07-23 16:25   ` Logan Gunthorpe
  2021-07-24 10:36     ` Dongdong Liu
  0 siblings, 1 reply; 21+ messages in thread
From: Logan Gunthorpe @ 2021-07-23 16:25 UTC (permalink / raw)
  To: Dongdong Liu, helgaas, hch, kw, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev




On 2021-07-23 5:06 a.m., Dongdong Liu wrote:
> Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
> 10-Bit Tag Requester doesn't interact with a device that does not
> support 10-BIT tag Completer. Before that happens, the kernel should
> emit a warning saying to enable a ”pci=disable_10bit_tag=“ kernel
> parameter.
> 
> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> ---
>  drivers/pci/p2pdma.c | 38 ++++++++++++++++++++++++++++++++++++++
>  1 file changed, 38 insertions(+)
> 
> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
> index 50cdde3..bd93840 100644
> --- a/drivers/pci/p2pdma.c
> +++ b/drivers/pci/p2pdma.c
> @@ -19,6 +19,7 @@
>  #include <linux/random.h>
>  #include <linux/seq_buf.h>
>  #include <linux/xarray.h>
> +#include "pci.h"
>  
>  enum pci_p2pdma_map_type {
>  	PCI_P2PDMA_MAP_UNKNOWN = 0,
> @@ -541,6 +542,39 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
>  	return map_type;
>  }
>  
> +
> +static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
> +				   bool verbose)
> +{
> +	bool req;
> +	bool comp;
> +	u16 ctl2;
> +
> +	if (a->is_virtfn) {
> +#ifdef CONFIG_PCI_IOV
> +		req = !!(a->physfn->sriov->ctrl &
> +			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
> +#endif
> +	} else {
> +		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl2);
> +		req = !!(ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
> +	}
> +
> +	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
> +	if (req && (!comp)) {
> +		if (verbose) {
> +			pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set in device (%s), but peer device (%s) does not support the 10-Bit Tag Completer\n",
> +				 pci_name(a), pci_name(b));
> +
> +			pci_warn(a, "to disable 10-Bit Tag Requester for this device, add the kernel parameter: pci=disable_10bit_tag=%s\n",
> +				 pci_name(a));
> +		}
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
>  /**
>   * pci_p2pdma_distance_many - Determine the cumulative distance between
>   *	a p2pdma provider and the clients in use.
> @@ -579,6 +613,10 @@ int pci_p2pdma_distance_many(struct pci_dev *provider, struct device **clients,
>  			return -1;
>  		}
>  
> +		if (!check_10bit_tags_vaild(pci_client, provider, verbose) ||
> +		    !check_10bit_tags_vaild(provider, pci_client, verbose))
> +			not_supported = true;
> +

This check needs to be done in calc_map_type_and_dist(). The mapping
type needs to be correctly stored in the xarray cache as other functions
rely on the cached value (and upcoming work will be calling
calc_map_type_and_dist() without pci_p2pdma_distance_many()).

Logan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-23 11:06 ` [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support Dongdong Liu
  2021-07-23 11:32   ` Leon Romanovsky
@ 2021-07-23 16:58   ` kernel test robot
  2021-07-24 10:35     ` Dongdong Liu
  1 sibling, 1 reply; 21+ messages in thread
From: kernel test robot @ 2021-07-23 16:58 UTC (permalink / raw)
  To: Dongdong Liu, helgaas, hch, kw, logang, linux-pci, rajur, hverkuil-cisco
  Cc: clang-built-linux, kbuild-all, linux-media, netdev

[-- Attachment #1: Type: text/plain, Size: 6758 bytes --]

Hi Dongdong,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on pci/next]
[also build test WARNING on linuxtv-media/master linus/master v5.14-rc2 next-20210723]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Dongdong-Liu/PCI-Enable-10-Bit-tag-support-for-PCIe-devices/20210723-190930
base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
config: x86_64-randconfig-b001-20210723 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 9625ca5b602616b2f5584e8a49ba93c52c141e40)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # https://github.com/0day-ci/linux/commit/2ff0b803971a3df5815c96c5c4874f4eef64fa2f
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Dongdong-Liu/PCI-Enable-10-Bit-tag-support-for-PCIe-devices/20210723-190930
        git checkout 2ff0b803971a3df5815c96c5c4874f4eef64fa2f
        # save the attached .config to linux build tree
        mkdir build_dir
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/pci/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/pci/pci.c:6618:34: error: expected identifier
           pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
                                           ^
   include/uapi/linux/pci_regs.h:657:26: note: expanded from macro 'PCI_EXP_DEVCTL2'
   #define PCI_EXP_DEVCTL2         40      /* Device Control 2 */
                                   ^
>> drivers/pci/pci.c:6618:2: warning: declaration specifier missing, defaulting to 'int'
           pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
           ^
           int
   drivers/pci/pci.c:6618:28: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
           pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
                                     ^
   drivers/pci/pci.c:6618:2: error: conflicting types for 'pcie_capability_clear_word'
           pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
           ^
   include/linux/pci.h:1161:19: note: previous definition is here
   static inline int pcie_capability_clear_word(struct pci_dev *dev, int pos,
                     ^
   drivers/pci/pci.c:6621:2: error: expected parameter declarator
           pci_info(dev, "disabled 10-Bit Tag Requester\n");
           ^
   include/linux/pci.h:2472:46: note: expanded from macro 'pci_info'
   #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
                                                    ^
   drivers/pci/pci.c:6621:2: error: expected ')'
   include/linux/pci.h:2472:46: note: expanded from macro 'pci_info'
   #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
                                                    ^
   drivers/pci/pci.c:6621:2: note: to match this '('
   include/linux/pci.h:2472:37: note: expanded from macro 'pci_info'
   #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
                                           ^
   include/linux/dev_printk.h:118:11: note: expanded from macro 'dev_info'
           _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
                    ^
   drivers/pci/pci.c:6621:2: warning: declaration specifier missing, defaulting to 'int'
           pci_info(dev, "disabled 10-Bit Tag Requester\n");
           ^
           int
   include/linux/pci.h:2472:37: note: expanded from macro 'pci_info'
   #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
                                           ^
   include/linux/dev_printk.h:118:2: note: expanded from macro 'dev_info'
           _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
           ^
   drivers/pci/pci.c:6621:2: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
   include/linux/pci.h:2472:37: note: expanded from macro 'pci_info'
   #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
                                           ^
   include/linux/dev_printk.h:118:11: note: expanded from macro 'dev_info'
           _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
                    ^
   drivers/pci/pci.c:6621:2: error: conflicting types for '_dev_info'
   include/linux/pci.h:2472:37: note: expanded from macro 'pci_info'
   #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
                                           ^
   include/linux/dev_printk.h:118:2: note: expanded from macro 'dev_info'
           _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
           ^
   include/linux/dev_printk.h:56:6: note: previous declaration is here
   void _dev_info(const struct device *dev, const char *fmt, ...);
        ^
   drivers/pci/pci.c:6622:1: error: extraneous closing brace ('}')
   }
   ^
   2 warnings and 8 errors generated.


vim +/int +6618 drivers/pci/pci.c

  6580	
  6581		if (!disable_10bit_tag_param)
  6582			return;
  6583	
  6584		p = disable_10bit_tag_param;
  6585		while (*p) {
  6586			ret = pci_dev_str_match(dev, p, &p);
  6587			if (ret < 0) {
  6588				pr_info_once("PCI: Can't parse disable_10bit_tag parameter: %s\n",
  6589					     disable_10bit_tag_param);
  6590	
  6591				break;
  6592			} else if (ret == 1) {
  6593				/* Found a match */
  6594				break;
  6595			}
  6596	
  6597			if (*p != ';' && *p != ',') {
  6598				/* End of param or invalid format */
  6599				break;
  6600			}
  6601			p++;
  6602		}
  6603	
  6604		if (ret != 1)
  6605			return;
  6606	
  6607	#ifdef CONFIG_PCI_IOV
  6608		if (dev->is_virtfn) {
  6609			iov = dev->physfn->sriov;
  6610			iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
  6611			pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL,
  6612					      iov->ctrl);
  6613			pci_info(dev, "disabled PF SRIOV 10-Bit Tag Requester\n");
  6614			return;
  6615	#endif
  6616		}
  6617	
> 6618		pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
  6619					   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
  6620	
  6621		pci_info(dev, "disabled 10-Bit Tag Requester\n");
  6622	}
  6623	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28923 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-23 16:58   ` kernel test robot
@ 2021-07-24 10:35     ` Dongdong Liu
  0 siblings, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-24 10:35 UTC (permalink / raw)
  To: kernel test robot, helgaas, hch, kw, logang, linux-pci, rajur,
	hverkuil-cisco
  Cc: clang-built-linux, kbuild-all, linux-media, netdev



On 2021/7/24 0:58, kernel test robot wrote:
> Hi Dongdong,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on pci/next]
> [also build test WARNING on linuxtv-media/master linus/master v5.14-rc2 next-20210723]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:    https://github.com/0day-ci/linux/commits/Dongdong-Liu/PCI-Enable-10-Bit-tag-support-for-PCIe-devices/20210723-190930
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git next
> config: x86_64-randconfig-b001-20210723 (attached as .config)
> compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 9625ca5b602616b2f5584e8a49ba93c52c141e40)
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # install x86_64 cross compiling tool for clang build
>         # apt-get install binutils-x86-64-linux-gnu
>         # https://github.com/0day-ci/linux/commit/2ff0b803971a3df5815c96c5c4874f4eef64fa2f
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Dongdong-Liu/PCI-Enable-10-Bit-tag-support-for-PCIe-devices/20210723-190930
>         git checkout 2ff0b803971a3df5815c96c5c4874f4eef64fa2f
>         # save the attached .config to linux build tree
>         mkdir build_dir
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/pci/
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
>    drivers/pci/pci.c:6618:34: error: expected identifier
>            pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
>                                            ^
>    include/uapi/linux/pci_regs.h:657:26: note: expanded from macro 'PCI_EXP_DEVCTL2'
>    #define PCI_EXP_DEVCTL2         40      /* Device Control 2 */
>                                    ^
>>> drivers/pci/pci.c:6618:2: warning: declaration specifier missing, defaulting to 'int'
>            pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
>            ^
>            int
>    drivers/pci/pci.c:6618:28: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
>            pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
>                                      ^
>    drivers/pci/pci.c:6618:2: error: conflicting types for 'pcie_capability_clear_word'
>            pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
>            ^
>    include/linux/pci.h:1161:19: note: previous definition is here
>    static inline int pcie_capability_clear_word(struct pci_dev *dev, int pos,
>                      ^
>    drivers/pci/pci.c:6621:2: error: expected parameter declarator
>            pci_info(dev, "disabled 10-Bit Tag Requester\n");
>            ^
>    include/linux/pci.h:2472:46: note: expanded from macro 'pci_info'
>    #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
>                                                     ^
>    drivers/pci/pci.c:6621:2: error: expected ')'
>    include/linux/pci.h:2472:46: note: expanded from macro 'pci_info'
>    #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
>                                                     ^
>    drivers/pci/pci.c:6621:2: note: to match this '('
>    include/linux/pci.h:2472:37: note: expanded from macro 'pci_info'
>    #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
>                                            ^
>    include/linux/dev_printk.h:118:11: note: expanded from macro 'dev_info'
>            _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
>                     ^
>    drivers/pci/pci.c:6621:2: warning: declaration specifier missing, defaulting to 'int'
>            pci_info(dev, "disabled 10-Bit Tag Requester\n");
>            ^
>            int
>    include/linux/pci.h:2472:37: note: expanded from macro 'pci_info'
>    #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
>                                            ^
>    include/linux/dev_printk.h:118:2: note: expanded from macro 'dev_info'
>            _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
>            ^
>    drivers/pci/pci.c:6621:2: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
>    include/linux/pci.h:2472:37: note: expanded from macro 'pci_info'
>    #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
>                                            ^
>    include/linux/dev_printk.h:118:11: note: expanded from macro 'dev_info'
>            _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
>                     ^
>    drivers/pci/pci.c:6621:2: error: conflicting types for '_dev_info'
>    include/linux/pci.h:2472:37: note: expanded from macro 'pci_info'
>    #define pci_info(pdev, fmt, arg...)     dev_info(&(pdev)->dev, fmt, ##arg)
>                                            ^
>    include/linux/dev_printk.h:118:2: note: expanded from macro 'dev_info'
>            _dev_info(dev, dev_fmt(fmt), ##__VA_ARGS__)
>            ^
>    include/linux/dev_printk.h:56:6: note: previous declaration is here
>    void _dev_info(const struct device *dev, const char *fmt, ...);
>         ^
>    drivers/pci/pci.c:6622:1: error: extraneous closing brace ('}')
>    }
>    ^
>    2 warnings and 8 errors generated.
>
>
> vim +/int +6618 drivers/pci/pci.c
>
>   6580	
>   6581		if (!disable_10bit_tag_param)
>   6582			return;
>   6583	
>   6584		p = disable_10bit_tag_param;
>   6585		while (*p) {
>   6586			ret = pci_dev_str_match(dev, p, &p);
>   6587			if (ret < 0) {
>   6588				pr_info_once("PCI: Can't parse disable_10bit_tag parameter: %s\n",
>   6589					     disable_10bit_tag_param);
>   6590	
>   6591				break;
>   6592			} else if (ret == 1) {
>   6593				/* Found a match */
>   6594				break;
>   6595			}
>   6596	
>   6597			if (*p != ';' && *p != ',') {
>   6598				/* End of param or invalid format */
>   6599				break;
>   6600			}
>   6601			p++;
>   6602		}
>   6603	
>   6604		if (ret != 1)
>   6605			return;
>   6606	
>   6607	#ifdef CONFIG_PCI_IOV
>   6608		if (dev->is_virtfn) {
>   6609			iov = dev->physfn->sriov;
>   6610			iov->ctrl &= ~PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN;
>   6611			pci_write_config_word(dev, iov->pos + PCI_SRIOV_CTRL,
>   6612					      iov->ctrl);
>   6613			pci_info(dev, "disabled PF SRIOV 10-Bit Tag Requester\n");
>   6614			return;
>   6615	#endif
>   6616		}
I made a mistake here, will fix.

Thanks,
Dongdong
>   6617	
>> 6618		pcie_capability_clear_word(dev, PCI_EXP_DEVCTL2,
>   6619					   PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>   6620	
>   6621		pci_info(dev, "disabled 10-Bit Tag Requester\n");
>   6622	}
>   6623	
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 8/8] PCI/P2PDMA: Add a 10-bit tag check in P2PDMA
  2021-07-23 16:25   ` Logan Gunthorpe
@ 2021-07-24 10:36     ` Dongdong Liu
  0 siblings, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-24 10:36 UTC (permalink / raw)
  To: Logan Gunthorpe, helgaas, hch, kw, linux-pci, rajur, hverkuil-cisco
  Cc: linux-media, netdev

Hi Logan
Many thanks for your review.
On 2021/7/24 0:25, Logan Gunthorpe wrote:
>
>
>
> On 2021-07-23 5:06 a.m., Dongdong Liu wrote:
>> Add a 10-Bit Tag check in the P2PDMA code to ensure that a device with
>> 10-Bit Tag Requester doesn't interact with a device that does not
>> support 10-BIT tag Completer. Before that happens, the kernel should
>> emit a warning saying to enable a ”pci=disable_10bit_tag=“ kernel
>> parameter.
>>
>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>> ---
>>  drivers/pci/p2pdma.c | 38 ++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 38 insertions(+)
>>
>> diff --git a/drivers/pci/p2pdma.c b/drivers/pci/p2pdma.c
>> index 50cdde3..bd93840 100644
>> --- a/drivers/pci/p2pdma.c
>> +++ b/drivers/pci/p2pdma.c
>> @@ -19,6 +19,7 @@
>>  #include <linux/random.h>
>>  #include <linux/seq_buf.h>
>>  #include <linux/xarray.h>
>> +#include "pci.h"
>>
>>  enum pci_p2pdma_map_type {
>>  	PCI_P2PDMA_MAP_UNKNOWN = 0,
>> @@ -541,6 +542,39 @@ calc_map_type_and_dist(struct pci_dev *provider, struct pci_dev *client,
>>  	return map_type;
>>  }
>>
>> +
>> +static bool check_10bit_tags_vaild(struct pci_dev *a, struct pci_dev *b,
>> +				   bool verbose)
>> +{
>> +	bool req;
>> +	bool comp;
>> +	u16 ctl2;
>> +
>> +	if (a->is_virtfn) {
>> +#ifdef CONFIG_PCI_IOV
>> +		req = !!(a->physfn->sriov->ctrl &
>> +			 PCI_SRIOV_CTRL_VF_10BIT_TAG_REQ_EN);
>> +#endif
>> +	} else {
>> +		pcie_capability_read_word(a, PCI_EXP_DEVCTL2, &ctl2);
>> +		req = !!(ctl2 & PCI_EXP_DEVCTL2_10BIT_TAG_REQ_EN);
>> +	}
>> +
>> +	comp = !!(b->pcie_devcap2 & PCI_EXP_DEVCAP2_10BIT_TAG_COMP);
>> +	if (req && (!comp)) {
>> +		if (verbose) {
>> +			pci_warn(a, "cannot be used for peer-to-peer DMA as 10-Bit Tag Requester enable is set in device (%s), but peer device (%s) does not support the 10-Bit Tag Completer\n",
>> +				 pci_name(a), pci_name(b));
>> +
>> +			pci_warn(a, "to disable 10-Bit Tag Requester for this device, add the kernel parameter: pci=disable_10bit_tag=%s\n",
>> +				 pci_name(a));
>> +		}
>> +		return false;
>> +	}
>> +
>> +	return true;
>> +}
>> +
>>  /**
>>   * pci_p2pdma_distance_many - Determine the cumulative distance between
>>   *	a p2pdma provider and the clients in use.
>> @@ -579,6 +613,10 @@ int pci_p2pdma_distance_many(struct pci_dev *provider, struct device **clients,
>>  			return -1;
>>  		}
>>
>> +		if (!check_10bit_tags_vaild(pci_client, provider, verbose) ||
>> +		    !check_10bit_tags_vaild(provider, pci_client, verbose))
>> +			not_supported = true;
>> +
>
> This check needs to be done in calc_map_type_and_dist(). The mapping
> type needs to be correctly stored in the xarray cache as other functions
> rely on the cached value (and upcoming work will be calling
> calc_map_type_and_dist() without pci_p2pdma_distance_many()).
>
Will fix.

Thanks,
Dongdong
> Logan
> .
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-23 16:20     ` Logan Gunthorpe
@ 2021-07-25  6:39       ` Leon Romanovsky
  2021-07-26 15:48         ` Logan Gunthorpe
  0 siblings, 1 reply; 21+ messages in thread
From: Leon Romanovsky @ 2021-07-25  6:39 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Dongdong Liu, helgaas, hch, kw, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Fri, Jul 23, 2021 at 10:20:50AM -0600, Logan Gunthorpe wrote:
> 
> 
> 
> On 2021-07-23 5:32 a.m., Leon Romanovsky wrote:
> > On Fri, Jul 23, 2021 at 07:06:41PM +0800, Dongdong Liu wrote:
> >> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> >> sending Requests to other Endpoints (as opposed to host memory), the
> >> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> >> unless an implementation-specific mechanism determines that the Endpoint
> >> supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
> >> parameter to disable 10-Bit Tag Requester if the peer device does not
> >> support the 10-Bit Tag Completer. This will make P2P traffic safe.
> >>
> >> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> >> ---
> >>  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
> >>  drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
> >>  drivers/pci/pci.h                               |  1 +
> >>  drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
> >>  drivers/pci/probe.c                             |  9 ++--
> >>  5 files changed, 78 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> >> index bdb2200..c2c4585 100644
> >> --- a/Documentation/admin-guide/kernel-parameters.txt
> >> +++ b/Documentation/admin-guide/kernel-parameters.txt
> >> @@ -4019,6 +4019,13 @@
> >>  				bridges without forcing it upstream. Note:
> >>  				this removes isolation between devices and
> >>  				may put more devices in an IOMMU group.
> >> +		disable_10bit_tag=<pci_dev>[; ...]
> >> +				  Specify one or more PCI devices (in the format
> >> +				  specified above) separated by semicolons.
> >> +				  Disable 10-Bit Tag Requester if the peer
> >> +				  device does not support the 10-Bit Tag
> >> +				  Completer.This will make P2P traffic safe.
> > 
> > I can't imagine more awkward user experience than such kernel parameter.
> > 
> > As a user, I will need to boot the system, hope for the best that system
> > works, write down all PCI device numbers, guess which one doesn't work
> > properly, update grub with new command line argument and reboot the
> > system. Any HW change and this dance should be repeated.
> 
> There are already two such PCI parameters with this pattern and they are
> not that awkward. pci_dev may be specified with either vendor/device IDS
> or with a path of BDFs (which protects against renumbering).

Unfortunately, in the real world, BDF is not so stable. It changes with
addition of new hardware, BIOS upgrades and even broken servers.

Vendor/device IDs doesn't work if you have multiple devices of same
vendor in the system.

> 
> This flag is only useful in P2PDMA traffic, and if the user attempts
> such a transfer, it prints a warning (see the next patch) with the exact
> parameter that needs to be added to the command line.

Dongdong citied PCI spec and it was very clear - don't enable this
feature unless you clearly know that it is safe to enable. This is
completely opposite to the proposal here - always enable and disable
if something is printed to the dmesg.

> 
> This has worked well for disable_acs_redir and was used for
> resource_alignment before that for quite some time. So save a better
> suggestion I think this is more than acceptable.

I don't know about other parameters and their history, but we are not in
90s anymore and addition of modules parameters (for the PCI it is kernel
cmdline arguments) are better to be changed to some configuration tool/sysfs.

Even FW upgrade with such kernel parameter can be problematic.

Thanks

> 
> Logan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-25  6:39       ` Leon Romanovsky
@ 2021-07-26 15:48         ` Logan Gunthorpe
  2021-07-27 11:05           ` Leon Romanovsky
  2021-07-27 14:00           ` Dongdong Liu
  0 siblings, 2 replies; 21+ messages in thread
From: Logan Gunthorpe @ 2021-07-26 15:48 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Dongdong Liu, helgaas, hch, kw, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev



On 2021-07-25 12:39 a.m., Leon Romanovsky wrote:
> On Fri, Jul 23, 2021 at 10:20:50AM -0600, Logan Gunthorpe wrote:
>>
>>
>>
>> On 2021-07-23 5:32 a.m., Leon Romanovsky wrote:
>>> On Fri, Jul 23, 2021 at 07:06:41PM +0800, Dongdong Liu wrote:
>>>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>>>> sending Requests to other Endpoints (as opposed to host memory), the
>>>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>>>> unless an implementation-specific mechanism determines that the Endpoint
>>>> supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
>>>> parameter to disable 10-Bit Tag Requester if the peer device does not
>>>> support the 10-Bit Tag Completer. This will make P2P traffic safe.
>>>>
>>>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>>>> ---
>>>>  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
>>>>  drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
>>>>  drivers/pci/pci.h                               |  1 +
>>>>  drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
>>>>  drivers/pci/probe.c                             |  9 ++--
>>>>  5 files changed, 78 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>> index bdb2200..c2c4585 100644
>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>> @@ -4019,6 +4019,13 @@
>>>>  				bridges without forcing it upstream. Note:
>>>>  				this removes isolation between devices and
>>>>  				may put more devices in an IOMMU group.
>>>> +		disable_10bit_tag=<pci_dev>[; ...]
>>>> +				  Specify one or more PCI devices (in the format
>>>> +				  specified above) separated by semicolons.
>>>> +				  Disable 10-Bit Tag Requester if the peer
>>>> +				  device does not support the 10-Bit Tag
>>>> +				  Completer.This will make P2P traffic safe.
>>>
>>> I can't imagine more awkward user experience than such kernel parameter.
>>>
>>> As a user, I will need to boot the system, hope for the best that system
>>> works, write down all PCI device numbers, guess which one doesn't work
>>> properly, update grub with new command line argument and reboot the
>>> system. Any HW change and this dance should be repeated.
>>
>> There are already two such PCI parameters with this pattern and they are
>> not that awkward. pci_dev may be specified with either vendor/device IDS
>> or with a path of BDFs (which protects against renumbering).
> 
> Unfortunately, in the real world, BDF is not so stable. It changes with
> addition of new hardware, BIOS upgrades and even broken servers.

That's why it supports using a *path* of BDFs which tends not to catch
the wrong device if the topology changes.

> Vendor/device IDs doesn't work if you have multiple devices of same
> vendor in the system.

Yes, but it's fine for some use cases. That's why there's a range of
options.

>>
>> This flag is only useful in P2PDMA traffic, and if the user attempts
>> such a transfer, it prints a warning (see the next patch) with the exact
>> parameter that needs to be added to the command line.
> 
> Dongdong citied PCI spec and it was very clear - don't enable this
> feature unless you clearly know that it is safe to enable. This is
> completely opposite to the proposal here - always enable and disable
> if something is printed to the dmesg.

Quoting from patch 4:

"For platforms where the RC supports 10-Bit Tag Completer capability,
it is highly recommended for platform firmware or operating software
that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
bit automatically in Endpoints with 10-Bit Tag Requester capability.
This enables the important class of 10-Bit Tag capable adapters that
send Memory Read Requests only to host memory."

Notice the last sentence. It's saying that devices who only talk to host
memory should have 10-bit tags enabled. In the kernel we call devices
that talk to things besides host memory "P2PDMA". So the spec is saying
not to enable 10bit tags for devices participating in P2PDMA. The kernel
needs a way to allow users to do that. The kernel parameter only stops
the feature from being enabled for a specific device, and the only
use-case is P2PDMA which is not that common and requires the user to be
aware of their topology. So I really don't think this is that big a problem.

>>
>> This has worked well for disable_acs_redir and was used for
>> resource_alignment before that for quite some time. So save a better
>> suggestion I think this is more than acceptable.
> 
> I don't know about other parameters and their history, but we are not in
> 90s anymore and addition of modules parameters (for the PCI it is kernel
> cmdline arguments) are better to be changed to some configuration tool/sysfs.

The problem was that the ACS bits had to be set before the kernel
enumerated the devices. The IOMMU code simply was not able to support
dynamic adjustments to its groups. I assume changing 10bit tags
dynamically is similarly tricky -- but if it's not then, yes a sysfs
interface in addition to the kernel parameter would be a good idea.

Logan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-26 15:48         ` Logan Gunthorpe
@ 2021-07-27 11:05           ` Leon Romanovsky
  2021-07-27 14:30             ` Dongdong Liu
  2021-07-27 14:00           ` Dongdong Liu
  1 sibling, 1 reply; 21+ messages in thread
From: Leon Romanovsky @ 2021-07-27 11:05 UTC (permalink / raw)
  To: Logan Gunthorpe
  Cc: Dongdong Liu, helgaas, hch, kw, linux-pci, rajur, hverkuil-cisco,
	linux-media, netdev

On Mon, Jul 26, 2021 at 09:48:57AM -0600, Logan Gunthorpe wrote:
> 
> 
> On 2021-07-25 12:39 a.m., Leon Romanovsky wrote:
> > On Fri, Jul 23, 2021 at 10:20:50AM -0600, Logan Gunthorpe wrote:
> >>
> >>
> >>
> >> On 2021-07-23 5:32 a.m., Leon Romanovsky wrote:
> >>> On Fri, Jul 23, 2021 at 07:06:41PM +0800, Dongdong Liu wrote:
> >>>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> >>>> sending Requests to other Endpoints (as opposed to host memory), the
> >>>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> >>>> unless an implementation-specific mechanism determines that the Endpoint
> >>>> supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
> >>>> parameter to disable 10-Bit Tag Requester if the peer device does not
> >>>> support the 10-Bit Tag Completer. This will make P2P traffic safe.
> >>>>
> >>>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> >>>> ---
> >>>>  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
> >>>>  drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
> >>>>  drivers/pci/pci.h                               |  1 +
> >>>>  drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
> >>>>  drivers/pci/probe.c                             |  9 ++--
> >>>>  5 files changed, 78 insertions(+), 8 deletions(-)
> >>>>
> >>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> >>>> index bdb2200..c2c4585 100644
> >>>> --- a/Documentation/admin-guide/kernel-parameters.txt
> >>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
> >>>> @@ -4019,6 +4019,13 @@
> >>>>  				bridges without forcing it upstream. Note:
> >>>>  				this removes isolation between devices and
> >>>>  				may put more devices in an IOMMU group.
> >>>> +		disable_10bit_tag=<pci_dev>[; ...]
> >>>> +				  Specify one or more PCI devices (in the format
> >>>> +				  specified above) separated by semicolons.
> >>>> +				  Disable 10-Bit Tag Requester if the peer
> >>>> +				  device does not support the 10-Bit Tag
> >>>> +				  Completer.This will make P2P traffic safe.
> >>>
> >>> I can't imagine more awkward user experience than such kernel parameter.
> >>>
> >>> As a user, I will need to boot the system, hope for the best that system
> >>> works, write down all PCI device numbers, guess which one doesn't work
> >>> properly, update grub with new command line argument and reboot the
> >>> system. Any HW change and this dance should be repeated.
> >>
> >> There are already two such PCI parameters with this pattern and they are
> >> not that awkward. pci_dev may be specified with either vendor/device IDS
> >> or with a path of BDFs (which protects against renumbering).
> > 
> > Unfortunately, in the real world, BDF is not so stable. It changes with
> > addition of new hardware, BIOS upgrades and even broken servers.
> 
> That's why it supports using a *path* of BDFs which tends not to catch
> the wrong device if the topology changes.
> 
> > Vendor/device IDs doesn't work if you have multiple devices of same
> > vendor in the system.
> 
> Yes, but it's fine for some use cases. That's why there's a range of
> options.

The thing is that you are adding PCI parameter that is applicable to everyone.

We probably see different usage models for this feature. In my world, users
have thousands of servers that runs 24x7, with VMs on top, some of them perform
FW upgrades without stopping anything. The idea that you can reboot such server
any time, simply doesn't exist.

So if I need to enable/disable this feature for one of the VFs, I will be stuck.

> 
> >>
> >> This flag is only useful in P2PDMA traffic, and if the user attempts
> >> such a transfer, it prints a warning (see the next patch) with the exact
> >> parameter that needs to be added to the command line.
> > 
> > Dongdong citied PCI spec and it was very clear - don't enable this
> > feature unless you clearly know that it is safe to enable. This is
> > completely opposite to the proposal here - always enable and disable
> > if something is printed to the dmesg.
> 
> Quoting from patch 4:
> 
> "For platforms where the RC supports 10-Bit Tag Completer capability,
> it is highly recommended for platform firmware or operating software
> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
> bit automatically in Endpoints with 10-Bit Tag Requester capability.
> This enables the important class of 10-Bit Tag capable adapters that
> send Memory Read Requests only to host memory."
> 
> Notice the last sentence. It's saying that devices who only talk to host
> memory should have 10-bit tags enabled. In the kernel we call devices
> that talk to things besides host memory "P2PDMA". So the spec is saying
> not to enable 10bit tags for devices participating in P2PDMA. The kernel
> needs a way to allow users to do that. The kernel parameter only stops
> the feature from being enabled for a specific device, and the only
> use-case is P2PDMA which is not that common and requires the user to be
> aware of their topology. So I really don't think this is that big a problem.

I'm not question the feature and the need of configuration. My concern
is just *how* this feature is configured.

> 
> >>
> >> This has worked well for disable_acs_redir and was used for
> >> resource_alignment before that for quite some time. So save a better
> >> suggestion I think this is more than acceptable.
> > 
> > I don't know about other parameters and their history, but we are not in
> > 90s anymore and addition of modules parameters (for the PCI it is kernel
> > cmdline arguments) are better to be changed to some configuration tool/sysfs.
> 
> The problem was that the ACS bits had to be set before the kernel
> enumerated the devices. The IOMMU code simply was not able to support
> dynamic adjustments to its groups. I assume changing 10bit tags
> dynamically is similarly tricky -- but if it's not then, yes a sysfs
> interface in addition to the kernel parameter would be a good idea.

I think that it is doable with combination of drivers_autoprobe disable
and some sysfs knob to enable/disable this feature before driver bind.

It should be very similar to that we did for the dynamic MSI-X, see
/sys/bus/pci/devices/.../sriov_vf_msix_count

Thanks

> 
> Logan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-26 15:48         ` Logan Gunthorpe
  2021-07-27 11:05           ` Leon Romanovsky
@ 2021-07-27 14:00           ` Dongdong Liu
  1 sibling, 0 replies; 21+ messages in thread
From: Dongdong Liu @ 2021-07-27 14:00 UTC (permalink / raw)
  To: Logan Gunthorpe, Leon Romanovsky
  Cc: helgaas, hch, kw, linux-pci, rajur, hverkuil-cisco, linux-media, netdev



On 2021/7/26 23:48, Logan Gunthorpe wrote:
>
>
> On 2021-07-25 12:39 a.m., Leon Romanovsky wrote:
>> On Fri, Jul 23, 2021 at 10:20:50AM -0600, Logan Gunthorpe wrote:
>>>
>>>
>>>
>>> On 2021-07-23 5:32 a.m., Leon Romanovsky wrote:
>>>> On Fri, Jul 23, 2021 at 07:06:41PM +0800, Dongdong Liu wrote:
>>>>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>>>>> sending Requests to other Endpoints (as opposed to host memory), the
>>>>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>>>>> unless an implementation-specific mechanism determines that the Endpoint
>>>>> supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
>>>>> parameter to disable 10-Bit Tag Requester if the peer device does not
>>>>> support the 10-Bit Tag Completer. This will make P2P traffic safe.
>>>>>
>>>>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>>>>> ---
>>>>>  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
>>>>>  drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
>>>>>  drivers/pci/pci.h                               |  1 +
>>>>>  drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
>>>>>  drivers/pci/probe.c                             |  9 ++--
>>>>>  5 files changed, 78 insertions(+), 8 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>> index bdb2200..c2c4585 100644
>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>> @@ -4019,6 +4019,13 @@
>>>>>  				bridges without forcing it upstream. Note:
>>>>>  				this removes isolation between devices and
>>>>>  				may put more devices in an IOMMU group.
>>>>> +		disable_10bit_tag=<pci_dev>[; ...]
>>>>> +				  Specify one or more PCI devices (in the format
>>>>> +				  specified above) separated by semicolons.
>>>>> +				  Disable 10-Bit Tag Requester if the peer
>>>>> +				  device does not support the 10-Bit Tag
>>>>> +				  Completer.This will make P2P traffic safe.
>>>>
>>>> I can't imagine more awkward user experience than such kernel parameter.
>>>>
>>>> As a user, I will need to boot the system, hope for the best that system
>>>> works, write down all PCI device numbers, guess which one doesn't work
>>>> properly, update grub with new command line argument and reboot the
>>>> system. Any HW change and this dance should be repeated.
>>>
>>> There are already two such PCI parameters with this pattern and they are
>>> not that awkward. pci_dev may be specified with either vendor/device IDS
>>> or with a path of BDFs (which protects against renumbering).
>>
>> Unfortunately, in the real world, BDF is not so stable. It changes with
>> addition of new hardware, BIOS upgrades and even broken servers.
>
> That's why it supports using a *path* of BDFs which tends not to catch
> the wrong device if the topology changes.
>
>> Vendor/device IDs doesn't work if you have multiple devices of same
>> vendor in the system.
>
> Yes, but it's fine for some use cases. That's why there's a range of
> options.
>
>>>
>>> This flag is only useful in P2PDMA traffic, and if the user attempts
>>> such a transfer, it prints a warning (see the next patch) with the exact
>>> parameter that needs to be added to the command line.
>>
>> Dongdong citied PCI spec and it was very clear - don't enable this
>> feature unless you clearly know that it is safe to enable. This is
>> completely opposite to the proposal here - always enable and disable
>> if something is printed to the dmesg.
>
> Quoting from patch 4:
>
> "For platforms where the RC supports 10-Bit Tag Completer capability,
> it is highly recommended for platform firmware or operating software
> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
> bit automatically in Endpoints with 10-Bit Tag Requester capability.
> This enables the important class of 10-Bit Tag capable adapters that
> send Memory Read Requests only to host memory."
>
> Notice the last sentence. It's saying that devices who only talk to host
> memory should have 10-bit tags enabled. In the kernel we call devices
> that talk to things besides host memory "P2PDMA". So the spec is saying
> not to enable 10bit tags for devices participating in P2PDMA. The kernel
> needs a way to allow users to do that. The kernel parameter only stops
> the feature from being enabled for a specific device, and the only
> use-case is P2PDMA which is not that common and requires the user to be
> aware of their topology. So I really don't think this is that big a problem.
>
>>>
>>> This has worked well for disable_acs_redir and was used for
>>> resource_alignment before that for quite some time. So save a better
>>> suggestion I think this is more than acceptable.
>>
>> I don't know about other parameters and their history, but we are not in
>> 90s anymore and addition of modules parameters (for the PCI it is kernel
>> cmdline arguments) are better to be changed to some configuration tool/sysfs.
>
> The problem was that the ACS bits had to be set before the kernel
> enumerated the devices. The IOMMU code simply was not able to support
> dynamic adjustments to its groups. I assume changing 10bit tags
> dynamically is similarly tricky -- but if it's not then, yes a sysfs
> interface in addition to the kernel parameter would be a good idea.
PCIe spec 5.0 section 7.5.3.16 Device Control 2 Register
10-Bit Tag Requester Enable says that
If software changes the value of this bit while the Function
has outstanding Non-Posted Requests, the result is undefined.

So 10-Bit Tag Requester Enable should be set before probe the device 
driver.

Thanks,
Dongdong
>
> Logan
> .
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-27 11:05           ` Leon Romanovsky
@ 2021-07-27 14:30             ` Dongdong Liu
  2021-07-27 15:41               ` Leon Romanovsky
  0 siblings, 1 reply; 21+ messages in thread
From: Dongdong Liu @ 2021-07-27 14:30 UTC (permalink / raw)
  To: Leon Romanovsky, Logan Gunthorpe
  Cc: helgaas, hch, kw, linux-pci, rajur, hverkuil-cisco, linux-media, netdev



On 2021/7/27 19:05, Leon Romanovsky wrote:
> On Mon, Jul 26, 2021 at 09:48:57AM -0600, Logan Gunthorpe wrote:
>>
>>
>> On 2021-07-25 12:39 a.m., Leon Romanovsky wrote:
>>> On Fri, Jul 23, 2021 at 10:20:50AM -0600, Logan Gunthorpe wrote:
>>>>
>>>>
>>>>
>>>> On 2021-07-23 5:32 a.m., Leon Romanovsky wrote:
>>>>> On Fri, Jul 23, 2021 at 07:06:41PM +0800, Dongdong Liu wrote:
>>>>>> PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
>>>>>> sending Requests to other Endpoints (as opposed to host memory), the
>>>>>> Endpoint must not send 10-Bit Tag Requests to another given Endpoint
>>>>>> unless an implementation-specific mechanism determines that the Endpoint
>>>>>> supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
>>>>>> parameter to disable 10-Bit Tag Requester if the peer device does not
>>>>>> support the 10-Bit Tag Completer. This will make P2P traffic safe.
>>>>>>
>>>>>> Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
>>>>>> ---
>>>>>>  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
>>>>>>  drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
>>>>>>  drivers/pci/pci.h                               |  1 +
>>>>>>  drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
>>>>>>  drivers/pci/probe.c                             |  9 ++--
>>>>>>  5 files changed, 78 insertions(+), 8 deletions(-)
>>>>>>
>>>>>> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
>>>>>> index bdb2200..c2c4585 100644
>>>>>> --- a/Documentation/admin-guide/kernel-parameters.txt
>>>>>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>>>>>> @@ -4019,6 +4019,13 @@
>>>>>>  				bridges without forcing it upstream. Note:
>>>>>>  				this removes isolation between devices and
>>>>>>  				may put more devices in an IOMMU group.
>>>>>> +		disable_10bit_tag=<pci_dev>[; ...]
>>>>>> +				  Specify one or more PCI devices (in the format
>>>>>> +				  specified above) separated by semicolons.
>>>>>> +				  Disable 10-Bit Tag Requester if the peer
>>>>>> +				  device does not support the 10-Bit Tag
>>>>>> +				  Completer.This will make P2P traffic safe.
>>>>>
>>>>> I can't imagine more awkward user experience than such kernel parameter.
>>>>>
>>>>> As a user, I will need to boot the system, hope for the best that system
>>>>> works, write down all PCI device numbers, guess which one doesn't work
>>>>> properly, update grub with new command line argument and reboot the
>>>>> system. Any HW change and this dance should be repeated.
>>>>
>>>> There are already two such PCI parameters with this pattern and they are
>>>> not that awkward. pci_dev may be specified with either vendor/device IDS
>>>> or with a path of BDFs (which protects against renumbering).
>>>
>>> Unfortunately, in the real world, BDF is not so stable. It changes with
>>> addition of new hardware, BIOS upgrades and even broken servers.
>>
>> That's why it supports using a *path* of BDFs which tends not to catch
>> the wrong device if the topology changes.
>>
>>> Vendor/device IDs doesn't work if you have multiple devices of same
>>> vendor in the system.
>>
>> Yes, but it's fine for some use cases. That's why there's a range of
>> options.
>
> The thing is that you are adding PCI parameter that is applicable to everyone.
>
> We probably see different usage models for this feature. In my world, users
> have thousands of servers that runs 24x7, with VMs on top, some of them perform
> FW upgrades without stopping anything. The idea that you can reboot such server
> any time, simply doesn't exist.
>
> So if I need to enable/disable this feature for one of the VFs, I will be stuck.
>
>>
>>>>
>>>> This flag is only useful in P2PDMA traffic, and if the user attempts
>>>> such a transfer, it prints a warning (see the next patch) with the exact
>>>> parameter that needs to be added to the command line.
>>>
>>> Dongdong citied PCI spec and it was very clear - don't enable this
>>> feature unless you clearly know that it is safe to enable. This is
>>> completely opposite to the proposal here - always enable and disable
>>> if something is printed to the dmesg.
>>
>> Quoting from patch 4:
>>
>> "For platforms where the RC supports 10-Bit Tag Completer capability,
>> it is highly recommended for platform firmware or operating software
>> that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
>> bit automatically in Endpoints with 10-Bit Tag Requester capability.
>> This enables the important class of 10-Bit Tag capable adapters that
>> send Memory Read Requests only to host memory."
>>
>> Notice the last sentence. It's saying that devices who only talk to host
>> memory should have 10-bit tags enabled. In the kernel we call devices
>> that talk to things besides host memory "P2PDMA". So the spec is saying
>> not to enable 10bit tags for devices participating in P2PDMA. The kernel
>> needs a way to allow users to do that. The kernel parameter only stops
>> the feature from being enabled for a specific device, and the only
>> use-case is P2PDMA which is not that common and requires the user to be
>> aware of their topology. So I really don't think this is that big a problem.
>
> I'm not question the feature and the need of configuration. My concern
> is just *how* this feature is configured.
>
>>
>>>>
>>>> This has worked well for disable_acs_redir and was used for
>>>> resource_alignment before that for quite some time. So save a better
>>>> suggestion I think this is more than acceptable.
>>>
>>> I don't know about other parameters and their history, but we are not in
>>> 90s anymore and addition of modules parameters (for the PCI it is kernel
>>> cmdline arguments) are better to be changed to some configuration tool/sysfs.
>>
>> The problem was that the ACS bits had to be set before the kernel
>> enumerated the devices. The IOMMU code simply was not able to support
>> dynamic adjustments to its groups. I assume changing 10bit tags
>> dynamically is similarly tricky -- but if it's not then, yes a sysfs
>> interface in addition to the kernel parameter would be a good idea.
>
> I think that it is doable with combination of drivers_autoprobe disable
> and some sysfs knob to enable/disable this feature before driver bind.
>
> It should be very similar to that we did for the dynamic MSI-X, see
> /sys/bus/pci/devices/.../sriov_vf_msix_count

Many thanks for your suggestion.

Seems a sysfs could be work ok,  but need to make sure 10-Bit Tag 
Requester to be set before binding the device driver as
PCIe spec 5.0 section 7.5.3.16 Device Control 2 Register
10-Bit Tag Requester Enable says that
If software changes the value of this bit while the Function
has outstanding Non-Posted Requests, the result is undefined.

Thanks,
Dongdong
>
> Thanks
>
>>
>> Logan
> .
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support
  2021-07-27 14:30             ` Dongdong Liu
@ 2021-07-27 15:41               ` Leon Romanovsky
  0 siblings, 0 replies; 21+ messages in thread
From: Leon Romanovsky @ 2021-07-27 15:41 UTC (permalink / raw)
  To: Dongdong Liu
  Cc: Logan Gunthorpe, helgaas, hch, kw, linux-pci, rajur,
	hverkuil-cisco, linux-media, netdev

On Tue, Jul 27, 2021 at 10:30:40PM +0800, Dongdong Liu wrote:
> 
> 
> On 2021/7/27 19:05, Leon Romanovsky wrote:
> > On Mon, Jul 26, 2021 at 09:48:57AM -0600, Logan Gunthorpe wrote:
> > > 
> > > 
> > > On 2021-07-25 12:39 a.m., Leon Romanovsky wrote:
> > > > On Fri, Jul 23, 2021 at 10:20:50AM -0600, Logan Gunthorpe wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > On 2021-07-23 5:32 a.m., Leon Romanovsky wrote:
> > > > > > On Fri, Jul 23, 2021 at 07:06:41PM +0800, Dongdong Liu wrote:
> > > > > > > PCIe spec 5.0 r1.0 section 2.2.6.2 says that if an Endpoint supports
> > > > > > > sending Requests to other Endpoints (as opposed to host memory), the
> > > > > > > Endpoint must not send 10-Bit Tag Requests to another given Endpoint
> > > > > > > unless an implementation-specific mechanism determines that the Endpoint
> > > > > > > supports 10-Bit Tag Completer capability. Add "pci=disable_10bit_tag="
> > > > > > > parameter to disable 10-Bit Tag Requester if the peer device does not
> > > > > > > support the 10-Bit Tag Completer. This will make P2P traffic safe.
> > > > > > > 
> > > > > > > Signed-off-by: Dongdong Liu <liudongdong3@huawei.com>
> > > > > > > ---
> > > > > > >  Documentation/admin-guide/kernel-parameters.txt |  7 ++++
> > > > > > >  drivers/pci/pci.c                               | 56 +++++++++++++++++++++++++
> > > > > > >  drivers/pci/pci.h                               |  1 +
> > > > > > >  drivers/pci/pcie/portdrv_pci.c                  | 13 +++---
> > > > > > >  drivers/pci/probe.c                             |  9 ++--
> > > > > > >  5 files changed, 78 insertions(+), 8 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > index bdb2200..c2c4585 100644
> > > > > > > --- a/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > > > > > > @@ -4019,6 +4019,13 @@
> > > > > > >  				bridges without forcing it upstream. Note:
> > > > > > >  				this removes isolation between devices and
> > > > > > >  				may put more devices in an IOMMU group.
> > > > > > > +		disable_10bit_tag=<pci_dev>[; ...]
> > > > > > > +				  Specify one or more PCI devices (in the format
> > > > > > > +				  specified above) separated by semicolons.
> > > > > > > +				  Disable 10-Bit Tag Requester if the peer
> > > > > > > +				  device does not support the 10-Bit Tag
> > > > > > > +				  Completer.This will make P2P traffic safe.
> > > > > > 
> > > > > > I can't imagine more awkward user experience than such kernel parameter.
> > > > > > 
> > > > > > As a user, I will need to boot the system, hope for the best that system
> > > > > > works, write down all PCI device numbers, guess which one doesn't work
> > > > > > properly, update grub with new command line argument and reboot the
> > > > > > system. Any HW change and this dance should be repeated.
> > > > > 
> > > > > There are already two such PCI parameters with this pattern and they are
> > > > > not that awkward. pci_dev may be specified with either vendor/device IDS
> > > > > or with a path of BDFs (which protects against renumbering).
> > > > 
> > > > Unfortunately, in the real world, BDF is not so stable. It changes with
> > > > addition of new hardware, BIOS upgrades and even broken servers.
> > > 
> > > That's why it supports using a *path* of BDFs which tends not to catch
> > > the wrong device if the topology changes.
> > > 
> > > > Vendor/device IDs doesn't work if you have multiple devices of same
> > > > vendor in the system.
> > > 
> > > Yes, but it's fine for some use cases. That's why there's a range of
> > > options.
> > 
> > The thing is that you are adding PCI parameter that is applicable to everyone.
> > 
> > We probably see different usage models for this feature. In my world, users
> > have thousands of servers that runs 24x7, with VMs on top, some of them perform
> > FW upgrades without stopping anything. The idea that you can reboot such server
> > any time, simply doesn't exist.
> > 
> > So if I need to enable/disable this feature for one of the VFs, I will be stuck.
> > 
> > > 
> > > > > 
> > > > > This flag is only useful in P2PDMA traffic, and if the user attempts
> > > > > such a transfer, it prints a warning (see the next patch) with the exact
> > > > > parameter that needs to be added to the command line.
> > > > 
> > > > Dongdong citied PCI spec and it was very clear - don't enable this
> > > > feature unless you clearly know that it is safe to enable. This is
> > > > completely opposite to the proposal here - always enable and disable
> > > > if something is printed to the dmesg.
> > > 
> > > Quoting from patch 4:
> > > 
> > > "For platforms where the RC supports 10-Bit Tag Completer capability,
> > > it is highly recommended for platform firmware or operating software
> > > that configures PCIe hierarchies to Set the 10-Bit Tag Requester Enable
> > > bit automatically in Endpoints with 10-Bit Tag Requester capability.
> > > This enables the important class of 10-Bit Tag capable adapters that
> > > send Memory Read Requests only to host memory."
> > > 
> > > Notice the last sentence. It's saying that devices who only talk to host
> > > memory should have 10-bit tags enabled. In the kernel we call devices
> > > that talk to things besides host memory "P2PDMA". So the spec is saying
> > > not to enable 10bit tags for devices participating in P2PDMA. The kernel
> > > needs a way to allow users to do that. The kernel parameter only stops
> > > the feature from being enabled for a specific device, and the only
> > > use-case is P2PDMA which is not that common and requires the user to be
> > > aware of their topology. So I really don't think this is that big a problem.
> > 
> > I'm not question the feature and the need of configuration. My concern
> > is just *how* this feature is configured.
> > 
> > > 
> > > > > 
> > > > > This has worked well for disable_acs_redir and was used for
> > > > > resource_alignment before that for quite some time. So save a better
> > > > > suggestion I think this is more than acceptable.
> > > > 
> > > > I don't know about other parameters and their history, but we are not in
> > > > 90s anymore and addition of modules parameters (for the PCI it is kernel
> > > > cmdline arguments) are better to be changed to some configuration tool/sysfs.
> > > 
> > > The problem was that the ACS bits had to be set before the kernel
> > > enumerated the devices. The IOMMU code simply was not able to support
> > > dynamic adjustments to its groups. I assume changing 10bit tags
> > > dynamically is similarly tricky -- but if it's not then, yes a sysfs
> > > interface in addition to the kernel parameter would be a good idea.
> > 
> > I think that it is doable with combination of drivers_autoprobe disable
> > and some sysfs knob to enable/disable this feature before driver bind.
> > 
> > It should be very similar to that we did for the dynamic MSI-X, see
> > /sys/bus/pci/devices/.../sriov_vf_msix_count
> 
> Many thanks for your suggestion.
> 
> Seems a sysfs could be work ok,  but need to make sure 10-Bit Tag Requester
> to be set before binding the device driver as
> PCIe spec 5.0 section 7.5.3.16 Device Control 2 Register
> 10-Bit Tag Requester Enable says that
> If software changes the value of this bit while the Function
> has outstanding Non-Posted Requests, the result is undefined.

This is where drivers_autoprobe will help.

Thanks


> 
> Thanks,
> Dongdong
> > 
> > Thanks
> > 
> > > 
> > > Logan
> > .
> > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-07-27 15:41 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-23 11:06 [PATCH V6 0/8] PCI: Enable 10-Bit tag support for PCIe devices Dongdong Liu
2021-07-23 11:06 ` [PATCH V6 1/8] PCI: Use cached Device Capabilities Register Dongdong Liu
2021-07-23 11:06 ` [PATCH V6 2/8] PCI: Use cached Device Capabilities 2 Register Dongdong Liu
2021-07-23 11:06 ` [PATCH V6 3/8] PCI: Add 10-Bit Tag register definitions Dongdong Liu
2021-07-23 11:06 ` [PATCH V6 4/8] PCI: Enable 10-Bit tag support for PCIe Endpoint devices Dongdong Liu
2021-07-23 11:06 ` [PATCH V6 5/8] PCI/IOV: Enable 10-Bit tag support for PCIe VF devices Dongdong Liu
2021-07-23 11:06 ` [PATCH V6 6/8] PCI: Enable 10-Bit tag support for PCIe RP devices Dongdong Liu
2021-07-23 11:06 ` [PATCH V6 7/8] PCI: Add "pci=disable_10bit_tag=" parameter for peer-to-peer support Dongdong Liu
2021-07-23 11:32   ` Leon Romanovsky
2021-07-23 16:20     ` Logan Gunthorpe
2021-07-25  6:39       ` Leon Romanovsky
2021-07-26 15:48         ` Logan Gunthorpe
2021-07-27 11:05           ` Leon Romanovsky
2021-07-27 14:30             ` Dongdong Liu
2021-07-27 15:41               ` Leon Romanovsky
2021-07-27 14:00           ` Dongdong Liu
2021-07-23 16:58   ` kernel test robot
2021-07-24 10:35     ` Dongdong Liu
2021-07-23 11:06 ` [PATCH V6 8/8] PCI/P2PDMA: Add a 10-bit tag check in P2PDMA Dongdong Liu
2021-07-23 16:25   ` Logan Gunthorpe
2021-07-24 10:36     ` Dongdong Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).