netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs
@ 2023-04-10 13:07 Leon Romanovsky
  2023-04-10 13:07 ` [PATCH mlx5-next 1/4] RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write Leon Romanovsky
                   ` (5 more replies)
  0 siblings, 6 replies; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-10 13:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Avihai Horon, Aya Levin, Eric Dumazet,
	Jakub Kicinski, linux-kernel, linux-rdma, Meir Lichtinger,
	Michael Guralnik, netdev, Paolo Abeni, Saeed Mahameed,
	Shay Drory

From: Leon Romanovsky <leonro@nvidia.com>

From Avihai,

Currently, Relaxed Ordering (RO) can't be used in VFs directly and in
VFs assigned to QEMU, even if the PF supports RO. This is due to issues
in reporting/emulation of PCI config space RO bit and due to current
HCA capability behavior.

This series fixes it by using a new HCA capability and by relying on FW
to do the "right thing" according to the PF's PCI config space RO value.

Allowing RO in VFs and VMs is valuable since it can greatly improve
performance on some setups. For example, testing throughput of a VF on
an AMD EPYC 7763 and ConnectX-6 Dx setup showed roughly 60% performance
improvement.

Thanks

Avihai Horon (4):
  RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
  RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
  net/mlx5: Update relaxed ordering read HCA capabilities
  RDMA/mlx5: Allow relaxed ordering read in VFs and VMs

 drivers/infiniband/hw/mlx5/mr.c                     | 12 ++++++++----
 drivers/infiniband/hw/mlx5/umr.c                    |  7 +++++--
 drivers/infiniband/hw/mlx5/umr.h                    |  3 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en/params.c |  3 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_common.c |  9 +++++----
 include/linux/mlx5/mlx5_ifc.h                       |  5 +++--
 6 files changed, 24 insertions(+), 15 deletions(-)

-- 
2.39.2


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH mlx5-next 1/4] RDMA/mlx5: Remove  pcie_relaxed_ordering_enabled() check for RO write
  2023-04-10 13:07 [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs Leon Romanovsky
@ 2023-04-10 13:07 ` Leon Romanovsky
  2023-04-11 23:18   ` Jacob Keller
  2023-04-10 13:07 ` [PATCH rdma-next 2/4] RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR Leon Romanovsky
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-10 13:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Avihai Horon, Eric Dumazet, Jakub Kicinski, linux-rdma,
	Meir Lichtinger, Michael Guralnik, netdev, Paolo Abeni,
	Saeed Mahameed, Shay Drory

From: Avihai Horon <avihaih@nvidia.com>

pcie_relaxed_ordering_enabled() check was added to avoid a syndrome when
creating a MKey with relaxed ordering (RO) enabled when the driver's
relaxed_ordering_{read,write} HCA capabilities are out of sync with FW.

While this can happen with relaxed_ordering_read, it can't happen with
relaxed_ordering_write as it's set if the device supports RO write,
regardless of RO in PCI config space, and thus can't change during
runtime.

Therefore, drop the pcie_relaxed_ordering_enabled() check for
relaxed_ordering_write while keeping it for relaxed_ordering_read.
Doing so will also allow the usage of RO write in VFs and VMs (where RO
in PCI config space is not reported/emulated properly).

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/mr.c                     | 6 +++---
 drivers/net/ethernet/mellanox/mlx5/core/en/params.c | 3 +--
 drivers/net/ethernet/mellanox/mlx5/core/en_common.c | 2 +-
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 347000d30cec..bb8f318bd5a5 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -69,11 +69,11 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
 	MLX5_SET(mkc, mkc, lw, !!(acc & IB_ACCESS_LOCAL_WRITE));
 	MLX5_SET(mkc, mkc, lr, 1);
 
-	if ((acc & IB_ACCESS_RELAXED_ORDERING) &&
-	    pcie_relaxed_ordering_enabled(dev->mdev->pdev)) {
+	if (acc & IB_ACCESS_RELAXED_ORDERING) {
 		if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
 			MLX5_SET(mkc, mkc, relaxed_ordering_write, 1);
-		if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read))
+		if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) &&
+		    pcie_relaxed_ordering_enabled(dev->mdev->pdev))
 			MLX5_SET(mkc, mkc, relaxed_ordering_read, 1);
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
index a21bd1179477..d840a59aec88 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/params.c
@@ -867,8 +867,7 @@ static void mlx5e_build_rx_cq_param(struct mlx5_core_dev *mdev,
 static u8 rq_end_pad_mode(struct mlx5_core_dev *mdev, struct mlx5e_params *params)
 {
 	bool lro_en = params->packet_merge.type == MLX5E_PACKET_MERGE_LRO;
-	bool ro = pcie_relaxed_ordering_enabled(mdev->pdev) &&
-		MLX5_CAP_GEN(mdev, relaxed_ordering_write);
+	bool ro = MLX5_CAP_GEN(mdev, relaxed_ordering_write);
 
 	return ro && lro_en ?
 		MLX5_WQ_END_PAD_MODE_NONE : MLX5_WQ_END_PAD_MODE_ALIGN;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index 4c9a3210600c..993af4c12d90 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -44,7 +44,7 @@ void mlx5e_mkey_set_relaxed_ordering(struct mlx5_core_dev *mdev, void *mkc)
 	bool ro_read = MLX5_CAP_GEN(mdev, relaxed_ordering_read);
 
 	MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enable && ro_read);
-	MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_pci_enable && ro_write);
+	MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_write);
 }
 
 int mlx5e_create_mkey(struct mlx5_core_dev *mdev, u32 pdn, u32 *mkey)
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH rdma-next 2/4] RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
  2023-04-10 13:07 [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs Leon Romanovsky
  2023-04-10 13:07 ` [PATCH mlx5-next 1/4] RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write Leon Romanovsky
@ 2023-04-10 13:07 ` Leon Romanovsky
  2023-04-11 23:18   ` Jacob Keller
  2023-04-10 13:07 ` [PATCH mlx5-next 3/4] net/mlx5: Update relaxed ordering read HCA capabilities Leon Romanovsky
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-10 13:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Avihai Horon, David S. Miller, Eric Dumazet, Jakub Kicinski,
	linux-rdma, Meir Lichtinger, Michael Guralnik, netdev,
	Paolo Abeni, Saeed Mahameed, Shay Drory

From: Avihai Horon <avihaih@nvidia.com>

relaxed_ordering_read HCA capability is set if both the device supports
relaxed ordering (RO) read and RO is set in PCI config space.

RO in PCI config space can change during runtime. This will change the
value of relaxed_ordering_read HCA capability in FW, but the driver will
not see it since it queries the capabilities only once.

This can lead to the following scenario:
1. RO in PCI config space is enabled.
2. User creates MKey without RO.
3. RO in PCI config space is disabled.
   As a result, relaxed_ordering_read HCA capability is turned off in FW
   but remains on in driver copy of the capabilities.
4. User requests to reconfig the MKey with RO via UMR.
5. Driver will try to reconfig the MKey with RO read although it
   shouldn't (as relaxed_ordering_read HCA capability is really off).

To fix this, check pcie_relaxed_ordering_enabled() before setting RO
read in UMR.

Fixes: 896ec9735336 ("RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7")
Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/umr.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/umr.c b/drivers/infiniband/hw/mlx5/umr.c
index 55f4e048d947..c9e176e8ced4 100644
--- a/drivers/infiniband/hw/mlx5/umr.c
+++ b/drivers/infiniband/hw/mlx5/umr.c
@@ -380,6 +380,9 @@ static void mlx5r_umr_set_access_flags(struct mlx5_ib_dev *dev,
 				       struct mlx5_mkey_seg *seg,
 				       unsigned int access_flags)
 {
+	bool ro_read = (access_flags & IB_ACCESS_RELAXED_ORDERING) &&
+		       pcie_relaxed_ordering_enabled(dev->mdev->pdev);
+
 	MLX5_SET(mkc, seg, a, !!(access_flags & IB_ACCESS_REMOTE_ATOMIC));
 	MLX5_SET(mkc, seg, rw, !!(access_flags & IB_ACCESS_REMOTE_WRITE));
 	MLX5_SET(mkc, seg, rr, !!(access_flags & IB_ACCESS_REMOTE_READ));
@@ -387,8 +390,7 @@ static void mlx5r_umr_set_access_flags(struct mlx5_ib_dev *dev,
 	MLX5_SET(mkc, seg, lr, 1);
 	MLX5_SET(mkc, seg, relaxed_ordering_write,
 		 !!(access_flags & IB_ACCESS_RELAXED_ORDERING));
-	MLX5_SET(mkc, seg, relaxed_ordering_read,
-		 !!(access_flags & IB_ACCESS_RELAXED_ORDERING));
+	MLX5_SET(mkc, seg, relaxed_ordering_read, ro_read);
 }
 
 int mlx5r_umr_rereg_pd_access(struct mlx5_ib_mr *mr, struct ib_pd *pd,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH mlx5-next 3/4] net/mlx5: Update relaxed ordering read HCA capabilities
  2023-04-10 13:07 [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs Leon Romanovsky
  2023-04-10 13:07 ` [PATCH mlx5-next 1/4] RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write Leon Romanovsky
  2023-04-10 13:07 ` [PATCH rdma-next 2/4] RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR Leon Romanovsky
@ 2023-04-10 13:07 ` Leon Romanovsky
  2023-04-11 23:19   ` Jacob Keller
  2023-04-10 13:07 ` [PATCH mlx5-next 4/4] RDMA/mlx5: Allow relaxed ordering read in VFs and VMs Leon Romanovsky
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-10 13:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Avihai Horon, Eric Dumazet, Jakub Kicinski, linux-rdma,
	Meir Lichtinger, Michael Guralnik, netdev, Paolo Abeni,
	Saeed Mahameed, Shay Drory

From: Avihai Horon <avihaih@nvidia.com>

Rename existing HCA capability relaxed_ordering_read to
relaxed_ordering_read_pci_enabled. This is in accordance with recent PRM
change to better describe the capability, as it's set only if both the
device supports relaxed ordering (RO) read and RO is enabled in PCI
config space.

In addition, add new HCA capability relaxed_ordering_read which is set
if the device supports RO read, regardless of RO in PCI config space.
This will be used in the following patch to allow RO in VFs and VMs.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/mr.c                     | 5 +++--
 drivers/infiniband/hw/mlx5/umr.h                    | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_common.c | 2 +-
 include/linux/mlx5/mlx5_ifc.h                       | 5 +++--
 4 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index bb8f318bd5a5..a7f0119cc959 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -72,7 +72,8 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
 	if (acc & IB_ACCESS_RELAXED_ORDERING) {
 		if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
 			MLX5_SET(mkc, mkc, relaxed_ordering_write, 1);
-		if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) &&
+		if (MLX5_CAP_GEN(dev->mdev,
+				 relaxed_ordering_read_pci_enabled) &&
 		    pcie_relaxed_ordering_enabled(dev->mdev->pdev))
 			MLX5_SET(mkc, mkc, relaxed_ordering_read, 1);
 	}
@@ -793,7 +794,7 @@ static int get_unchangeable_access_flags(struct mlx5_ib_dev *dev,
 		ret |= IB_ACCESS_RELAXED_ORDERING;
 
 	if ((access_flags & IB_ACCESS_RELAXED_ORDERING) &&
-	    MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) &&
+	    MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_pci_enabled) &&
 	    !MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_umr))
 		ret |= IB_ACCESS_RELAXED_ORDERING;
 
diff --git a/drivers/infiniband/hw/mlx5/umr.h b/drivers/infiniband/hw/mlx5/umr.h
index c9d0021381a2..e12ecd7e079c 100644
--- a/drivers/infiniband/hw/mlx5/umr.h
+++ b/drivers/infiniband/hw/mlx5/umr.h
@@ -62,7 +62,7 @@ static inline bool mlx5r_umr_can_reconfig(struct mlx5_ib_dev *dev,
 		return false;
 
 	if ((diffs & IB_ACCESS_RELAXED_ORDERING) &&
-	    MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) &&
+	    MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_pci_enabled) &&
 	    !MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_umr))
 		return false;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index 993af4c12d90..3c765a1f91a5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -41,7 +41,7 @@ void mlx5e_mkey_set_relaxed_ordering(struct mlx5_core_dev *mdev, void *mkc)
 {
 	bool ro_pci_enable = pcie_relaxed_ordering_enabled(mdev->pdev);
 	bool ro_write = MLX5_CAP_GEN(mdev, relaxed_ordering_write);
-	bool ro_read = MLX5_CAP_GEN(mdev, relaxed_ordering_read);
+	bool ro_read = MLX5_CAP_GEN(mdev, relaxed_ordering_read_pci_enabled);
 
 	MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enable && ro_read);
 	MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_write);
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index e4306cd87cd7..b54339a1b1c6 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1511,7 +1511,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 
 	u8         log_max_eq_sz[0x8];
 	u8         relaxed_ordering_write[0x1];
-	u8         relaxed_ordering_read[0x1];
+	u8         relaxed_ordering_read_pci_enabled[0x1];
 	u8         log_max_mkey[0x6];
 	u8         reserved_at_f0[0x6];
 	u8	   terminate_scatter_list_mkey[0x1];
@@ -1727,7 +1727,8 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 
 	u8         reserved_at_320[0x3];
 	u8         log_max_transport_domain[0x5];
-	u8         reserved_at_328[0x3];
+	u8         reserved_at_328[0x2];
+	u8	   relaxed_ordering_read[0x1];
 	u8         log_max_pd[0x5];
 	u8         reserved_at_330[0x9];
 	u8         q_counter_aggregation[0x1];
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH mlx5-next 4/4] RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
  2023-04-10 13:07 [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs Leon Romanovsky
                   ` (2 preceding siblings ...)
  2023-04-10 13:07 ` [PATCH mlx5-next 3/4] net/mlx5: Update relaxed ordering read HCA capabilities Leon Romanovsky
@ 2023-04-10 13:07 ` Leon Romanovsky
  2023-04-11 23:19   ` Jacob Keller
  2023-04-11 14:01 ` [PATCH rdma-next 0/4] " Jason Gunthorpe
  2023-04-16 10:30 ` Leon Romanovsky
  5 siblings, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-10 13:07 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Avihai Horon, Aya Levin, Eric Dumazet, Jakub Kicinski,
	linux-rdma, Meir Lichtinger, Michael Guralnik, netdev,
	Paolo Abeni, Saeed Mahameed, Shay Drory

From: Avihai Horon <avihaih@nvidia.com>

According to PCIe spec, Enable Relaxed Ordering value in the VF's PCI
config space is wired to 0 and PF relaxed ordering (RO) setting should
be applied to the VF. In QEMU (and maybe others), when assigning VFs,
the RO bit in PCI config space is not emulated properly and is always
set to 0.

Therefore, pcie_relaxed_ordering_enabled() always returns 0 for VFs and
VMs and thus MKeys can't be created with RO read even if the PF supports
it.

pcie_relaxed_ordering_enabled() check was added to avoid a syndrome when
creating a MKey with relaxed ordering (RO) enabled when the driver's
relaxed_ordering_read_pci_enabled HCA capability is out of sync with FW.
With the new relaxed_ordering_read capability this can't happen, as it's
set regardless of RO value in PCI config space and thus can't change
during runtime.

Hence, to allow RO read in VFs and VMs, use the new HCA capability
relaxed_ordering_read without checking pcie_relaxed_ordering_enabled().
The old capability checks are kept for backward compatibility with older
FWs.

Allowing RO in VFs and VMs is valuable since it can greatly improve
performance on some setups. For example, testing throughput of a VF on
an AMD EPYC 7763 and ConnectX-6 Dx setup showed roughly 60% performance
improvement.

Signed-off-by: Avihai Horon <avihaih@nvidia.com>
Reviewed-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
 drivers/infiniband/hw/mlx5/mr.c                     | 11 +++++++----
 drivers/infiniband/hw/mlx5/umr.c                    |  3 ++-
 drivers/infiniband/hw/mlx5/umr.h                    |  3 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en_common.c |  7 ++++---
 4 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index a7f0119cc959..1ce48e485c5b 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -72,9 +72,11 @@ static void set_mkc_access_pd_addr_fields(void *mkc, int acc, u64 start_addr,
 	if (acc & IB_ACCESS_RELAXED_ORDERING) {
 		if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_write))
 			MLX5_SET(mkc, mkc, relaxed_ordering_write, 1);
-		if (MLX5_CAP_GEN(dev->mdev,
-				 relaxed_ordering_read_pci_enabled) &&
-		    pcie_relaxed_ordering_enabled(dev->mdev->pdev))
+
+		if (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) ||
+		    (MLX5_CAP_GEN(dev->mdev,
+				  relaxed_ordering_read_pci_enabled) &&
+		     pcie_relaxed_ordering_enabled(dev->mdev->pdev)))
 			MLX5_SET(mkc, mkc, relaxed_ordering_read, 1);
 	}
 
@@ -794,7 +796,8 @@ static int get_unchangeable_access_flags(struct mlx5_ib_dev *dev,
 		ret |= IB_ACCESS_RELAXED_ORDERING;
 
 	if ((access_flags & IB_ACCESS_RELAXED_ORDERING) &&
-	    MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_pci_enabled) &&
+	    (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) ||
+	     MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_pci_enabled)) &&
 	    !MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_umr))
 		ret |= IB_ACCESS_RELAXED_ORDERING;
 
diff --git a/drivers/infiniband/hw/mlx5/umr.c b/drivers/infiniband/hw/mlx5/umr.c
index c9e176e8ced4..234bf30db731 100644
--- a/drivers/infiniband/hw/mlx5/umr.c
+++ b/drivers/infiniband/hw/mlx5/umr.c
@@ -381,7 +381,8 @@ static void mlx5r_umr_set_access_flags(struct mlx5_ib_dev *dev,
 				       unsigned int access_flags)
 {
 	bool ro_read = (access_flags & IB_ACCESS_RELAXED_ORDERING) &&
-		       pcie_relaxed_ordering_enabled(dev->mdev->pdev);
+		       (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) ||
+			pcie_relaxed_ordering_enabled(dev->mdev->pdev));
 
 	MLX5_SET(mkc, seg, a, !!(access_flags & IB_ACCESS_REMOTE_ATOMIC));
 	MLX5_SET(mkc, seg, rw, !!(access_flags & IB_ACCESS_REMOTE_WRITE));
diff --git a/drivers/infiniband/hw/mlx5/umr.h b/drivers/infiniband/hw/mlx5/umr.h
index e12ecd7e079c..3799bb758e49 100644
--- a/drivers/infiniband/hw/mlx5/umr.h
+++ b/drivers/infiniband/hw/mlx5/umr.h
@@ -62,7 +62,8 @@ static inline bool mlx5r_umr_can_reconfig(struct mlx5_ib_dev *dev,
 		return false;
 
 	if ((diffs & IB_ACCESS_RELAXED_ORDERING) &&
-	    MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_pci_enabled) &&
+	    (MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read) ||
+	     MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_pci_enabled)) &&
 	    !MLX5_CAP_GEN(dev->mdev, relaxed_ordering_read_umr))
 		return false;
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
index 3c765a1f91a5..1f90594499c6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_common.c
@@ -39,11 +39,12 @@
 
 void mlx5e_mkey_set_relaxed_ordering(struct mlx5_core_dev *mdev, void *mkc)
 {
-	bool ro_pci_enable = pcie_relaxed_ordering_enabled(mdev->pdev);
 	bool ro_write = MLX5_CAP_GEN(mdev, relaxed_ordering_write);
-	bool ro_read = MLX5_CAP_GEN(mdev, relaxed_ordering_read_pci_enabled);
+	bool ro_read = MLX5_CAP_GEN(mdev, relaxed_ordering_read) ||
+		       (pcie_relaxed_ordering_enabled(mdev->pdev) &&
+			MLX5_CAP_GEN(mdev, relaxed_ordering_read_pci_enabled));
 
-	MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_pci_enable && ro_read);
+	MLX5_SET(mkc, mkc, relaxed_ordering_read, ro_read);
 	MLX5_SET(mkc, mkc, relaxed_ordering_write, ro_write);
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs
  2023-04-10 13:07 [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs Leon Romanovsky
                   ` (3 preceding siblings ...)
  2023-04-10 13:07 ` [PATCH mlx5-next 4/4] RDMA/mlx5: Allow relaxed ordering read in VFs and VMs Leon Romanovsky
@ 2023-04-11 14:01 ` Jason Gunthorpe
  2023-04-11 14:09   ` Leon Romanovsky
  2023-04-11 23:21   ` Jacob Keller
  2023-04-16 10:30 ` Leon Romanovsky
  5 siblings, 2 replies; 16+ messages in thread
From: Jason Gunthorpe @ 2023-04-11 14:01 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Leon Romanovsky, Avihai Horon, Aya Levin, Eric Dumazet,
	Jakub Kicinski, linux-kernel, linux-rdma, Meir Lichtinger,
	Michael Guralnik, netdev, Paolo Abeni, Saeed Mahameed,
	Shay Drory

On Mon, Apr 10, 2023 at 04:07:49PM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> From Avihai,
> 
> Currently, Relaxed Ordering (RO) can't be used in VFs directly and in
> VFs assigned to QEMU, even if the PF supports RO. This is due to issues
> in reporting/emulation of PCI config space RO bit and due to current
> HCA capability behavior.
> 
> This series fixes it by using a new HCA capability and by relying on FW
> to do the "right thing" according to the PF's PCI config space RO value.
> 
> Allowing RO in VFs and VMs is valuable since it can greatly improve
> performance on some setups. For example, testing throughput of a VF on
> an AMD EPYC 7763 and ConnectX-6 Dx setup showed roughly 60% performance
> improvement.
> 
> Thanks
> 
> Avihai Horon (4):
>   RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
>   RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
>   net/mlx5: Update relaxed ordering read HCA capabilities
>   RDMA/mlx5: Allow relaxed ordering read in VFs and VMs

This looks OK, but the patch structure is pretty confusing.

It seems to me there are really only two patches here, the first is to
add some static inline

'mlx5 supports read ro'

which supports both the cap bits described in
the PRM, with a little comment to explain that old devices only set
the old cap.

And a second patch to call it in all the places we need to check before
setting the mkc ro read bit.

Maybe a final third patch to sort out that mistake in the write side.

But this really doesn't have anything to do with VFs and VMs, this is
adjusting the code to follow the current PRM because the old one was
mis-desgined.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs
  2023-04-11 14:01 ` [PATCH rdma-next 0/4] " Jason Gunthorpe
@ 2023-04-11 14:09   ` Leon Romanovsky
  2023-04-11 23:21   ` Jacob Keller
  1 sibling, 0 replies; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-11 14:09 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Avihai Horon, Aya Levin, Eric Dumazet, Jakub Kicinski,
	linux-kernel, linux-rdma, Meir Lichtinger, Michael Guralnik,
	netdev, Paolo Abeni, Saeed Mahameed, Shay Drory

On Tue, Apr 11, 2023 at 11:01:03AM -0300, Jason Gunthorpe wrote:
> On Mon, Apr 10, 2023 at 04:07:49PM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@nvidia.com>
> > 
> > From Avihai,
> > 
> > Currently, Relaxed Ordering (RO) can't be used in VFs directly and in
> > VFs assigned to QEMU, even if the PF supports RO. This is due to issues
> > in reporting/emulation of PCI config space RO bit and due to current
> > HCA capability behavior.
> > 
> > This series fixes it by using a new HCA capability and by relying on FW
> > to do the "right thing" according to the PF's PCI config space RO value.
> > 
> > Allowing RO in VFs and VMs is valuable since it can greatly improve
> > performance on some setups. For example, testing throughput of a VF on
> > an AMD EPYC 7763 and ConnectX-6 Dx setup showed roughly 60% performance
> > improvement.
> > 
> > Thanks
> > 
> > Avihai Horon (4):
> >   RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
> >   RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
> >   net/mlx5: Update relaxed ordering read HCA capabilities
> >   RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
> 
> This looks OK, but the patch structure is pretty confusing.
> 
> It seems to me there are really only two patches here, the first is to
> add some static inline

I asked from Avihai to align all pcie_relaxed_ordering_enabled() calls
to be relevant for RO only. This is how we came to first two patches.

Thanks

> 
> 'mlx5 supports read ro'
> 
> which supports both the cap bits described in
> the PRM, with a little comment to explain that old devices only set
> the old cap.
> 
> And a second patch to call it in all the places we need to check before
> setting the mkc ro read bit.
> 
> Maybe a final third patch to sort out that mistake in the write side.
> 
> But this really doesn't have anything to do with VFs and VMs, this is
> adjusting the code to follow the current PRM because the old one was
> mis-desgined.
> 
> Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mlx5-next 1/4] RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
  2023-04-10 13:07 ` [PATCH mlx5-next 1/4] RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write Leon Romanovsky
@ 2023-04-11 23:18   ` Jacob Keller
  0 siblings, 0 replies; 16+ messages in thread
From: Jacob Keller @ 2023-04-11 23:18 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Avihai Horon, Eric Dumazet, Jakub Kicinski, linux-rdma,
	Meir Lichtinger, Michael Guralnik, netdev, Paolo Abeni,
	Saeed Mahameed, Shay Drory



On 4/10/2023 6:07 AM, Leon Romanovsky wrote:
> From: Avihai Horon <avihaih@nvidia.com>
> 
> pcie_relaxed_ordering_enabled() check was added to avoid a syndrome when
> creating a MKey with relaxed ordering (RO) enabled when the driver's
> relaxed_ordering_{read,write} HCA capabilities are out of sync with FW.
> 
> While this can happen with relaxed_ordering_read, it can't happen with
> relaxed_ordering_write as it's set if the device supports RO write,
> regardless of RO in PCI config space, and thus can't change during
> runtime.
> 
> Therefore, drop the pcie_relaxed_ordering_enabled() check for
> relaxed_ordering_write while keeping it for relaxed_ordering_read.
> Doing so will also allow the usage of RO write in VFs and VMs (where RO
> in PCI config space is not reported/emulated properly).
> 
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> Reviewed-by: Shay Drory <shayd@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 2/4] RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
  2023-04-10 13:07 ` [PATCH rdma-next 2/4] RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR Leon Romanovsky
@ 2023-04-11 23:18   ` Jacob Keller
  0 siblings, 0 replies; 16+ messages in thread
From: Jacob Keller @ 2023-04-11 23:18 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Avihai Horon, David S. Miller, Eric Dumazet, Jakub Kicinski,
	linux-rdma, Meir Lichtinger, Michael Guralnik, netdev,
	Paolo Abeni, Saeed Mahameed, Shay Drory



On 4/10/2023 6:07 AM, Leon Romanovsky wrote:
> From: Avihai Horon <avihaih@nvidia.com>
> 
> relaxed_ordering_read HCA capability is set if both the device supports
> relaxed ordering (RO) read and RO is set in PCI config space.
> 
> RO in PCI config space can change during runtime. This will change the
> value of relaxed_ordering_read HCA capability in FW, but the driver will
> not see it since it queries the capabilities only once.
> 
> This can lead to the following scenario:
> 1. RO in PCI config space is enabled.
> 2. User creates MKey without RO.
> 3. RO in PCI config space is disabled.
>    As a result, relaxed_ordering_read HCA capability is turned off in FW
>    but remains on in driver copy of the capabilities.
> 4. User requests to reconfig the MKey with RO via UMR.
> 5. Driver will try to reconfig the MKey with RO read although it
>    shouldn't (as relaxed_ordering_read HCA capability is really off).
> 
> To fix this, check pcie_relaxed_ordering_enabled() before setting RO
> read in UMR.
> 
> Fixes: 896ec9735336 ("RDMA/mlx5: Set mkey relaxed ordering by UMR with ConnectX-7")
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> Reviewed-by: Shay Drory <shayd@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---


Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mlx5-next 3/4] net/mlx5: Update relaxed ordering read HCA capabilities
  2023-04-10 13:07 ` [PATCH mlx5-next 3/4] net/mlx5: Update relaxed ordering read HCA capabilities Leon Romanovsky
@ 2023-04-11 23:19   ` Jacob Keller
  0 siblings, 0 replies; 16+ messages in thread
From: Jacob Keller @ 2023-04-11 23:19 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Avihai Horon, Eric Dumazet, Jakub Kicinski, linux-rdma,
	Meir Lichtinger, Michael Guralnik, netdev, Paolo Abeni,
	Saeed Mahameed, Shay Drory



On 4/10/2023 6:07 AM, Leon Romanovsky wrote:
> From: Avihai Horon <avihaih@nvidia.com>
> 
> Rename existing HCA capability relaxed_ordering_read to
> relaxed_ordering_read_pci_enabled. This is in accordance with recent PRM
> change to better describe the capability, as it's set only if both the
> device supports relaxed ordering (RO) read and RO is enabled in PCI
> config space.
> 
> In addition, add new HCA capability relaxed_ordering_read which is set
> if the device supports RO read, regardless of RO in PCI config space.
> This will be used in the following patch to allow RO in VFs and VMs.
> 
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> Reviewed-by: Shay Drory <shayd@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---


Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH mlx5-next 4/4] RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
  2023-04-10 13:07 ` [PATCH mlx5-next 4/4] RDMA/mlx5: Allow relaxed ordering read in VFs and VMs Leon Romanovsky
@ 2023-04-11 23:19   ` Jacob Keller
  0 siblings, 0 replies; 16+ messages in thread
From: Jacob Keller @ 2023-04-11 23:19 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Avihai Horon, Aya Levin, Eric Dumazet, Jakub Kicinski,
	linux-rdma, Meir Lichtinger, Michael Guralnik, netdev,
	Paolo Abeni, Saeed Mahameed, Shay Drory



On 4/10/2023 6:07 AM, Leon Romanovsky wrote:
> From: Avihai Horon <avihaih@nvidia.com>
> 
> According to PCIe spec, Enable Relaxed Ordering value in the VF's PCI
> config space is wired to 0 and PF relaxed ordering (RO) setting should
> be applied to the VF. In QEMU (and maybe others), when assigning VFs,
> the RO bit in PCI config space is not emulated properly and is always
> set to 0.
> 
> Therefore, pcie_relaxed_ordering_enabled() always returns 0 for VFs and
> VMs and thus MKeys can't be created with RO read even if the PF supports
> it.
> 
> pcie_relaxed_ordering_enabled() check was added to avoid a syndrome when
> creating a MKey with relaxed ordering (RO) enabled when the driver's
> relaxed_ordering_read_pci_enabled HCA capability is out of sync with FW.
> With the new relaxed_ordering_read capability this can't happen, as it's
> set regardless of RO value in PCI config space and thus can't change
> during runtime.
> 
> Hence, to allow RO read in VFs and VMs, use the new HCA capability
> relaxed_ordering_read without checking pcie_relaxed_ordering_enabled().
> The old capability checks are kept for backward compatibility with older
> FWs.
> 
> Allowing RO in VFs and VMs is valuable since it can greatly improve
> performance on some setups. For example, testing throughput of a VF on
> an AMD EPYC 7763 and ConnectX-6 Dx setup showed roughly 60% performance
> improvement.
> 
> Signed-off-by: Avihai Horon <avihaih@nvidia.com>
> Reviewed-by: Shay Drory <shayd@nvidia.com>
> Reviewed-by: Aya Levin <ayal@nvidia.com>
> Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
> ---


Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs
  2023-04-11 14:01 ` [PATCH rdma-next 0/4] " Jason Gunthorpe
  2023-04-11 14:09   ` Leon Romanovsky
@ 2023-04-11 23:21   ` Jacob Keller
  2023-04-13 12:49     ` Leon Romanovsky
  1 sibling, 1 reply; 16+ messages in thread
From: Jacob Keller @ 2023-04-11 23:21 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: Leon Romanovsky, Avihai Horon, Aya Levin, Eric Dumazet,
	Jakub Kicinski, linux-kernel, linux-rdma, Meir Lichtinger,
	Michael Guralnik, netdev, Paolo Abeni, Saeed Mahameed,
	Shay Drory



On 4/11/2023 7:01 AM, Jason Gunthorpe wrote:
> On Mon, Apr 10, 2023 at 04:07:49PM +0300, Leon Romanovsky wrote:
>> From: Leon Romanovsky <leonro@nvidia.com>
>>
>> From Avihai,
>>
>> Currently, Relaxed Ordering (RO) can't be used in VFs directly and in
>> VFs assigned to QEMU, even if the PF supports RO. This is due to issues
>> in reporting/emulation of PCI config space RO bit and due to current
>> HCA capability behavior.
>>
>> This series fixes it by using a new HCA capability and by relying on FW
>> to do the "right thing" according to the PF's PCI config space RO value.
>>
>> Allowing RO in VFs and VMs is valuable since it can greatly improve
>> performance on some setups. For example, testing throughput of a VF on
>> an AMD EPYC 7763 and ConnectX-6 Dx setup showed roughly 60% performance
>> improvement.
>>
>> Thanks
>>
>> Avihai Horon (4):
>>   RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
>>   RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
>>   net/mlx5: Update relaxed ordering read HCA capabilities
>>   RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
> 
> This looks OK, but the patch structure is pretty confusing.
> 
> It seems to me there are really only two patches here, the first is to
> add some static inline
> 
> 'mlx5 supports read ro'
> 
> which supports both the cap bits described in
> the PRM, with a little comment to explain that old devices only set
> the old cap.
> 
> And a second patch to call it in all the places we need to check before
> setting the mkc ro read bit.
> 
> Maybe a final third patch to sort out that mistake in the write side.
> 
> But this really doesn't have anything to do with VFs and VMs, this is
> adjusting the code to follow the current PRM because the old one was
> mis-desgined.
> 
> Jason

FWIW I think Jason's outline here makes sense too and might be slightly
better. However, reading through the series I was reasonably able to
understand things enough that I think its fine as-is.

In some sense its not about VF or VM, but fixing this has the result
that it fixes a setup with VF and VM, so I think thats an ok thing to
call out as the goal.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs
  2023-04-11 23:21   ` Jacob Keller
@ 2023-04-13 12:49     ` Leon Romanovsky
  2023-04-13 14:46       ` Jason Gunthorpe
  0 siblings, 1 reply; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-13 12:49 UTC (permalink / raw)
  To: Jacob Keller
  Cc: Jason Gunthorpe, Avihai Horon, Aya Levin, Eric Dumazet,
	Jakub Kicinski, linux-kernel, linux-rdma, Meir Lichtinger,
	Michael Guralnik, netdev, Paolo Abeni, Saeed Mahameed,
	Shay Drory

On Tue, Apr 11, 2023 at 04:21:09PM -0700, Jacob Keller wrote:
> 
> 
> On 4/11/2023 7:01 AM, Jason Gunthorpe wrote:
> > On Mon, Apr 10, 2023 at 04:07:49PM +0300, Leon Romanovsky wrote:
> >> From: Leon Romanovsky <leonro@nvidia.com>
> >>
> >> From Avihai,
> >>
> >> Currently, Relaxed Ordering (RO) can't be used in VFs directly and in
> >> VFs assigned to QEMU, even if the PF supports RO. This is due to issues
> >> in reporting/emulation of PCI config space RO bit and due to current
> >> HCA capability behavior.
> >>
> >> This series fixes it by using a new HCA capability and by relying on FW
> >> to do the "right thing" according to the PF's PCI config space RO value.
> >>
> >> Allowing RO in VFs and VMs is valuable since it can greatly improve
> >> performance on some setups. For example, testing throughput of a VF on
> >> an AMD EPYC 7763 and ConnectX-6 Dx setup showed roughly 60% performance
> >> improvement.
> >>
> >> Thanks
> >>
> >> Avihai Horon (4):
> >>   RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
> >>   RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
> >>   net/mlx5: Update relaxed ordering read HCA capabilities
> >>   RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
> > 
> > This looks OK, but the patch structure is pretty confusing.
> > 
> > It seems to me there are really only two patches here, the first is to
> > add some static inline
> > 
> > 'mlx5 supports read ro'
> > 
> > which supports both the cap bits described in
> > the PRM, with a little comment to explain that old devices only set
> > the old cap.
> > 
> > And a second patch to call it in all the places we need to check before
> > setting the mkc ro read bit.
> > 
> > Maybe a final third patch to sort out that mistake in the write side.
> > 
> > But this really doesn't have anything to do with VFs and VMs, this is
> > adjusting the code to follow the current PRM because the old one was
> > mis-desgined.
> > 
> > Jason
> 
> FWIW I think Jason's outline here makes sense too and might be slightly
> better. However, reading through the series I was reasonably able to
> understand things enough that I think its fine as-is.
> 
> In some sense its not about VF or VM, but fixing this has the result
> that it fixes a setup with VF and VM, so I think thats an ok thing to
> call out as the goal.

VF or VM came from user perspective of where this behavior is not
correct. Avihai saw this in QEMU, so he described it in terms which
are more clear to the end user.

Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs
  2023-04-13 12:49     ` Leon Romanovsky
@ 2023-04-13 14:46       ` Jason Gunthorpe
  2023-04-16 10:28         ` Leon Romanovsky
  0 siblings, 1 reply; 16+ messages in thread
From: Jason Gunthorpe @ 2023-04-13 14:46 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Jacob Keller, Avihai Horon, Aya Levin, Eric Dumazet,
	Jakub Kicinski, linux-kernel, linux-rdma, Meir Lichtinger,
	Michael Guralnik, netdev, Paolo Abeni, Saeed Mahameed,
	Shay Drory

On Thu, Apr 13, 2023 at 03:49:29PM +0300, Leon Romanovsky wrote:

> > that it fixes a setup with VF and VM, so I think thats an ok thing to
> > call out as the goal.
> 
> VF or VM came from user perspective of where this behavior is not
> correct. Avihai saw this in QEMU, so he described it in terms which
> are more clear to the end user.

Except it is not clear, the VF/VM issue is more properly solved by
showing the real relaxed order cap to the VM.

This series really is about fixing the FW mistake that had a dynamic
cap bit for relaxed ordering. The driver does not support cap bits
that change during runtime.

mlx5 racily bodged around the broken cap by by protecting the feature
with the same test the FW was using to make the cap dynamic, but this
is all just wrong.

The new cap bit is static, doesn't change like a cap bit should, and
so we don't need the bodge anymore.

That the bodge didn't work in VMs because of a qmeu/vfio issue is
another bad side effect, but it isn't really the point of this series.

This is why I'd like it if the code was more closely organized to make
it clear that the old cap is OLD and that the bodge that goes along
with it is part of making the cap bit work. It kind of gets lost in
the way things are organized what is old/new.

Jason

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs
  2023-04-13 14:46       ` Jason Gunthorpe
@ 2023-04-16 10:28         ` Leon Romanovsky
  0 siblings, 0 replies; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-16 10:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Jacob Keller, Avihai Horon, Aya Levin, Eric Dumazet,
	Jakub Kicinski, linux-kernel, linux-rdma, Meir Lichtinger,
	Michael Guralnik, netdev, Paolo Abeni, Saeed Mahameed,
	Shay Drory

On Thu, Apr 13, 2023 at 11:46:16AM -0300, Jason Gunthorpe wrote:
> On Thu, Apr 13, 2023 at 03:49:29PM +0300, Leon Romanovsky wrote:
> 
> > > that it fixes a setup with VF and VM, so I think thats an ok thing to
> > > call out as the goal.
> > 
> > VF or VM came from user perspective of where this behavior is not
> > correct. Avihai saw this in QEMU, so he described it in terms which
> > are more clear to the end user.
> 
> Except it is not clear, the VF/VM issue is more properly solved by
> showing the real relaxed order cap to the VM.

I'm not convinced that patch restructure is really needed for something
so low as fix to problematic FW. I'm applying the series as is and
curious reader will read this discussion through Link tag from the
patch.

Thanks

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs
  2023-04-10 13:07 [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs Leon Romanovsky
                   ` (4 preceding siblings ...)
  2023-04-11 14:01 ` [PATCH rdma-next 0/4] " Jason Gunthorpe
@ 2023-04-16 10:30 ` Leon Romanovsky
  5 siblings, 0 replies; 16+ messages in thread
From: Leon Romanovsky @ 2023-04-16 10:30 UTC (permalink / raw)
  To: Jason Gunthorpe, Leon Romanovsky
  Cc: Avihai Horon, Aya Levin, Eric Dumazet, Jakub Kicinski,
	linux-kernel, linux-rdma, Meir Lichtinger, Michael Guralnik,
	netdev, Paolo Abeni, Saeed Mahameed, Shay Drory, Leon Romanovsky


On Mon, 10 Apr 2023 16:07:49 +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@nvidia.com>
> 
> From Avihai,
> 
> Currently, Relaxed Ordering (RO) can't be used in VFs directly and in
> VFs assigned to QEMU, even if the PF supports RO. This is due to issues
> in reporting/emulation of PCI config space RO bit and due to current
> HCA capability behavior.
> 
> [...]

Applied, thanks!

[1/4] RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
      https://git.kernel.org/rdma/rdma/c/ed4b0661cce119
[2/4] RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
      https://git.kernel.org/rdma/rdma/c/d43b020b0f82c0
[3/4] net/mlx5: Update relaxed ordering read HCA capabilities
      https://git.kernel.org/rdma/rdma/c/ccbbfe0682f2ff
[4/4] RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
      https://git.kernel.org/rdma/rdma/c/bd4ba605c4a92b

Best regards,
-- 
Leon Romanovsky <leon@kernel.org>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2023-04-16 10:30 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-10 13:07 [PATCH rdma-next 0/4] Allow relaxed ordering read in VFs and VMs Leon Romanovsky
2023-04-10 13:07 ` [PATCH mlx5-next 1/4] RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write Leon Romanovsky
2023-04-11 23:18   ` Jacob Keller
2023-04-10 13:07 ` [PATCH rdma-next 2/4] RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR Leon Romanovsky
2023-04-11 23:18   ` Jacob Keller
2023-04-10 13:07 ` [PATCH mlx5-next 3/4] net/mlx5: Update relaxed ordering read HCA capabilities Leon Romanovsky
2023-04-11 23:19   ` Jacob Keller
2023-04-10 13:07 ` [PATCH mlx5-next 4/4] RDMA/mlx5: Allow relaxed ordering read in VFs and VMs Leon Romanovsky
2023-04-11 23:19   ` Jacob Keller
2023-04-11 14:01 ` [PATCH rdma-next 0/4] " Jason Gunthorpe
2023-04-11 14:09   ` Leon Romanovsky
2023-04-11 23:21   ` Jacob Keller
2023-04-13 12:49     ` Leon Romanovsky
2023-04-13 14:46       ` Jason Gunthorpe
2023-04-16 10:28         ` Leon Romanovsky
2023-04-16 10:30 ` Leon Romanovsky

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).