Linux-RDMA Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/8 v1] Remove FMR support from RDMA drivers
@ 2020-05-14 12:02 Max Gurtovoy
  2020-05-14 12:02 ` [PATCH 1/8] RDMA/mlx4: remove FMR support for memory registration Max Gurtovoy
                   ` (10 more replies)
  0 siblings, 11 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:02 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

This series removes the support for FMR mode to register memory. This ancient
mode is unsafe and not maintained/tested in the last few years. It also doesn't
have any reasonable advantage over other memory registration methods such as
FRWR (that is implemented in all the recent RDMA adapters). This series should
be reviewed and approved by the maintainer of the effected drivers and I
suggest to test it as well.

The tests that I made for this series (fio benchmarks and fio verify data):
1. iSER initiator on ConnectX-4
2. iSER initiator on ConnectX-3
3. SRP initiator on ConnectX-4 (loopback to SRP target)
4. SRP initiator on ConnectX-3

Not tested:
1. RDS
2. mthca
3. rdmavt

Israel Rukshin (1):
  RDMA/iser: Remove support for FMR memory registration

Max Gurtovoy (7):
  RDMA/mlx4: remove FMR support for memory registration
  RDMA/rds: remove FMR support for memory registration
  RDMA/mthca: remove FMR support for memory registration
  RDMA/rdmavt: remove FMR memory registration
  RDMA/srp: remove support for FMR memory registration
  RDMA/core: remove FMR pool API
  RDMA/core: remove FMR device ops

 Documentation/driver-api/infiniband.rst      |   3 -
 Documentation/infiniband/core_locking.rst    |   2 -
 drivers/infiniband/core/Makefile             |   2 +-
 drivers/infiniband/core/device.c             |   4 -
 drivers/infiniband/core/fmr_pool.c           | 494 ---------------------------
 drivers/infiniband/core/verbs.c              |  48 ---
 drivers/infiniband/hw/mlx4/main.c            |  10 -
 drivers/infiniband/hw/mlx4/mlx4_ib.h         |  16 -
 drivers/infiniband/hw/mlx4/mr.c              |  93 -----
 drivers/infiniband/hw/mthca/mthca_dev.h      |  10 -
 drivers/infiniband/hw/mthca/mthca_mr.c       | 262 +-------------
 drivers/infiniband/hw/mthca/mthca_provider.c |  86 -----
 drivers/infiniband/sw/rdmavt/mr.c            | 154 ---------
 drivers/infiniband/sw/rdmavt/mr.h            |  15 -
 drivers/infiniband/sw/rdmavt/vt.c            |   4 -
 drivers/infiniband/ulp/iser/iscsi_iser.h     |  79 +----
 drivers/infiniband/ulp/iser/iser_initiator.c |  19 +-
 drivers/infiniband/ulp/iser/iser_memory.c    | 188 +---------
 drivers/infiniband/ulp/iser/iser_verbs.c     | 126 +------
 drivers/infiniband/ulp/srp/ib_srp.c          | 222 +-----------
 drivers/infiniband/ulp/srp/ib_srp.h          |  27 +-
 drivers/net/ethernet/mellanox/mlx4/mr.c      | 183 ----------
 include/linux/mlx4/device.h                  |  21 +-
 include/rdma/ib_fmr_pool.h                   |  93 -----
 include/rdma/ib_verbs.h                      |  45 ---
 net/rds/Makefile                             |   2 +-
 net/rds/ib.c                                 |  14 +-
 net/rds/ib.h                                 |   1 -
 net/rds/ib_cm.c                              |   4 +-
 net/rds/ib_fmr.c                             | 269 ---------------
 net/rds/ib_mr.h                              |  12 -
 net/rds/ib_rdma.c                            |  16 +-
 32 files changed, 77 insertions(+), 2447 deletions(-)
 delete mode 100644 drivers/infiniband/core/fmr_pool.c
 delete mode 100644 include/rdma/ib_fmr_pool.h
 delete mode 100644 net/rds/ib_fmr.c

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/8] RDMA/mlx4: remove FMR support for memory registration
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
@ 2020-05-14 12:02 ` Max Gurtovoy
  2020-05-14 12:02 ` [PATCH 2/8] RDMA/rds: " Max Gurtovoy
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:02 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

HCA's that are driven by mlx4 driver support FRWR method to register
memory. Remove the ancient and unsafe FMR method.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/infiniband/hw/mlx4/main.c       |  10 --
 drivers/infiniband/hw/mlx4/mlx4_ib.h    |  16 ---
 drivers/infiniband/hw/mlx4/mr.c         |  93 ----------------
 drivers/net/ethernet/mellanox/mlx4/mr.c | 183 --------------------------------
 include/linux/mlx4/device.h             |  21 +---
 5 files changed, 2 insertions(+), 321 deletions(-)

diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 275722c..778dccf 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2600,13 +2600,6 @@ static void get_fw_ver_str(struct ib_device *device, char *str)
 	.modify_wq = mlx4_ib_modify_wq,
 };
 
-static const struct ib_device_ops mlx4_ib_dev_fmr_ops = {
-	.alloc_fmr = mlx4_ib_fmr_alloc,
-	.dealloc_fmr = mlx4_ib_fmr_dealloc,
-	.map_phys_fmr = mlx4_ib_map_phys_fmr,
-	.unmap_fmr = mlx4_ib_unmap_fmr,
-};
-
 static const struct ib_device_ops mlx4_ib_dev_mw_ops = {
 	.alloc_mw = mlx4_ib_alloc_mw,
 	.dealloc_mw = mlx4_ib_dealloc_mw,
@@ -2724,9 +2717,6 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 		ib_set_device_ops(&ibdev->ib_dev, &mlx4_ib_dev_wq_ops);
 	}
 
-	if (!mlx4_is_slave(ibdev->dev))
-		ib_set_device_ops(&ibdev->ib_dev, &mlx4_ib_dev_fmr_ops);
-
 	if (dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW ||
 	    dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN) {
 		ibdev->ib_dev.uverbs_cmd_mask |=
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index d188573..ebdeffd 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -146,11 +146,6 @@ struct mlx4_ib_mw {
 	struct mlx4_mw		mmw;
 };
 
-struct mlx4_ib_fmr {
-	struct ib_fmr           ibfmr;
-	struct mlx4_fmr         mfmr;
-};
-
 #define MAX_REGS_PER_FLOW 2
 
 struct mlx4_flow_reg_id {
@@ -679,11 +674,6 @@ static inline struct mlx4_ib_mw *to_mmw(struct ib_mw *ibmw)
 	return container_of(ibmw, struct mlx4_ib_mw, ibmw);
 }
 
-static inline struct mlx4_ib_fmr *to_mfmr(struct ib_fmr *ibfmr)
-{
-	return container_of(ibfmr, struct mlx4_ib_fmr, ibfmr);
-}
-
 static inline struct mlx4_ib_flow *to_mflow(struct ib_flow *ibflow)
 {
 	return container_of(ibflow, struct mlx4_ib_flow, ibflow);
@@ -794,12 +784,6 @@ int mlx4_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 int mlx4_ib_mad_init(struct mlx4_ib_dev *dev);
 void mlx4_ib_mad_cleanup(struct mlx4_ib_dev *dev);
 
-struct ib_fmr *mlx4_ib_fmr_alloc(struct ib_pd *pd, int mr_access_flags,
-				  struct ib_fmr_attr *fmr_attr);
-int mlx4_ib_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list, int npages,
-			 u64 iova);
-int mlx4_ib_unmap_fmr(struct list_head *fmr_list);
-int mlx4_ib_fmr_dealloc(struct ib_fmr *fmr);
 int __mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
 			 struct ib_port_attr *props, int netw_view);
 int __mlx4_ib_query_pkey(struct ib_device *ibdev, u8 port, u16 index,
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index b0121c9..e2fb71b 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -698,99 +698,6 @@ struct ib_mr *mlx4_ib_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
 	return ERR_PTR(err);
 }
 
-struct ib_fmr *mlx4_ib_fmr_alloc(struct ib_pd *pd, int acc,
-				 struct ib_fmr_attr *fmr_attr)
-{
-	struct mlx4_ib_dev *dev = to_mdev(pd->device);
-	struct mlx4_ib_fmr *fmr;
-	int err = -ENOMEM;
-
-	fmr = kmalloc(sizeof *fmr, GFP_KERNEL);
-	if (!fmr)
-		return ERR_PTR(-ENOMEM);
-
-	err = mlx4_fmr_alloc(dev->dev, to_mpd(pd)->pdn, convert_access(acc),
-			     fmr_attr->max_pages, fmr_attr->max_maps,
-			     fmr_attr->page_shift, &fmr->mfmr);
-	if (err)
-		goto err_free;
-
-	err = mlx4_fmr_enable(to_mdev(pd->device)->dev, &fmr->mfmr);
-	if (err)
-		goto err_mr;
-
-	fmr->ibfmr.rkey = fmr->ibfmr.lkey = fmr->mfmr.mr.key;
-
-	return &fmr->ibfmr;
-
-err_mr:
-	(void) mlx4_mr_free(to_mdev(pd->device)->dev, &fmr->mfmr.mr);
-
-err_free:
-	kfree(fmr);
-
-	return ERR_PTR(err);
-}
-
-int mlx4_ib_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
-		      int npages, u64 iova)
-{
-	struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr);
-	struct mlx4_ib_dev *dev = to_mdev(ifmr->ibfmr.device);
-
-	return mlx4_map_phys_fmr(dev->dev, &ifmr->mfmr, page_list, npages, iova,
-				 &ifmr->ibfmr.lkey, &ifmr->ibfmr.rkey);
-}
-
-int mlx4_ib_unmap_fmr(struct list_head *fmr_list)
-{
-	struct ib_fmr *ibfmr;
-	int err;
-	struct mlx4_dev *mdev = NULL;
-
-	list_for_each_entry(ibfmr, fmr_list, list) {
-		if (mdev && to_mdev(ibfmr->device)->dev != mdev)
-			return -EINVAL;
-		mdev = to_mdev(ibfmr->device)->dev;
-	}
-
-	if (!mdev)
-		return 0;
-
-	list_for_each_entry(ibfmr, fmr_list, list) {
-		struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr);
-
-		mlx4_fmr_unmap(mdev, &ifmr->mfmr, &ifmr->ibfmr.lkey, &ifmr->ibfmr.rkey);
-	}
-
-	/*
-	 * Make sure all MPT status updates are visible before issuing
-	 * SYNC_TPT firmware command.
-	 */
-	wmb();
-
-	err = mlx4_SYNC_TPT(mdev);
-	if (err)
-		pr_warn("SYNC_TPT error %d when "
-		       "unmapping FMRs\n", err);
-
-	return 0;
-}
-
-int mlx4_ib_fmr_dealloc(struct ib_fmr *ibfmr)
-{
-	struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr);
-	struct mlx4_ib_dev *dev = to_mdev(ibfmr->device);
-	int err;
-
-	err = mlx4_fmr_free(dev->dev, &ifmr->mfmr);
-
-	if (!err)
-		kfree(ifmr);
-
-	return err;
-}
-
 static int mlx4_set_page(struct ib_mr *ibmr, u64 addr)
 {
 	struct mlx4_ib_mr *mr = to_mmr(ibmr);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c b/drivers/net/ethernet/mellanox/mlx4/mr.c
index 1a11bc0..d2986f1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -966,189 +966,6 @@ void mlx4_cleanup_mr_table(struct mlx4_dev *dev)
 	mlx4_bitmap_cleanup(&mr_table->mpt_bitmap);
 }
 
-static inline int mlx4_check_fmr(struct mlx4_fmr *fmr, u64 *page_list,
-				  int npages, u64 iova)
-{
-	int i, page_mask;
-
-	if (npages > fmr->max_pages)
-		return -EINVAL;
-
-	page_mask = (1 << fmr->page_shift) - 1;
-
-	/* We are getting page lists, so va must be page aligned. */
-	if (iova & page_mask)
-		return -EINVAL;
-
-	/* Trust the user not to pass misaligned data in page_list */
-	if (0)
-		for (i = 0; i < npages; ++i) {
-			if (page_list[i] & ~page_mask)
-				return -EINVAL;
-		}
-
-	if (fmr->maps >= fmr->max_maps)
-		return -EINVAL;
-
-	return 0;
-}
-
-int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list,
-		      int npages, u64 iova, u32 *lkey, u32 *rkey)
-{
-	u32 key;
-	int i, err;
-
-	err = mlx4_check_fmr(fmr, page_list, npages, iova);
-	if (err)
-		return err;
-
-	++fmr->maps;
-
-	key = key_to_hw_index(fmr->mr.key);
-	key += dev->caps.num_mpts;
-	*lkey = *rkey = fmr->mr.key = hw_index_to_key(key);
-
-	*(u8 *) fmr->mpt = MLX4_MPT_STATUS_SW;
-
-	/* Make sure MPT status is visible before writing MTT entries */
-	wmb();
-
-	dma_sync_single_for_cpu(&dev->persist->pdev->dev, fmr->dma_handle,
-				npages * sizeof(u64), DMA_TO_DEVICE);
-
-	for (i = 0; i < npages; ++i)
-		fmr->mtts[i] = cpu_to_be64(page_list[i] | MLX4_MTT_FLAG_PRESENT);
-
-	dma_sync_single_for_device(&dev->persist->pdev->dev, fmr->dma_handle,
-				   npages * sizeof(u64), DMA_TO_DEVICE);
-
-	fmr->mpt->key    = cpu_to_be32(key);
-	fmr->mpt->lkey   = cpu_to_be32(key);
-	fmr->mpt->length = cpu_to_be64(npages * (1ull << fmr->page_shift));
-	fmr->mpt->start  = cpu_to_be64(iova);
-
-	/* Make MTT entries are visible before setting MPT status */
-	wmb();
-
-	*(u8 *) fmr->mpt = MLX4_MPT_STATUS_HW;
-
-	/* Make sure MPT status is visible before consumer can use FMR */
-	wmb();
-
-	return 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_map_phys_fmr);
-
-int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages,
-		   int max_maps, u8 page_shift, struct mlx4_fmr *fmr)
-{
-	struct mlx4_priv *priv = mlx4_priv(dev);
-	int err = -ENOMEM;
-
-	if (max_maps > dev->caps.max_fmr_maps)
-		return -EINVAL;
-
-	if (page_shift < (ffs(dev->caps.page_size_cap) - 1) || page_shift >= 32)
-		return -EINVAL;
-
-	/* All MTTs must fit in the same page */
-	if (max_pages * sizeof(*fmr->mtts) > PAGE_SIZE)
-		return -EINVAL;
-
-	fmr->page_shift = page_shift;
-	fmr->max_pages  = max_pages;
-	fmr->max_maps   = max_maps;
-	fmr->maps = 0;
-
-	err = mlx4_mr_alloc(dev, pd, 0, 0, access, max_pages,
-			    page_shift, &fmr->mr);
-	if (err)
-		return err;
-
-	fmr->mtts = mlx4_table_find(&priv->mr_table.mtt_table,
-				    fmr->mr.mtt.offset,
-				    &fmr->dma_handle);
-
-	if (!fmr->mtts) {
-		err = -ENOMEM;
-		goto err_free;
-	}
-
-	return 0;
-
-err_free:
-	(void) mlx4_mr_free(dev, &fmr->mr);
-	return err;
-}
-EXPORT_SYMBOL_GPL(mlx4_fmr_alloc);
-
-int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr)
-{
-	struct mlx4_priv *priv = mlx4_priv(dev);
-	int err;
-
-	err = mlx4_mr_enable(dev, &fmr->mr);
-	if (err)
-		return err;
-
-	fmr->mpt = mlx4_table_find(&priv->mr_table.dmpt_table,
-				    key_to_hw_index(fmr->mr.key), NULL);
-	if (!fmr->mpt)
-		return -ENOMEM;
-
-	return 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_fmr_enable);
-
-void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr,
-		    u32 *lkey, u32 *rkey)
-{
-	if (!fmr->maps)
-		return;
-
-	/* To unmap: it is sufficient to take back ownership from HW */
-	*(u8 *)fmr->mpt = MLX4_MPT_STATUS_SW;
-
-	/* Make sure MPT status is visible */
-	wmb();
-
-	fmr->maps = 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_fmr_unmap);
-
-int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr)
-{
-	int ret;
-
-	if (fmr->maps)
-		return -EBUSY;
-	if (fmr->mr.enabled == MLX4_MPT_EN_HW) {
-		/* In case of FMR was enabled and unmapped
-		 * make sure to give ownership of MPT back to HW
-		 * so HW2SW_MPT command will success.
-		 */
-		*(u8 *)fmr->mpt = MLX4_MPT_STATUS_SW;
-		/* Make sure MPT status is visible before changing MPT fields */
-		wmb();
-		fmr->mpt->length = 0;
-		fmr->mpt->start  = 0;
-		/* Make sure MPT data is visible after changing MPT status */
-		wmb();
-		*(u8 *)fmr->mpt = MLX4_MPT_STATUS_HW;
-		/* make sure MPT status is visible */
-		wmb();
-	}
-
-	ret = mlx4_mr_free(dev, &fmr->mr);
-	if (ret)
-		return ret;
-	fmr->mr.enabled = MLX4_MPT_DISABLED;
-
-	return 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_fmr_free);
-
 int mlx4_SYNC_TPT(struct mlx4_dev *dev)
 {
 	return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_SYNC_TPT,
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 20372de..dd53d96 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -707,17 +707,6 @@ struct mlx4_mw {
 	int			enabled;
 };
 
-struct mlx4_fmr {
-	struct mlx4_mr		mr;
-	struct mlx4_mpt_entry  *mpt;
-	__be64		       *mtts;
-	dma_addr_t		dma_handle;
-	int			max_pages;
-	int			max_maps;
-	int			maps;
-	u8			page_shift;
-};
-
 struct mlx4_uar {
 	unsigned long		pfn;
 	int			index;
@@ -1412,14 +1401,6 @@ void mlx4_handle_eth_header_mcast_prio(struct mlx4_net_trans_rule_hw_ctrl *ctrl,
 int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index);
 void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, u16 vlan);
 
-int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list,
-		      int npages, u64 iova, u32 *lkey, u32 *rkey);
-int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages,
-		   int max_maps, u8 page_shift, struct mlx4_fmr *fmr);
-int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr);
-void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr,
-		    u32 *lkey, u32 *rkey);
-int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr);
 int mlx4_SYNC_TPT(struct mlx4_dev *dev);
 int mlx4_test_interrupt(struct mlx4_dev *dev, int vector);
 int mlx4_test_async(struct mlx4_dev *dev);
@@ -1522,6 +1503,8 @@ struct mlx4_slaves_pport mlx4_phys_to_slaves_pport_actv(
 int mlx4_vf_get_enable_smi_admin(struct mlx4_dev *dev, int slave, int port);
 int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port,
 				 int enable);
+
+struct mlx4_mpt_entry;
 int mlx4_mr_hw_get_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
 		       struct mlx4_mpt_entry ***mpt_entry);
 int mlx4_mr_hw_write_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 2/8] RDMA/rds: remove FMR support for memory registration
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
  2020-05-14 12:02 ` [PATCH 1/8] RDMA/mlx4: remove FMR support for memory registration Max Gurtovoy
@ 2020-05-14 12:02 ` Max Gurtovoy
  2020-05-14 12:03 ` [PATCH 3/8] RDMA/mthca: " Max Gurtovoy
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:02 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

Use FRWR method for memory registration by default and remove the ancient
and unsafe FMR method.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 net/rds/Makefile  |   2 +-
 net/rds/ib.c      |  14 +--
 net/rds/ib.h      |   1 -
 net/rds/ib_cm.c   |   4 +-
 net/rds/ib_fmr.c  | 269 ------------------------------------------------------
 net/rds/ib_mr.h   |  12 ---
 net/rds/ib_rdma.c |  16 +---
 7 files changed, 11 insertions(+), 307 deletions(-)
 delete mode 100644 net/rds/ib_fmr.c

diff --git a/net/rds/Makefile b/net/rds/Makefile
index e647f9d..8fdc118 100644
--- a/net/rds/Makefile
+++ b/net/rds/Makefile
@@ -7,7 +7,7 @@ rds-y :=	af_rds.o bind.o cong.o connection.o info.o message.o   \
 obj-$(CONFIG_RDS_RDMA) += rds_rdma.o
 rds_rdma-y :=	rdma_transport.o \
 			ib.o ib_cm.o ib_recv.o ib_ring.o ib_send.o ib_stats.o \
-			ib_sysctl.o ib_rdma.o ib_fmr.o ib_frmr.o
+			ib_sysctl.o ib_rdma.o ib_frmr.o
 
 
 obj-$(CONFIG_RDS_TCP) += rds_tcp.o
diff --git a/net/rds/ib.c b/net/rds/ib.c
index a792d8a..33a65c8 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -130,12 +130,15 @@ void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
 static void rds_ib_add_one(struct ib_device *device)
 {
 	struct rds_ib_device *rds_ibdev;
-	bool has_fr, has_fmr;
 
 	/* Only handle IB (no iWARP) devices */
 	if (device->node_type != RDMA_NODE_IB_CA)
 		return;
 
+	/* Device must support FRWR */
+	if (!(device->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS))
+		return;
+
 	rds_ibdev = kzalloc_node(sizeof(struct rds_ib_device), GFP_KERNEL,
 				 ibdev_to_node(device));
 	if (!rds_ibdev)
@@ -151,11 +154,6 @@ static void rds_ib_add_one(struct ib_device *device)
 	rds_ibdev->max_wrs = device->attrs.max_qp_wr;
 	rds_ibdev->max_sge = min(device->attrs.max_send_sge, RDS_IB_MAX_SGE);
 
-	has_fr = (device->attrs.device_cap_flags &
-		  IB_DEVICE_MEM_MGT_EXTENSIONS);
-	has_fmr = (device->ops.alloc_fmr && device->ops.dealloc_fmr &&
-		   device->ops.map_phys_fmr && device->ops.unmap_fmr);
-	rds_ibdev->use_fastreg = (has_fr && !has_fmr);
 	rds_ibdev->odp_capable =
 		!!(device->attrs.device_cap_flags &
 		   IB_DEVICE_ON_DEMAND_PAGING) &&
@@ -217,9 +215,7 @@ static void rds_ib_add_one(struct ib_device *device)
 		 rds_ibdev->fmr_max_remaps, rds_ibdev->max_1m_mrs,
 		 rds_ibdev->max_8k_mrs);
 
-	pr_info("RDS/IB: %s: %s supported and preferred\n",
-		device->name,
-		rds_ibdev->use_fastreg ? "FRMR" : "FMR");
+	pr_info("RDS/IB: %s: added\n", device->name);
 
 	down_write(&rds_ib_devices_lock);
 	list_add_tail_rcu(&rds_ibdev->list, &rds_ib_devices);
diff --git a/net/rds/ib.h b/net/rds/ib.h
index 0296f1f..e0d5991 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -247,7 +247,6 @@ struct rds_ib_device {
 	struct ib_device	*dev;
 	struct ib_pd		*pd;
 	struct dma_pool		*rid_hdrs_pool; /* RDS headers DMA pool */
-	u8			use_fastreg:1;
 	u8			odp_capable:1;
 
 	unsigned int		max_mrs;
diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index c71f432..c5c665e 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -526,10 +526,10 @@ static int rds_ib_setup_qp(struct rds_connection *conn)
 		return -EOPNOTSUPP;
 
 	/* The fr_queue_space is currently set to 512, to add extra space on
-	 * completion queue and send queue. This extra space is used for FRMR
+	 * completion queue and send queue. This extra space is used for FRWR
 	 * registration and invalidation work requests
 	 */
-	fr_queue_space = (rds_ibdev->use_fastreg ? RDS_IB_DEFAULT_FR_WR : 0);
+	fr_queue_space = RDS_IB_DEFAULT_FR_WR;
 
 	/* add the conn now so that connection establishment has the dev */
 	rds_ib_add_conn(rds_ibdev, conn);
diff --git a/net/rds/ib_fmr.c b/net/rds/ib_fmr.c
deleted file mode 100644
index 93c0437..0000000
--- a/net/rds/ib_fmr.c
+++ /dev/null
@@ -1,269 +0,0 @@
-/*
- * Copyright (c) 2016 Oracle.  All rights reserved.
- *
- * This software is available to you under a choice of one of two
- * licenses.  You may choose to be licensed under the terms of the GNU
- * General Public License (GPL) Version 2, available from the file
- * COPYING in the main directory of this source tree, or the
- * OpenIB.org BSD license below:
- *
- *     Redistribution and use in source and binary forms, with or
- *     without modification, are permitted provided that the following
- *     conditions are met:
- *
- *      - Redistributions of source code must retain the above
- *        copyright notice, this list of conditions and the following
- *        disclaimer.
- *
- *      - Redistributions in binary form must reproduce the above
- *        copyright notice, this list of conditions and the following
- *        disclaimer in the documentation and/or other materials
- *        provided with the distribution.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
- * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
- * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
- * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- * SOFTWARE.
- */
-
-#include "ib_mr.h"
-
-struct rds_ib_mr *rds_ib_alloc_fmr(struct rds_ib_device *rds_ibdev, int npages)
-{
-	struct rds_ib_mr_pool *pool;
-	struct rds_ib_mr *ibmr = NULL;
-	struct rds_ib_fmr *fmr;
-	int err = 0;
-
-	if (npages <= RDS_MR_8K_MSG_SIZE)
-		pool = rds_ibdev->mr_8k_pool;
-	else
-		pool = rds_ibdev->mr_1m_pool;
-
-	if (atomic_read(&pool->dirty_count) >= pool->max_items / 10)
-		queue_delayed_work(rds_ib_mr_wq, &pool->flush_worker, 10);
-
-	/* Switch pools if one of the pool is reaching upper limit */
-	if (atomic_read(&pool->dirty_count) >=  pool->max_items * 9 / 10) {
-		if (pool->pool_type == RDS_IB_MR_8K_POOL)
-			pool = rds_ibdev->mr_1m_pool;
-		else
-			pool = rds_ibdev->mr_8k_pool;
-	}
-
-	ibmr = rds_ib_try_reuse_ibmr(pool);
-	if (ibmr)
-		return ibmr;
-
-	ibmr = kzalloc_node(sizeof(*ibmr), GFP_KERNEL,
-			    rdsibdev_to_node(rds_ibdev));
-	if (!ibmr) {
-		err = -ENOMEM;
-		goto out_no_cigar;
-	}
-
-	fmr = &ibmr->u.fmr;
-	fmr->fmr = ib_alloc_fmr(rds_ibdev->pd,
-			(IB_ACCESS_LOCAL_WRITE |
-			 IB_ACCESS_REMOTE_READ |
-			 IB_ACCESS_REMOTE_WRITE |
-			 IB_ACCESS_REMOTE_ATOMIC),
-			&pool->fmr_attr);
-	if (IS_ERR(fmr->fmr)) {
-		err = PTR_ERR(fmr->fmr);
-		fmr->fmr = NULL;
-		pr_warn("RDS/IB: %s failed (err=%d)\n", __func__, err);
-		goto out_no_cigar;
-	}
-
-	ibmr->pool = pool;
-	if (pool->pool_type == RDS_IB_MR_8K_POOL)
-		rds_ib_stats_inc(s_ib_rdma_mr_8k_alloc);
-	else
-		rds_ib_stats_inc(s_ib_rdma_mr_1m_alloc);
-
-	return ibmr;
-
-out_no_cigar:
-	kfree(ibmr);
-	atomic_dec(&pool->item_count);
-
-	return ERR_PTR(err);
-}
-
-static int rds_ib_map_fmr(struct rds_ib_device *rds_ibdev,
-			  struct rds_ib_mr *ibmr, struct scatterlist *sg,
-			  unsigned int nents)
-{
-	struct ib_device *dev = rds_ibdev->dev;
-	struct rds_ib_fmr *fmr = &ibmr->u.fmr;
-	struct scatterlist *scat = sg;
-	u64 io_addr = 0;
-	u64 *dma_pages;
-	u32 len;
-	int page_cnt, sg_dma_len;
-	int i, j;
-	int ret;
-
-	sg_dma_len = ib_dma_map_sg(dev, sg, nents, DMA_BIDIRECTIONAL);
-	if (unlikely(!sg_dma_len)) {
-		pr_warn("RDS/IB: %s failed!\n", __func__);
-		return -EBUSY;
-	}
-
-	len = 0;
-	page_cnt = 0;
-
-	for (i = 0; i < sg_dma_len; ++i) {
-		unsigned int dma_len = sg_dma_len(&scat[i]);
-		u64 dma_addr = sg_dma_address(&scat[i]);
-
-		if (dma_addr & ~PAGE_MASK) {
-			if (i > 0) {
-				ib_dma_unmap_sg(dev, sg, nents,
-						DMA_BIDIRECTIONAL);
-				return -EINVAL;
-			} else {
-				++page_cnt;
-			}
-		}
-		if ((dma_addr + dma_len) & ~PAGE_MASK) {
-			if (i < sg_dma_len - 1) {
-				ib_dma_unmap_sg(dev, sg, nents,
-						DMA_BIDIRECTIONAL);
-				return -EINVAL;
-			} else {
-				++page_cnt;
-			}
-		}
-
-		len += dma_len;
-	}
-
-	page_cnt += len >> PAGE_SHIFT;
-	if (page_cnt > ibmr->pool->fmr_attr.max_pages) {
-		ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL);
-		return -EINVAL;
-	}
-
-	dma_pages = kmalloc_array_node(sizeof(u64), page_cnt, GFP_ATOMIC,
-				       rdsibdev_to_node(rds_ibdev));
-	if (!dma_pages) {
-		ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL);
-		return -ENOMEM;
-	}
-
-	page_cnt = 0;
-	for (i = 0; i < sg_dma_len; ++i) {
-		unsigned int dma_len = sg_dma_len(&scat[i]);
-		u64 dma_addr = sg_dma_address(&scat[i]);
-
-		for (j = 0; j < dma_len; j += PAGE_SIZE)
-			dma_pages[page_cnt++] =
-				(dma_addr & PAGE_MASK) + j;
-	}
-
-	ret = ib_map_phys_fmr(fmr->fmr, dma_pages, page_cnt, io_addr);
-	if (ret) {
-		ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL);
-		goto out;
-	}
-
-	/* Success - we successfully remapped the MR, so we can
-	 * safely tear down the old mapping.
-	 */
-	rds_ib_teardown_mr(ibmr);
-
-	ibmr->sg = scat;
-	ibmr->sg_len = nents;
-	ibmr->sg_dma_len = sg_dma_len;
-	ibmr->remap_count++;
-
-	if (ibmr->pool->pool_type == RDS_IB_MR_8K_POOL)
-		rds_ib_stats_inc(s_ib_rdma_mr_8k_used);
-	else
-		rds_ib_stats_inc(s_ib_rdma_mr_1m_used);
-	ret = 0;
-
-out:
-	kfree(dma_pages);
-
-	return ret;
-}
-
-struct rds_ib_mr *rds_ib_reg_fmr(struct rds_ib_device *rds_ibdev,
-				 struct scatterlist *sg,
-				 unsigned long nents,
-				 u32 *key)
-{
-	struct rds_ib_mr *ibmr = NULL;
-	struct rds_ib_fmr *fmr;
-	int ret;
-
-	ibmr = rds_ib_alloc_fmr(rds_ibdev, nents);
-	if (IS_ERR(ibmr))
-		return ibmr;
-
-	ibmr->device = rds_ibdev;
-	fmr = &ibmr->u.fmr;
-	ret = rds_ib_map_fmr(rds_ibdev, ibmr, sg, nents);
-	if (ret == 0)
-		*key = fmr->fmr->rkey;
-	else
-		rds_ib_free_mr(ibmr, 0);
-
-	return ibmr;
-}
-
-void rds_ib_unreg_fmr(struct list_head *list, unsigned int *nfreed,
-		      unsigned long *unpinned, unsigned int goal)
-{
-	struct rds_ib_mr *ibmr, *next;
-	struct rds_ib_fmr *fmr;
-	LIST_HEAD(fmr_list);
-	int ret = 0;
-	unsigned int freed = *nfreed;
-
-	/* String all ib_mr's onto one list and hand them to  ib_unmap_fmr */
-	list_for_each_entry(ibmr, list, unmap_list) {
-		fmr = &ibmr->u.fmr;
-		list_add(&fmr->fmr->list, &fmr_list);
-	}
-
-	ret = ib_unmap_fmr(&fmr_list);
-	if (ret)
-		pr_warn("RDS/IB: FMR invalidation failed (err=%d)\n", ret);
-
-	/* Now we can destroy the DMA mapping and unpin any pages */
-	list_for_each_entry_safe(ibmr, next, list, unmap_list) {
-		fmr = &ibmr->u.fmr;
-		*unpinned += ibmr->sg_len;
-		__rds_ib_teardown_mr(ibmr);
-		if (freed < goal ||
-		    ibmr->remap_count >= ibmr->pool->fmr_attr.max_maps) {
-			if (ibmr->pool->pool_type == RDS_IB_MR_8K_POOL)
-				rds_ib_stats_inc(s_ib_rdma_mr_8k_free);
-			else
-				rds_ib_stats_inc(s_ib_rdma_mr_1m_free);
-			list_del(&ibmr->unmap_list);
-			ib_dealloc_fmr(fmr->fmr);
-			kfree(ibmr);
-			freed++;
-		}
-	}
-	*nfreed = freed;
-}
-
-void rds_ib_free_fmr_list(struct rds_ib_mr *ibmr)
-{
-	struct rds_ib_mr_pool *pool = ibmr->pool;
-
-	if (ibmr->remap_count >= pool->fmr_attr.max_maps)
-		llist_add(&ibmr->llnode, &pool->drop_list);
-	else
-		llist_add(&ibmr->llnode, &pool->free_list);
-}
diff --git a/net/rds/ib_mr.h b/net/rds/ib_mr.h
index 0c8252d7..f365e1a 100644
--- a/net/rds/ib_mr.h
+++ b/net/rds/ib_mr.h
@@ -43,10 +43,6 @@
 #define RDS_MR_8K_SCALE			(256 / (RDS_MR_8K_MSG_SIZE + 1))
 #define RDS_MR_8K_POOL_SIZE		(RDS_MR_8K_SCALE * (8192 / 2))
 
-struct rds_ib_fmr {
-	struct ib_fmr		*fmr;
-};
-
 enum rds_ib_fr_state {
 	FRMR_IS_FREE,	/* mr invalidated & ready for use */
 	FRMR_IS_INUSE,	/* mr is in use or used & can be invalidated */
@@ -84,7 +80,6 @@ struct rds_ib_mr {
 
 	u8				odp:1;
 	union {
-		struct rds_ib_fmr	fmr;
 		struct rds_ib_frmr	frmr;
 		struct ib_mr		*mr;
 	} u;
@@ -110,7 +105,6 @@ struct rds_ib_mr_pool {
 	unsigned long		max_items_soft;
 	unsigned long		max_free_pinned;
 	struct ib_fmr_attr	fmr_attr;
-	bool			use_fastreg;
 };
 
 extern struct workqueue_struct *rds_ib_mr_wq;
@@ -136,15 +130,9 @@ void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
 
 void __rds_ib_teardown_mr(struct rds_ib_mr *);
 void rds_ib_teardown_mr(struct rds_ib_mr *);
-struct rds_ib_mr *rds_ib_alloc_fmr(struct rds_ib_device *, int);
 struct rds_ib_mr *rds_ib_reuse_mr(struct rds_ib_mr_pool *);
 int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *, int, struct rds_ib_mr **);
-struct rds_ib_mr *rds_ib_reg_fmr(struct rds_ib_device *, struct scatterlist *,
-				 unsigned long, u32 *);
 struct rds_ib_mr *rds_ib_try_reuse_ibmr(struct rds_ib_mr_pool *);
-void rds_ib_unreg_fmr(struct list_head *, unsigned int *,
-		      unsigned long *, unsigned int);
-void rds_ib_free_fmr_list(struct rds_ib_mr *);
 struct rds_ib_mr *rds_ib_reg_frmr(struct rds_ib_device *rds_ibdev,
 				  struct rds_ib_connection *ic,
 				  struct scatterlist *sg,
diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
index b34b24e..c55d77e 100644
--- a/net/rds/ib_rdma.c
+++ b/net/rds/ib_rdma.c
@@ -406,10 +406,7 @@ int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool,
 	if (list_empty(&unmap_list))
 		goto out;
 
-	if (pool->use_fastreg)
-		rds_ib_unreg_frmr(&unmap_list, &nfreed, &unpinned, free_goal);
-	else
-		rds_ib_unreg_fmr(&unmap_list, &nfreed, &unpinned, free_goal);
+	rds_ib_unreg_frmr(&unmap_list, &nfreed, &unpinned, free_goal);
 
 	if (!list_empty(&unmap_list)) {
 		unsigned long flags;
@@ -503,10 +500,7 @@ void rds_ib_free_mr(void *trans_private, int invalidate)
 	}
 
 	/* Return it to the pool's free list */
-	if (rds_ibdev->use_fastreg)
-		rds_ib_free_frmr_list(ibmr);
-	else
-		rds_ib_free_fmr_list(ibmr);
+	rds_ib_free_frmr_list(ibmr);
 
 	atomic_add(ibmr->sg_len, &pool->free_pinned);
 	atomic_inc(&pool->dirty_count);
@@ -622,10 +616,7 @@ void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
 		goto out;
 	}
 
-	if (rds_ibdev->use_fastreg)
-		ibmr = rds_ib_reg_frmr(rds_ibdev, ic, sg, nents, key_ret);
-	else
-		ibmr = rds_ib_reg_fmr(rds_ibdev, sg, nents, key_ret);
+	ibmr = rds_ib_reg_frmr(rds_ibdev, ic, sg, nents, key_ret);
 	if (IS_ERR(ibmr)) {
 		ret = PTR_ERR(ibmr);
 		pr_warn("RDS/IB: rds_ib_get_mr failed (errno=%d)\n", ret);
@@ -681,7 +672,6 @@ struct rds_ib_mr_pool *rds_ib_create_mr_pool(struct rds_ib_device *rds_ibdev,
 	pool->fmr_attr.max_maps = rds_ibdev->fmr_max_remaps;
 	pool->fmr_attr.page_shift = PAGE_SHIFT;
 	pool->max_items_soft = rds_ibdev->max_mrs * 3 / 4;
-	pool->use_fastreg = rds_ibdev->use_fastreg;
 
 	return pool;
 }
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 3/8] RDMA/mthca: remove FMR support for memory registration
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
  2020-05-14 12:02 ` [PATCH 1/8] RDMA/mlx4: remove FMR support for memory registration Max Gurtovoy
  2020-05-14 12:02 ` [PATCH 2/8] RDMA/rds: " Max Gurtovoy
@ 2020-05-14 12:03 ` Max Gurtovoy
  2020-05-14 12:03 ` [PATCH 4/8] RDMA/rdmavt: remove FMR " Max Gurtovoy
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:03 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

Remove the ancient and unsafe FMR method.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/infiniband/hw/mthca/mthca_dev.h      |  10 -
 drivers/infiniband/hw/mthca/mthca_mr.c       | 262 +--------------------------
 drivers/infiniband/hw/mthca/mthca_provider.c |  86 ---------
 3 files changed, 1 insertion(+), 357 deletions(-)

diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 599794c..7550e9d 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -478,16 +478,6 @@ int mthca_mr_alloc_phys(struct mthca_dev *dev, u32 pd,
 			u32 access, struct mthca_mr *mr);
 void mthca_free_mr(struct mthca_dev *dev,  struct mthca_mr *mr);
 
-int mthca_fmr_alloc(struct mthca_dev *dev, u32 pd,
-		    u32 access, struct mthca_fmr *fmr);
-int mthca_tavor_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
-			     int list_len, u64 iova);
-void mthca_tavor_fmr_unmap(struct mthca_dev *dev, struct mthca_fmr *fmr);
-int mthca_arbel_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
-			     int list_len, u64 iova);
-void mthca_arbel_fmr_unmap(struct mthca_dev *dev, struct mthca_fmr *fmr);
-int mthca_free_fmr(struct mthca_dev *dev,  struct mthca_fmr *fmr);
-
 int mthca_map_eq_icm(struct mthca_dev *dev, u64 icm_virt);
 void mthca_unmap_eq_icm(struct mthca_dev *dev);
 
diff --git a/drivers/infiniband/hw/mthca/mthca_mr.c b/drivers/infiniband/hw/mthca/mthca_mr.c
index 4250b2c..ce0e086 100644
--- a/drivers/infiniband/hw/mthca/mthca_mr.c
+++ b/drivers/infiniband/hw/mthca/mthca_mr.c
@@ -541,7 +541,7 @@ int mthca_mr_alloc_phys(struct mthca_dev *dev, u32 pd,
 	return err;
 }
 
-/* Free mr or fmr */
+/* Free mr */
 static void mthca_free_region(struct mthca_dev *dev, u32 lkey)
 {
 	mthca_table_put(dev, dev->mr_table.mpt_table,
@@ -564,266 +564,6 @@ void mthca_free_mr(struct mthca_dev *dev, struct mthca_mr *mr)
 	mthca_free_mtt(dev, mr->mtt);
 }
 
-int mthca_fmr_alloc(struct mthca_dev *dev, u32 pd,
-		    u32 access, struct mthca_fmr *mr)
-{
-	struct mthca_mpt_entry *mpt_entry;
-	struct mthca_mailbox *mailbox;
-	u64 mtt_seg;
-	u32 key, idx;
-	int list_len = mr->attr.max_pages;
-	int err = -ENOMEM;
-	int i;
-
-	if (mr->attr.page_shift < 12 || mr->attr.page_shift >= 32)
-		return -EINVAL;
-
-	/* For Arbel, all MTTs must fit in the same page. */
-	if (mthca_is_memfree(dev) &&
-	    mr->attr.max_pages * sizeof *mr->mem.arbel.mtts > PAGE_SIZE)
-		return -EINVAL;
-
-	mr->maps = 0;
-
-	key = mthca_alloc(&dev->mr_table.mpt_alloc);
-	if (key == -1)
-		return -ENOMEM;
-	key = adjust_key(dev, key);
-
-	idx = key & (dev->limits.num_mpts - 1);
-	mr->ibmr.rkey = mr->ibmr.lkey = hw_index_to_key(dev, key);
-
-	if (mthca_is_memfree(dev)) {
-		err = mthca_table_get(dev, dev->mr_table.mpt_table, key);
-		if (err)
-			goto err_out_mpt_free;
-
-		mr->mem.arbel.mpt = mthca_table_find(dev->mr_table.mpt_table, key, NULL);
-		BUG_ON(!mr->mem.arbel.mpt);
-	} else
-		mr->mem.tavor.mpt = dev->mr_table.tavor_fmr.mpt_base +
-			sizeof *(mr->mem.tavor.mpt) * idx;
-
-	mr->mtt = __mthca_alloc_mtt(dev, list_len, dev->mr_table.fmr_mtt_buddy);
-	if (IS_ERR(mr->mtt)) {
-		err = PTR_ERR(mr->mtt);
-		goto err_out_table;
-	}
-
-	mtt_seg = mr->mtt->first_seg * dev->limits.mtt_seg_size;
-
-	if (mthca_is_memfree(dev)) {
-		mr->mem.arbel.mtts = mthca_table_find(dev->mr_table.mtt_table,
-						      mr->mtt->first_seg,
-						      &mr->mem.arbel.dma_handle);
-		BUG_ON(!mr->mem.arbel.mtts);
-	} else
-		mr->mem.tavor.mtts = dev->mr_table.tavor_fmr.mtt_base + mtt_seg;
-
-	mailbox = mthca_alloc_mailbox(dev, GFP_KERNEL);
-	if (IS_ERR(mailbox)) {
-		err = PTR_ERR(mailbox);
-		goto err_out_free_mtt;
-	}
-
-	mpt_entry = mailbox->buf;
-
-	mpt_entry->flags = cpu_to_be32(MTHCA_MPT_FLAG_SW_OWNS     |
-				       MTHCA_MPT_FLAG_MIO         |
-				       MTHCA_MPT_FLAG_REGION      |
-				       access);
-
-	mpt_entry->page_size = cpu_to_be32(mr->attr.page_shift - 12);
-	mpt_entry->key       = cpu_to_be32(key);
-	mpt_entry->pd        = cpu_to_be32(pd);
-	memset(&mpt_entry->start, 0,
-	       sizeof *mpt_entry - offsetof(struct mthca_mpt_entry, start));
-	mpt_entry->mtt_seg   = cpu_to_be64(dev->mr_table.mtt_base + mtt_seg);
-
-	if (0) {
-		mthca_dbg(dev, "Dumping MPT entry %08x:\n", mr->ibmr.lkey);
-		for (i = 0; i < sizeof (struct mthca_mpt_entry) / 4; ++i) {
-			if (i % 4 == 0)
-				printk("[%02x] ", i * 4);
-			printk(" %08x", be32_to_cpu(((__be32 *) mpt_entry)[i]));
-			if ((i + 1) % 4 == 0)
-				printk("\n");
-		}
-	}
-
-	err = mthca_SW2HW_MPT(dev, mailbox,
-			      key & (dev->limits.num_mpts - 1));
-	if (err) {
-		mthca_warn(dev, "SW2HW_MPT failed (%d)\n", err);
-		goto err_out_mailbox_free;
-	}
-
-	mthca_free_mailbox(dev, mailbox);
-	return 0;
-
-err_out_mailbox_free:
-	mthca_free_mailbox(dev, mailbox);
-
-err_out_free_mtt:
-	mthca_free_mtt(dev, mr->mtt);
-
-err_out_table:
-	mthca_table_put(dev, dev->mr_table.mpt_table, key);
-
-err_out_mpt_free:
-	mthca_free(&dev->mr_table.mpt_alloc, key);
-	return err;
-}
-
-int mthca_free_fmr(struct mthca_dev *dev, struct mthca_fmr *fmr)
-{
-	if (fmr->maps)
-		return -EBUSY;
-
-	mthca_free_region(dev, fmr->ibmr.lkey);
-	mthca_free_mtt(dev, fmr->mtt);
-
-	return 0;
-}
-
-static inline int mthca_check_fmr(struct mthca_fmr *fmr, u64 *page_list,
-				  int list_len, u64 iova)
-{
-	int i, page_mask;
-
-	if (list_len > fmr->attr.max_pages)
-		return -EINVAL;
-
-	page_mask = (1 << fmr->attr.page_shift) - 1;
-
-	/* We are getting page lists, so va must be page aligned. */
-	if (iova & page_mask)
-		return -EINVAL;
-
-	/* Trust the user not to pass misaligned data in page_list */
-	if (0)
-		for (i = 0; i < list_len; ++i) {
-			if (page_list[i] & ~page_mask)
-				return -EINVAL;
-		}
-
-	if (fmr->maps >= fmr->attr.max_maps)
-		return -EINVAL;
-
-	return 0;
-}
-
-
-int mthca_tavor_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
-			     int list_len, u64 iova)
-{
-	struct mthca_fmr *fmr = to_mfmr(ibfmr);
-	struct mthca_dev *dev = to_mdev(ibfmr->device);
-	struct mthca_mpt_entry mpt_entry;
-	u32 key;
-	int i, err;
-
-	err = mthca_check_fmr(fmr, page_list, list_len, iova);
-	if (err)
-		return err;
-
-	++fmr->maps;
-
-	key = tavor_key_to_hw_index(fmr->ibmr.lkey);
-	key += dev->limits.num_mpts;
-	fmr->ibmr.lkey = fmr->ibmr.rkey = tavor_hw_index_to_key(key);
-
-	writeb(MTHCA_MPT_STATUS_SW, fmr->mem.tavor.mpt);
-
-	for (i = 0; i < list_len; ++i) {
-		__be64 mtt_entry = cpu_to_be64(page_list[i] |
-					       MTHCA_MTT_FLAG_PRESENT);
-		mthca_write64_raw(mtt_entry, fmr->mem.tavor.mtts + i);
-	}
-
-	mpt_entry.lkey   = cpu_to_be32(key);
-	mpt_entry.length = cpu_to_be64(list_len * (1ull << fmr->attr.page_shift));
-	mpt_entry.start  = cpu_to_be64(iova);
-
-	__raw_writel((__force u32) mpt_entry.lkey, &fmr->mem.tavor.mpt->key);
-	memcpy_toio(&fmr->mem.tavor.mpt->start, &mpt_entry.start,
-		    offsetof(struct mthca_mpt_entry, window_count) -
-		    offsetof(struct mthca_mpt_entry, start));
-
-	writeb(MTHCA_MPT_STATUS_HW, fmr->mem.tavor.mpt);
-
-	return 0;
-}
-
-int mthca_arbel_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
-			     int list_len, u64 iova)
-{
-	struct mthca_fmr *fmr = to_mfmr(ibfmr);
-	struct mthca_dev *dev = to_mdev(ibfmr->device);
-	u32 key;
-	int i, err;
-
-	err = mthca_check_fmr(fmr, page_list, list_len, iova);
-	if (err)
-		return err;
-
-	++fmr->maps;
-
-	key = arbel_key_to_hw_index(fmr->ibmr.lkey);
-	if (dev->mthca_flags & MTHCA_FLAG_SINAI_OPT)
-		key += SINAI_FMR_KEY_INC;
-	else
-		key += dev->limits.num_mpts;
-	fmr->ibmr.lkey = fmr->ibmr.rkey = arbel_hw_index_to_key(key);
-
-	*(u8 *) fmr->mem.arbel.mpt = MTHCA_MPT_STATUS_SW;
-
-	wmb();
-
-	dma_sync_single_for_cpu(&dev->pdev->dev, fmr->mem.arbel.dma_handle,
-				list_len * sizeof(u64), DMA_TO_DEVICE);
-
-	for (i = 0; i < list_len; ++i)
-		fmr->mem.arbel.mtts[i] = cpu_to_be64(page_list[i] |
-						     MTHCA_MTT_FLAG_PRESENT);
-
-	dma_sync_single_for_device(&dev->pdev->dev, fmr->mem.arbel.dma_handle,
-				   list_len * sizeof(u64), DMA_TO_DEVICE);
-
-	fmr->mem.arbel.mpt->key    = cpu_to_be32(key);
-	fmr->mem.arbel.mpt->lkey   = cpu_to_be32(key);
-	fmr->mem.arbel.mpt->length = cpu_to_be64(list_len * (1ull << fmr->attr.page_shift));
-	fmr->mem.arbel.mpt->start  = cpu_to_be64(iova);
-
-	wmb();
-
-	*(u8 *) fmr->mem.arbel.mpt = MTHCA_MPT_STATUS_HW;
-
-	wmb();
-
-	return 0;
-}
-
-void mthca_tavor_fmr_unmap(struct mthca_dev *dev, struct mthca_fmr *fmr)
-{
-	if (!fmr->maps)
-		return;
-
-	fmr->maps = 0;
-
-	writeb(MTHCA_MPT_STATUS_SW, fmr->mem.tavor.mpt);
-}
-
-void mthca_arbel_fmr_unmap(struct mthca_dev *dev, struct mthca_fmr *fmr)
-{
-	if (!fmr->maps)
-		return;
-
-	fmr->maps = 0;
-
-	*(u8 *) fmr->mem.arbel.mpt = MTHCA_MPT_STATUS_SW;
-}
-
 int mthca_init_mr_table(struct mthca_dev *dev)
 {
 	phys_addr_t addr;
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 69a3e4f..9028df0 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -957,69 +957,6 @@ static int mthca_dereg_mr(struct ib_mr *mr, struct ib_udata *udata)
 	return 0;
 }
 
-static struct ib_fmr *mthca_alloc_fmr(struct ib_pd *pd, int mr_access_flags,
-				      struct ib_fmr_attr *fmr_attr)
-{
-	struct mthca_fmr *fmr;
-	int err;
-
-	fmr = kmalloc(sizeof *fmr, GFP_KERNEL);
-	if (!fmr)
-		return ERR_PTR(-ENOMEM);
-
-	memcpy(&fmr->attr, fmr_attr, sizeof *fmr_attr);
-	err = mthca_fmr_alloc(to_mdev(pd->device), to_mpd(pd)->pd_num,
-			     convert_access(mr_access_flags), fmr);
-
-	if (err) {
-		kfree(fmr);
-		return ERR_PTR(err);
-	}
-
-	return &fmr->ibmr;
-}
-
-static int mthca_dealloc_fmr(struct ib_fmr *fmr)
-{
-	struct mthca_fmr *mfmr = to_mfmr(fmr);
-	int err;
-
-	err = mthca_free_fmr(to_mdev(fmr->device), mfmr);
-	if (err)
-		return err;
-
-	kfree(mfmr);
-	return 0;
-}
-
-static int mthca_unmap_fmr(struct list_head *fmr_list)
-{
-	struct ib_fmr *fmr;
-	int err;
-	struct mthca_dev *mdev = NULL;
-
-	list_for_each_entry(fmr, fmr_list, list) {
-		if (mdev && to_mdev(fmr->device) != mdev)
-			return -EINVAL;
-		mdev = to_mdev(fmr->device);
-	}
-
-	if (!mdev)
-		return 0;
-
-	if (mthca_is_memfree(mdev)) {
-		list_for_each_entry(fmr, fmr_list, list)
-			mthca_arbel_fmr_unmap(mdev, to_mfmr(fmr));
-
-		wmb();
-	} else
-		list_for_each_entry(fmr, fmr_list, list)
-			mthca_tavor_fmr_unmap(mdev, to_mfmr(fmr));
-
-	err = mthca_SYNC_TPT(mdev);
-	return err;
-}
-
 static ssize_t hw_rev_show(struct device *device,
 			   struct device_attribute *attr, char *buf)
 {
@@ -1203,20 +1140,6 @@ static void get_dev_fw_str(struct ib_device *device, char *str)
 	INIT_RDMA_OBJ_SIZE(ib_srq, mthca_srq, ibsrq),
 };
 
-static const struct ib_device_ops mthca_dev_arbel_fmr_ops = {
-	.alloc_fmr = mthca_alloc_fmr,
-	.dealloc_fmr = mthca_dealloc_fmr,
-	.map_phys_fmr = mthca_arbel_map_phys_fmr,
-	.unmap_fmr = mthca_unmap_fmr,
-};
-
-static const struct ib_device_ops mthca_dev_tavor_fmr_ops = {
-	.alloc_fmr = mthca_alloc_fmr,
-	.dealloc_fmr = mthca_dealloc_fmr,
-	.map_phys_fmr = mthca_tavor_map_phys_fmr,
-	.unmap_fmr = mthca_unmap_fmr,
-};
-
 static const struct ib_device_ops mthca_dev_arbel_ops = {
 	.post_recv = mthca_arbel_post_receive,
 	.post_send = mthca_arbel_post_send,
@@ -1275,15 +1198,6 @@ int mthca_register_device(struct mthca_dev *dev)
 					  &mthca_dev_tavor_srq_ops);
 	}
 
-	if (dev->mthca_flags & MTHCA_FLAG_FMR) {
-		if (mthca_is_memfree(dev))
-			ib_set_device_ops(&dev->ib_dev,
-					  &mthca_dev_arbel_fmr_ops);
-		else
-			ib_set_device_ops(&dev->ib_dev,
-					  &mthca_dev_tavor_fmr_ops);
-	}
-
 	ib_set_device_ops(&dev->ib_dev, &mthca_dev_ops);
 
 	if (mthca_is_memfree(dev))
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 4/8] RDMA/rdmavt: remove FMR memory registration
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
                   ` (2 preceding siblings ...)
  2020-05-14 12:03 ` [PATCH 3/8] RDMA/mthca: " Max Gurtovoy
@ 2020-05-14 12:03 ` Max Gurtovoy
  2020-05-14 12:03 ` [PATCH 5/8] RDMA/iser: Remove support for " Max Gurtovoy
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:03 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

Use FRWR method to register memory by default and remove the ancient and
unsafe FMR method.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 drivers/infiniband/sw/rdmavt/mr.c | 154 --------------------------------------
 drivers/infiniband/sw/rdmavt/mr.h |  15 ----
 drivers/infiniband/sw/rdmavt/vt.c |   4 -
 3 files changed, 173 deletions(-)

diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index 72f6534..ddb0c0d 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -714,160 +714,6 @@ int rvt_invalidate_rkey(struct rvt_qp *qp, u32 rkey)
 EXPORT_SYMBOL(rvt_invalidate_rkey);
 
 /**
- * rvt_alloc_fmr - allocate a fast memory region
- * @pd: the protection domain for this memory region
- * @mr_access_flags: access flags for this memory region
- * @fmr_attr: fast memory region attributes
- *
- * Return: the memory region on success, otherwise returns an errno.
- */
-struct ib_fmr *rvt_alloc_fmr(struct ib_pd *pd, int mr_access_flags,
-			     struct ib_fmr_attr *fmr_attr)
-{
-	struct rvt_fmr *fmr;
-	int m;
-	struct ib_fmr *ret;
-	int rval = -ENOMEM;
-
-	/* Allocate struct plus pointers to first level page tables. */
-	m = (fmr_attr->max_pages + RVT_SEGSZ - 1) / RVT_SEGSZ;
-	fmr = kzalloc(struct_size(fmr, mr.map, m), GFP_KERNEL);
-	if (!fmr)
-		goto bail;
-
-	rval = rvt_init_mregion(&fmr->mr, pd, fmr_attr->max_pages,
-				PERCPU_REF_INIT_ATOMIC);
-	if (rval)
-		goto bail;
-
-	/*
-	 * ib_alloc_fmr() will initialize fmr->ibfmr except for lkey &
-	 * rkey.
-	 */
-	rval = rvt_alloc_lkey(&fmr->mr, 0);
-	if (rval)
-		goto bail_mregion;
-	fmr->ibfmr.rkey = fmr->mr.lkey;
-	fmr->ibfmr.lkey = fmr->mr.lkey;
-	/*
-	 * Resources are allocated but no valid mapping (RKEY can't be
-	 * used).
-	 */
-	fmr->mr.access_flags = mr_access_flags;
-	fmr->mr.max_segs = fmr_attr->max_pages;
-	fmr->mr.page_shift = fmr_attr->page_shift;
-
-	ret = &fmr->ibfmr;
-done:
-	return ret;
-
-bail_mregion:
-	rvt_deinit_mregion(&fmr->mr);
-bail:
-	kfree(fmr);
-	ret = ERR_PTR(rval);
-	goto done;
-}
-
-/**
- * rvt_map_phys_fmr - set up a fast memory region
- * @ibfmr: the fast memory region to set up
- * @page_list: the list of pages to associate with the fast memory region
- * @list_len: the number of pages to associate with the fast memory region
- * @iova: the virtual address of the start of the fast memory region
- *
- * This may be called from interrupt context.
- *
- * Return: 0 on success
- */
-
-int rvt_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
-		     int list_len, u64 iova)
-{
-	struct rvt_fmr *fmr = to_ifmr(ibfmr);
-	struct rvt_lkey_table *rkt;
-	unsigned long flags;
-	int m, n;
-	unsigned long i;
-	u32 ps;
-	struct rvt_dev_info *rdi = ib_to_rvt(ibfmr->device);
-
-	i = atomic_long_read(&fmr->mr.refcount.count);
-	if (i > 2)
-		return -EBUSY;
-
-	if (list_len > fmr->mr.max_segs)
-		return -EINVAL;
-
-	rkt = &rdi->lkey_table;
-	spin_lock_irqsave(&rkt->lock, flags);
-	fmr->mr.user_base = iova;
-	fmr->mr.iova = iova;
-	ps = 1 << fmr->mr.page_shift;
-	fmr->mr.length = list_len * ps;
-	m = 0;
-	n = 0;
-	for (i = 0; i < list_len; i++) {
-		fmr->mr.map[m]->segs[n].vaddr = (void *)page_list[i];
-		fmr->mr.map[m]->segs[n].length = ps;
-		trace_rvt_mr_fmr_seg(&fmr->mr, m, n, (void *)page_list[i], ps);
-		if (++n == RVT_SEGSZ) {
-			m++;
-			n = 0;
-		}
-	}
-	spin_unlock_irqrestore(&rkt->lock, flags);
-	return 0;
-}
-
-/**
- * rvt_unmap_fmr - unmap fast memory regions
- * @fmr_list: the list of fast memory regions to unmap
- *
- * Return: 0 on success.
- */
-int rvt_unmap_fmr(struct list_head *fmr_list)
-{
-	struct rvt_fmr *fmr;
-	struct rvt_lkey_table *rkt;
-	unsigned long flags;
-	struct rvt_dev_info *rdi;
-
-	list_for_each_entry(fmr, fmr_list, ibfmr.list) {
-		rdi = ib_to_rvt(fmr->ibfmr.device);
-		rkt = &rdi->lkey_table;
-		spin_lock_irqsave(&rkt->lock, flags);
-		fmr->mr.user_base = 0;
-		fmr->mr.iova = 0;
-		fmr->mr.length = 0;
-		spin_unlock_irqrestore(&rkt->lock, flags);
-	}
-	return 0;
-}
-
-/**
- * rvt_dealloc_fmr - deallocate a fast memory region
- * @ibfmr: the fast memory region to deallocate
- *
- * Return: 0 on success.
- */
-int rvt_dealloc_fmr(struct ib_fmr *ibfmr)
-{
-	struct rvt_fmr *fmr = to_ifmr(ibfmr);
-	int ret = 0;
-
-	rvt_free_lkey(&fmr->mr);
-	rvt_put_mr(&fmr->mr); /* will set completion if last */
-	ret = rvt_check_refs(&fmr->mr, __func__);
-	if (ret)
-		goto out;
-	rvt_deinit_mregion(&fmr->mr);
-	kfree(fmr);
-out:
-	return ret;
-}
-
-/**
  * rvt_sge_adjacent - is isge compressible
  * @last_sge: last outgoing SGE written
  * @sge: SGE to check
diff --git a/drivers/infiniband/sw/rdmavt/mr.h b/drivers/infiniband/sw/rdmavt/mr.h
index 2c8d075..780fc63 100644
--- a/drivers/infiniband/sw/rdmavt/mr.h
+++ b/drivers/infiniband/sw/rdmavt/mr.h
@@ -49,10 +49,6 @@
  */
 
 #include <rdma/rdma_vt.h>
-struct rvt_fmr {
-	struct ib_fmr ibfmr;
-	struct rvt_mregion mr;        /* must be last */
-};
 
 struct rvt_mr {
 	struct ib_mr ibmr;
@@ -60,11 +56,6 @@ struct rvt_mr {
 	struct rvt_mregion mr;  /* must be last */
 };
 
-static inline struct rvt_fmr *to_ifmr(struct ib_fmr *ibfmr)
-{
-	return container_of(ibfmr, struct rvt_fmr, ibfmr);
-}
-
 static inline struct rvt_mr *to_imr(struct ib_mr *ibmr)
 {
 	return container_of(ibmr, struct rvt_mr, ibmr);
@@ -83,11 +74,5 @@ struct ib_mr *rvt_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
 			   u32 max_num_sg, struct ib_udata *udata);
 int rvt_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
 		  int sg_nents, unsigned int *sg_offset);
-struct ib_fmr *rvt_alloc_fmr(struct ib_pd *pd, int mr_access_flags,
-			     struct ib_fmr_attr *fmr_attr);
-int rvt_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
-		     int list_len, u64 iova);
-int rvt_unmap_fmr(struct list_head *fmr_list);
-int rvt_dealloc_fmr(struct ib_fmr *ibfmr);
 
 #endif          /* DEF_RVTMR_H */
diff --git a/drivers/infiniband/sw/rdmavt/vt.c b/drivers/infiniband/sw/rdmavt/vt.c
index 72b031a..f904bb3 100644
--- a/drivers/infiniband/sw/rdmavt/vt.c
+++ b/drivers/infiniband/sw/rdmavt/vt.c
@@ -378,7 +378,6 @@ enum {
 static const struct ib_device_ops rvt_dev_ops = {
 	.uverbs_abi_ver = RVT_UVERBS_ABI_VERSION,
 
-	.alloc_fmr = rvt_alloc_fmr,
 	.alloc_mr = rvt_alloc_mr,
 	.alloc_pd = rvt_alloc_pd,
 	.alloc_ucontext = rvt_alloc_ucontext,
@@ -387,7 +386,6 @@ enum {
 	.create_cq = rvt_create_cq,
 	.create_qp = rvt_create_qp,
 	.create_srq = rvt_create_srq,
-	.dealloc_fmr = rvt_dealloc_fmr,
 	.dealloc_pd = rvt_dealloc_pd,
 	.dealloc_ucontext = rvt_dealloc_ucontext,
 	.dereg_mr = rvt_dereg_mr,
@@ -399,7 +397,6 @@ enum {
 	.get_dma_mr = rvt_get_dma_mr,
 	.get_port_immutable = rvt_get_port_immutable,
 	.map_mr_sg = rvt_map_mr_sg,
-	.map_phys_fmr = rvt_map_phys_fmr,
 	.mmap = rvt_mmap,
 	.modify_ah = rvt_modify_ah,
 	.modify_device = rvt_modify_device,
@@ -420,7 +417,6 @@ enum {
 	.reg_user_mr = rvt_reg_user_mr,
 	.req_notify_cq = rvt_req_notify_cq,
 	.resize_cq = rvt_resize_cq,
-	.unmap_fmr = rvt_unmap_fmr,
 
 	INIT_RDMA_OBJ_SIZE(ib_ah, rvt_ah, ibah),
 	INIT_RDMA_OBJ_SIZE(ib_cq, rvt_cq, ibcq),
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 5/8] RDMA/iser: Remove support for FMR memory registration
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
                   ` (3 preceding siblings ...)
  2020-05-14 12:03 ` [PATCH 4/8] RDMA/rdmavt: remove FMR " Max Gurtovoy
@ 2020-05-14 12:03 ` Max Gurtovoy
  2020-05-14 12:03 ` [PATCH 6/8] RDMA/srp: remove " Max Gurtovoy
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:03 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

From: Israel Rukshin <israelr@mellanox.com>

FMR is not supported on most recent RDMA devices (that use fast memory
registration mechanism). Also, FMR was recently removed from NFS/RDMA
ULP.

Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
 drivers/infiniband/ulp/iser/iscsi_iser.h     |  79 +----------
 drivers/infiniband/ulp/iser/iser_initiator.c |  19 ++-
 drivers/infiniband/ulp/iser/iser_memory.c    | 188 ++-------------------------
 drivers/infiniband/ulp/iser/iser_verbs.c     | 126 +++---------------
 4 files changed, 40 insertions(+), 372 deletions(-)

diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 029c001..1d77c7f 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -65,7 +65,6 @@
 #include <linux/in6.h>
 
 #include <rdma/ib_verbs.h>
-#include <rdma/ib_fmr_pool.h>
 #include <rdma/rdma_cm.h>
 
 #define DRV_NAME	"iser"
@@ -313,33 +312,6 @@ struct iser_comp {
 };
 
 /**
- * struct iser_reg_ops - Memory registration operations
- *     per-device registration schemes
- *
- * @alloc_reg_res:     Allocate registration resources
- * @free_reg_res:      Free registration resources
- * @reg_mem:           Register memory buffers
- * @unreg_mem:         Un-register memory buffers
- * @reg_desc_get:      Get a registration descriptor for pool
- * @reg_desc_put:      Get a registration descriptor to pool
- */
-struct iser_reg_ops {
-	int            (*alloc_reg_res)(struct ib_conn *ib_conn,
-					unsigned cmds_max,
-					unsigned int size);
-	void           (*free_reg_res)(struct ib_conn *ib_conn);
-	int            (*reg_mem)(struct iscsi_iser_task *iser_task,
-				  struct iser_data_buf *mem,
-				  struct iser_reg_resources *rsc,
-				  struct iser_mem_reg *reg);
-	void           (*unreg_mem)(struct iscsi_iser_task *iser_task,
-				    enum iser_data_dir cmd_dir);
-	struct iser_fr_desc * (*reg_desc_get)(struct ib_conn *ib_conn);
-	void           (*reg_desc_put)(struct ib_conn *ib_conn,
-				       struct iser_fr_desc *desc);
-};
-
-/**
  * struct iser_device - iSER device handle
  *
  * @ib_device:     RDMA device
@@ -351,8 +323,6 @@ struct iser_reg_ops {
  * @comps_used:    Number of completion contexts used, Min between online
  *                 cpus and device max completion vectors
  * @comps:         Dinamically allocated array of completion handlers
- * @reg_ops:       Registration ops
- * @remote_inv_sup: Remote invalidate is supported on this device
  */
 struct iser_device {
 	struct ib_device             *ib_device;
@@ -362,26 +332,18 @@ struct iser_device {
 	int                          refcount;
 	int			     comps_used;
 	struct iser_comp	     *comps;
-	const struct iser_reg_ops    *reg_ops;
-	bool                         remote_inv_sup;
 };
 
 /**
  * struct iser_reg_resources - Fast registration resources
  *
  * @mr:         memory region
- * @fmr_pool:   pool of fmrs
  * @sig_mr:     signature memory region
- * @page_vec:   fast reg page list used by fmr pool
  * @mr_valid:   is mr valid indicator
  */
 struct iser_reg_resources {
-	union {
-		struct ib_mr             *mr;
-		struct ib_fmr_pool       *fmr_pool;
-	};
+	struct ib_mr                     *mr;
 	struct ib_mr                     *sig_mr;
-	struct iser_page_vec             *page_vec;
 	u8				  mr_valid:1;
 };
 
@@ -403,7 +365,7 @@ struct iser_fr_desc {
  * struct iser_fr_pool - connection fast registration pool
  *
  * @list:                list of fastreg descriptors
- * @lock:                protects fmr/fastreg pool
+ * @lock:                protects fastreg pool
  * @size:                size of the pool
  */
 struct iser_fr_pool {
@@ -518,12 +480,6 @@ struct iscsi_iser_task {
 	struct iser_data_buf         prot[ISER_DIRS_NUM];
 };
 
-struct iser_page_vec {
-	u64 *pages;
-	int npages;
-	struct ib_mr fake_mr;
-};
-
 /**
  * struct iser_global - iSER global context
  *
@@ -548,8 +504,6 @@ struct iser_global {
 extern unsigned int iser_max_sectors;
 extern bool iser_always_reg;
 
-int iser_assign_reg_ops(struct iser_device *device);
-
 int iser_send_control(struct iscsi_conn *conn,
 		      struct iscsi_task *task);
 
@@ -591,22 +545,17 @@ void iser_finalize_rdma_unaligned_sg(struct iscsi_iser_task *iser_task,
 				     struct iser_data_buf *mem,
 				     enum iser_data_dir cmd_dir);
 
-int iser_reg_rdma_mem(struct iscsi_iser_task *task,
-		      enum iser_data_dir dir,
-		      bool all_imm);
-void iser_unreg_rdma_mem(struct iscsi_iser_task *task,
-			 enum iser_data_dir dir);
+int iser_reg_mem_fastreg(struct iscsi_iser_task *task,
+			 enum iser_data_dir dir,
+			 bool all_imm);
+void iser_unreg_mem_fastreg(struct iscsi_iser_task *task,
+			    enum iser_data_dir dir);
 
 int  iser_connect(struct iser_conn *iser_conn,
 		  struct sockaddr *src_addr,
 		  struct sockaddr *dst_addr,
 		  int non_blocking);
 
-void iser_unreg_mem_fmr(struct iscsi_iser_task *iser_task,
-			enum iser_data_dir cmd_dir);
-void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task,
-			    enum iser_data_dir cmd_dir);
-
 int  iser_post_recvl(struct iser_conn *iser_conn);
 int  iser_post_recvm(struct iser_conn *iser_conn, int count);
 int  iser_post_send(struct ib_conn *ib_conn, struct iser_tx_desc *tx_desc,
@@ -625,26 +574,12 @@ int  iser_initialize_task_headers(struct iscsi_task *task,
 			struct iser_tx_desc *tx_desc);
 int iser_alloc_rx_descriptors(struct iser_conn *iser_conn,
 			      struct iscsi_session *session);
-int iser_alloc_fmr_pool(struct ib_conn *ib_conn,
-			unsigned cmds_max,
-			unsigned int size);
-void iser_free_fmr_pool(struct ib_conn *ib_conn);
 int iser_alloc_fastreg_pool(struct ib_conn *ib_conn,
 			    unsigned cmds_max,
 			    unsigned int size);
 void iser_free_fastreg_pool(struct ib_conn *ib_conn);
 u8 iser_check_task_pi_status(struct iscsi_iser_task *iser_task,
 			     enum iser_data_dir cmd_dir, sector_t *sector);
-struct iser_fr_desc *
-iser_reg_desc_get_fr(struct ib_conn *ib_conn);
-void
-iser_reg_desc_put_fr(struct ib_conn *ib_conn,
-		     struct iser_fr_desc *desc);
-struct iser_fr_desc *
-iser_reg_desc_get_fmr(struct ib_conn *ib_conn);
-void
-iser_reg_desc_put_fmr(struct ib_conn *ib_conn,
-		      struct iser_fr_desc *desc);
 
 static inline struct iser_conn *
 to_iser_conn(struct ib_conn *ib_conn)
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c b/drivers/infiniband/ulp/iser/iser_initiator.c
index 4a7045b..27a6f75 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -72,7 +72,7 @@ static int iser_prepare_read_cmd(struct iscsi_task *task)
 			return err;
 	}
 
-	err = iser_reg_rdma_mem(iser_task, ISER_DIR_IN, false);
+	err = iser_reg_mem_fastreg(iser_task, ISER_DIR_IN, false);
 	if (err) {
 		iser_err("Failed to set up Data-IN RDMA\n");
 		return err;
@@ -126,8 +126,8 @@ static int iser_prepare_read_cmd(struct iscsi_task *task)
 			return err;
 	}
 
-	err = iser_reg_rdma_mem(iser_task, ISER_DIR_OUT,
-				buf_out->data_len == imm_sz);
+	err = iser_reg_mem_fastreg(iser_task, ISER_DIR_OUT,
+				   buf_out->data_len == imm_sz);
 	if (err != 0) {
 		iser_err("Failed to register write cmd RDMA mem\n");
 		return err;
@@ -250,8 +250,8 @@ int iser_alloc_rx_descriptors(struct iser_conn *iser_conn,
 	iser_conn->qp_max_recv_dtos_mask = session->cmds_max - 1; /* cmds_max is 2^N */
 	iser_conn->min_posted_rx = iser_conn->qp_max_recv_dtos >> 2;
 
-	if (device->reg_ops->alloc_reg_res(ib_conn, session->scsi_cmds_max,
-					   iser_conn->pages_per_mr))
+	if (iser_alloc_fastreg_pool(ib_conn, session->scsi_cmds_max,
+				    iser_conn->pages_per_mr))
 		goto create_rdma_reg_res_failed;
 
 	if (iser_alloc_login_buf(iser_conn))
@@ -293,7 +293,7 @@ int iser_alloc_rx_descriptors(struct iser_conn *iser_conn,
 rx_desc_alloc_fail:
 	iser_free_login_buf(iser_conn);
 alloc_login_buf_fail:
-	device->reg_ops->free_reg_res(ib_conn);
+	iser_free_fastreg_pool(ib_conn);
 create_rdma_reg_res_failed:
 	iser_err("failed allocating rx descriptors / data buffers\n");
 	return -ENOMEM;
@@ -306,8 +306,7 @@ void iser_free_rx_descriptors(struct iser_conn *iser_conn)
 	struct ib_conn *ib_conn = &iser_conn->ib_conn;
 	struct iser_device *device = ib_conn->device;
 
-	if (device->reg_ops->free_reg_res)
-		device->reg_ops->free_reg_res(ib_conn);
+	iser_free_fastreg_pool(ib_conn);
 
 	rx_desc = iser_conn->rx_descs;
 	for (i = 0; i < iser_conn->qp_max_recv_dtos; i++, rx_desc++)
@@ -768,7 +767,7 @@ void iser_task_rdma_finalize(struct iscsi_iser_task *iser_task)
 	int prot_count = scsi_prot_sg_count(iser_task->sc);
 
 	if (iser_task->dir[ISER_DIR_IN]) {
-		iser_unreg_rdma_mem(iser_task, ISER_DIR_IN);
+		iser_unreg_mem_fastreg(iser_task, ISER_DIR_IN);
 		iser_dma_unmap_task_data(iser_task,
 					 &iser_task->data[ISER_DIR_IN],
 					 DMA_FROM_DEVICE);
@@ -779,7 +778,7 @@ void iser_task_rdma_finalize(struct iscsi_iser_task *iser_task)
 	}
 
 	if (iser_task->dir[ISER_DIR_OUT]) {
-		iser_unreg_rdma_mem(iser_task, ISER_DIR_OUT);
+		iser_unreg_mem_fastreg(iser_task, ISER_DIR_OUT);
 		iser_dma_unmap_task_data(iser_task,
 					 &iser_task->data[ISER_DIR_OUT],
 					 DMA_TO_DEVICE);
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c
index 999ef7c..d4e057f 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -38,62 +38,13 @@
 #include <linux/scatterlist.h>
 
 #include "iscsi_iser.h"
-static
-int iser_fast_reg_fmr(struct iscsi_iser_task *iser_task,
-		      struct iser_data_buf *mem,
-		      struct iser_reg_resources *rsc,
-		      struct iser_mem_reg *mem_reg);
-static
-int iser_fast_reg_mr(struct iscsi_iser_task *iser_task,
-		     struct iser_data_buf *mem,
-		     struct iser_reg_resources *rsc,
-		     struct iser_mem_reg *mem_reg);
-
-static const struct iser_reg_ops fastreg_ops = {
-	.alloc_reg_res	= iser_alloc_fastreg_pool,
-	.free_reg_res	= iser_free_fastreg_pool,
-	.reg_mem	= iser_fast_reg_mr,
-	.unreg_mem	= iser_unreg_mem_fastreg,
-	.reg_desc_get	= iser_reg_desc_get_fr,
-	.reg_desc_put	= iser_reg_desc_put_fr,
-};
-
-static const struct iser_reg_ops fmr_ops = {
-	.alloc_reg_res	= iser_alloc_fmr_pool,
-	.free_reg_res	= iser_free_fmr_pool,
-	.reg_mem	= iser_fast_reg_fmr,
-	.unreg_mem	= iser_unreg_mem_fmr,
-	.reg_desc_get	= iser_reg_desc_get_fmr,
-	.reg_desc_put	= iser_reg_desc_put_fmr,
-};
 
 void iser_reg_comp(struct ib_cq *cq, struct ib_wc *wc)
 {
 	iser_err_comp(wc, "memreg");
 }
 
-int iser_assign_reg_ops(struct iser_device *device)
-{
-	struct ib_device *ib_dev = device->ib_device;
-
-	/* Assign function handles  - based on FMR support */
-	if (ib_dev->ops.alloc_fmr && ib_dev->ops.dealloc_fmr &&
-	    ib_dev->ops.map_phys_fmr && ib_dev->ops.unmap_fmr) {
-		iser_info("FMR supported, using FMR for registration\n");
-		device->reg_ops = &fmr_ops;
-	} else if (ib_dev->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) {
-		iser_info("FastReg supported, using FastReg for registration\n");
-		device->reg_ops = &fastreg_ops;
-		device->remote_inv_sup = iser_always_reg;
-	} else {
-		iser_err("IB device does not support FMRs nor FastRegs, can't register memory\n");
-		return -1;
-	}
-
-	return 0;
-}
-
-struct iser_fr_desc *
+static struct iser_fr_desc *
 iser_reg_desc_get_fr(struct ib_conn *ib_conn)
 {
 	struct iser_fr_pool *fr_pool = &ib_conn->fr_pool;
@@ -109,7 +60,7 @@ struct iser_fr_desc *
 	return desc;
 }
 
-void
+static void
 iser_reg_desc_put_fr(struct ib_conn *ib_conn,
 		     struct iser_fr_desc *desc)
 {
@@ -121,44 +72,6 @@ struct iser_fr_desc *
 	spin_unlock_irqrestore(&fr_pool->lock, flags);
 }
 
-struct iser_fr_desc *
-iser_reg_desc_get_fmr(struct ib_conn *ib_conn)
-{
-	struct iser_fr_pool *fr_pool = &ib_conn->fr_pool;
-
-	return list_first_entry(&fr_pool->list,
-				struct iser_fr_desc, list);
-}
-
-void
-iser_reg_desc_put_fmr(struct ib_conn *ib_conn,
-		      struct iser_fr_desc *desc)
-{
-}
-
-static void iser_data_buf_dump(struct iser_data_buf *data,
-			       struct ib_device *ibdev)
-{
-	struct scatterlist *sg;
-	int i;
-
-	for_each_sg(data->sg, sg, data->dma_nents, i)
-		iser_dbg("sg[%d] dma_addr:0x%lX page:0x%p "
-			 "off:0x%x sz:0x%x dma_len:0x%x\n",
-			 i, (unsigned long)sg_dma_address(sg),
-			 sg_page(sg), sg->offset, sg->length, sg_dma_len(sg));
-}
-
-static void iser_dump_page_vec(struct iser_page_vec *page_vec)
-{
-	int i;
-
-	iser_err("page vec npages %d data length %lld\n",
-		 page_vec->npages, page_vec->fake_mr.length);
-	for (i = 0; i < page_vec->npages; i++)
-		iser_err("vec[%d]: %llx\n", i, page_vec->pages[i]);
-}
-
 int iser_dma_map_task_data(struct iscsi_iser_task *iser_task,
 			    struct iser_data_buf *data,
 			    enum iser_data_dir iser_dir,
@@ -213,84 +126,9 @@ void iser_dma_unmap_task_data(struct iscsi_iser_task *iser_task,
 	return 0;
 }
 
-static int iser_set_page(struct ib_mr *mr, u64 addr)
-{
-	struct iser_page_vec *page_vec =
-		container_of(mr, struct iser_page_vec, fake_mr);
-
-	page_vec->pages[page_vec->npages++] = addr;
-
-	return 0;
-}
-
-static
-int iser_fast_reg_fmr(struct iscsi_iser_task *iser_task,
-		      struct iser_data_buf *mem,
-		      struct iser_reg_resources *rsc,
-		      struct iser_mem_reg *reg)
-{
-	struct ib_conn *ib_conn = &iser_task->iser_conn->ib_conn;
-	struct iser_device *device = ib_conn->device;
-	struct iser_page_vec *page_vec = rsc->page_vec;
-	struct ib_fmr_pool *fmr_pool = rsc->fmr_pool;
-	struct ib_pool_fmr *fmr;
-	int ret, plen;
-
-	page_vec->npages = 0;
-	page_vec->fake_mr.page_size = SZ_4K;
-	plen = ib_sg_to_pages(&page_vec->fake_mr, mem->sg,
-			      mem->dma_nents, NULL, iser_set_page);
-	if (unlikely(plen < mem->dma_nents)) {
-		iser_err("page vec too short to hold this SG\n");
-		iser_data_buf_dump(mem, device->ib_device);
-		iser_dump_page_vec(page_vec);
-		return -EINVAL;
-	}
-
-	fmr  = ib_fmr_pool_map_phys(fmr_pool, page_vec->pages,
-				    page_vec->npages, page_vec->pages[0]);
-	if (IS_ERR(fmr)) {
-		ret = PTR_ERR(fmr);
-		iser_err("ib_fmr_pool_map_phys failed: %d\n", ret);
-		return ret;
-	}
-
-	reg->sge.lkey = fmr->fmr->lkey;
-	reg->rkey = fmr->fmr->rkey;
-	reg->sge.addr = page_vec->fake_mr.iova;
-	reg->sge.length = page_vec->fake_mr.length;
-	reg->mem_h = fmr;
-
-	iser_dbg("fmr reg: lkey=0x%x, rkey=0x%x, addr=0x%llx,"
-		 " length=0x%x\n", reg->sge.lkey, reg->rkey,
-		 reg->sge.addr, reg->sge.length);
-
-	return 0;
-}
-
-/**
- * Unregister (previosuly registered using FMR) memory.
- * If memory is non-FMR does nothing.
- */
-void iser_unreg_mem_fmr(struct iscsi_iser_task *iser_task,
-			enum iser_data_dir cmd_dir)
-{
-	struct iser_mem_reg *reg = &iser_task->rdma_reg[cmd_dir];
-
-	if (!reg->mem_h)
-		return;
-
-	iser_dbg("PHYSICAL Mem.Unregister mem_h %p\n", reg->mem_h);
-
-	ib_fmr_pool_unmap((struct ib_pool_fmr *)reg->mem_h);
-
-	reg->mem_h = NULL;
-}
-
 void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task,
 			    enum iser_data_dir cmd_dir)
 {
-	struct iser_device *device = iser_task->iser_conn->ib_conn.device;
 	struct iser_mem_reg *reg = &iser_task->rdma_reg[cmd_dir];
 	struct iser_fr_desc *desc;
 	struct ib_mr_status mr_status;
@@ -312,7 +150,7 @@ void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task,
 		ib_check_mr_status(desc->rsc.sig_mr, IB_MR_CHECK_SIG_STATUS,
 				   &mr_status);
 	}
-	device->reg_ops->reg_desc_put(&iser_task->iser_conn->ib_conn, desc);
+	iser_reg_desc_put_fr(&iser_task->iser_conn->ib_conn, reg->mem_h);
 	reg->mem_h = NULL;
 }
 
@@ -509,15 +347,14 @@ static int iser_fast_reg_mr(struct iscsi_iser_task *iser_task,
 	if (use_dma_key)
 		return iser_reg_dma(device, mem, reg);
 
-	return device->reg_ops->reg_mem(task, mem, &desc->rsc, reg);
+	return iser_fast_reg_mr(task, mem, &desc->rsc, reg);
 }
 
-int iser_reg_rdma_mem(struct iscsi_iser_task *task,
-		      enum iser_data_dir dir,
-		      bool all_imm)
+int iser_reg_mem_fastreg(struct iscsi_iser_task *task,
+			 enum iser_data_dir dir,
+			 bool all_imm)
 {
 	struct ib_conn *ib_conn = &task->iser_conn->ib_conn;
-	struct iser_device *device = ib_conn->device;
 	struct iser_data_buf *mem = &task->data[dir];
 	struct iser_mem_reg *reg = &task->rdma_reg[dir];
 	struct iser_fr_desc *desc = NULL;
@@ -528,7 +365,7 @@ int iser_reg_rdma_mem(struct iscsi_iser_task *task,
 		      scsi_get_prot_op(task->sc) == SCSI_PROT_NORMAL;
 
 	if (!use_dma_key) {
-		desc = device->reg_ops->reg_desc_get(ib_conn);
+		desc = iser_reg_desc_get_fr(ib_conn);
 		reg->mem_h = desc;
 	}
 
@@ -549,15 +386,8 @@ int iser_reg_rdma_mem(struct iscsi_iser_task *task,
 
 err_reg:
 	if (desc)
-		device->reg_ops->reg_desc_put(ib_conn, desc);
+		iser_reg_desc_put_fr(ib_conn, desc);
 
 	return err;
 }
 
-void iser_unreg_rdma_mem(struct iscsi_iser_task *task,
-			 enum iser_data_dir dir)
-{
-	struct iser_device *device = task->iser_conn->ib_conn.device;
-
-	device->reg_ops->unreg_mem(task, dir);
-}
diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c
index 127887c..c1f44c4 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -68,11 +68,12 @@ static void iser_event_handler(struct ib_event_handler *handler,
 static int iser_create_device_ib_res(struct iser_device *device)
 {
 	struct ib_device *ib_dev = device->ib_device;
-	int ret, i, max_cqe;
+	int i, max_cqe;
 
-	ret = iser_assign_reg_ops(device);
-	if (ret)
-		return ret;
+	if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS)) {
+		iser_err("IB device does not support memory registrations\n");
+		return -1;
+	}
 
 	device->comps_used = min_t(int, num_online_cpus(),
 				 ib_dev->num_comp_vectors);
@@ -147,96 +148,6 @@ static void iser_free_device_ib_res(struct iser_device *device)
 	device->pd = NULL;
 }
 
-/**
- * iser_alloc_fmr_pool - Creates FMR pool and page_vector
- * @ib_conn: connection RDMA resources
- * @cmds_max: max number of SCSI commands for this connection
- * @size: max number of pages per map request
- *
- * Return: 0 on success, or errno code on failure
- */
-int iser_alloc_fmr_pool(struct ib_conn *ib_conn,
-			unsigned cmds_max,
-			unsigned int size)
-{
-	struct iser_device *device = ib_conn->device;
-	struct iser_fr_pool *fr_pool = &ib_conn->fr_pool;
-	struct iser_page_vec *page_vec;
-	struct iser_fr_desc *desc;
-	struct ib_fmr_pool *fmr_pool;
-	struct ib_fmr_pool_param params;
-	int ret;
-
-	INIT_LIST_HEAD(&fr_pool->list);
-	spin_lock_init(&fr_pool->lock);
-
-	desc = kzalloc(sizeof(*desc), GFP_KERNEL);
-	if (!desc)
-		return -ENOMEM;
-
-	page_vec = kmalloc(sizeof(*page_vec) + (sizeof(u64) * size),
-			   GFP_KERNEL);
-	if (!page_vec) {
-		ret = -ENOMEM;
-		goto err_frpl;
-	}
-
-	page_vec->pages = (u64 *)(page_vec + 1);
-
-	params.page_shift        = ilog2(SZ_4K);
-	params.max_pages_per_fmr = size;
-	/* make the pool size twice the max number of SCSI commands *
-	 * the ML is expected to queue, watermark for unmap at 50%  */
-	params.pool_size	 = cmds_max * 2;
-	params.dirty_watermark	 = cmds_max;
-	params.cache		 = 0;
-	params.flush_function	 = NULL;
-	params.access		 = (IB_ACCESS_LOCAL_WRITE  |
-				    IB_ACCESS_REMOTE_WRITE |
-				    IB_ACCESS_REMOTE_READ);
-
-	fmr_pool = ib_create_fmr_pool(device->pd, &params);
-	if (IS_ERR(fmr_pool)) {
-		ret = PTR_ERR(fmr_pool);
-		iser_err("FMR allocation failed, err %d\n", ret);
-		goto err_fmr;
-	}
-
-	desc->rsc.page_vec = page_vec;
-	desc->rsc.fmr_pool = fmr_pool;
-	list_add(&desc->list, &fr_pool->list);
-
-	return 0;
-
-err_fmr:
-	kfree(page_vec);
-err_frpl:
-	kfree(desc);
-
-	return ret;
-}
-
-/**
- * iser_free_fmr_pool - releases the FMR pool and page vec
- * @ib_conn: connection RDMA resources
- */
-void iser_free_fmr_pool(struct ib_conn *ib_conn)
-{
-	struct iser_fr_pool *fr_pool = &ib_conn->fr_pool;
-	struct iser_fr_desc *desc;
-
-	desc = list_first_entry(&fr_pool->list,
-				struct iser_fr_desc, list);
-	list_del(&desc->list);
-
-	iser_info("freeing conn %p fmr pool %p\n",
-		  ib_conn, desc->rsc.fmr_pool);
-
-	ib_destroy_fmr_pool(desc->rsc.fmr_pool);
-	kfree(desc->rsc.page_vec);
-	kfree(desc);
-}
-
 static struct iser_fr_desc *
 iser_create_fastreg_desc(struct iser_device *device,
 			 struct ib_pd *pd,
@@ -667,13 +578,12 @@ static void iser_connect_error(struct rdma_cm_id *cma_id)
 	u32 max_num_sg;
 
 	/*
-	 * FRs without SG_GAPS or FMRs can only map up to a (device) page per
-	 * entry, but if the first entry is misaligned we'll end up using two
-	 * entries (head and tail) for a single page worth data, so one
-	 * additional entry is required.
+	 * FRs without SG_GAPS can only map up to a (device) page per entry,
+	 * but if the first entry is misaligned we'll end up using two entries
+	 * (head and tail) for a single page worth data, so one additional
+	 * entry is required.
 	 */
-	if ((attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) &&
-	    (attr->device_cap_flags & IB_DEVICE_SG_GAPS_REG))
+	if (attr->device_cap_flags & IB_DEVICE_SG_GAPS_REG)
 		reserved_mr_pages = 0;
 	else
 		reserved_mr_pages = 1;
@@ -684,14 +594,8 @@ static void iser_connect_error(struct rdma_cm_id *cma_id)
 		max_num_sg = attr->max_fast_reg_page_list_len;
 
 	sg_tablesize = DIV_ROUND_UP(max_sectors * SECTOR_SIZE, SZ_4K);
-	if (attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS)
-		sup_sg_tablesize =
-			min_t(
-			 uint, ISCSI_ISER_MAX_SG_TABLESIZE,
-			 max_num_sg - reserved_mr_pages);
-	else
-		sup_sg_tablesize = ISCSI_ISER_MAX_SG_TABLESIZE;
-
+	sup_sg_tablesize = min_t(uint, ISCSI_ISER_MAX_SG_TABLESIZE,
+				 max_num_sg - reserved_mr_pages);
 	iser_conn->scsi_sg_tablesize = min(sg_tablesize, sup_sg_tablesize);
 	iser_conn->pages_per_mr =
 		iser_conn->scsi_sg_tablesize + reserved_mr_pages;
@@ -755,7 +659,7 @@ static void iser_route_handler(struct rdma_cm_id *cma_id)
 	struct iser_cm_hdr req_hdr;
 	struct iser_conn *iser_conn = (struct iser_conn *)cma_id->context;
 	struct ib_conn *ib_conn = &iser_conn->ib_conn;
-	struct iser_device *device = ib_conn->device;
+	struct ib_device *ib_dev = ib_conn->device->ib_device;
 
 	if (iser_conn->state != ISER_CONN_PENDING)
 		/* bailout */
@@ -766,14 +670,14 @@ static void iser_route_handler(struct rdma_cm_id *cma_id)
 		goto failure;
 
 	memset(&conn_param, 0, sizeof conn_param);
-	conn_param.responder_resources = device->ib_device->attrs.max_qp_rd_atom;
+	conn_param.responder_resources = ib_dev->attrs.max_qp_rd_atom;
 	conn_param.initiator_depth     = 1;
 	conn_param.retry_count	       = 7;
 	conn_param.rnr_retry_count     = 6;
 
 	memset(&req_hdr, 0, sizeof(req_hdr));
 	req_hdr.flags = ISER_ZBVA_NOT_SUP;
-	if (!device->remote_inv_sup)
+	if (!iser_always_reg)
 		req_hdr.flags |= ISER_SEND_W_INV_NOT_SUP;
 	conn_param.private_data	= (void *)&req_hdr;
 	conn_param.private_data_len = sizeof(struct iser_cm_hdr);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 6/8] RDMA/srp: remove support for FMR memory registration
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
                   ` (4 preceding siblings ...)
  2020-05-14 12:03 ` [PATCH 5/8] RDMA/iser: Remove support for " Max Gurtovoy
@ 2020-05-14 12:03 ` Max Gurtovoy
  2020-05-14 14:02   ` Bart Van Assche
  2020-05-14 12:03 ` [PATCH 7/8] RDMA/core: remove FMR pool API Max Gurtovoy
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:03 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

FMR is not supported on most recent RDMA devices (that use fast memory
registration mechanism). Also, FMR was recently removed from NFS/RDMA
ULP.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Israel Rukshin <israelr@mellanox.com>
---
 drivers/infiniband/ulp/srp/ib_srp.c | 222 +++---------------------------------
 drivers/infiniband/ulp/srp/ib_srp.h |  27 +----
 2 files changed, 22 insertions(+), 227 deletions(-)

diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index cd1181c..f276155 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -71,7 +71,6 @@
 static unsigned int cmd_sg_entries;
 static unsigned int indirect_sg_entries;
 static bool allow_ext_sg;
-static bool prefer_fr = true;
 static bool register_always = true;
 static bool never_register;
 static int topspin_workarounds = 1;
@@ -95,10 +94,6 @@
 MODULE_PARM_DESC(topspin_workarounds,
 		 "Enable workarounds for Topspin/Cisco SRP target bugs if != 0");
 
-module_param(prefer_fr, bool, 0444);
-MODULE_PARM_DESC(prefer_fr,
-"Whether to use fast registration if both FMR and fast registration are supported");
-
 module_param(register_always, bool, 0444);
 MODULE_PARM_DESC(register_always,
 		 "Use memory registration even for contiguous memory regions");
@@ -388,24 +383,6 @@ static int srp_new_cm_id(struct srp_rdma_ch *ch)
 		srp_new_ib_cm_id(ch);
 }
 
-static struct ib_fmr_pool *srp_alloc_fmr_pool(struct srp_target_port *target)
-{
-	struct srp_device *dev = target->srp_host->srp_dev;
-	struct ib_fmr_pool_param fmr_param;
-
-	memset(&fmr_param, 0, sizeof(fmr_param));
-	fmr_param.pool_size	    = target->mr_pool_size;
-	fmr_param.dirty_watermark   = fmr_param.pool_size / 4;
-	fmr_param.cache		    = 1;
-	fmr_param.max_pages_per_fmr = dev->max_pages_per_mr;
-	fmr_param.page_shift	    = ilog2(dev->mr_page_size);
-	fmr_param.access	    = (IB_ACCESS_LOCAL_WRITE |
-				       IB_ACCESS_REMOTE_WRITE |
-				       IB_ACCESS_REMOTE_READ);
-
-	return ib_create_fmr_pool(dev->pd, &fmr_param);
-}
-
 /**
  * srp_destroy_fr_pool() - free the resources owned by a pool
  * @pool: Fast registration pool to be destroyed.
@@ -556,7 +533,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch)
 	struct ib_qp_init_attr *init_attr;
 	struct ib_cq *recv_cq, *send_cq;
 	struct ib_qp *qp;
-	struct ib_fmr_pool *fmr_pool = NULL;
 	struct srp_fr_pool *fr_pool = NULL;
 	const int m = 1 + dev->use_fast_reg * target->mr_per_cmd * 2;
 	int ret;
@@ -619,14 +595,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch)
 				     "FR pool allocation failed (%d)\n", ret);
 			goto err_qp;
 		}
-	} else if (dev->use_fmr) {
-		fmr_pool = srp_alloc_fmr_pool(target);
-		if (IS_ERR(fmr_pool)) {
-			ret = PTR_ERR(fmr_pool);
-			shost_printk(KERN_WARNING, target->scsi_host, PFX
-				     "FMR pool allocation failed (%d)\n", ret);
-			goto err_qp;
-		}
 	}
 
 	if (ch->qp)
@@ -644,10 +612,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch)
 		if (ch->fr_pool)
 			srp_destroy_fr_pool(ch->fr_pool);
 		ch->fr_pool = fr_pool;
-	} else if (dev->use_fmr) {
-		if (ch->fmr_pool)
-			ib_destroy_fmr_pool(ch->fmr_pool);
-		ch->fmr_pool = fmr_pool;
 	}
 
 	kfree(init_attr);
@@ -702,9 +666,6 @@ static void srp_free_ch_ib(struct srp_target_port *target,
 	if (dev->use_fast_reg) {
 		if (ch->fr_pool)
 			srp_destroy_fr_pool(ch->fr_pool);
-	} else if (dev->use_fmr) {
-		if (ch->fmr_pool)
-			ib_destroy_fmr_pool(ch->fmr_pool);
 	}
 
 	srp_destroy_qp(ch);
@@ -1017,12 +978,8 @@ static void srp_free_req_data(struct srp_target_port *target,
 
 	for (i = 0; i < target->req_ring_size; ++i) {
 		req = &ch->req_ring[i];
-		if (dev->use_fast_reg) {
+		if (dev->use_fast_reg)
 			kfree(req->fr_list);
-		} else {
-			kfree(req->fmr_list);
-			kfree(req->map_page);
-		}
 		if (req->indirect_dma_addr) {
 			ib_dma_unmap_single(ibdev, req->indirect_dma_addr,
 					    target->indirect_size,
@@ -1056,16 +1013,8 @@ static int srp_alloc_req_data(struct srp_rdma_ch *ch)
 					GFP_KERNEL);
 		if (!mr_list)
 			goto out;
-		if (srp_dev->use_fast_reg) {
+		if (srp_dev->use_fast_reg)
 			req->fr_list = mr_list;
-		} else {
-			req->fmr_list = mr_list;
-			req->map_page = kmalloc_array(srp_dev->max_pages_per_mr,
-						      sizeof(void *),
-						      GFP_KERNEL);
-			if (!req->map_page)
-				goto out;
-		}
 		req->indirect_desc = kmalloc(target->indirect_size, GFP_KERNEL);
 		if (!req->indirect_desc)
 			goto out;
@@ -1272,11 +1221,6 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
 		if (req->nmdesc)
 			srp_fr_pool_put(ch->fr_pool, req->fr_list,
 					req->nmdesc);
-	} else if (dev->use_fmr) {
-		struct ib_pool_fmr **pfmr;
-
-		for (i = req->nmdesc, pfmr = req->fmr_list; i > 0; i--, pfmr++)
-			ib_fmr_pool_unmap(*pfmr);
 	}
 
 	ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
@@ -1472,50 +1416,6 @@ static void srp_map_desc(struct srp_map_state *state, dma_addr_t dma_addr,
 	state->ndesc++;
 }
 
-static int srp_map_finish_fmr(struct srp_map_state *state,
-			      struct srp_rdma_ch *ch)
-{
-	struct srp_target_port *target = ch->target;
-	struct srp_device *dev = target->srp_host->srp_dev;
-	struct ib_pool_fmr *fmr;
-	u64 io_addr = 0;
-
-	if (state->fmr.next >= state->fmr.end) {
-		shost_printk(KERN_ERR, ch->target->scsi_host,
-			     PFX "Out of MRs (mr_per_cmd = %d)\n",
-			     ch->target->mr_per_cmd);
-		return -ENOMEM;
-	}
-
-	WARN_ON_ONCE(!dev->use_fmr);
-
-	if (state->npages == 0)
-		return 0;
-
-	if (state->npages == 1 && target->global_rkey) {
-		srp_map_desc(state, state->base_dma_addr, state->dma_len,
-			     target->global_rkey);
-		goto reset_state;
-	}
-
-	fmr = ib_fmr_pool_map_phys(ch->fmr_pool, state->pages,
-				   state->npages, io_addr);
-	if (IS_ERR(fmr))
-		return PTR_ERR(fmr);
-
-	*state->fmr.next++ = fmr;
-	state->nmdesc++;
-
-	srp_map_desc(state, state->base_dma_addr & ~dev->mr_page_mask,
-		     state->dma_len, fmr->fmr->rkey);
-
-reset_state:
-	state->npages = 0;
-	state->dma_len = 0;
-
-	return 0;
-}
-
 static void srp_reg_mr_err_done(struct ib_cq *cq, struct ib_wc *wc)
 {
 	srp_handle_qp_err(cq, wc, "FAST REG");
@@ -1606,74 +1506,6 @@ static int srp_map_finish_fr(struct srp_map_state *state,
 	return n;
 }
 
-static int srp_map_sg_entry(struct srp_map_state *state,
-			    struct srp_rdma_ch *ch,
-			    struct scatterlist *sg)
-{
-	struct srp_target_port *target = ch->target;
-	struct srp_device *dev = target->srp_host->srp_dev;
-	dma_addr_t dma_addr = sg_dma_address(sg);
-	unsigned int dma_len = sg_dma_len(sg);
-	unsigned int len = 0;
-	int ret;
-
-	WARN_ON_ONCE(!dma_len);
-
-	while (dma_len) {
-		unsigned offset = dma_addr & ~dev->mr_page_mask;
-
-		if (state->npages == dev->max_pages_per_mr ||
-		    (state->npages > 0 && offset != 0)) {
-			ret = srp_map_finish_fmr(state, ch);
-			if (ret)
-				return ret;
-		}
-
-		len = min_t(unsigned int, dma_len, dev->mr_page_size - offset);
-
-		if (!state->npages)
-			state->base_dma_addr = dma_addr;
-		state->pages[state->npages++] = dma_addr & dev->mr_page_mask;
-		state->dma_len += len;
-		dma_addr += len;
-		dma_len -= len;
-	}
-
-	/*
-	 * If the end of the MR is not on a page boundary then we need to
-	 * close it out and start a new one -- we can only merge at page
-	 * boundaries.
-	 */
-	ret = 0;
-	if ((dma_addr & ~dev->mr_page_mask) != 0)
-		ret = srp_map_finish_fmr(state, ch);
-	return ret;
-}
-
-static int srp_map_sg_fmr(struct srp_map_state *state, struct srp_rdma_ch *ch,
-			  struct srp_request *req, struct scatterlist *scat,
-			  int count)
-{
-	struct scatterlist *sg;
-	int i, ret;
-
-	state->pages = req->map_page;
-	state->fmr.next = req->fmr_list;
-	state->fmr.end = req->fmr_list + ch->target->mr_per_cmd;
-
-	for_each_sg(scat, sg, count, i) {
-		ret = srp_map_sg_entry(state, ch, sg);
-		if (ret)
-			return ret;
-	}
-
-	ret = srp_map_finish_fmr(state, ch);
-	if (ret)
-		return ret;
-
-	return 0;
-}
-
 static int srp_map_sg_fr(struct srp_map_state *state, struct srp_rdma_ch *ch,
 			 struct srp_request *req, struct scatterlist *scat,
 			 int count)
@@ -1733,7 +1565,6 @@ static int srp_map_idb(struct srp_rdma_ch *ch, struct srp_request *req,
 	struct srp_device *dev = target->srp_host->srp_dev;
 	struct srp_map_state state;
 	struct srp_direct_buf idb_desc;
-	u64 idb_pages[1];
 	struct scatterlist idb_sg[1];
 	int ret;
 
@@ -1756,14 +1587,6 @@ static int srp_map_idb(struct srp_rdma_ch *ch, struct srp_request *req,
 		if (ret < 0)
 			return ret;
 		WARN_ON_ONCE(ret < 1);
-	} else if (dev->use_fmr) {
-		state.pages = idb_pages;
-		state.pages[0] = (req->indirect_dma_addr &
-				  dev->mr_page_mask);
-		state.npages = 1;
-		ret = srp_map_finish_fmr(&state, ch);
-		if (ret < 0)
-			return ret;
 	} else {
 		return -EINVAL;
 	}
@@ -1787,9 +1610,6 @@ static void srp_check_mapping(struct srp_map_state *state,
 	if (dev->use_fast_reg)
 		for (i = 0, pfr = req->fr_list; i < state->nmdesc; i++, pfr++)
 			mr_len += (*pfr)->mr->length;
-	else if (dev->use_fmr)
-		for (i = 0; i < state->nmdesc; i++)
-			mr_len += be32_to_cpu(req->indirect_desc[i].len);
 	if (desc_len != scsi_bufflen(req->scmnd) ||
 	    mr_len > scsi_bufflen(req->scmnd))
 		pr_err("Inconsistent: scsi len %d <> desc len %lld <> mr len %lld; ndesc %d; nmdesc = %d\n",
@@ -1904,8 +1724,6 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_rdma_ch *ch,
 	state.desc = req->indirect_desc;
 	if (dev->use_fast_reg)
 		ret = srp_map_sg_fr(&state, ch, req, scat, count);
-	else if (dev->use_fmr)
-		ret = srp_map_sg_fmr(&state, ch, req, scat, count);
 	else
 		ret = srp_map_sg_dma(&state, ch, req, scat, count);
 	req->nmdesc = state.nmdesc;
@@ -3864,13 +3682,13 @@ static ssize_t srp_create_target(struct device *dev,
 		goto out;
 	}
 
-	if (!srp_dev->has_fmr && !srp_dev->has_fr && !target->allow_ext_sg &&
+	if (!srp_dev->has_fr && !target->allow_ext_sg &&
 	    target->cmd_sg_cnt < target->sg_tablesize) {
 		pr_warn("No MR pool and no external indirect descriptors, limiting sg_tablesize to cmd_sg_cnt\n");
 		target->sg_tablesize = target->cmd_sg_cnt;
 	}
 
-	if (srp_dev->use_fast_reg || srp_dev->use_fmr) {
+	if (srp_dev->use_fast_reg) {
 		bool gaps_reg = (ibdev->attrs.device_cap_flags &
 				 IB_DEVICE_SG_GAPS_REG);
 
@@ -3878,12 +3696,12 @@ static ssize_t srp_create_target(struct device *dev,
 				  (ilog2(srp_dev->mr_page_size) - 9);
 		if (!gaps_reg) {
 			/*
-			 * FR and FMR can only map one HCA page per entry. If
-			 * the start address is not aligned on a HCA page
-			 * boundary two entries will be used for the head and
-			 * the tail although these two entries combined
-			 * contain at most one HCA page of data. Hence the "+
-			 * 1" in the calculation below.
+			 * FR can only map one HCA page per entry. If the start
+			 * address is not aligned on a HCA page boundary two
+			 * entries will be used for the head and the tail
+			 * although these two entries combined contain at most
+			 * one HCA page of data. Hence the "+ 1" in the
+			 * calculation below.
 			 *
 			 * The indirect data buffer descriptor is contiguous
 			 * so the memory for that buffer will only be
@@ -4162,23 +3980,15 @@ static void srp_add_one(struct ib_device *device)
 	srp_dev->max_pages_per_mr = min_t(u64, SRP_MAX_PAGES_PER_MR,
 					  max_pages_per_mr);
 
-	srp_dev->has_fmr = (device->ops.alloc_fmr &&
-			    device->ops.dealloc_fmr &&
-			    device->ops.map_phys_fmr &&
-			    device->ops.unmap_fmr);
 	srp_dev->has_fr = (attr->device_cap_flags &
 			   IB_DEVICE_MEM_MGT_EXTENSIONS);
-	if (!never_register && !srp_dev->has_fmr && !srp_dev->has_fr) {
-		dev_warn(&device->dev, "neither FMR nor FR is supported\n");
-	} else if (!never_register &&
-		   attr->max_mr_size >= 2 * srp_dev->mr_page_size) {
-		srp_dev->use_fast_reg = (srp_dev->has_fr &&
-					 (!srp_dev->has_fmr || prefer_fr));
-		srp_dev->use_fmr = !srp_dev->use_fast_reg && srp_dev->has_fmr;
-	}
+	if (!never_register && !srp_dev->has_fr)
+		dev_warn(&device->dev, "FR is not supported\n");
+	else if (!never_register &&
+		 attr->max_mr_size >= 2 * srp_dev->mr_page_size)
+		srp_dev->use_fast_reg = srp_dev->has_fr;
 
-	if (never_register || !register_always ||
-	    (!srp_dev->has_fmr && !srp_dev->has_fr))
+	if (never_register || !register_always || !srp_dev->has_fr)
 		flags |= IB_PD_UNSAFE_GLOBAL_RKEY;
 
 	if (srp_dev->use_fast_reg) {
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 6fabcc2..6818cac 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -44,7 +44,6 @@
 #include <rdma/ib_verbs.h>
 #include <rdma/ib_sa.h>
 #include <rdma/ib_cm.h>
-#include <rdma/ib_fmr_pool.h>
 #include <rdma/rdma_cm.h>
 
 enum {
@@ -95,8 +94,7 @@ enum srp_iu_type {
 /*
  * @mr_page_mask: HCA memory registration page mask.
  * @mr_page_size: HCA memory registration page size.
- * @mr_max_size: Maximum size in bytes of a single FMR / FR registration
- *   request.
+ * @mr_max_size: Maximum size in bytes of a single FR registration request.
  */
 struct srp_device {
 	struct list_head	dev_list;
@@ -107,9 +105,7 @@ struct srp_device {
 	int			mr_page_size;
 	int			mr_max_size;
 	int			max_pages_per_mr;
-	bool			has_fmr;
 	bool			has_fr;
-	bool			use_fmr;
 	bool			use_fast_reg;
 };
 
@@ -127,11 +123,7 @@ struct srp_host {
 struct srp_request {
 	struct scsi_cmnd       *scmnd;
 	struct srp_iu	       *cmd;
-	union {
-		struct ib_pool_fmr **fmr_list;
-		struct srp_fr_desc **fr_list;
-	};
-	u64		       *map_page;
+	struct srp_fr_desc     **fr_list;
 	struct srp_direct_buf  *indirect_desc;
 	dma_addr_t		indirect_dma_addr;
 	short			nmdesc;
@@ -155,10 +147,7 @@ struct srp_rdma_ch {
 	struct ib_cq	       *send_cq;
 	struct ib_cq	       *recv_cq;
 	struct ib_qp	       *qp;
-	union {
-		struct ib_fmr_pool     *fmr_pool;
-		struct srp_fr_pool     *fr_pool;
-	};
+	struct srp_fr_pool     *fr_pool;
 	uint32_t		max_it_iu_len;
 	uint32_t		max_ti_iu_len;
 	u8			max_imm_sge;
@@ -319,20 +308,16 @@ struct srp_fr_pool {
  * @pages:	    Array with DMA addresses of pages being considered for
  *		    memory registration.
  * @base_dma_addr:  DMA address of the first page that has not yet been mapped.
- * @dma_len:	    Number of bytes that will be registered with the next
- *		    FMR or FR memory registration call.
+ * @dma_len:	    Number of bytes that will be registered with the next FR
+ *                  memory registration call.
  * @total_len:	    Total number of bytes in the sg-list being mapped.
  * @npages:	    Number of page addresses in the pages[] array.
- * @nmdesc:	    Number of FMR or FR memory descriptors used for mapping.
+ * @nmdesc:	    Number of FR memory descriptors used for mapping.
  * @ndesc:	    Number of SRP buffer descriptors that have been filled in.
  */
 struct srp_map_state {
 	union {
 		struct {
-			struct ib_pool_fmr **next;
-			struct ib_pool_fmr **end;
-		} fmr;
-		struct {
 			struct srp_fr_desc **next;
 			struct srp_fr_desc **end;
 		} fr;
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 7/8] RDMA/core: remove FMR pool API
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
                   ` (5 preceding siblings ...)
  2020-05-14 12:03 ` [PATCH 6/8] RDMA/srp: remove " Max Gurtovoy
@ 2020-05-14 12:03 ` Max Gurtovoy
  2020-05-14 12:03 ` [PATCH 8/8] RDMA/core: remove FMR device ops Max Gurtovoy
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:03 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

This ancient and unsafe method for memory registration is no longer used
by any RDMA based ULP. Remove the FMR pool API from the core driver.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 Documentation/driver-api/infiniband.rst |   3 -
 drivers/infiniband/core/Makefile        |   2 +-
 drivers/infiniband/core/fmr_pool.c      | 494 --------------------------------
 include/rdma/ib_fmr_pool.h              |  93 ------
 4 files changed, 1 insertion(+), 591 deletions(-)
 delete mode 100644 drivers/infiniband/core/fmr_pool.c
 delete mode 100644 include/rdma/ib_fmr_pool.h

diff --git a/Documentation/driver-api/infiniband.rst b/Documentation/driver-api/infiniband.rst
index 1a3116f..30e142c 100644
--- a/Documentation/driver-api/infiniband.rst
+++ b/Documentation/driver-api/infiniband.rst
@@ -37,9 +37,6 @@ InfiniBand core interfaces
 .. kernel-doc:: drivers/infiniband/core/ud_header.c
     :export:
 
-.. kernel-doc:: drivers/infiniband/core/fmr_pool.c
-    :export:
-
 .. kernel-doc:: drivers/infiniband/core/umem.c
     :export:
 
diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index d1b14887..064cd34 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -8,7 +8,7 @@ obj-$(CONFIG_INFINIBAND_USER_MAD) +=	ib_umad.o
 obj-$(CONFIG_INFINIBAND_USER_ACCESS) += ib_uverbs.o $(user_access-y)
 
 ib_core-y :=			packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \
-				device.o fmr_pool.o cache.o netlink.o \
+				device.o cache.o netlink.o \
 				roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
 				multicast.o mad.o smi.o agent.o mad_rmpp.o \
 				nldev.o restrack.o counters.o ib_core_uverbs.o \
diff --git a/drivers/infiniband/core/fmr_pool.c b/drivers/infiniband/core/fmr_pool.c
deleted file mode 100644
index e08aec4..0000000
--- a/drivers/infiniband/core/fmr_pool.c
+++ /dev/null
@@ -1,494 +0,0 @@
-/*
- * Copyright (c) 2004 Topspin Communications.  All rights reserved.
- * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
- *
- * This software is available to you under a choice of one of two
- * licenses.  You may choose to be licensed under the terms of the GNU
- * General Public License (GPL) Version 2, available from the file
- * COPYING in the main directory of this source tree, or the
- * OpenIB.org BSD license below:
- *
- *     Redistribution and use in source and binary forms, with or
- *     without modification, are permitted provided that the following
- *     conditions are met:
- *
- *      - Redistributions of source code must retain the above
- *        copyright notice, this list of conditions and the following
- *        disclaimer.
- *
- *      - Redistributions in binary form must reproduce the above
- *        copyright notice, this list of conditions and the following
- *        disclaimer in the documentation and/or other materials
- *        provided with the distribution.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
- * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
- * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
- * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- * SOFTWARE.
- */
-
-#include <linux/errno.h>
-#include <linux/spinlock.h>
-#include <linux/export.h>
-#include <linux/slab.h>
-#include <linux/jhash.h>
-#include <linux/kthread.h>
-
-#include <rdma/ib_fmr_pool.h>
-
-#include "core_priv.h"
-
-#define PFX "fmr_pool: "
-
-enum {
-	IB_FMR_MAX_REMAPS = 32,
-
-	IB_FMR_HASH_BITS  = 8,
-	IB_FMR_HASH_SIZE  = 1 << IB_FMR_HASH_BITS,
-	IB_FMR_HASH_MASK  = IB_FMR_HASH_SIZE - 1
-};
-
-/*
- * If an FMR is not in use, then the list member will point to either
- * its pool's free_list (if the FMR can be mapped again; that is,
- * remap_count < pool->max_remaps) or its pool's dirty_list (if the
- * FMR needs to be unmapped before being remapped).  In either of
- * these cases it is a bug if the ref_count is not 0.  In other words,
- * if ref_count is > 0, then the list member must not be linked into
- * either free_list or dirty_list.
- *
- * The cache_node member is used to link the FMR into a cache bucket
- * (if caching is enabled).  This is independent of the reference
- * count of the FMR.  When a valid FMR is released, its ref_count is
- * decremented, and if ref_count reaches 0, the FMR is placed in
- * either free_list or dirty_list as appropriate.  However, it is not
- * removed from the cache and may be "revived" if a call to
- * ib_fmr_register_physical() occurs before the FMR is remapped.  In
- * this case we just increment the ref_count and remove the FMR from
- * free_list/dirty_list.
- *
- * Before we remap an FMR from free_list, we remove it from the cache
- * (to prevent another user from obtaining a stale FMR).  When an FMR
- * is released, we add it to the tail of the free list, so that our
- * cache eviction policy is "least recently used."
- *
- * All manipulation of ref_count, list and cache_node is protected by
- * pool_lock to maintain consistency.
- */
-
-struct ib_fmr_pool {
-	spinlock_t                pool_lock;
-
-	int                       pool_size;
-	int                       max_pages;
-	int			  max_remaps;
-	int                       dirty_watermark;
-	int                       dirty_len;
-	struct list_head          free_list;
-	struct list_head          dirty_list;
-	struct hlist_head        *cache_bucket;
-
-	void                     (*flush_function)(struct ib_fmr_pool *pool,
-						   void *              arg);
-	void                     *flush_arg;
-
-	struct kthread_worker	  *worker;
-	struct kthread_work	  work;
-
-	atomic_t                  req_ser;
-	atomic_t                  flush_ser;
-
-	wait_queue_head_t         force_wait;
-};
-
-static inline u32 ib_fmr_hash(u64 first_page)
-{
-	return jhash_2words((u32) first_page, (u32) (first_page >> 32), 0) &
-		(IB_FMR_HASH_SIZE - 1);
-}
-
-/* Caller must hold pool_lock */
-static inline struct ib_pool_fmr *ib_fmr_cache_lookup(struct ib_fmr_pool *pool,
-						      u64 *page_list,
-						      int  page_list_len,
-						      u64  io_virtual_address)
-{
-	struct hlist_head *bucket;
-	struct ib_pool_fmr *fmr;
-
-	if (!pool->cache_bucket)
-		return NULL;
-
-	bucket = pool->cache_bucket + ib_fmr_hash(*page_list);
-
-	hlist_for_each_entry(fmr, bucket, cache_node)
-		if (io_virtual_address == fmr->io_virtual_address &&
-		    page_list_len      == fmr->page_list_len      &&
-		    !memcmp(page_list, fmr->page_list,
-			    page_list_len * sizeof *page_list))
-			return fmr;
-
-	return NULL;
-}
-
-static void ib_fmr_batch_release(struct ib_fmr_pool *pool)
-{
-	int                 ret;
-	struct ib_pool_fmr *fmr;
-	LIST_HEAD(unmap_list);
-	LIST_HEAD(fmr_list);
-
-	spin_lock_irq(&pool->pool_lock);
-
-	list_for_each_entry(fmr, &pool->dirty_list, list) {
-		hlist_del_init(&fmr->cache_node);
-		fmr->remap_count = 0;
-		list_add_tail(&fmr->fmr->list, &fmr_list);
-	}
-
-	list_splice_init(&pool->dirty_list, &unmap_list);
-	pool->dirty_len = 0;
-
-	spin_unlock_irq(&pool->pool_lock);
-
-	if (list_empty(&unmap_list)) {
-		return;
-	}
-
-	ret = ib_unmap_fmr(&fmr_list);
-	if (ret)
-		pr_warn(PFX "ib_unmap_fmr returned %d\n", ret);
-
-	spin_lock_irq(&pool->pool_lock);
-	list_splice(&unmap_list, &pool->free_list);
-	spin_unlock_irq(&pool->pool_lock);
-}
-
-static void ib_fmr_cleanup_func(struct kthread_work *work)
-{
-	struct ib_fmr_pool *pool = container_of(work, struct ib_fmr_pool, work);
-
-	ib_fmr_batch_release(pool);
-	atomic_inc(&pool->flush_ser);
-	wake_up_interruptible(&pool->force_wait);
-
-	if (pool->flush_function)
-		pool->flush_function(pool, pool->flush_arg);
-
-	if (atomic_read(&pool->flush_ser) - atomic_read(&pool->req_ser) < 0)
-		kthread_queue_work(pool->worker, &pool->work);
-}
-
-/**
- * ib_create_fmr_pool - Create an FMR pool
- * @pd:Protection domain for FMRs
- * @params:FMR pool parameters
- *
- * Create a pool of FMRs.  Return value is pointer to new pool or
- * error code if creation failed.
- */
-struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd             *pd,
-				       struct ib_fmr_pool_param *params)
-{
-	struct ib_device   *device;
-	struct ib_fmr_pool *pool;
-	int i;
-	int ret;
-	int max_remaps;
-
-	if (!params)
-		return ERR_PTR(-EINVAL);
-
-	device = pd->device;
-	if (!device->ops.alloc_fmr    || !device->ops.dealloc_fmr  ||
-	    !device->ops.map_phys_fmr || !device->ops.unmap_fmr) {
-		dev_info(&device->dev, "Device does not support FMRs\n");
-		return ERR_PTR(-ENOSYS);
-	}
-
-	if (!device->attrs.max_map_per_fmr)
-		max_remaps = IB_FMR_MAX_REMAPS;
-	else
-		max_remaps = device->attrs.max_map_per_fmr;
-
-	pool = kmalloc(sizeof *pool, GFP_KERNEL);
-	if (!pool)
-		return ERR_PTR(-ENOMEM);
-
-	pool->cache_bucket   = NULL;
-	pool->flush_function = params->flush_function;
-	pool->flush_arg      = params->flush_arg;
-
-	INIT_LIST_HEAD(&pool->free_list);
-	INIT_LIST_HEAD(&pool->dirty_list);
-
-	if (params->cache) {
-		pool->cache_bucket =
-			kmalloc_array(IB_FMR_HASH_SIZE,
-				      sizeof(*pool->cache_bucket),
-				      GFP_KERNEL);
-		if (!pool->cache_bucket) {
-			ret = -ENOMEM;
-			goto out_free_pool;
-		}
-
-		for (i = 0; i < IB_FMR_HASH_SIZE; ++i)
-			INIT_HLIST_HEAD(pool->cache_bucket + i);
-	}
-
-	pool->pool_size       = 0;
-	pool->max_pages       = params->max_pages_per_fmr;
-	pool->max_remaps      = max_remaps;
-	pool->dirty_watermark = params->dirty_watermark;
-	pool->dirty_len       = 0;
-	spin_lock_init(&pool->pool_lock);
-	atomic_set(&pool->req_ser,   0);
-	atomic_set(&pool->flush_ser, 0);
-	init_waitqueue_head(&pool->force_wait);
-
-	pool->worker =
-		kthread_create_worker(0, "ib_fmr(%s)", dev_name(&device->dev));
-	if (IS_ERR(pool->worker)) {
-		pr_warn(PFX "couldn't start cleanup kthread worker\n");
-		ret = PTR_ERR(pool->worker);
-		goto out_free_pool;
-	}
-	kthread_init_work(&pool->work, ib_fmr_cleanup_func);
-
-	{
-		struct ib_pool_fmr *fmr;
-		struct ib_fmr_attr fmr_attr = {
-			.max_pages  = params->max_pages_per_fmr,
-			.max_maps   = pool->max_remaps,
-			.page_shift = params->page_shift
-		};
-		int bytes_per_fmr = sizeof *fmr;
-
-		if (pool->cache_bucket)
-			bytes_per_fmr += params->max_pages_per_fmr * sizeof (u64);
-
-		for (i = 0; i < params->pool_size; ++i) {
-			fmr = kmalloc(bytes_per_fmr, GFP_KERNEL);
-			if (!fmr)
-				goto out_fail;
-
-			fmr->pool             = pool;
-			fmr->remap_count      = 0;
-			fmr->ref_count        = 0;
-			INIT_HLIST_NODE(&fmr->cache_node);
-
-			fmr->fmr = ib_alloc_fmr(pd, params->access, &fmr_attr);
-			if (IS_ERR(fmr->fmr)) {
-				pr_warn(PFX "fmr_create failed for FMR %d\n",
-					i);
-				kfree(fmr);
-				goto out_fail;
-			}
-
-			list_add_tail(&fmr->list, &pool->free_list);
-			++pool->pool_size;
-		}
-	}
-
-	return pool;
-
- out_free_pool:
-	kfree(pool->cache_bucket);
-	kfree(pool);
-
-	return ERR_PTR(ret);
-
- out_fail:
-	ib_destroy_fmr_pool(pool);
-
-	return ERR_PTR(-ENOMEM);
-}
-EXPORT_SYMBOL(ib_create_fmr_pool);
-
-/**
- * ib_destroy_fmr_pool - Free FMR pool
- * @pool:FMR pool to free
- *
- * Destroy an FMR pool and free all associated resources.
- */
-void ib_destroy_fmr_pool(struct ib_fmr_pool *pool)
-{
-	struct ib_pool_fmr *fmr;
-	struct ib_pool_fmr *tmp;
-	LIST_HEAD(fmr_list);
-	int                 i;
-
-	kthread_destroy_worker(pool->worker);
-	ib_fmr_batch_release(pool);
-
-	i = 0;
-	list_for_each_entry_safe(fmr, tmp, &pool->free_list, list) {
-		if (fmr->remap_count) {
-			INIT_LIST_HEAD(&fmr_list);
-			list_add_tail(&fmr->fmr->list, &fmr_list);
-			ib_unmap_fmr(&fmr_list);
-		}
-		ib_dealloc_fmr(fmr->fmr);
-		list_del(&fmr->list);
-		kfree(fmr);
-		++i;
-	}
-
-	if (i < pool->pool_size)
-		pr_warn(PFX "pool still has %d regions registered\n",
-			pool->pool_size - i);
-
-	kfree(pool->cache_bucket);
-	kfree(pool);
-}
-EXPORT_SYMBOL(ib_destroy_fmr_pool);
-
-/**
- * ib_flush_fmr_pool - Invalidate all unmapped FMRs
- * @pool:FMR pool to flush
- *
- * Ensure that all unmapped FMRs are fully invalidated.
- */
-int ib_flush_fmr_pool(struct ib_fmr_pool *pool)
-{
-	int serial;
-	struct ib_pool_fmr *fmr, *next;
-
-	/*
-	 * The free_list holds FMRs that may have been used
-	 * but have not been remapped enough times to be dirty.
-	 * Put them on the dirty list now so that the cleanup
-	 * thread will reap them too.
-	 */
-	spin_lock_irq(&pool->pool_lock);
-	list_for_each_entry_safe(fmr, next, &pool->free_list, list) {
-		if (fmr->remap_count > 0)
-			list_move(&fmr->list, &pool->dirty_list);
-	}
-	spin_unlock_irq(&pool->pool_lock);
-
-	serial = atomic_inc_return(&pool->req_ser);
-	kthread_queue_work(pool->worker, &pool->work);
-
-	if (wait_event_interruptible(pool->force_wait,
-				     atomic_read(&pool->flush_ser) - serial >= 0))
-		return -EINTR;
-
-	return 0;
-}
-EXPORT_SYMBOL(ib_flush_fmr_pool);
-
-/**
- * ib_fmr_pool_map_phys - Map an FMR from an FMR pool.
- * @pool_handle: FMR pool to allocate FMR from
- * @page_list: List of pages to map
- * @list_len: Number of pages in @page_list
- * @io_virtual_address: I/O virtual address for new FMR
- */
-struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle,
-					 u64                *page_list,
-					 int                 list_len,
-					 u64                 io_virtual_address)
-{
-	struct ib_fmr_pool *pool = pool_handle;
-	struct ib_pool_fmr *fmr;
-	unsigned long       flags;
-	int                 result;
-
-	if (list_len < 1 || list_len > pool->max_pages)
-		return ERR_PTR(-EINVAL);
-
-	spin_lock_irqsave(&pool->pool_lock, flags);
-	fmr = ib_fmr_cache_lookup(pool,
-				  page_list,
-				  list_len,
-				  io_virtual_address);
-	if (fmr) {
-		/* found in cache */
-		++fmr->ref_count;
-		if (fmr->ref_count == 1) {
-			list_del(&fmr->list);
-		}
-
-		spin_unlock_irqrestore(&pool->pool_lock, flags);
-
-		return fmr;
-	}
-
-	if (list_empty(&pool->free_list)) {
-		spin_unlock_irqrestore(&pool->pool_lock, flags);
-		return ERR_PTR(-EAGAIN);
-	}
-
-	fmr = list_entry(pool->free_list.next, struct ib_pool_fmr, list);
-	list_del(&fmr->list);
-	hlist_del_init(&fmr->cache_node);
-	spin_unlock_irqrestore(&pool->pool_lock, flags);
-
-	result = ib_map_phys_fmr(fmr->fmr, page_list, list_len,
-				 io_virtual_address);
-
-	if (result) {
-		spin_lock_irqsave(&pool->pool_lock, flags);
-		list_add(&fmr->list, &pool->free_list);
-		spin_unlock_irqrestore(&pool->pool_lock, flags);
-
-		pr_warn(PFX "fmr_map returns %d\n", result);
-
-		return ERR_PTR(result);
-	}
-
-	++fmr->remap_count;
-	fmr->ref_count = 1;
-
-	if (pool->cache_bucket) {
-		fmr->io_virtual_address = io_virtual_address;
-		fmr->page_list_len      = list_len;
-		memcpy(fmr->page_list, page_list, list_len * sizeof(*page_list));
-
-		spin_lock_irqsave(&pool->pool_lock, flags);
-		hlist_add_head(&fmr->cache_node,
-			       pool->cache_bucket + ib_fmr_hash(fmr->page_list[0]));
-		spin_unlock_irqrestore(&pool->pool_lock, flags);
-	}
-
-	return fmr;
-}
-EXPORT_SYMBOL(ib_fmr_pool_map_phys);
-
-/**
- * ib_fmr_pool_unmap - Unmap FMR
- * @fmr:FMR to unmap
- *
- * Unmap an FMR.  The FMR mapping may remain valid until the FMR is
- * reused (or until ib_flush_fmr_pool() is called).
- */
-void ib_fmr_pool_unmap(struct ib_pool_fmr *fmr)
-{
-	struct ib_fmr_pool *pool;
-	unsigned long flags;
-
-	pool = fmr->pool;
-
-	spin_lock_irqsave(&pool->pool_lock, flags);
-
-	--fmr->ref_count;
-	if (!fmr->ref_count) {
-		if (fmr->remap_count < pool->max_remaps) {
-			list_add_tail(&fmr->list, &pool->free_list);
-		} else {
-			list_add_tail(&fmr->list, &pool->dirty_list);
-			if (++pool->dirty_len >= pool->dirty_watermark) {
-				atomic_inc(&pool->req_ser);
-				kthread_queue_work(pool->worker, &pool->work);
-			}
-		}
-	}
-
-	spin_unlock_irqrestore(&pool->pool_lock, flags);
-}
-EXPORT_SYMBOL(ib_fmr_pool_unmap);
diff --git a/include/rdma/ib_fmr_pool.h b/include/rdma/ib_fmr_pool.h
deleted file mode 100644
index 2fd9bfb..0000000
--- a/include/rdma/ib_fmr_pool.h
+++ /dev/null
@@ -1,93 +0,0 @@
-/*
- * Copyright (c) 2004 Topspin Corporation.  All rights reserved.
- * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
- *
- * This software is available to you under a choice of one of two
- * licenses.  You may choose to be licensed under the terms of the GNU
- * General Public License (GPL) Version 2, available from the file
- * COPYING in the main directory of this source tree, or the
- * OpenIB.org BSD license below:
- *
- *     Redistribution and use in source and binary forms, with or
- *     without modification, are permitted provided that the following
- *     conditions are met:
- *
- *      - Redistributions of source code must retain the above
- *        copyright notice, this list of conditions and the following
- *        disclaimer.
- *
- *      - Redistributions in binary form must reproduce the above
- *        copyright notice, this list of conditions and the following
- *        disclaimer in the documentation and/or other materials
- *        provided with the distribution.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
- * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
- * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
- * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- * SOFTWARE.
- */
-
-#if !defined(IB_FMR_POOL_H)
-#define IB_FMR_POOL_H
-
-#include <rdma/ib_verbs.h>
-
-struct ib_fmr_pool;
-
-/**
- * struct ib_fmr_pool_param - Parameters for creating FMR pool
- * @max_pages_per_fmr:Maximum number of pages per map request.
- * @page_shift: Log2 of sizeof "pages" mapped by this fmr
- * @access:Access flags for FMRs in pool.
- * @pool_size:Number of FMRs to allocate for pool.
- * @dirty_watermark:Flush is triggered when @dirty_watermark dirty
- *     FMRs are present.
- * @flush_function:Callback called when unmapped FMRs are flushed and
- *     more FMRs are possibly available for mapping
- * @flush_arg:Context passed to user's flush function.
- * @cache:If set, FMRs may be reused after unmapping for identical map
- *     requests.
- */
-struct ib_fmr_pool_param {
-	int                     max_pages_per_fmr;
-	int                     page_shift;
-	enum ib_access_flags    access;
-	int                     pool_size;
-	int                     dirty_watermark;
-	void                  (*flush_function)(struct ib_fmr_pool *pool,
-						void               *arg);
-	void                   *flush_arg;
-	unsigned                cache:1;
-};
-
-struct ib_pool_fmr {
-	struct ib_fmr      *fmr;
-	struct ib_fmr_pool *pool;
-	struct list_head    list;
-	struct hlist_node   cache_node;
-	int                 ref_count;
-	int                 remap_count;
-	u64                 io_virtual_address;
-	int                 page_list_len;
-	u64                 page_list[];
-};
-
-struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd             *pd,
-				       struct ib_fmr_pool_param *params);
-
-void ib_destroy_fmr_pool(struct ib_fmr_pool *pool);
-
-int ib_flush_fmr_pool(struct ib_fmr_pool *pool);
-
-struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle,
-					 u64                *page_list,
-					 int                 list_len,
-					 u64                 io_virtual_address);
-
-void ib_fmr_pool_unmap(struct ib_pool_fmr *fmr);
-
-#endif /* IB_FMR_POOL_H */
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 8/8] RDMA/core: remove FMR device ops
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
                   ` (6 preceding siblings ...)
  2020-05-14 12:03 ` [PATCH 7/8] RDMA/core: remove FMR pool API Max Gurtovoy
@ 2020-05-14 12:03 ` Max Gurtovoy
  2020-05-14 15:13 ` [PATCH 0/8 v1] Remove FMR support from RDMA drivers Aron Silverton
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 12:03 UTC (permalink / raw)
  To: bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Max Gurtovoy

After removing FMR support from all the RDMA ULPs and providers, there
is no need to keep FMR operation for IB devices.

Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
 Documentation/infiniband/core_locking.rst |  2 --
 drivers/infiniband/core/device.c          |  4 ---
 drivers/infiniband/core/verbs.c           | 48 -------------------------------
 include/rdma/ib_verbs.h                   | 45 -----------------------------
 4 files changed, 99 deletions(-)

diff --git a/Documentation/infiniband/core_locking.rst b/Documentation/infiniband/core_locking.rst
index 8f76a8a..efd5e76 100644
--- a/Documentation/infiniband/core_locking.rst
+++ b/Documentation/infiniband/core_locking.rst
@@ -22,7 +22,6 @@ Sleeping and interrupt context
     - post_recv
     - poll_cq
     - req_notify_cq
-    - map_phys_fmr
 
   which may not sleep and must be callable from any context.
 
@@ -36,7 +35,6 @@ Sleeping and interrupt context
     - ib_post_send
     - ib_post_recv
     - ib_req_notify_cq
-    - ib_map_phys_fmr
 
   are therefore safe to call from any context.
 
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index d0b3d35..4acbef2 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2557,7 +2557,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
 	SET_DEVICE_OP(dev_ops, add_gid);
 	SET_DEVICE_OP(dev_ops, advise_mr);
 	SET_DEVICE_OP(dev_ops, alloc_dm);
-	SET_DEVICE_OP(dev_ops, alloc_fmr);
 	SET_DEVICE_OP(dev_ops, alloc_hw_stats);
 	SET_DEVICE_OP(dev_ops, alloc_mr);
 	SET_DEVICE_OP(dev_ops, alloc_mr_integrity);
@@ -2584,7 +2583,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
 	SET_DEVICE_OP(dev_ops, create_wq);
 	SET_DEVICE_OP(dev_ops, dealloc_dm);
 	SET_DEVICE_OP(dev_ops, dealloc_driver);
-	SET_DEVICE_OP(dev_ops, dealloc_fmr);
 	SET_DEVICE_OP(dev_ops, dealloc_mw);
 	SET_DEVICE_OP(dev_ops, dealloc_pd);
 	SET_DEVICE_OP(dev_ops, dealloc_ucontext);
@@ -2628,7 +2626,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
 	SET_DEVICE_OP(dev_ops, iw_rem_ref);
 	SET_DEVICE_OP(dev_ops, map_mr_sg);
 	SET_DEVICE_OP(dev_ops, map_mr_sg_pi);
-	SET_DEVICE_OP(dev_ops, map_phys_fmr);
 	SET_DEVICE_OP(dev_ops, mmap);
 	SET_DEVICE_OP(dev_ops, mmap_free);
 	SET_DEVICE_OP(dev_ops, modify_ah);
@@ -2662,7 +2659,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
 	SET_DEVICE_OP(dev_ops, resize_cq);
 	SET_DEVICE_OP(dev_ops, set_vf_guid);
 	SET_DEVICE_OP(dev_ops, set_vf_link_state);
-	SET_DEVICE_OP(dev_ops, unmap_fmr);
 
 	SET_OBJ_SIZE(dev_ops, ib_ah);
 	SET_OBJ_SIZE(dev_ops, ib_cq);
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 56a7133..fa6689b 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2160,54 +2160,6 @@ struct ib_mr *ib_alloc_mr_integrity(struct ib_pd *pd,
 }
 EXPORT_SYMBOL(ib_alloc_mr_integrity);
 
-/* "Fast" memory regions */
-
-struct ib_fmr *ib_alloc_fmr(struct ib_pd *pd,
-			    int mr_access_flags,
-			    struct ib_fmr_attr *fmr_attr)
-{
-	struct ib_fmr *fmr;
-
-	if (!pd->device->ops.alloc_fmr)
-		return ERR_PTR(-EOPNOTSUPP);
-
-	fmr = pd->device->ops.alloc_fmr(pd, mr_access_flags, fmr_attr);
-	if (!IS_ERR(fmr)) {
-		fmr->device = pd->device;
-		fmr->pd     = pd;
-		atomic_inc(&pd->usecnt);
-	}
-
-	return fmr;
-}
-EXPORT_SYMBOL(ib_alloc_fmr);
-
-int ib_unmap_fmr(struct list_head *fmr_list)
-{
-	struct ib_fmr *fmr;
-
-	if (list_empty(fmr_list))
-		return 0;
-
-	fmr = list_entry(fmr_list->next, struct ib_fmr, list);
-	return fmr->device->ops.unmap_fmr(fmr_list);
-}
-EXPORT_SYMBOL(ib_unmap_fmr);
-
-int ib_dealloc_fmr(struct ib_fmr *fmr)
-{
-	struct ib_pd *pd;
-	int ret;
-
-	pd = fmr->pd;
-	ret = fmr->device->ops.dealloc_fmr(fmr);
-	if (!ret)
-		atomic_dec(&pd->usecnt);
-
-	return ret;
-}
-EXPORT_SYMBOL(ib_dealloc_fmr);
-
 /* Multicast groups */
 
 static bool is_valid_mcast_lid(struct ib_qp *qp, u16 lid)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index bbc5cfb..1850b3a 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2453,12 +2453,6 @@ struct ib_device_ops {
 	struct ib_mw *(*alloc_mw)(struct ib_pd *pd, enum ib_mw_type type,
 				  struct ib_udata *udata);
 	int (*dealloc_mw)(struct ib_mw *mw);
-	struct ib_fmr *(*alloc_fmr)(struct ib_pd *pd, int mr_access_flags,
-				    struct ib_fmr_attr *fmr_attr);
-	int (*map_phys_fmr)(struct ib_fmr *fmr, u64 *page_list, int list_len,
-			    u64 iova);
-	int (*unmap_fmr)(struct list_head *fmr_list);
-	int (*dealloc_fmr)(struct ib_fmr *fmr);
 	int (*attach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid);
 	int (*detach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid);
 	struct ib_xrcd *(*alloc_xrcd)(struct ib_device *device,
@@ -4209,45 +4203,6 @@ static inline u32 ib_inc_rkey(u32 rkey)
 }
 
 /**
- * ib_alloc_fmr - Allocates a unmapped fast memory region.
- * @pd: The protection domain associated with the unmapped region.
- * @mr_access_flags: Specifies the memory access rights.
- * @fmr_attr: Attributes of the unmapped region.
- *
- * A fast memory region must be mapped before it can be used as part of
- * a work request.
- */
-struct ib_fmr *ib_alloc_fmr(struct ib_pd *pd,
-			    int mr_access_flags,
-			    struct ib_fmr_attr *fmr_attr);
-
-/**
- * ib_map_phys_fmr - Maps a list of physical pages to a fast memory region.
- * @fmr: The fast memory region to associate with the pages.
- * @page_list: An array of physical pages to map to the fast memory region.
- * @list_len: The number of pages in page_list.
- * @iova: The I/O virtual address to use with the mapped region.
- */
-static inline int ib_map_phys_fmr(struct ib_fmr *fmr,
-				  u64 *page_list, int list_len,
-				  u64 iova)
-{
-	return fmr->device->ops.map_phys_fmr(fmr, page_list, list_len, iova);
-}
-
-/**
- * ib_unmap_fmr - Removes the mapping from a list of fast memory regions.
- * @fmr_list: A linked list of fast memory regions to unmap.
- */
-int ib_unmap_fmr(struct list_head *fmr_list);
-
-/**
- * ib_dealloc_fmr - Deallocates a fast memory region.
- * @fmr: The fast memory region to deallocate.
- */
-int ib_dealloc_fmr(struct ib_fmr *fmr);
-
-/**
  * ib_attach_mcast - Attaches the specified QP to a multicast group.
  * @qp: QP to attach to the multicast group.  The QP must be type
  *   IB_QPT_UD.
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 6/8] RDMA/srp: remove support for FMR memory registration
  2020-05-14 12:03 ` [PATCH 6/8] RDMA/srp: remove " Max Gurtovoy
@ 2020-05-14 14:02   ` Bart Van Assche
  0 siblings, 0 replies; 33+ messages in thread
From: Bart Van Assche @ 2020-05-14 14:02 UTC (permalink / raw)
  To: Max Gurtovoy, jgg, linux-rdma, dledford, leon; +Cc: sagi, israelr, shlomin

On 2020-05-14 05:03, Max Gurtovoy wrote:
> FMR is not supported on most recent RDMA devices (that use fast memory
> registration mechanism). Also, FMR was recently removed from NFS/RDMA
> ULP.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
                   ` (7 preceding siblings ...)
  2020-05-14 12:03 ` [PATCH 8/8] RDMA/core: remove FMR device ops Max Gurtovoy
@ 2020-05-14 15:13 ` Aron Silverton
  2020-05-14 18:18   ` santosh.shilimkar
  2020-05-14 16:00 ` Gal Pressman
  2020-05-18 15:20 ` Dennis Dalessandro
  10 siblings, 1 reply; 33+ messages in thread
From: Aron Silverton @ 2020-05-14 15:13 UTC (permalink / raw)
  To: Max Gurtovoy
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, sagi,
	israelr, shlomin, Santosh Shilimkar

+Santosh

You probably meant to copy the RDS maintainer? Not sure if this should have
also been sent to netdev@vger.kernel.org.

Aron

> On May 14, 2020, at 7:02 AM, Max Gurtovoy <maxg@mellanox.com> wrote:
> 
> This series removes the support for FMR mode to register memory. This ancient
> mode is unsafe and not maintained/tested in the last few years. It also doesn't
> have any reasonable advantage over other memory registration methods such as
> FRWR (that is implemented in all the recent RDMA adapters). This series should
> be reviewed and approved by the maintainer of the effected drivers and I
> suggest to test it as well.
> 
> The tests that I made for this series (fio benchmarks and fio verify data):
> 1. iSER initiator on ConnectX-4
> 2. iSER initiator on ConnectX-3
> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> 4. SRP initiator on ConnectX-3
> 
> Not tested:
> 1. RDS
> 2. mthca
> 3. rdmavt
> 
> Israel Rukshin (1):
>  RDMA/iser: Remove support for FMR memory registration
> 
> Max Gurtovoy (7):
>  RDMA/mlx4: remove FMR support for memory registration
>  RDMA/rds: remove FMR support for memory registration
>  RDMA/mthca: remove FMR support for memory registration
>  RDMA/rdmavt: remove FMR memory registration
>  RDMA/srp: remove support for FMR memory registration
>  RDMA/core: remove FMR pool API
>  RDMA/core: remove FMR device ops
> 
> Documentation/driver-api/infiniband.rst      |   3 -
> Documentation/infiniband/core_locking.rst    |   2 -
> drivers/infiniband/core/Makefile             |   2 +-
> drivers/infiniband/core/device.c             |   4 -
> drivers/infiniband/core/fmr_pool.c           | 494 ---------------------------
> drivers/infiniband/core/verbs.c              |  48 ---
> drivers/infiniband/hw/mlx4/main.c            |  10 -
> drivers/infiniband/hw/mlx4/mlx4_ib.h         |  16 -
> drivers/infiniband/hw/mlx4/mr.c              |  93 -----
> drivers/infiniband/hw/mthca/mthca_dev.h      |  10 -
> drivers/infiniband/hw/mthca/mthca_mr.c       | 262 +-------------
> drivers/infiniband/hw/mthca/mthca_provider.c |  86 -----
> drivers/infiniband/sw/rdmavt/mr.c            | 154 ---------
> drivers/infiniband/sw/rdmavt/mr.h            |  15 -
> drivers/infiniband/sw/rdmavt/vt.c            |   4 -
> drivers/infiniband/ulp/iser/iscsi_iser.h     |  79 +----
> drivers/infiniband/ulp/iser/iser_initiator.c |  19 +-
> drivers/infiniband/ulp/iser/iser_memory.c    | 188 +---------
> drivers/infiniband/ulp/iser/iser_verbs.c     | 126 +------
> drivers/infiniband/ulp/srp/ib_srp.c          | 222 +-----------
> drivers/infiniband/ulp/srp/ib_srp.h          |  27 +-
> drivers/net/ethernet/mellanox/mlx4/mr.c      | 183 ----------
> include/linux/mlx4/device.h                  |  21 +-
> include/rdma/ib_fmr_pool.h                   |  93 -----
> include/rdma/ib_verbs.h                      |  45 ---
> net/rds/Makefile                             |   2 +-
> net/rds/ib.c                                 |  14 +-
> net/rds/ib.h                                 |   1 -
> net/rds/ib_cm.c                              |   4 +-
> net/rds/ib_fmr.c                             | 269 ---------------
> net/rds/ib_mr.h                              |  12 -
> net/rds/ib_rdma.c                            |  16 +-
> 32 files changed, 77 insertions(+), 2447 deletions(-)
> delete mode 100644 drivers/infiniband/core/fmr_pool.c
> delete mode 100644 include/rdma/ib_fmr_pool.h
> delete mode 100644 net/rds/ib_fmr.c
> 
> -- 
> 1.8.3.1
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
                   ` (8 preceding siblings ...)
  2020-05-14 15:13 ` [PATCH 0/8 v1] Remove FMR support from RDMA drivers Aron Silverton
@ 2020-05-14 16:00 ` Gal Pressman
  2020-05-17 10:37   ` Max Gurtovoy
  2020-05-18 15:20 ` Dennis Dalessandro
  10 siblings, 1 reply; 33+ messages in thread
From: Gal Pressman @ 2020-05-14 16:00 UTC (permalink / raw)
  To: Max Gurtovoy, bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin

On 14/05/2020 15:02, Max Gurtovoy wrote:
> This series removes the support for FMR mode to register memory. This ancient
> mode is unsafe and not maintained/tested in the last few years. It also doesn't
> have any reasonable advantage over other memory registration methods such as
> FRWR (that is implemented in all the recent RDMA adapters). This series should
> be reviewed and approved by the maintainer of the effected drivers and I
> suggest to test it as well.
> 
> The tests that I made for this series (fio benchmarks and fio verify data):
> 1. iSER initiator on ConnectX-4
> 2. iSER initiator on ConnectX-3
> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> 4. SRP initiator on ConnectX-3
> 
> Not tested:
> 1. RDS
> 2. mthca
> 3. rdmavt

I think there are a few leftovers:

From f289a67b47e03d268469211065bf114cbb1c7125 Mon Sep 17 00:00:00 2001
From: Gal Pressman <galpress@amazon.com>
Date: Wed, 13 May 2020 10:49:09 +0300
Subject: [PATCH] RDMA/mlx5: Remove FMR leftovers

Remove a few leftovers from FMR functionality which are no longer used.

Signed-off-by: Gal Pressman <galpress@amazon.com>
---
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 482b54eb9764..40c461017763 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -675,12 +675,6 @@ struct umr_common {
 	struct semaphore	sem;
 };
 
-enum {
-	MLX5_FMR_INVALID,
-	MLX5_FMR_VALID,
-	MLX5_FMR_BUSY,
-};
-
 struct mlx5_cache_ent {
 	struct list_head	head;
 	/* sync access to the cahce entry
@@ -1253,8 +1247,6 @@ int mlx5_query_mad_ifc_port(struct ib_device *ibdev, u8 port,
 			    struct ib_port_attr *props);
 int mlx5_ib_query_port(struct ib_device *ibdev, u8 port,
 		       struct ib_port_attr *props);
-int mlx5_ib_init_fmr(struct mlx5_ib_dev *dev);
-void mlx5_ib_cleanup_fmr(struct mlx5_ib_dev *dev);
 void mlx5_ib_cont_pages(struct ib_umem *umem, u64 addr,
 			unsigned long max_page_shift,
 			int *count, int *shift,
-- 
2.26.2

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 15:13 ` [PATCH 0/8 v1] Remove FMR support from RDMA drivers Aron Silverton
@ 2020-05-14 18:18   ` santosh.shilimkar
  2020-05-14 19:42     ` Max Gurtovoy
  2020-05-14 22:23     ` Sagi Grimberg
  0 siblings, 2 replies; 33+ messages in thread
From: santosh.shilimkar @ 2020-05-14 18:18 UTC (permalink / raw)
  To: Aron Silverton, Max Gurtovoy
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, sagi,
	israelr, shlomin



On 5/14/20 8:13 AM, Aron Silverton wrote:
> +Santosh
> 
> You probably meant to copy the RDS maintainer? Not sure if this should have
> also been sent to netdev@vger.kernel.org.
> 
Thanks Aron.

> 
>> On May 14, 2020, at 7:02 AM, Max Gurtovoy <maxg@mellanox.com> wrote:
>>
>> This series removes the support for FMR mode to register memory. This ancient
>> mode is unsafe and not maintained/tested in the last few years. It also doesn't
>> have any reasonable advantage over other memory registration methods such as
>> FRWR (that is implemented in all the recent RDMA adapters). This series should
>> be reviewed and approved by the maintainer of the effected drivers and I
>> suggest to test it as well.
>>
I know the security issue has been brought up before and this plan of 
removal of FMR support was on the cards but on RDS at least on CX3 we
got more throughput with FMR vs FRWR. And the reasons are well
understood as well why its the case.

Is it possible to keep core support still around so that HCA's which
supports FMR, ULPs can still can leverage it if they want.
 From RDS perspective, if the HCA like CX3 doesn't support both modes,
code prefers FMR vs FRWR and hence the question.

Also while re-posting the series, please copy me on the patches.

Regards,
Santosh


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 18:18   ` santosh.shilimkar
@ 2020-05-14 19:42     ` Max Gurtovoy
  2020-05-14 22:23     ` Sagi Grimberg
  1 sibling, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-14 19:42 UTC (permalink / raw)
  To: santosh.shilimkar, Aron Silverton
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, sagi,
	israelr, shlomin


On 5/14/2020 9:18 PM, santosh.shilimkar@oracle.com wrote:
>
>
> On 5/14/20 8:13 AM, Aron Silverton wrote:
>> +Santosh
>>
>> You probably meant to copy the RDS maintainer? Not sure if this 
>> should have
>> also been sent to netdev@vger.kernel.org.
>>
> Thanks Aron.
>
>>
>>> On May 14, 2020, at 7:02 AM, Max Gurtovoy <maxg@mellanox.com> wrote:
>>>
>>> This series removes the support for FMR mode to register memory. 
>>> This ancient
>>> mode is unsafe and not maintained/tested in the last few years. It 
>>> also doesn't
>>> have any reasonable advantage over other memory registration methods 
>>> such as
>>> FRWR (that is implemented in all the recent RDMA adapters). This 
>>> series should
>>> be reviewed and approved by the maintainer of the effected drivers 
>>> and I
>>> suggest to test it as well.
>>>
> I know the security issue has been brought up before and this plan of 
> removal of FMR support was on the cards but on RDS at least on CX3 we
> got more throughput with FMR vs FRWR. And the reasons are well
> understood as well why its the case.

Please share the benchmark for both and the reasons that you understood 
for the difference.

We'll try to analyze how to improve ConnectX-3 FRWR


>
> Is it possible to keep core support still around so that HCA's which
> supports FMR, ULPs can still can leverage it if they want.
> From RDS perspective, if the HCA like CX3 doesn't support both modes,
> code prefers FMR vs FRWR and hence the question.

Well this series delete around 2400-2500 LOC that are probably almost 
not used and tested.

Lets try to understand the perf degradation.


>
> Also while re-posting the series, please copy me on the patches.

Sure.


>
> Regards,
> Santosh
>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 18:18   ` santosh.shilimkar
  2020-05-14 19:42     ` Max Gurtovoy
@ 2020-05-14 22:23     ` Sagi Grimberg
  2020-05-14 23:41       ` santosh.shilimkar
  2020-05-15  0:37       ` Max Gurtovoy
  1 sibling, 2 replies; 33+ messages in thread
From: Sagi Grimberg @ 2020-05-14 22:23 UTC (permalink / raw)
  To: santosh.shilimkar, Aron Silverton, Max Gurtovoy
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, israelr,
	shlomin


>> +Santosh
>>
>> You probably meant to copy the RDS maintainer? Not sure if this should 
>> have
>> also been sent to netdev@vger.kernel.org.
>>
> Thanks Aron.
> 
>>
>>> On May 14, 2020, at 7:02 AM, Max Gurtovoy <maxg@mellanox.com> wrote:
>>>
>>> This series removes the support for FMR mode to register memory. This 
>>> ancient
>>> mode is unsafe and not maintained/tested in the last few years. It 
>>> also doesn't
>>> have any reasonable advantage over other memory registration methods 
>>> such as
>>> FRWR (that is implemented in all the recent RDMA adapters). This 
>>> series should
>>> be reviewed and approved by the maintainer of the effected drivers and I
>>> suggest to test it as well.
>>>
> I know the security issue has been brought up before and this plan of 
> removal of FMR support was on the cards

Actually, what is unsafe is not necessarily fmrs, but rather the
fmr_pool interface. So Max, you can keep fmr if rds wants it, but
we can get rid of fmr pools.

> but on RDS at least on CX3 we
> got more throughput with FMR vs FRWR. And the reasons are well
> understood as well why its the case.

Looking at the rds code, it seems that rds doesn't do any fast
registration at all, rkeys are long lived and are only invalidated (or
unmaped) when need recycling or when a socket is torn down...

So I'm not clear exactly about the model here, but seems to me
its almost like rds needs something like phys_mr, which is static for
all of its lifetime. It seems that fmrs just create a hassle for
rds, unless I'm missing something...

Having said that, it surely isn't the most secure behavior...
At least its not the global dma rkey...

> Is it possible to keep core support still around so that HCA's which
> supports FMR, ULPs can still can leverage it if they want.
>  From RDS perspective, if the HCA like CX3 doesn't support both modes,
> code prefers FMR vs FRWR and hence the question.

Max can start by removing fmr_pools, fmrs can stay as there is nothing
fundamentally wrong with them. And apparently there are still users.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 22:23     ` Sagi Grimberg
@ 2020-05-14 23:41       ` santosh.shilimkar
  2020-05-15 16:52         ` Tom Talpey
  2020-05-15  0:37       ` Max Gurtovoy
  1 sibling, 1 reply; 33+ messages in thread
From: santosh.shilimkar @ 2020-05-14 23:41 UTC (permalink / raw)
  To: Sagi Grimberg, Aron Silverton, Max Gurtovoy
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, israelr,
	shlomin

On 5/14/20 3:23 PM, Sagi Grimberg wrote:
> 
>>> +Santosh
>>>
>>> You probably meant to copy the RDS maintainer? Not sure if this 
>>> should have
>>> also been sent to netdev@vger.kernel.org.
>>>
>> Thanks Aron.
>>
>>>
>>>> On May 14, 2020, at 7:02 AM, Max Gurtovoy <maxg@mellanox.com> wrote:
>>>>
>>>> This series removes the support for FMR mode to register memory. 
>>>> This ancient
>>>> mode is unsafe and not maintained/tested in the last few years. It 
>>>> also doesn't
>>>> have any reasonable advantage over other memory registration methods 
>>>> such as
>>>> FRWR (that is implemented in all the recent RDMA adapters). This 
>>>> series should
>>>> be reviewed and approved by the maintainer of the effected drivers 
>>>> and I
>>>> suggest to test it as well.
>>>>
>> I know the security issue has been brought up before and this plan of 
>> removal of FMR support was on the cards
> 
> Actually, what is unsafe is not necessarily fmrs, but rather the
> fmr_pool interface. So Max, you can keep fmr if rds wants it, but
> we can get rid of fmr pools.
>
Good point. We aren't using the fmr_pools.

>> but on RDS at least on CX3 we
>> got more throughput with FMR vs FRWR. And the reasons are well
>> understood as well why its the case.
> 
> Looking at the rds code, it seems that rds doesn't do any fast
> registration at all, rkeys are long lived and are only invalidated (or
> unmaped) when need recycling or when a socket is torn down...
> 
> So I'm not clear exactly about the model here, but seems to me
> its almost like rds needs something like phys_mr, which is static for
> all of its lifetime. It seems that fmrs just create a hassle for
> rds, unless I'm missing something...
> 
> Having said that, it surely isn't the most secure behavior...
> At least its not the global dma rkey...
>
There are couple of layers but you can see the FRWR code inside,
net/rds/ib_frmr.c. The MR allocation as well as free/invalidation
is managed from user-land instead of ULP data path. There are
couple of cases where some use_once semantics does MR invalidation
within kernel but thats only because userland indicated that MR
key can be invalidated after issued RDMA ops is complete.

>> Is it possible to keep core support still around so that HCA's which
>> supports FMR, ULPs can still can leverage it if they want.
>>  From RDS perspective, if the HCA like CX3 doesn't support both modes,
>> code prefers FMR vs FRWR and hence the question.
> 
> Max can start by removing fmr_pools, fmrs can stay as there is nothing
> fundamentally wrong with them. And apparently there are still users.

That will surely help if its an option. RDS don't use fmr_pools so no
issues there.

Regards,
Santosh



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 22:23     ` Sagi Grimberg
  2020-05-14 23:41       ` santosh.shilimkar
@ 2020-05-15  0:37       ` Max Gurtovoy
  1 sibling, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-15  0:37 UTC (permalink / raw)
  To: Sagi Grimberg, santosh.shilimkar, Aron Silverton
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, israelr,
	shlomin


On 5/15/2020 1:23 AM, Sagi Grimberg wrote:
>
>>> +Santosh
>>>
>>> You probably meant to copy the RDS maintainer? Not sure if this 
>>> should have
>>> also been sent to netdev@vger.kernel.org.
>>>
>> Thanks Aron.
>>
>>>
>>>> On May 14, 2020, at 7:02 AM, Max Gurtovoy <maxg@mellanox.com> wrote:
>>>>
>>>> This series removes the support for FMR mode to register memory. 
>>>> This ancient
>>>> mode is unsafe and not maintained/tested in the last few years. It 
>>>> also doesn't
>>>> have any reasonable advantage over other memory registration 
>>>> methods such as
>>>> FRWR (that is implemented in all the recent RDMA adapters). This 
>>>> series should
>>>> be reviewed and approved by the maintainer of the effected drivers 
>>>> and I
>>>> suggest to test it as well.
>>>>
>> I know the security issue has been brought up before and this plan of 
>> removal of FMR support was on the cards
>
> Actually, what is unsafe is not necessarily fmrs, but rather the
> fmr_pool interface. So Max, you can keep fmr if rds wants it, but
> we can get rid of fmr pools.
>
>> but on RDS at least on CX3 we
>> got more throughput with FMR vs FRWR. And the reasons are well
>> understood as well why its the case.
>
> Looking at the rds code, it seems that rds doesn't do any fast
> registration at all, rkeys are long lived and are only invalidated (or
> unmaped) when need recycling or when a socket is torn down...
>
> So I'm not clear exactly about the model here, but seems to me
> its almost like rds needs something like phys_mr, which is static for
> all of its lifetime. It seems that fmrs just create a hassle for
> rds, unless I'm missing something...
>
> Having said that, it surely isn't the most secure behavior...
> At least its not the global dma rkey...
>
>> Is it possible to keep core support still around so that HCA's which
>> supports FMR, ULPs can still can leverage it if they want.
>>  From RDS perspective, if the HCA like CX3 doesn't support both modes,
>> code prefers FMR vs FRWR and hence the question.
>
> Max can start by removing fmr_pools, fmrs can stay as there is nothing
> fundamentally wrong with them. And apparently there are still users.

Ok we can start with this.

Please review patches 5, 6, 7 that are stand alone.

And I'll send a V2 for SRP/ISER/FMR_pool only.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 23:41       ` santosh.shilimkar
@ 2020-05-15 16:52         ` Tom Talpey
  2020-05-15 18:59           ` Sagi Grimberg
  0 siblings, 1 reply; 33+ messages in thread
From: Tom Talpey @ 2020-05-15 16:52 UTC (permalink / raw)
  To: santosh.shilimkar, Sagi Grimberg, Aron Silverton, Max Gurtovoy
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, israelr,
	shlomin

On 5/14/2020 7:41 PM, santosh.shilimkar@oracle.com wrote:
> On 5/14/20 3:23 PM, Sagi Grimberg wrote:
>>
>>>> +Santosh
>>>>
>>>> You probably meant to copy the RDS maintainer? Not sure if this 
>>>> should have
>>>> also been sent to netdev@vger.kernel.org.
>>>>
>>> Thanks Aron.
>>>
>>>>
>>>>> On May 14, 2020, at 7:02 AM, Max Gurtovoy <maxg@mellanox.com> wrote:
>>>>>
>>>>> This series removes the support for FMR mode to register memory. 
>>>>> This ancient
>>>>> mode is unsafe and not maintained/tested in the last few years. It 
>>>>> also doesn't
>>>>> have any reasonable advantage over other memory registration 
>>>>> methods such as
>>>>> FRWR (that is implemented in all the recent RDMA adapters). This 
>>>>> series should
>>>>> be reviewed and approved by the maintainer of the effected drivers 
>>>>> and I
>>>>> suggest to test it as well.
>>>>>
>>> I know the security issue has been brought up before and this plan of 
>>> removal of FMR support was on the cards
>>
>> Actually, what is unsafe is not necessarily fmrs, but rather the
>> fmr_pool interface. So Max, you can keep fmr if rds wants it, but
>> we can get rid of fmr pools.
>>
> Good point. We aren't using the fmr_pools.

It's not only the fmr_pools. The FMR API operates on a page granularity,
so a sub-page registration, or a non-page-aligned one, ends up exposing
data outside the buffer. This is done silently, and is a significant
vulnerability for most upper layers.

Tom.

>>> but on RDS at least on CX3 we
>>> got more throughput with FMR vs FRWR. And the reasons are well
>>> understood as well why its the case.
>>
>> Looking at the rds code, it seems that rds doesn't do any fast
>> registration at all, rkeys are long lived and are only invalidated (or
>> unmaped) when need recycling or when a socket is torn down...
>>
>> So I'm not clear exactly about the model here, but seems to me
>> its almost like rds needs something like phys_mr, which is static for
>> all of its lifetime. It seems that fmrs just create a hassle for
>> rds, unless I'm missing something...
>>
>> Having said that, it surely isn't the most secure behavior...
>> At least its not the global dma rkey...
>>
> There are couple of layers but you can see the FRWR code inside,
> net/rds/ib_frmr.c. The MR allocation as well as free/invalidation
> is managed from user-land instead of ULP data path. There are
> couple of cases where some use_once semantics does MR invalidation
> within kernel but thats only because userland indicated that MR
> key can be invalidated after issued RDMA ops is complete.
> 
>>> Is it possible to keep core support still around so that HCA's which
>>> supports FMR, ULPs can still can leverage it if they want.
>>>  From RDS perspective, if the HCA like CX3 doesn't support both modes,
>>> code prefers FMR vs FRWR and hence the question.
>>
>> Max can start by removing fmr_pools, fmrs can stay as there is nothing
>> fundamentally wrong with them. And apparently there are still users.
> 
> That will surely help if its an option. RDS don't use fmr_pools so no
> issues there.
> 
> Regards,
> Santosh
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-15 16:52         ` Tom Talpey
@ 2020-05-15 18:59           ` Sagi Grimberg
  2020-05-17 10:51             ` Max Gurtovoy
  0 siblings, 1 reply; 33+ messages in thread
From: Sagi Grimberg @ 2020-05-15 18:59 UTC (permalink / raw)
  To: Tom Talpey, santosh.shilimkar, Aron Silverton, Max Gurtovoy
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, israelr,
	shlomin


> It's not only the fmr_pools. The FMR API operates on a page granularity,
> so a sub-page registration, or a non-page-aligned one, ends up exposing
> data outside the buffer. This is done silently, and is a significant
> vulnerability for most upper layers.

You're right Tom, I forgot about that...
I guess I'd vote to remove it altogether then...

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 16:00 ` Gal Pressman
@ 2020-05-17 10:37   ` Max Gurtovoy
  0 siblings, 0 replies; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-17 10:37 UTC (permalink / raw)
  To: Gal Pressman, bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin


On 5/14/2020 7:00 PM, Gal Pressman wrote:
> On 14/05/2020 15:02, Max Gurtovoy wrote:
>> This series removes the support for FMR mode to register memory. This ancient
>> mode is unsafe and not maintained/tested in the last few years. It also doesn't
>> have any reasonable advantage over other memory registration methods such as
>> FRWR (that is implemented in all the recent RDMA adapters). This series should
>> be reviewed and approved by the maintainer of the effected drivers and I
>> suggest to test it as well.
>>
>> The tests that I made for this series (fio benchmarks and fio verify data):
>> 1. iSER initiator on ConnectX-4
>> 2. iSER initiator on ConnectX-3
>> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
>> 4. SRP initiator on ConnectX-3
>>
>> Not tested:
>> 1. RDS
>> 2. mthca
>> 3. rdmavt
> I think there are a few leftovers:
>
>  From f289a67b47e03d268469211065bf114cbb1c7125 Mon Sep 17 00:00:00 2001
> From: Gal Pressman <galpress@amazon.com>
> Date: Wed, 13 May 2020 10:49:09 +0300
> Subject: [PATCH] RDMA/mlx5: Remove FMR leftovers
>
> Remove a few leftovers from FMR functionality which are no longer used.
>
> Signed-off-by: Gal Pressman <galpress@amazon.com>
> ---
>   drivers/infiniband/hw/mlx5/mlx5_ib.h | 8 --------
>   1 file changed, 8 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> index 482b54eb9764..40c461017763 100644
> --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
> +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> @@ -675,12 +675,6 @@ struct umr_common {
>   	struct semaphore	sem;
>   };
>   
> -enum {
> -	MLX5_FMR_INVALID,
> -	MLX5_FMR_VALID,
> -	MLX5_FMR_BUSY,
> -};
> -
>   struct mlx5_cache_ent {
>   	struct list_head	head;
>   	/* sync access to the cahce entry
> @@ -1253,8 +1247,6 @@ int mlx5_query_mad_ifc_port(struct ib_device *ibdev, u8 port,
>   			    struct ib_port_attr *props);
>   int mlx5_ib_query_port(struct ib_device *ibdev, u8 port,
>   		       struct ib_port_attr *props);
> -int mlx5_ib_init_fmr(struct mlx5_ib_dev *dev);
> -void mlx5_ib_cleanup_fmr(struct mlx5_ib_dev *dev);
>   void mlx5_ib_cont_pages(struct ib_umem *umem, u64 addr,
>   			unsigned long max_page_shift,
>   			int *count, int *shift,

Thanks Gal.

This is a dead code regardless to this series but I'll add it.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-15 18:59           ` Sagi Grimberg
@ 2020-05-17 10:51             ` Max Gurtovoy
  2020-05-18 16:34               ` santosh.shilimkar
  0 siblings, 1 reply; 33+ messages in thread
From: Max Gurtovoy @ 2020-05-17 10:51 UTC (permalink / raw)
  To: Sagi Grimberg, Tom Talpey, santosh.shilimkar, Aron Silverton
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, israelr,
	shlomin


On 5/15/2020 9:59 PM, Sagi Grimberg wrote:
>
>> It's not only the fmr_pools. The FMR API operates on a page granularity,
>> so a sub-page registration, or a non-page-aligned one, ends up exposing
>> data outside the buffer. This is done silently, and is a significant
>> vulnerability for most upper layers.
>
> You're right Tom, I forgot about that...
> I guess I'd vote to remove it altogether then...

Santosh,

I don't really understand the mechanism you use from user/kernel 
regarding the registration.

But I saw you use wait_event and wait for completing the 
FRWR/INVALIDATION. For sure this is not optimal.

But can't really say what is the bottleneck in your case.

Can you share the numbers you get for FMR/FRWR benchmarks ?


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
                   ` (9 preceding siblings ...)
  2020-05-14 16:00 ` Gal Pressman
@ 2020-05-18 15:20 ` Dennis Dalessandro
  2020-05-18 18:10   ` Jason Gunthorpe
  10 siblings, 1 reply; 33+ messages in thread
From: Dennis Dalessandro @ 2020-05-18 15:20 UTC (permalink / raw)
  To: Max Gurtovoy, bvanassche, jgg, linux-rdma, dledford, leon
  Cc: sagi, israelr, shlomin, Marciniszyn, Mike

On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
> This series removes the support for FMR mode to register memory. This ancient
> mode is unsafe and not maintained/tested in the last few years. It also doesn't
> have any reasonable advantage over other memory registration methods such as
> FRWR (that is implemented in all the recent RDMA adapters). This series should
> be reviewed and approved by the maintainer of the effected drivers and I
> suggest to test it as well.
> 
> The tests that I made for this series (fio benchmarks and fio verify data):
> 1. iSER initiator on ConnectX-4
> 2. iSER initiator on ConnectX-3
> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> 4. SRP initiator on ConnectX-3
> 
> Not tested:
> 1. RDS
> 2. mthca
> 3. rdmavt

This will effectively kill qib which uses rdmavt. It's gonna have to be 
a NAK from me.

-Denny


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-17 10:51             ` Max Gurtovoy
@ 2020-05-18 16:34               ` santosh.shilimkar
  0 siblings, 0 replies; 33+ messages in thread
From: santosh.shilimkar @ 2020-05-18 16:34 UTC (permalink / raw)
  To: Max Gurtovoy, Sagi Grimberg, Tom Talpey, Aron Silverton
  Cc: bvanassche, Jason Gunthorpe, linux-rdma, dledford, leon, israelr,
	shlomin

On 5/17/20 3:51 AM, Max Gurtovoy wrote:
> 
> On 5/15/2020 9:59 PM, Sagi Grimberg wrote:
>>
>>> It's not only the fmr_pools. The FMR API operates on a page granularity,
>>> so a sub-page registration, or a non-page-aligned one, ends up exposing
>>> data outside the buffer. This is done silently, and is a significant
>>> vulnerability for most upper layers.
>>
>> You're right Tom, I forgot about that...
>> I guess I'd vote to remove it altogether then...
> 
> Santosh,
> 
> I don't really understand the mechanism you use from user/kernel 
> regarding the registration.
> 
> But I saw you use wait_event and wait for completing the 
> FRWR/INVALIDATION. For sure this is not optimal.
> 
> But can't really say what is the bottleneck in your case.
> 
> Can you share the numbers you get for FMR/FRWR benchmarks ?
> 
The upstream code for FRWR is behind the optimal code we have
been using. The invalidation path is removed from datapath
with proxy QP within the PD.

Anyway, please bounce your patches for me to review to make sure
nothing breaks. From various responses, it seems to be aligned to
move ahead with the patch-set, so lets go with it.

Regards,
Santosh

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-18 15:20 ` Dennis Dalessandro
@ 2020-05-18 18:10   ` Jason Gunthorpe
  2020-05-19 13:43     ` Dennis Dalessandro
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2020-05-18 18:10 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Max Gurtovoy, bvanassche, linux-rdma, dledford, leon, sagi,
	israelr, shlomin, Marciniszyn, Mike

On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
> On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
> > This series removes the support for FMR mode to register memory. This ancient
> > mode is unsafe and not maintained/tested in the last few years. It also doesn't
> > have any reasonable advantage over other memory registration methods such as
> > FRWR (that is implemented in all the recent RDMA adapters). This series should
> > be reviewed and approved by the maintainer of the effected drivers and I
> > suggest to test it as well.
> > 
> > The tests that I made for this series (fio benchmarks and fio verify data):
> > 1. iSER initiator on ConnectX-4
> > 2. iSER initiator on ConnectX-3
> > 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> > 4. SRP initiator on ConnectX-3
> > 
> > Not tested:
> > 1. RDS
> > 2. mthca
> > 3. rdmavt
> 
> This will effectively kill qib which uses rdmavt. It's gonna have to be a
> NAK from me.

Are you objecting the SRP and iSER changes too?

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-18 18:10   ` Jason Gunthorpe
@ 2020-05-19 13:43     ` Dennis Dalessandro
  2020-05-19 13:53       ` Jason Gunthorpe
  0 siblings, 1 reply; 33+ messages in thread
From: Dennis Dalessandro @ 2020-05-19 13:43 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Max Gurtovoy, bvanassche, linux-rdma, dledford, leon, sagi,
	israelr, shlomin, Marciniszyn, Mike

On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
> On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
>> On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
>>> This series removes the support for FMR mode to register memory. This ancient
>>> mode is unsafe and not maintained/tested in the last few years. It also doesn't
>>> have any reasonable advantage over other memory registration methods such as
>>> FRWR (that is implemented in all the recent RDMA adapters). This series should
>>> be reviewed and approved by the maintainer of the effected drivers and I
>>> suggest to test it as well.
>>>
>>> The tests that I made for this series (fio benchmarks and fio verify data):
>>> 1. iSER initiator on ConnectX-4
>>> 2. iSER initiator on ConnectX-3
>>> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
>>> 4. SRP initiator on ConnectX-3
>>>
>>> Not tested:
>>> 1. RDS
>>> 2. mthca
>>> 3. rdmavt
>>
>> This will effectively kill qib which uses rdmavt. It's gonna have to be a
>> NAK from me.
> 
> Are you objecting the SRP and iSER changes too?

No, just want to keep basic verbs support at least. NFS already dropped, 
similarly we are ok with dropping it from SRP/iSER as a next step.

-Denny


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-19 13:43     ` Dennis Dalessandro
@ 2020-05-19 13:53       ` Jason Gunthorpe
  2020-05-19 14:19         ` Leon Romanovsky
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2020-05-19 13:53 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Max Gurtovoy, bvanassche, linux-rdma, dledford, leon, sagi,
	israelr, shlomin, Marciniszyn, Mike

On Tue, May 19, 2020 at 09:43:14AM -0400, Dennis Dalessandro wrote:
> On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
> > On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
> > > On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
> > > > This series removes the support for FMR mode to register memory. This ancient
> > > > mode is unsafe and not maintained/tested in the last few years. It also doesn't
> > > > have any reasonable advantage over other memory registration methods such as
> > > > FRWR (that is implemented in all the recent RDMA adapters). This series should
> > > > be reviewed and approved by the maintainer of the effected drivers and I
> > > > suggest to test it as well.
> > > > 
> > > > The tests that I made for this series (fio benchmarks and fio verify data):
> > > > 1. iSER initiator on ConnectX-4
> > > > 2. iSER initiator on ConnectX-3
> > > > 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> > > > 4. SRP initiator on ConnectX-3
> > > > 
> > > > Not tested:
> > > > 1. RDS
> > > > 2. mthca
> > > > 3. rdmavt
> > > 
> > > This will effectively kill qib which uses rdmavt. It's gonna have to be a
> > > NAK from me.
> > 
> > Are you objecting the SRP and iSER changes too?
> 
> No, just want to keep basic verbs support at least. NFS already dropped,
> similarly we are ok with dropping it from SRP/iSER as a next step.

So you see a major user in RDS for qib?

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-19 13:53       ` Jason Gunthorpe
@ 2020-05-19 14:19         ` Leon Romanovsky
  2020-05-19 14:26           ` Dennis Dalessandro
  0 siblings, 1 reply; 33+ messages in thread
From: Leon Romanovsky @ 2020-05-19 14:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Dennis Dalessandro, Max Gurtovoy, bvanassche, linux-rdma,
	dledford, sagi, israelr, shlomin, Marciniszyn, Mike

On Tue, May 19, 2020 at 10:53:52AM -0300, Jason Gunthorpe wrote:
> On Tue, May 19, 2020 at 09:43:14AM -0400, Dennis Dalessandro wrote:
> > On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
> > > On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
> > > > On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
> > > > > This series removes the support for FMR mode to register memory. This ancient
> > > > > mode is unsafe and not maintained/tested in the last few years. It also doesn't
> > > > > have any reasonable advantage over other memory registration methods such as
> > > > > FRWR (that is implemented in all the recent RDMA adapters). This series should
> > > > > be reviewed and approved by the maintainer of the effected drivers and I
> > > > > suggest to test it as well.
> > > > >
> > > > > The tests that I made for this series (fio benchmarks and fio verify data):
> > > > > 1. iSER initiator on ConnectX-4
> > > > > 2. iSER initiator on ConnectX-3
> > > > > 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> > > > > 4. SRP initiator on ConnectX-3
> > > > >
> > > > > Not tested:
> > > > > 1. RDS
> > > > > 2. mthca
> > > > > 3. rdmavt
> > > >
> > > > This will effectively kill qib which uses rdmavt. It's gonna have to be a
> > > > NAK from me.
> > >
> > > Are you objecting the SRP and iSER changes too?
> >
> > No, just want to keep basic verbs support at least. NFS already dropped,
> > similarly we are ok with dropping it from SRP/iSER as a next step.
>
> So you see a major user in RDS for qib?

Didn't we agree to drop it from RDS too?

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-19 14:19         ` Leon Romanovsky
@ 2020-05-19 14:26           ` Dennis Dalessandro
  2020-05-19 14:30             ` Jason Gunthorpe
  0 siblings, 1 reply; 33+ messages in thread
From: Dennis Dalessandro @ 2020-05-19 14:26 UTC (permalink / raw)
  To: Leon Romanovsky, Jason Gunthorpe
  Cc: Max Gurtovoy, bvanassche, linux-rdma, dledford, sagi, israelr,
	shlomin, Marciniszyn, Mike

On 5/19/2020 10:19 AM, Leon Romanovsky wrote:
> On Tue, May 19, 2020 at 10:53:52AM -0300, Jason Gunthorpe wrote:
>> On Tue, May 19, 2020 at 09:43:14AM -0400, Dennis Dalessandro wrote:
>>> On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
>>>> On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
>>>>> On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
>>>>>> This series removes the support for FMR mode to register memory. This ancient
>>>>>> mode is unsafe and not maintained/tested in the last few years. It also doesn't
>>>>>> have any reasonable advantage over other memory registration methods such as
>>>>>> FRWR (that is implemented in all the recent RDMA adapters). This series should
>>>>>> be reviewed and approved by the maintainer of the effected drivers and I
>>>>>> suggest to test it as well.
>>>>>>
>>>>>> The tests that I made for this series (fio benchmarks and fio verify data):
>>>>>> 1. iSER initiator on ConnectX-4
>>>>>> 2. iSER initiator on ConnectX-3
>>>>>> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
>>>>>> 4. SRP initiator on ConnectX-3
>>>>>>
>>>>>> Not tested:
>>>>>> 1. RDS
>>>>>> 2. mthca
>>>>>> 3. rdmavt
>>>>>
>>>>> This will effectively kill qib which uses rdmavt. It's gonna have to be a
>>>>> NAK from me.
>>>>
>>>> Are you objecting the SRP and iSER changes too?
>>>
>>> No, just want to keep basic verbs support at least. NFS already dropped,
>>> similarly we are ok with dropping it from SRP/iSER as a next step.
>>
>> So you see a major user in RDS for qib?
> 
> Didn't we agree to drop it from RDS too?
> 

Just basic verbs application support is enough for qib I think. I don't 
see any major use of RDS.

-Denny

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-19 14:26           ` Dennis Dalessandro
@ 2020-05-19 14:30             ` Jason Gunthorpe
  2020-05-19 14:37               ` Dennis Dalessandro
  0 siblings, 1 reply; 33+ messages in thread
From: Jason Gunthorpe @ 2020-05-19 14:30 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Leon Romanovsky, Max Gurtovoy, bvanassche, linux-rdma, dledford,
	sagi, israelr, shlomin, Marciniszyn, Mike

On Tue, May 19, 2020 at 10:26:37AM -0400, Dennis Dalessandro wrote:
> On 5/19/2020 10:19 AM, Leon Romanovsky wrote:
> > On Tue, May 19, 2020 at 10:53:52AM -0300, Jason Gunthorpe wrote:
> > > On Tue, May 19, 2020 at 09:43:14AM -0400, Dennis Dalessandro wrote:
> > > > On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
> > > > > On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
> > > > > > On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
> > > > > > > This series removes the support for FMR mode to register memory. This ancient
> > > > > > > mode is unsafe and not maintained/tested in the last few years. It also doesn't
> > > > > > > have any reasonable advantage over other memory registration methods such as
> > > > > > > FRWR (that is implemented in all the recent RDMA adapters). This series should
> > > > > > > be reviewed and approved by the maintainer of the effected drivers and I
> > > > > > > suggest to test it as well.
> > > > > > > 
> > > > > > > The tests that I made for this series (fio benchmarks and fio verify data):
> > > > > > > 1. iSER initiator on ConnectX-4
> > > > > > > 2. iSER initiator on ConnectX-3
> > > > > > > 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> > > > > > > 4. SRP initiator on ConnectX-3
> > > > > > > 
> > > > > > > Not tested:
> > > > > > > 1. RDS
> > > > > > > 2. mthca
> > > > > > > 3. rdmavt
> > > > > > 
> > > > > > This will effectively kill qib which uses rdmavt. It's gonna have to be a
> > > > > > NAK from me.
> > > > > 
> > > > > Are you objecting the SRP and iSER changes too?
> > > > 
> > > > No, just want to keep basic verbs support at least. NFS already dropped,
> > > > similarly we are ok with dropping it from SRP/iSER as a next step.
> > > 
> > > So you see a major user in RDS for qib?
> > 
> > Didn't we agree to drop it from RDS too?
> > 
> 
> Just basic verbs application support is enough for qib I think. I don't see
> any major use of RDS.

Well, once the in-kernel users of an API are gone that API will be
purged. This is standard kernel policy.

So you can't NAK this series on the grounds you want to keep an API
without users, presumably for out of tree modules...

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-19 14:30             ` Jason Gunthorpe
@ 2020-05-19 14:37               ` Dennis Dalessandro
  2020-05-23 22:08                 ` Jason Gunthorpe
  0 siblings, 1 reply; 33+ messages in thread
From: Dennis Dalessandro @ 2020-05-19 14:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Max Gurtovoy, bvanassche, linux-rdma, dledford,
	sagi, israelr, shlomin, Marciniszyn, Mike

On 5/19/2020 10:30 AM, Jason Gunthorpe wrote:
> On Tue, May 19, 2020 at 10:26:37AM -0400, Dennis Dalessandro wrote:
>> On 5/19/2020 10:19 AM, Leon Romanovsky wrote:
>>> On Tue, May 19, 2020 at 10:53:52AM -0300, Jason Gunthorpe wrote:
>>>> On Tue, May 19, 2020 at 09:43:14AM -0400, Dennis Dalessandro wrote:
>>>>> On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
>>>>>> On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
>>>>>>> On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
>>>>>>>> This series removes the support for FMR mode to register memory. This ancient
>>>>>>>> mode is unsafe and not maintained/tested in the last few years. It also doesn't
>>>>>>>> have any reasonable advantage over other memory registration methods such as
>>>>>>>> FRWR (that is implemented in all the recent RDMA adapters). This series should
>>>>>>>> be reviewed and approved by the maintainer of the effected drivers and I
>>>>>>>> suggest to test it as well.
>>>>>>>>
>>>>>>>> The tests that I made for this series (fio benchmarks and fio verify data):
>>>>>>>> 1. iSER initiator on ConnectX-4
>>>>>>>> 2. iSER initiator on ConnectX-3
>>>>>>>> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
>>>>>>>> 4. SRP initiator on ConnectX-3
>>>>>>>>
>>>>>>>> Not tested:
>>>>>>>> 1. RDS
>>>>>>>> 2. mthca
>>>>>>>> 3. rdmavt
>>>>>>>
>>>>>>> This will effectively kill qib which uses rdmavt. It's gonna have to be a
>>>>>>> NAK from me.
>>>>>>
>>>>>> Are you objecting the SRP and iSER changes too?
>>>>>
>>>>> No, just want to keep basic verbs support at least. NFS already dropped,
>>>>> similarly we are ok with dropping it from SRP/iSER as a next step.
>>>>
>>>> So you see a major user in RDS for qib?
>>>
>>> Didn't we agree to drop it from RDS too?
>>>
>>
>> Just basic verbs application support is enough for qib I think. I don't see
>> any major use of RDS.
> 
> Well, once the in-kernel users of an API are gone that API will be
> purged. This is standard kernel policy.
> 
> So you can't NAK this series on the grounds you want to keep an API
> without users, presumably for out of tree modules...
> 

Maybe I need to look at this again. I thought it would kill off user 
access as well. We don't need any kernel ULPs.

-Denny


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-19 14:37               ` Dennis Dalessandro
@ 2020-05-23 22:08                 ` Jason Gunthorpe
  2020-05-24  1:27                   ` Tom Talpey
  2020-05-26 15:38                   ` Dennis Dalessandro
  0 siblings, 2 replies; 33+ messages in thread
From: Jason Gunthorpe @ 2020-05-23 22:08 UTC (permalink / raw)
  To: Dennis Dalessandro
  Cc: Leon Romanovsky, Max Gurtovoy, bvanassche, linux-rdma, dledford,
	sagi, israelr, shlomin, Marciniszyn, Mike

On Tue, May 19, 2020 at 10:37:07AM -0400, Dennis Dalessandro wrote:
> On 5/19/2020 10:30 AM, Jason Gunthorpe wrote:
> > On Tue, May 19, 2020 at 10:26:37AM -0400, Dennis Dalessandro wrote:
> > > On 5/19/2020 10:19 AM, Leon Romanovsky wrote:
> > > > On Tue, May 19, 2020 at 10:53:52AM -0300, Jason Gunthorpe wrote:
> > > > > On Tue, May 19, 2020 at 09:43:14AM -0400, Dennis Dalessandro wrote:
> > > > > > On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
> > > > > > > On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
> > > > > > > > On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
> > > > > > > > > This series removes the support for FMR mode to register memory. This ancient
> > > > > > > > > mode is unsafe and not maintained/tested in the last few years. It also doesn't
> > > > > > > > > have any reasonable advantage over other memory registration methods such as
> > > > > > > > > FRWR (that is implemented in all the recent RDMA adapters). This series should
> > > > > > > > > be reviewed and approved by the maintainer of the effected drivers and I
> > > > > > > > > suggest to test it as well.
> > > > > > > > > 
> > > > > > > > > The tests that I made for this series (fio benchmarks and fio verify data):
> > > > > > > > > 1. iSER initiator on ConnectX-4
> > > > > > > > > 2. iSER initiator on ConnectX-3
> > > > > > > > > 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> > > > > > > > > 4. SRP initiator on ConnectX-3
> > > > > > > > > 
> > > > > > > > > Not tested:
> > > > > > > > > 1. RDS
> > > > > > > > > 2. mthca
> > > > > > > > > 3. rdmavt
> > > > > > > > 
> > > > > > > > This will effectively kill qib which uses rdmavt. It's gonna have to be a
> > > > > > > > NAK from me.
> > > > > > > 
> > > > > > > Are you objecting the SRP and iSER changes too?
> > > > > > 
> > > > > > No, just want to keep basic verbs support at least. NFS already dropped,
> > > > > > similarly we are ok with dropping it from SRP/iSER as a next step.
> > > > > 
> > > > > So you see a major user in RDS for qib?
> > > > 
> > > > Didn't we agree to drop it from RDS too?
> > > > 
> > > 
> > > Just basic verbs application support is enough for qib I think. I don't see
> > > any major use of RDS.
> > 
> > Well, once the in-kernel users of an API are gone that API will be
> > purged. This is standard kernel policy.
> > 
> > So you can't NAK this series on the grounds you want to keep an API
> > without users, presumably for out of tree modules...
> > 
> 
> Maybe I need to look at this again. I thought it would kill off user access
> as well. We don't need any kernel ULPs.

Did you make a conclusion? Seems like everyone else is in agreement
here, if Max resends a v2 I'm inclined to take it unless RDS objects.

I did not think FMR or FRWR were available from userspace at all.

Jason

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-23 22:08                 ` Jason Gunthorpe
@ 2020-05-24  1:27                   ` Tom Talpey
  2020-05-26 15:38                   ` Dennis Dalessandro
  1 sibling, 0 replies; 33+ messages in thread
From: Tom Talpey @ 2020-05-24  1:27 UTC (permalink / raw)
  To: Jason Gunthorpe, Dennis Dalessandro
  Cc: Leon Romanovsky, Max Gurtovoy, bvanassche, linux-rdma, dledford,
	sagi, israelr, shlomin, Marciniszyn, Mike

On 5/23/2020 6:08 PM, Jason Gunthorpe wrote:
> On Tue, May 19, 2020 at 10:37:07AM -0400, Dennis Dalessandro wrote:
>> On 5/19/2020 10:30 AM, Jason Gunthorpe wrote:
>>> On Tue, May 19, 2020 at 10:26:37AM -0400, Dennis Dalessandro wrote:
>>>> On 5/19/2020 10:19 AM, Leon Romanovsky wrote:
>>>>> On Tue, May 19, 2020 at 10:53:52AM -0300, Jason Gunthorpe wrote:
>>>>>> On Tue, May 19, 2020 at 09:43:14AM -0400, Dennis Dalessandro wrote:
>>>>>>> On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
>>>>>>>> On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
>>>>>>>>> On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
>>>>>>>>>> This series removes the support for FMR mode to register memory. This ancient
>>>>>>>>>> mode is unsafe and not maintained/tested in the last few years. It also doesn't
>>>>>>>>>> have any reasonable advantage over other memory registration methods such as
>>>>>>>>>> FRWR (that is implemented in all the recent RDMA adapters). This series should
>>>>>>>>>> be reviewed and approved by the maintainer of the effected drivers and I
>>>>>>>>>> suggest to test it as well.
>>>>>>>>>>
>>>>>>>>>> The tests that I made for this series (fio benchmarks and fio verify data):
>>>>>>>>>> 1. iSER initiator on ConnectX-4
>>>>>>>>>> 2. iSER initiator on ConnectX-3
>>>>>>>>>> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
>>>>>>>>>> 4. SRP initiator on ConnectX-3
>>>>>>>>>>
>>>>>>>>>> Not tested:
>>>>>>>>>> 1. RDS
>>>>>>>>>> 2. mthca
>>>>>>>>>> 3. rdmavt
>>>>>>>>>
>>>>>>>>> This will effectively kill qib which uses rdmavt. It's gonna have to be a
>>>>>>>>> NAK from me.
>>>>>>>>
>>>>>>>> Are you objecting the SRP and iSER changes too?
>>>>>>>
>>>>>>> No, just want to keep basic verbs support at least. NFS already dropped,
>>>>>>> similarly we are ok with dropping it from SRP/iSER as a next step.
>>>>>>
>>>>>> So you see a major user in RDS for qib?
>>>>>
>>>>> Didn't we agree to drop it from RDS too?
>>>>>
>>>>
>>>> Just basic verbs application support is enough for qib I think. I don't see
>>>> any major use of RDS.
>>>
>>> Well, once the in-kernel users of an API are gone that API will be
>>> purged. This is standard kernel policy.
>>>
>>> So you can't NAK this series on the grounds you want to keep an API
>>> without users, presumably for out of tree modules...
>>>
>>
>> Maybe I need to look at this again. I thought it would kill off user access
>> as well. We don't need any kernel ULPs.
> 
> Did you make a conclusion? Seems like everyone else is in agreement
> here, if Max resends a v2 I'm inclined to take it unless RDS objects.
> 
> I did not think FMR or FRWR were available from userspace at all.

I certainly hope they're not available from userspace, as they both take
physaddrs to reference memory!

Tom.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 0/8 v1] Remove FMR support from RDMA drivers
  2020-05-23 22:08                 ` Jason Gunthorpe
  2020-05-24  1:27                   ` Tom Talpey
@ 2020-05-26 15:38                   ` Dennis Dalessandro
  1 sibling, 0 replies; 33+ messages in thread
From: Dennis Dalessandro @ 2020-05-26 15:38 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Max Gurtovoy, bvanassche, linux-rdma, dledford,
	sagi, israelr, shlomin, Marciniszyn, Mike

On 5/23/2020 6:08 PM, Jason Gunthorpe wrote:
> On Tue, May 19, 2020 at 10:37:07AM -0400, Dennis Dalessandro wrote:
>> On 5/19/2020 10:30 AM, Jason Gunthorpe wrote:
>>> On Tue, May 19, 2020 at 10:26:37AM -0400, Dennis Dalessandro wrote:
>>>> On 5/19/2020 10:19 AM, Leon Romanovsky wrote:
>>>>> On Tue, May 19, 2020 at 10:53:52AM -0300, Jason Gunthorpe wrote:
>>>>>> On Tue, May 19, 2020 at 09:43:14AM -0400, Dennis Dalessandro wrote:
>>>>>>> On 5/18/2020 2:10 PM, Jason Gunthorpe wrote:
>>>>>>>> On Mon, May 18, 2020 at 11:20:04AM -0400, Dennis Dalessandro wrote:
>>>>>>>>> On 5/14/2020 8:02 AM, Max Gurtovoy wrote:
>>>>>>>>>> This series removes the support for FMR mode to register memory. This ancient
>>>>>>>>>> mode is unsafe and not maintained/tested in the last few years. It also doesn't
>>>>>>>>>> have any reasonable advantage over other memory registration methods such as
>>>>>>>>>> FRWR (that is implemented in all the recent RDMA adapters). This series should
>>>>>>>>>> be reviewed and approved by the maintainer of the effected drivers and I
>>>>>>>>>> suggest to test it as well.
>>>>>>>>>>
>>>>>>>>>> The tests that I made for this series (fio benchmarks and fio verify data):
>>>>>>>>>> 1. iSER initiator on ConnectX-4
>>>>>>>>>> 2. iSER initiator on ConnectX-3
>>>>>>>>>> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
>>>>>>>>>> 4. SRP initiator on ConnectX-3
>>>>>>>>>>
>>>>>>>>>> Not tested:
>>>>>>>>>> 1. RDS
>>>>>>>>>> 2. mthca
>>>>>>>>>> 3. rdmavt
>>>>>>>>>
>>>>>>>>> This will effectively kill qib which uses rdmavt. It's gonna have to be a
>>>>>>>>> NAK from me.
>>>>>>>>
>>>>>>>> Are you objecting the SRP and iSER changes too?
>>>>>>>
>>>>>>> No, just want to keep basic verbs support at least. NFS already dropped,
>>>>>>> similarly we are ok with dropping it from SRP/iSER as a next step.
>>>>>>
>>>>>> So you see a major user in RDS for qib?
>>>>>
>>>>> Didn't we agree to drop it from RDS too?
>>>>>
>>>>
>>>> Just basic verbs application support is enough for qib I think. I don't see
>>>> any major use of RDS.
>>>
>>> Well, once the in-kernel users of an API are gone that API will be
>>> purged. This is standard kernel policy.
>>>
>>> So you can't NAK this series on the grounds you want to keep an API
>>> without users, presumably for out of tree modules...
>>>
>>
>> Maybe I need to look at this again. I thought it would kill off user access
>> as well. We don't need any kernel ULPs.
> 
> Did you make a conclusion? Seems like everyone else is in agreement
> here, if Max resends a v2 I'm inclined to take it unless RDS objects.
> 
> I did not think FMR or FRWR were available from userspace at all.

Yeah, looked it over again and agree it's OK. No issues here now. Thanks 
for checking.

-Denny



^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, back to index

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-14 12:02 [PATCH 0/8 v1] Remove FMR support from RDMA drivers Max Gurtovoy
2020-05-14 12:02 ` [PATCH 1/8] RDMA/mlx4: remove FMR support for memory registration Max Gurtovoy
2020-05-14 12:02 ` [PATCH 2/8] RDMA/rds: " Max Gurtovoy
2020-05-14 12:03 ` [PATCH 3/8] RDMA/mthca: " Max Gurtovoy
2020-05-14 12:03 ` [PATCH 4/8] RDMA/rdmavt: remove FMR " Max Gurtovoy
2020-05-14 12:03 ` [PATCH 5/8] RDMA/iser: Remove support for " Max Gurtovoy
2020-05-14 12:03 ` [PATCH 6/8] RDMA/srp: remove " Max Gurtovoy
2020-05-14 14:02   ` Bart Van Assche
2020-05-14 12:03 ` [PATCH 7/8] RDMA/core: remove FMR pool API Max Gurtovoy
2020-05-14 12:03 ` [PATCH 8/8] RDMA/core: remove FMR device ops Max Gurtovoy
2020-05-14 15:13 ` [PATCH 0/8 v1] Remove FMR support from RDMA drivers Aron Silverton
2020-05-14 18:18   ` santosh.shilimkar
2020-05-14 19:42     ` Max Gurtovoy
2020-05-14 22:23     ` Sagi Grimberg
2020-05-14 23:41       ` santosh.shilimkar
2020-05-15 16:52         ` Tom Talpey
2020-05-15 18:59           ` Sagi Grimberg
2020-05-17 10:51             ` Max Gurtovoy
2020-05-18 16:34               ` santosh.shilimkar
2020-05-15  0:37       ` Max Gurtovoy
2020-05-14 16:00 ` Gal Pressman
2020-05-17 10:37   ` Max Gurtovoy
2020-05-18 15:20 ` Dennis Dalessandro
2020-05-18 18:10   ` Jason Gunthorpe
2020-05-19 13:43     ` Dennis Dalessandro
2020-05-19 13:53       ` Jason Gunthorpe
2020-05-19 14:19         ` Leon Romanovsky
2020-05-19 14:26           ` Dennis Dalessandro
2020-05-19 14:30             ` Jason Gunthorpe
2020-05-19 14:37               ` Dennis Dalessandro
2020-05-23 22:08                 ` Jason Gunthorpe
2020-05-24  1:27                   ` Tom Talpey
2020-05-26 15:38                   ` Dennis Dalessandro

Linux-RDMA Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-rdma/0 linux-rdma/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-rdma linux-rdma/ https://lore.kernel.org/linux-rdma \
		linux-rdma@vger.kernel.org
	public-inbox-index linux-rdma

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-rdma


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git