* [PATCH 1/9] RDMA/mlx5: Remove FMR leftovers
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-27 13:58 ` Leon Romanovsky
2020-05-27 9:46 ` [PATCH 2/9] RDMA/mlx4: remove FMR support for memory registration Max Gurtovoy
` (8 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
From: Gal Pressman <galpress@amazon.com>
Remove a few leftovers from FMR functionality which are no longer used.
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
drivers/infiniband/hw/mlx5/mlx5_ib.h | 8 --------
1 file changed, 8 deletions(-)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index a4e5223..211ec84 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -702,12 +702,6 @@ struct umr_common {
struct semaphore sem;
};
-enum {
- MLX5_FMR_INVALID,
- MLX5_FMR_VALID,
- MLX5_FMR_BUSY,
-};
-
struct mlx5_cache_ent {
struct list_head head;
/* sync access to the cahce entry
@@ -1283,8 +1277,6 @@ int mlx5_query_mad_ifc_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props);
int mlx5_ib_query_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props);
-int mlx5_ib_init_fmr(struct mlx5_ib_dev *dev);
-void mlx5_ib_cleanup_fmr(struct mlx5_ib_dev *dev);
void mlx5_ib_cont_pages(struct ib_umem *umem, u64 addr,
unsigned long max_page_shift,
int *count, int *shift,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 1/9] RDMA/mlx5: Remove FMR leftovers
2020-05-27 9:46 ` [PATCH 1/9] RDMA/mlx5: Remove FMR leftovers Max Gurtovoy
@ 2020-05-27 13:58 ` Leon Romanovsky
0 siblings, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2020-05-27 13:58 UTC (permalink / raw)
To: Max Gurtovoy
Cc: jgg, dledford, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom, aron.silverton,
israelr, oren, shlomin, vladimirk
On Wed, May 27, 2020 at 12:46:26PM +0300, Max Gurtovoy wrote:
> From: Gal Pressman <galpress@amazon.com>
>
> Remove a few leftovers from FMR functionality which are no longer used.
>
> Signed-off-by: Gal Pressman <galpress@amazon.com>
> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> ---
> drivers/infiniband/hw/mlx5/mlx5_ib.h | 8 --------
> 1 file changed, 8 deletions(-)
Thanks,
Acked-by: Leon Romanovsky <leonro@mellanox.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/9] RDMA/mlx4: remove FMR support for memory registration
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
2020-05-27 9:46 ` [PATCH 1/9] RDMA/mlx5: Remove FMR leftovers Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-27 14:03 ` Leon Romanovsky
2020-05-27 9:46 ` [PATCH 3/9] RDMA/rds: " Max Gurtovoy
` (7 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
HCA's that are driven by mlx4 driver support FRWR method to register
memory. Remove the ancient and unsafe FMR method.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
drivers/infiniband/hw/mlx4/main.c | 10 --
drivers/infiniband/hw/mlx4/mlx4_ib.h | 16 ---
drivers/infiniband/hw/mlx4/mr.c | 93 ----------------
drivers/net/ethernet/mellanox/mlx4/mr.c | 183 --------------------------------
include/linux/mlx4/device.h | 21 +---
5 files changed, 2 insertions(+), 321 deletions(-)
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 275722c..778dccf 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -2600,13 +2600,6 @@ static void get_fw_ver_str(struct ib_device *device, char *str)
.modify_wq = mlx4_ib_modify_wq,
};
-static const struct ib_device_ops mlx4_ib_dev_fmr_ops = {
- .alloc_fmr = mlx4_ib_fmr_alloc,
- .dealloc_fmr = mlx4_ib_fmr_dealloc,
- .map_phys_fmr = mlx4_ib_map_phys_fmr,
- .unmap_fmr = mlx4_ib_unmap_fmr,
-};
-
static const struct ib_device_ops mlx4_ib_dev_mw_ops = {
.alloc_mw = mlx4_ib_alloc_mw,
.dealloc_mw = mlx4_ib_dealloc_mw,
@@ -2724,9 +2717,6 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
ib_set_device_ops(&ibdev->ib_dev, &mlx4_ib_dev_wq_ops);
}
- if (!mlx4_is_slave(ibdev->dev))
- ib_set_device_ops(&ibdev->ib_dev, &mlx4_ib_dev_fmr_ops);
-
if (dev->caps.flags & MLX4_DEV_CAP_FLAG_MEM_WINDOW ||
dev->caps.bmme_flags & MLX4_BMME_FLAG_TYPE_2_WIN) {
ibdev->ib_dev.uverbs_cmd_mask |=
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index d188573..ebdeffd 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -146,11 +146,6 @@ struct mlx4_ib_mw {
struct mlx4_mw mmw;
};
-struct mlx4_ib_fmr {
- struct ib_fmr ibfmr;
- struct mlx4_fmr mfmr;
-};
-
#define MAX_REGS_PER_FLOW 2
struct mlx4_flow_reg_id {
@@ -679,11 +674,6 @@ static inline struct mlx4_ib_mw *to_mmw(struct ib_mw *ibmw)
return container_of(ibmw, struct mlx4_ib_mw, ibmw);
}
-static inline struct mlx4_ib_fmr *to_mfmr(struct ib_fmr *ibfmr)
-{
- return container_of(ibfmr, struct mlx4_ib_fmr, ibfmr);
-}
-
static inline struct mlx4_ib_flow *to_mflow(struct ib_flow *ibflow)
{
return container_of(ibflow, struct mlx4_ib_flow, ibflow);
@@ -794,12 +784,6 @@ int mlx4_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
int mlx4_ib_mad_init(struct mlx4_ib_dev *dev);
void mlx4_ib_mad_cleanup(struct mlx4_ib_dev *dev);
-struct ib_fmr *mlx4_ib_fmr_alloc(struct ib_pd *pd, int mr_access_flags,
- struct ib_fmr_attr *fmr_attr);
-int mlx4_ib_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list, int npages,
- u64 iova);
-int mlx4_ib_unmap_fmr(struct list_head *fmr_list);
-int mlx4_ib_fmr_dealloc(struct ib_fmr *fmr);
int __mlx4_ib_query_port(struct ib_device *ibdev, u8 port,
struct ib_port_attr *props, int netw_view);
int __mlx4_ib_query_pkey(struct ib_device *ibdev, u8 port, u16 index,
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index b0121c9..e2fb71b 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -698,99 +698,6 @@ struct ib_mr *mlx4_ib_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
return ERR_PTR(err);
}
-struct ib_fmr *mlx4_ib_fmr_alloc(struct ib_pd *pd, int acc,
- struct ib_fmr_attr *fmr_attr)
-{
- struct mlx4_ib_dev *dev = to_mdev(pd->device);
- struct mlx4_ib_fmr *fmr;
- int err = -ENOMEM;
-
- fmr = kmalloc(sizeof *fmr, GFP_KERNEL);
- if (!fmr)
- return ERR_PTR(-ENOMEM);
-
- err = mlx4_fmr_alloc(dev->dev, to_mpd(pd)->pdn, convert_access(acc),
- fmr_attr->max_pages, fmr_attr->max_maps,
- fmr_attr->page_shift, &fmr->mfmr);
- if (err)
- goto err_free;
-
- err = mlx4_fmr_enable(to_mdev(pd->device)->dev, &fmr->mfmr);
- if (err)
- goto err_mr;
-
- fmr->ibfmr.rkey = fmr->ibfmr.lkey = fmr->mfmr.mr.key;
-
- return &fmr->ibfmr;
-
-err_mr:
- (void) mlx4_mr_free(to_mdev(pd->device)->dev, &fmr->mfmr.mr);
-
-err_free:
- kfree(fmr);
-
- return ERR_PTR(err);
-}
-
-int mlx4_ib_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
- int npages, u64 iova)
-{
- struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr);
- struct mlx4_ib_dev *dev = to_mdev(ifmr->ibfmr.device);
-
- return mlx4_map_phys_fmr(dev->dev, &ifmr->mfmr, page_list, npages, iova,
- &ifmr->ibfmr.lkey, &ifmr->ibfmr.rkey);
-}
-
-int mlx4_ib_unmap_fmr(struct list_head *fmr_list)
-{
- struct ib_fmr *ibfmr;
- int err;
- struct mlx4_dev *mdev = NULL;
-
- list_for_each_entry(ibfmr, fmr_list, list) {
- if (mdev && to_mdev(ibfmr->device)->dev != mdev)
- return -EINVAL;
- mdev = to_mdev(ibfmr->device)->dev;
- }
-
- if (!mdev)
- return 0;
-
- list_for_each_entry(ibfmr, fmr_list, list) {
- struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr);
-
- mlx4_fmr_unmap(mdev, &ifmr->mfmr, &ifmr->ibfmr.lkey, &ifmr->ibfmr.rkey);
- }
-
- /*
- * Make sure all MPT status updates are visible before issuing
- * SYNC_TPT firmware command.
- */
- wmb();
-
- err = mlx4_SYNC_TPT(mdev);
- if (err)
- pr_warn("SYNC_TPT error %d when "
- "unmapping FMRs\n", err);
-
- return 0;
-}
-
-int mlx4_ib_fmr_dealloc(struct ib_fmr *ibfmr)
-{
- struct mlx4_ib_fmr *ifmr = to_mfmr(ibfmr);
- struct mlx4_ib_dev *dev = to_mdev(ibfmr->device);
- int err;
-
- err = mlx4_fmr_free(dev->dev, &ifmr->mfmr);
-
- if (!err)
- kfree(ifmr);
-
- return err;
-}
-
static int mlx4_set_page(struct ib_mr *ibmr, u64 addr)
{
struct mlx4_ib_mr *mr = to_mmr(ibmr);
diff --git a/drivers/net/ethernet/mellanox/mlx4/mr.c b/drivers/net/ethernet/mellanox/mlx4/mr.c
index 1a11bc0..d2986f1 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mr.c
+++ b/drivers/net/ethernet/mellanox/mlx4/mr.c
@@ -966,189 +966,6 @@ void mlx4_cleanup_mr_table(struct mlx4_dev *dev)
mlx4_bitmap_cleanup(&mr_table->mpt_bitmap);
}
-static inline int mlx4_check_fmr(struct mlx4_fmr *fmr, u64 *page_list,
- int npages, u64 iova)
-{
- int i, page_mask;
-
- if (npages > fmr->max_pages)
- return -EINVAL;
-
- page_mask = (1 << fmr->page_shift) - 1;
-
- /* We are getting page lists, so va must be page aligned. */
- if (iova & page_mask)
- return -EINVAL;
-
- /* Trust the user not to pass misaligned data in page_list */
- if (0)
- for (i = 0; i < npages; ++i) {
- if (page_list[i] & ~page_mask)
- return -EINVAL;
- }
-
- if (fmr->maps >= fmr->max_maps)
- return -EINVAL;
-
- return 0;
-}
-
-int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list,
- int npages, u64 iova, u32 *lkey, u32 *rkey)
-{
- u32 key;
- int i, err;
-
- err = mlx4_check_fmr(fmr, page_list, npages, iova);
- if (err)
- return err;
-
- ++fmr->maps;
-
- key = key_to_hw_index(fmr->mr.key);
- key += dev->caps.num_mpts;
- *lkey = *rkey = fmr->mr.key = hw_index_to_key(key);
-
- *(u8 *) fmr->mpt = MLX4_MPT_STATUS_SW;
-
- /* Make sure MPT status is visible before writing MTT entries */
- wmb();
-
- dma_sync_single_for_cpu(&dev->persist->pdev->dev, fmr->dma_handle,
- npages * sizeof(u64), DMA_TO_DEVICE);
-
- for (i = 0; i < npages; ++i)
- fmr->mtts[i] = cpu_to_be64(page_list[i] | MLX4_MTT_FLAG_PRESENT);
-
- dma_sync_single_for_device(&dev->persist->pdev->dev, fmr->dma_handle,
- npages * sizeof(u64), DMA_TO_DEVICE);
-
- fmr->mpt->key = cpu_to_be32(key);
- fmr->mpt->lkey = cpu_to_be32(key);
- fmr->mpt->length = cpu_to_be64(npages * (1ull << fmr->page_shift));
- fmr->mpt->start = cpu_to_be64(iova);
-
- /* Make MTT entries are visible before setting MPT status */
- wmb();
-
- *(u8 *) fmr->mpt = MLX4_MPT_STATUS_HW;
-
- /* Make sure MPT status is visible before consumer can use FMR */
- wmb();
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_map_phys_fmr);
-
-int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages,
- int max_maps, u8 page_shift, struct mlx4_fmr *fmr)
-{
- struct mlx4_priv *priv = mlx4_priv(dev);
- int err = -ENOMEM;
-
- if (max_maps > dev->caps.max_fmr_maps)
- return -EINVAL;
-
- if (page_shift < (ffs(dev->caps.page_size_cap) - 1) || page_shift >= 32)
- return -EINVAL;
-
- /* All MTTs must fit in the same page */
- if (max_pages * sizeof(*fmr->mtts) > PAGE_SIZE)
- return -EINVAL;
-
- fmr->page_shift = page_shift;
- fmr->max_pages = max_pages;
- fmr->max_maps = max_maps;
- fmr->maps = 0;
-
- err = mlx4_mr_alloc(dev, pd, 0, 0, access, max_pages,
- page_shift, &fmr->mr);
- if (err)
- return err;
-
- fmr->mtts = mlx4_table_find(&priv->mr_table.mtt_table,
- fmr->mr.mtt.offset,
- &fmr->dma_handle);
-
- if (!fmr->mtts) {
- err = -ENOMEM;
- goto err_free;
- }
-
- return 0;
-
-err_free:
- (void) mlx4_mr_free(dev, &fmr->mr);
- return err;
-}
-EXPORT_SYMBOL_GPL(mlx4_fmr_alloc);
-
-int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr)
-{
- struct mlx4_priv *priv = mlx4_priv(dev);
- int err;
-
- err = mlx4_mr_enable(dev, &fmr->mr);
- if (err)
- return err;
-
- fmr->mpt = mlx4_table_find(&priv->mr_table.dmpt_table,
- key_to_hw_index(fmr->mr.key), NULL);
- if (!fmr->mpt)
- return -ENOMEM;
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_fmr_enable);
-
-void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr,
- u32 *lkey, u32 *rkey)
-{
- if (!fmr->maps)
- return;
-
- /* To unmap: it is sufficient to take back ownership from HW */
- *(u8 *)fmr->mpt = MLX4_MPT_STATUS_SW;
-
- /* Make sure MPT status is visible */
- wmb();
-
- fmr->maps = 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_fmr_unmap);
-
-int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr)
-{
- int ret;
-
- if (fmr->maps)
- return -EBUSY;
- if (fmr->mr.enabled == MLX4_MPT_EN_HW) {
- /* In case of FMR was enabled and unmapped
- * make sure to give ownership of MPT back to HW
- * so HW2SW_MPT command will success.
- */
- *(u8 *)fmr->mpt = MLX4_MPT_STATUS_SW;
- /* Make sure MPT status is visible before changing MPT fields */
- wmb();
- fmr->mpt->length = 0;
- fmr->mpt->start = 0;
- /* Make sure MPT data is visible after changing MPT status */
- wmb();
- *(u8 *)fmr->mpt = MLX4_MPT_STATUS_HW;
- /* make sure MPT status is visible */
- wmb();
- }
-
- ret = mlx4_mr_free(dev, &fmr->mr);
- if (ret)
- return ret;
- fmr->mr.enabled = MLX4_MPT_DISABLED;
-
- return 0;
-}
-EXPORT_SYMBOL_GPL(mlx4_fmr_free);
-
int mlx4_SYNC_TPT(struct mlx4_dev *dev)
{
return mlx4_cmd(dev, 0, 0, 0, MLX4_CMD_SYNC_TPT,
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 20372de..dd53d96 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -707,17 +707,6 @@ struct mlx4_mw {
int enabled;
};
-struct mlx4_fmr {
- struct mlx4_mr mr;
- struct mlx4_mpt_entry *mpt;
- __be64 *mtts;
- dma_addr_t dma_handle;
- int max_pages;
- int max_maps;
- int maps;
- u8 page_shift;
-};
-
struct mlx4_uar {
unsigned long pfn;
int index;
@@ -1412,14 +1401,6 @@ void mlx4_handle_eth_header_mcast_prio(struct mlx4_net_trans_rule_hw_ctrl *ctrl,
int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index);
void mlx4_unregister_vlan(struct mlx4_dev *dev, u8 port, u16 vlan);
-int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64 *page_list,
- int npages, u64 iova, u32 *lkey, u32 *rkey);
-int mlx4_fmr_alloc(struct mlx4_dev *dev, u32 pd, u32 access, int max_pages,
- int max_maps, u8 page_shift, struct mlx4_fmr *fmr);
-int mlx4_fmr_enable(struct mlx4_dev *dev, struct mlx4_fmr *fmr);
-void mlx4_fmr_unmap(struct mlx4_dev *dev, struct mlx4_fmr *fmr,
- u32 *lkey, u32 *rkey);
-int mlx4_fmr_free(struct mlx4_dev *dev, struct mlx4_fmr *fmr);
int mlx4_SYNC_TPT(struct mlx4_dev *dev);
int mlx4_test_interrupt(struct mlx4_dev *dev, int vector);
int mlx4_test_async(struct mlx4_dev *dev);
@@ -1522,6 +1503,8 @@ struct mlx4_slaves_pport mlx4_phys_to_slaves_pport_actv(
int mlx4_vf_get_enable_smi_admin(struct mlx4_dev *dev, int slave, int port);
int mlx4_vf_set_enable_smi_admin(struct mlx4_dev *dev, int slave, int port,
int enable);
+
+struct mlx4_mpt_entry;
int mlx4_mr_hw_get_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
struct mlx4_mpt_entry ***mpt_entry);
int mlx4_mr_hw_write_mpt(struct mlx4_dev *dev, struct mlx4_mr *mmr,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 2/9] RDMA/mlx4: remove FMR support for memory registration
2020-05-27 9:46 ` [PATCH 2/9] RDMA/mlx4: remove FMR support for memory registration Max Gurtovoy
@ 2020-05-27 14:03 ` Leon Romanovsky
0 siblings, 0 replies; 14+ messages in thread
From: Leon Romanovsky @ 2020-05-27 14:03 UTC (permalink / raw)
To: Max Gurtovoy
Cc: jgg, dledford, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom, aron.silverton,
israelr, oren, shlomin, vladimirk
On Wed, May 27, 2020 at 12:46:27PM +0300, Max Gurtovoy wrote:
> HCA's that are driven by mlx4 driver support FRWR method to register
> memory. Remove the ancient and unsafe FMR method.
>
> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
> ---
> drivers/infiniband/hw/mlx4/main.c | 10 --
> drivers/infiniband/hw/mlx4/mlx4_ib.h | 16 ---
> drivers/infiniband/hw/mlx4/mr.c | 93 ----------------
> drivers/net/ethernet/mellanox/mlx4/mr.c | 183 --------------------------------
> include/linux/mlx4/device.h | 21 +---
> 5 files changed, 2 insertions(+), 321 deletions(-)
I don't see this line removed and all "max_fmr_maps" related.
➜ kernel git:(rdma-next) git grep fmr drivers/infiniband/hw/mlx4/ | head -1
drivers/infiniband/hw/mlx4/main.c: props->max_map_per_fmr = dev->dev->caps.max_fmr_maps;
Thanks
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/9] RDMA/rds: remove FMR support for memory registration
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
2020-05-27 9:46 ` [PATCH 1/9] RDMA/mlx5: Remove FMR leftovers Max Gurtovoy
2020-05-27 9:46 ` [PATCH 2/9] RDMA/mlx4: remove FMR support for memory registration Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-27 9:46 ` [PATCH 4/9] RDMA/mthca: " Max Gurtovoy
` (6 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
Use FRWR method for memory registration by default and remove the ancient
and unsafe FMR method.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
net/rds/Makefile | 2 +-
net/rds/ib.c | 14 +--
net/rds/ib.h | 1 -
net/rds/ib_cm.c | 4 +-
net/rds/ib_fmr.c | 269 ------------------------------------------------------
net/rds/ib_mr.h | 12 ---
net/rds/ib_rdma.c | 16 +---
7 files changed, 11 insertions(+), 307 deletions(-)
delete mode 100644 net/rds/ib_fmr.c
diff --git a/net/rds/Makefile b/net/rds/Makefile
index e647f9d..8fdc118 100644
--- a/net/rds/Makefile
+++ b/net/rds/Makefile
@@ -7,7 +7,7 @@ rds-y := af_rds.o bind.o cong.o connection.o info.o message.o \
obj-$(CONFIG_RDS_RDMA) += rds_rdma.o
rds_rdma-y := rdma_transport.o \
ib.o ib_cm.o ib_recv.o ib_ring.o ib_send.o ib_stats.o \
- ib_sysctl.o ib_rdma.o ib_fmr.o ib_frmr.o
+ ib_sysctl.o ib_rdma.o ib_frmr.o
obj-$(CONFIG_RDS_TCP) += rds_tcp.o
diff --git a/net/rds/ib.c b/net/rds/ib.c
index a792d8a..33a65c8 100644
--- a/net/rds/ib.c
+++ b/net/rds/ib.c
@@ -130,12 +130,15 @@ void rds_ib_dev_put(struct rds_ib_device *rds_ibdev)
static void rds_ib_add_one(struct ib_device *device)
{
struct rds_ib_device *rds_ibdev;
- bool has_fr, has_fmr;
/* Only handle IB (no iWARP) devices */
if (device->node_type != RDMA_NODE_IB_CA)
return;
+ /* Device must support FRWR */
+ if (!(device->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS))
+ return;
+
rds_ibdev = kzalloc_node(sizeof(struct rds_ib_device), GFP_KERNEL,
ibdev_to_node(device));
if (!rds_ibdev)
@@ -151,11 +154,6 @@ static void rds_ib_add_one(struct ib_device *device)
rds_ibdev->max_wrs = device->attrs.max_qp_wr;
rds_ibdev->max_sge = min(device->attrs.max_send_sge, RDS_IB_MAX_SGE);
- has_fr = (device->attrs.device_cap_flags &
- IB_DEVICE_MEM_MGT_EXTENSIONS);
- has_fmr = (device->ops.alloc_fmr && device->ops.dealloc_fmr &&
- device->ops.map_phys_fmr && device->ops.unmap_fmr);
- rds_ibdev->use_fastreg = (has_fr && !has_fmr);
rds_ibdev->odp_capable =
!!(device->attrs.device_cap_flags &
IB_DEVICE_ON_DEMAND_PAGING) &&
@@ -217,9 +215,7 @@ static void rds_ib_add_one(struct ib_device *device)
rds_ibdev->fmr_max_remaps, rds_ibdev->max_1m_mrs,
rds_ibdev->max_8k_mrs);
- pr_info("RDS/IB: %s: %s supported and preferred\n",
- device->name,
- rds_ibdev->use_fastreg ? "FRMR" : "FMR");
+ pr_info("RDS/IB: %s: added\n", device->name);
down_write(&rds_ib_devices_lock);
list_add_tail_rcu(&rds_ibdev->list, &rds_ib_devices);
diff --git a/net/rds/ib.h b/net/rds/ib.h
index 0296f1f..e0d5991 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -247,7 +247,6 @@ struct rds_ib_device {
struct ib_device *dev;
struct ib_pd *pd;
struct dma_pool *rid_hdrs_pool; /* RDS headers DMA pool */
- u8 use_fastreg:1;
u8 odp_capable:1;
unsigned int max_mrs;
diff --git a/net/rds/ib_cm.c b/net/rds/ib_cm.c
index c71f432..c5c665e 100644
--- a/net/rds/ib_cm.c
+++ b/net/rds/ib_cm.c
@@ -526,10 +526,10 @@ static int rds_ib_setup_qp(struct rds_connection *conn)
return -EOPNOTSUPP;
/* The fr_queue_space is currently set to 512, to add extra space on
- * completion queue and send queue. This extra space is used for FRMR
+ * completion queue and send queue. This extra space is used for FRWR
* registration and invalidation work requests
*/
- fr_queue_space = (rds_ibdev->use_fastreg ? RDS_IB_DEFAULT_FR_WR : 0);
+ fr_queue_space = RDS_IB_DEFAULT_FR_WR;
/* add the conn now so that connection establishment has the dev */
rds_ib_add_conn(rds_ibdev, conn);
diff --git a/net/rds/ib_fmr.c b/net/rds/ib_fmr.c
deleted file mode 100644
index 93c0437..0000000
--- a/net/rds/ib_fmr.c
+++ /dev/null
@@ -1,269 +0,0 @@
-/*
- * Copyright (c) 2016 Oracle. All rights reserved.
- *
- * This software is available to you under a choice of one of two
- * licenses. You may choose to be licensed under the terms of the GNU
- * General Public License (GPL) Version 2, available from the file
- * COPYING in the main directory of this source tree, or the
- * OpenIB.org BSD license below:
- *
- * Redistribution and use in source and binary forms, with or
- * without modification, are permitted provided that the following
- * conditions are met:
- *
- * - Redistributions of source code must retain the above
- * copyright notice, this list of conditions and the following
- * disclaimer.
- *
- * - Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following
- * disclaimer in the documentation and/or other materials
- * provided with the distribution.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
- * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
- * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
- * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- * SOFTWARE.
- */
-
-#include "ib_mr.h"
-
-struct rds_ib_mr *rds_ib_alloc_fmr(struct rds_ib_device *rds_ibdev, int npages)
-{
- struct rds_ib_mr_pool *pool;
- struct rds_ib_mr *ibmr = NULL;
- struct rds_ib_fmr *fmr;
- int err = 0;
-
- if (npages <= RDS_MR_8K_MSG_SIZE)
- pool = rds_ibdev->mr_8k_pool;
- else
- pool = rds_ibdev->mr_1m_pool;
-
- if (atomic_read(&pool->dirty_count) >= pool->max_items / 10)
- queue_delayed_work(rds_ib_mr_wq, &pool->flush_worker, 10);
-
- /* Switch pools if one of the pool is reaching upper limit */
- if (atomic_read(&pool->dirty_count) >= pool->max_items * 9 / 10) {
- if (pool->pool_type == RDS_IB_MR_8K_POOL)
- pool = rds_ibdev->mr_1m_pool;
- else
- pool = rds_ibdev->mr_8k_pool;
- }
-
- ibmr = rds_ib_try_reuse_ibmr(pool);
- if (ibmr)
- return ibmr;
-
- ibmr = kzalloc_node(sizeof(*ibmr), GFP_KERNEL,
- rdsibdev_to_node(rds_ibdev));
- if (!ibmr) {
- err = -ENOMEM;
- goto out_no_cigar;
- }
-
- fmr = &ibmr->u.fmr;
- fmr->fmr = ib_alloc_fmr(rds_ibdev->pd,
- (IB_ACCESS_LOCAL_WRITE |
- IB_ACCESS_REMOTE_READ |
- IB_ACCESS_REMOTE_WRITE |
- IB_ACCESS_REMOTE_ATOMIC),
- &pool->fmr_attr);
- if (IS_ERR(fmr->fmr)) {
- err = PTR_ERR(fmr->fmr);
- fmr->fmr = NULL;
- pr_warn("RDS/IB: %s failed (err=%d)\n", __func__, err);
- goto out_no_cigar;
- }
-
- ibmr->pool = pool;
- if (pool->pool_type == RDS_IB_MR_8K_POOL)
- rds_ib_stats_inc(s_ib_rdma_mr_8k_alloc);
- else
- rds_ib_stats_inc(s_ib_rdma_mr_1m_alloc);
-
- return ibmr;
-
-out_no_cigar:
- kfree(ibmr);
- atomic_dec(&pool->item_count);
-
- return ERR_PTR(err);
-}
-
-static int rds_ib_map_fmr(struct rds_ib_device *rds_ibdev,
- struct rds_ib_mr *ibmr, struct scatterlist *sg,
- unsigned int nents)
-{
- struct ib_device *dev = rds_ibdev->dev;
- struct rds_ib_fmr *fmr = &ibmr->u.fmr;
- struct scatterlist *scat = sg;
- u64 io_addr = 0;
- u64 *dma_pages;
- u32 len;
- int page_cnt, sg_dma_len;
- int i, j;
- int ret;
-
- sg_dma_len = ib_dma_map_sg(dev, sg, nents, DMA_BIDIRECTIONAL);
- if (unlikely(!sg_dma_len)) {
- pr_warn("RDS/IB: %s failed!\n", __func__);
- return -EBUSY;
- }
-
- len = 0;
- page_cnt = 0;
-
- for (i = 0; i < sg_dma_len; ++i) {
- unsigned int dma_len = sg_dma_len(&scat[i]);
- u64 dma_addr = sg_dma_address(&scat[i]);
-
- if (dma_addr & ~PAGE_MASK) {
- if (i > 0) {
- ib_dma_unmap_sg(dev, sg, nents,
- DMA_BIDIRECTIONAL);
- return -EINVAL;
- } else {
- ++page_cnt;
- }
- }
- if ((dma_addr + dma_len) & ~PAGE_MASK) {
- if (i < sg_dma_len - 1) {
- ib_dma_unmap_sg(dev, sg, nents,
- DMA_BIDIRECTIONAL);
- return -EINVAL;
- } else {
- ++page_cnt;
- }
- }
-
- len += dma_len;
- }
-
- page_cnt += len >> PAGE_SHIFT;
- if (page_cnt > ibmr->pool->fmr_attr.max_pages) {
- ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL);
- return -EINVAL;
- }
-
- dma_pages = kmalloc_array_node(sizeof(u64), page_cnt, GFP_ATOMIC,
- rdsibdev_to_node(rds_ibdev));
- if (!dma_pages) {
- ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL);
- return -ENOMEM;
- }
-
- page_cnt = 0;
- for (i = 0; i < sg_dma_len; ++i) {
- unsigned int dma_len = sg_dma_len(&scat[i]);
- u64 dma_addr = sg_dma_address(&scat[i]);
-
- for (j = 0; j < dma_len; j += PAGE_SIZE)
- dma_pages[page_cnt++] =
- (dma_addr & PAGE_MASK) + j;
- }
-
- ret = ib_map_phys_fmr(fmr->fmr, dma_pages, page_cnt, io_addr);
- if (ret) {
- ib_dma_unmap_sg(dev, sg, nents, DMA_BIDIRECTIONAL);
- goto out;
- }
-
- /* Success - we successfully remapped the MR, so we can
- * safely tear down the old mapping.
- */
- rds_ib_teardown_mr(ibmr);
-
- ibmr->sg = scat;
- ibmr->sg_len = nents;
- ibmr->sg_dma_len = sg_dma_len;
- ibmr->remap_count++;
-
- if (ibmr->pool->pool_type == RDS_IB_MR_8K_POOL)
- rds_ib_stats_inc(s_ib_rdma_mr_8k_used);
- else
- rds_ib_stats_inc(s_ib_rdma_mr_1m_used);
- ret = 0;
-
-out:
- kfree(dma_pages);
-
- return ret;
-}
-
-struct rds_ib_mr *rds_ib_reg_fmr(struct rds_ib_device *rds_ibdev,
- struct scatterlist *sg,
- unsigned long nents,
- u32 *key)
-{
- struct rds_ib_mr *ibmr = NULL;
- struct rds_ib_fmr *fmr;
- int ret;
-
- ibmr = rds_ib_alloc_fmr(rds_ibdev, nents);
- if (IS_ERR(ibmr))
- return ibmr;
-
- ibmr->device = rds_ibdev;
- fmr = &ibmr->u.fmr;
- ret = rds_ib_map_fmr(rds_ibdev, ibmr, sg, nents);
- if (ret == 0)
- *key = fmr->fmr->rkey;
- else
- rds_ib_free_mr(ibmr, 0);
-
- return ibmr;
-}
-
-void rds_ib_unreg_fmr(struct list_head *list, unsigned int *nfreed,
- unsigned long *unpinned, unsigned int goal)
-{
- struct rds_ib_mr *ibmr, *next;
- struct rds_ib_fmr *fmr;
- LIST_HEAD(fmr_list);
- int ret = 0;
- unsigned int freed = *nfreed;
-
- /* String all ib_mr's onto one list and hand them to ib_unmap_fmr */
- list_for_each_entry(ibmr, list, unmap_list) {
- fmr = &ibmr->u.fmr;
- list_add(&fmr->fmr->list, &fmr_list);
- }
-
- ret = ib_unmap_fmr(&fmr_list);
- if (ret)
- pr_warn("RDS/IB: FMR invalidation failed (err=%d)\n", ret);
-
- /* Now we can destroy the DMA mapping and unpin any pages */
- list_for_each_entry_safe(ibmr, next, list, unmap_list) {
- fmr = &ibmr->u.fmr;
- *unpinned += ibmr->sg_len;
- __rds_ib_teardown_mr(ibmr);
- if (freed < goal ||
- ibmr->remap_count >= ibmr->pool->fmr_attr.max_maps) {
- if (ibmr->pool->pool_type == RDS_IB_MR_8K_POOL)
- rds_ib_stats_inc(s_ib_rdma_mr_8k_free);
- else
- rds_ib_stats_inc(s_ib_rdma_mr_1m_free);
- list_del(&ibmr->unmap_list);
- ib_dealloc_fmr(fmr->fmr);
- kfree(ibmr);
- freed++;
- }
- }
- *nfreed = freed;
-}
-
-void rds_ib_free_fmr_list(struct rds_ib_mr *ibmr)
-{
- struct rds_ib_mr_pool *pool = ibmr->pool;
-
- if (ibmr->remap_count >= pool->fmr_attr.max_maps)
- llist_add(&ibmr->llnode, &pool->drop_list);
- else
- llist_add(&ibmr->llnode, &pool->free_list);
-}
diff --git a/net/rds/ib_mr.h b/net/rds/ib_mr.h
index 0c8252d7..f365e1a 100644
--- a/net/rds/ib_mr.h
+++ b/net/rds/ib_mr.h
@@ -43,10 +43,6 @@
#define RDS_MR_8K_SCALE (256 / (RDS_MR_8K_MSG_SIZE + 1))
#define RDS_MR_8K_POOL_SIZE (RDS_MR_8K_SCALE * (8192 / 2))
-struct rds_ib_fmr {
- struct ib_fmr *fmr;
-};
-
enum rds_ib_fr_state {
FRMR_IS_FREE, /* mr invalidated & ready for use */
FRMR_IS_INUSE, /* mr is in use or used & can be invalidated */
@@ -84,7 +80,6 @@ struct rds_ib_mr {
u8 odp:1;
union {
- struct rds_ib_fmr fmr;
struct rds_ib_frmr frmr;
struct ib_mr *mr;
} u;
@@ -110,7 +105,6 @@ struct rds_ib_mr_pool {
unsigned long max_items_soft;
unsigned long max_free_pinned;
struct ib_fmr_attr fmr_attr;
- bool use_fastreg;
};
extern struct workqueue_struct *rds_ib_mr_wq;
@@ -136,15 +130,9 @@ void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
void __rds_ib_teardown_mr(struct rds_ib_mr *);
void rds_ib_teardown_mr(struct rds_ib_mr *);
-struct rds_ib_mr *rds_ib_alloc_fmr(struct rds_ib_device *, int);
struct rds_ib_mr *rds_ib_reuse_mr(struct rds_ib_mr_pool *);
int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *, int, struct rds_ib_mr **);
-struct rds_ib_mr *rds_ib_reg_fmr(struct rds_ib_device *, struct scatterlist *,
- unsigned long, u32 *);
struct rds_ib_mr *rds_ib_try_reuse_ibmr(struct rds_ib_mr_pool *);
-void rds_ib_unreg_fmr(struct list_head *, unsigned int *,
- unsigned long *, unsigned int);
-void rds_ib_free_fmr_list(struct rds_ib_mr *);
struct rds_ib_mr *rds_ib_reg_frmr(struct rds_ib_device *rds_ibdev,
struct rds_ib_connection *ic,
struct scatterlist *sg,
diff --git a/net/rds/ib_rdma.c b/net/rds/ib_rdma.c
index b34b24e..c55d77e 100644
--- a/net/rds/ib_rdma.c
+++ b/net/rds/ib_rdma.c
@@ -406,10 +406,7 @@ int rds_ib_flush_mr_pool(struct rds_ib_mr_pool *pool,
if (list_empty(&unmap_list))
goto out;
- if (pool->use_fastreg)
- rds_ib_unreg_frmr(&unmap_list, &nfreed, &unpinned, free_goal);
- else
- rds_ib_unreg_fmr(&unmap_list, &nfreed, &unpinned, free_goal);
+ rds_ib_unreg_frmr(&unmap_list, &nfreed, &unpinned, free_goal);
if (!list_empty(&unmap_list)) {
unsigned long flags;
@@ -503,10 +500,7 @@ void rds_ib_free_mr(void *trans_private, int invalidate)
}
/* Return it to the pool's free list */
- if (rds_ibdev->use_fastreg)
- rds_ib_free_frmr_list(ibmr);
- else
- rds_ib_free_fmr_list(ibmr);
+ rds_ib_free_frmr_list(ibmr);
atomic_add(ibmr->sg_len, &pool->free_pinned);
atomic_inc(&pool->dirty_count);
@@ -622,10 +616,7 @@ void *rds_ib_get_mr(struct scatterlist *sg, unsigned long nents,
goto out;
}
- if (rds_ibdev->use_fastreg)
- ibmr = rds_ib_reg_frmr(rds_ibdev, ic, sg, nents, key_ret);
- else
- ibmr = rds_ib_reg_fmr(rds_ibdev, sg, nents, key_ret);
+ ibmr = rds_ib_reg_frmr(rds_ibdev, ic, sg, nents, key_ret);
if (IS_ERR(ibmr)) {
ret = PTR_ERR(ibmr);
pr_warn("RDS/IB: rds_ib_get_mr failed (errno=%d)\n", ret);
@@ -681,7 +672,6 @@ struct rds_ib_mr_pool *rds_ib_create_mr_pool(struct rds_ib_device *rds_ibdev,
pool->fmr_attr.max_maps = rds_ibdev->fmr_max_remaps;
pool->fmr_attr.page_shift = PAGE_SHIFT;
pool->max_items_soft = rds_ibdev->max_mrs * 3 / 4;
- pool->use_fastreg = rds_ibdev->use_fastreg;
return pool;
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 4/9] RDMA/mthca: remove FMR support for memory registration
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
` (2 preceding siblings ...)
2020-05-27 9:46 ` [PATCH 3/9] RDMA/rds: " Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-27 9:46 ` [PATCH 5/9] RDMA/rdmavt: remove FMR " Max Gurtovoy
` (5 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
Remove the ancient and unsafe FMR method.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
drivers/infiniband/hw/mthca/mthca_dev.h | 10 -
drivers/infiniband/hw/mthca/mthca_mr.c | 262 +--------------------------
drivers/infiniband/hw/mthca/mthca_provider.c | 86 ---------
3 files changed, 1 insertion(+), 357 deletions(-)
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index 599794c..7550e9d 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -478,16 +478,6 @@ int mthca_mr_alloc_phys(struct mthca_dev *dev, u32 pd,
u32 access, struct mthca_mr *mr);
void mthca_free_mr(struct mthca_dev *dev, struct mthca_mr *mr);
-int mthca_fmr_alloc(struct mthca_dev *dev, u32 pd,
- u32 access, struct mthca_fmr *fmr);
-int mthca_tavor_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
- int list_len, u64 iova);
-void mthca_tavor_fmr_unmap(struct mthca_dev *dev, struct mthca_fmr *fmr);
-int mthca_arbel_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
- int list_len, u64 iova);
-void mthca_arbel_fmr_unmap(struct mthca_dev *dev, struct mthca_fmr *fmr);
-int mthca_free_fmr(struct mthca_dev *dev, struct mthca_fmr *fmr);
-
int mthca_map_eq_icm(struct mthca_dev *dev, u64 icm_virt);
void mthca_unmap_eq_icm(struct mthca_dev *dev);
diff --git a/drivers/infiniband/hw/mthca/mthca_mr.c b/drivers/infiniband/hw/mthca/mthca_mr.c
index 4250b2c..ce0e086 100644
--- a/drivers/infiniband/hw/mthca/mthca_mr.c
+++ b/drivers/infiniband/hw/mthca/mthca_mr.c
@@ -541,7 +541,7 @@ int mthca_mr_alloc_phys(struct mthca_dev *dev, u32 pd,
return err;
}
-/* Free mr or fmr */
+/* Free mr */
static void mthca_free_region(struct mthca_dev *dev, u32 lkey)
{
mthca_table_put(dev, dev->mr_table.mpt_table,
@@ -564,266 +564,6 @@ void mthca_free_mr(struct mthca_dev *dev, struct mthca_mr *mr)
mthca_free_mtt(dev, mr->mtt);
}
-int mthca_fmr_alloc(struct mthca_dev *dev, u32 pd,
- u32 access, struct mthca_fmr *mr)
-{
- struct mthca_mpt_entry *mpt_entry;
- struct mthca_mailbox *mailbox;
- u64 mtt_seg;
- u32 key, idx;
- int list_len = mr->attr.max_pages;
- int err = -ENOMEM;
- int i;
-
- if (mr->attr.page_shift < 12 || mr->attr.page_shift >= 32)
- return -EINVAL;
-
- /* For Arbel, all MTTs must fit in the same page. */
- if (mthca_is_memfree(dev) &&
- mr->attr.max_pages * sizeof *mr->mem.arbel.mtts > PAGE_SIZE)
- return -EINVAL;
-
- mr->maps = 0;
-
- key = mthca_alloc(&dev->mr_table.mpt_alloc);
- if (key == -1)
- return -ENOMEM;
- key = adjust_key(dev, key);
-
- idx = key & (dev->limits.num_mpts - 1);
- mr->ibmr.rkey = mr->ibmr.lkey = hw_index_to_key(dev, key);
-
- if (mthca_is_memfree(dev)) {
- err = mthca_table_get(dev, dev->mr_table.mpt_table, key);
- if (err)
- goto err_out_mpt_free;
-
- mr->mem.arbel.mpt = mthca_table_find(dev->mr_table.mpt_table, key, NULL);
- BUG_ON(!mr->mem.arbel.mpt);
- } else
- mr->mem.tavor.mpt = dev->mr_table.tavor_fmr.mpt_base +
- sizeof *(mr->mem.tavor.mpt) * idx;
-
- mr->mtt = __mthca_alloc_mtt(dev, list_len, dev->mr_table.fmr_mtt_buddy);
- if (IS_ERR(mr->mtt)) {
- err = PTR_ERR(mr->mtt);
- goto err_out_table;
- }
-
- mtt_seg = mr->mtt->first_seg * dev->limits.mtt_seg_size;
-
- if (mthca_is_memfree(dev)) {
- mr->mem.arbel.mtts = mthca_table_find(dev->mr_table.mtt_table,
- mr->mtt->first_seg,
- &mr->mem.arbel.dma_handle);
- BUG_ON(!mr->mem.arbel.mtts);
- } else
- mr->mem.tavor.mtts = dev->mr_table.tavor_fmr.mtt_base + mtt_seg;
-
- mailbox = mthca_alloc_mailbox(dev, GFP_KERNEL);
- if (IS_ERR(mailbox)) {
- err = PTR_ERR(mailbox);
- goto err_out_free_mtt;
- }
-
- mpt_entry = mailbox->buf;
-
- mpt_entry->flags = cpu_to_be32(MTHCA_MPT_FLAG_SW_OWNS |
- MTHCA_MPT_FLAG_MIO |
- MTHCA_MPT_FLAG_REGION |
- access);
-
- mpt_entry->page_size = cpu_to_be32(mr->attr.page_shift - 12);
- mpt_entry->key = cpu_to_be32(key);
- mpt_entry->pd = cpu_to_be32(pd);
- memset(&mpt_entry->start, 0,
- sizeof *mpt_entry - offsetof(struct mthca_mpt_entry, start));
- mpt_entry->mtt_seg = cpu_to_be64(dev->mr_table.mtt_base + mtt_seg);
-
- if (0) {
- mthca_dbg(dev, "Dumping MPT entry %08x:\n", mr->ibmr.lkey);
- for (i = 0; i < sizeof (struct mthca_mpt_entry) / 4; ++i) {
- if (i % 4 == 0)
- printk("[%02x] ", i * 4);
- printk(" %08x", be32_to_cpu(((__be32 *) mpt_entry)[i]));
- if ((i + 1) % 4 == 0)
- printk("\n");
- }
- }
-
- err = mthca_SW2HW_MPT(dev, mailbox,
- key & (dev->limits.num_mpts - 1));
- if (err) {
- mthca_warn(dev, "SW2HW_MPT failed (%d)\n", err);
- goto err_out_mailbox_free;
- }
-
- mthca_free_mailbox(dev, mailbox);
- return 0;
-
-err_out_mailbox_free:
- mthca_free_mailbox(dev, mailbox);
-
-err_out_free_mtt:
- mthca_free_mtt(dev, mr->mtt);
-
-err_out_table:
- mthca_table_put(dev, dev->mr_table.mpt_table, key);
-
-err_out_mpt_free:
- mthca_free(&dev->mr_table.mpt_alloc, key);
- return err;
-}
-
-int mthca_free_fmr(struct mthca_dev *dev, struct mthca_fmr *fmr)
-{
- if (fmr->maps)
- return -EBUSY;
-
- mthca_free_region(dev, fmr->ibmr.lkey);
- mthca_free_mtt(dev, fmr->mtt);
-
- return 0;
-}
-
-static inline int mthca_check_fmr(struct mthca_fmr *fmr, u64 *page_list,
- int list_len, u64 iova)
-{
- int i, page_mask;
-
- if (list_len > fmr->attr.max_pages)
- return -EINVAL;
-
- page_mask = (1 << fmr->attr.page_shift) - 1;
-
- /* We are getting page lists, so va must be page aligned. */
- if (iova & page_mask)
- return -EINVAL;
-
- /* Trust the user not to pass misaligned data in page_list */
- if (0)
- for (i = 0; i < list_len; ++i) {
- if (page_list[i] & ~page_mask)
- return -EINVAL;
- }
-
- if (fmr->maps >= fmr->attr.max_maps)
- return -EINVAL;
-
- return 0;
-}
-
-
-int mthca_tavor_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
- int list_len, u64 iova)
-{
- struct mthca_fmr *fmr = to_mfmr(ibfmr);
- struct mthca_dev *dev = to_mdev(ibfmr->device);
- struct mthca_mpt_entry mpt_entry;
- u32 key;
- int i, err;
-
- err = mthca_check_fmr(fmr, page_list, list_len, iova);
- if (err)
- return err;
-
- ++fmr->maps;
-
- key = tavor_key_to_hw_index(fmr->ibmr.lkey);
- key += dev->limits.num_mpts;
- fmr->ibmr.lkey = fmr->ibmr.rkey = tavor_hw_index_to_key(key);
-
- writeb(MTHCA_MPT_STATUS_SW, fmr->mem.tavor.mpt);
-
- for (i = 0; i < list_len; ++i) {
- __be64 mtt_entry = cpu_to_be64(page_list[i] |
- MTHCA_MTT_FLAG_PRESENT);
- mthca_write64_raw(mtt_entry, fmr->mem.tavor.mtts + i);
- }
-
- mpt_entry.lkey = cpu_to_be32(key);
- mpt_entry.length = cpu_to_be64(list_len * (1ull << fmr->attr.page_shift));
- mpt_entry.start = cpu_to_be64(iova);
-
- __raw_writel((__force u32) mpt_entry.lkey, &fmr->mem.tavor.mpt->key);
- memcpy_toio(&fmr->mem.tavor.mpt->start, &mpt_entry.start,
- offsetof(struct mthca_mpt_entry, window_count) -
- offsetof(struct mthca_mpt_entry, start));
-
- writeb(MTHCA_MPT_STATUS_HW, fmr->mem.tavor.mpt);
-
- return 0;
-}
-
-int mthca_arbel_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
- int list_len, u64 iova)
-{
- struct mthca_fmr *fmr = to_mfmr(ibfmr);
- struct mthca_dev *dev = to_mdev(ibfmr->device);
- u32 key;
- int i, err;
-
- err = mthca_check_fmr(fmr, page_list, list_len, iova);
- if (err)
- return err;
-
- ++fmr->maps;
-
- key = arbel_key_to_hw_index(fmr->ibmr.lkey);
- if (dev->mthca_flags & MTHCA_FLAG_SINAI_OPT)
- key += SINAI_FMR_KEY_INC;
- else
- key += dev->limits.num_mpts;
- fmr->ibmr.lkey = fmr->ibmr.rkey = arbel_hw_index_to_key(key);
-
- *(u8 *) fmr->mem.arbel.mpt = MTHCA_MPT_STATUS_SW;
-
- wmb();
-
- dma_sync_single_for_cpu(&dev->pdev->dev, fmr->mem.arbel.dma_handle,
- list_len * sizeof(u64), DMA_TO_DEVICE);
-
- for (i = 0; i < list_len; ++i)
- fmr->mem.arbel.mtts[i] = cpu_to_be64(page_list[i] |
- MTHCA_MTT_FLAG_PRESENT);
-
- dma_sync_single_for_device(&dev->pdev->dev, fmr->mem.arbel.dma_handle,
- list_len * sizeof(u64), DMA_TO_DEVICE);
-
- fmr->mem.arbel.mpt->key = cpu_to_be32(key);
- fmr->mem.arbel.mpt->lkey = cpu_to_be32(key);
- fmr->mem.arbel.mpt->length = cpu_to_be64(list_len * (1ull << fmr->attr.page_shift));
- fmr->mem.arbel.mpt->start = cpu_to_be64(iova);
-
- wmb();
-
- *(u8 *) fmr->mem.arbel.mpt = MTHCA_MPT_STATUS_HW;
-
- wmb();
-
- return 0;
-}
-
-void mthca_tavor_fmr_unmap(struct mthca_dev *dev, struct mthca_fmr *fmr)
-{
- if (!fmr->maps)
- return;
-
- fmr->maps = 0;
-
- writeb(MTHCA_MPT_STATUS_SW, fmr->mem.tavor.mpt);
-}
-
-void mthca_arbel_fmr_unmap(struct mthca_dev *dev, struct mthca_fmr *fmr)
-{
- if (!fmr->maps)
- return;
-
- fmr->maps = 0;
-
- *(u8 *) fmr->mem.arbel.mpt = MTHCA_MPT_STATUS_SW;
-}
-
int mthca_init_mr_table(struct mthca_dev *dev)
{
phys_addr_t addr;
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 69a3e4f..9028df0 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -957,69 +957,6 @@ static int mthca_dereg_mr(struct ib_mr *mr, struct ib_udata *udata)
return 0;
}
-static struct ib_fmr *mthca_alloc_fmr(struct ib_pd *pd, int mr_access_flags,
- struct ib_fmr_attr *fmr_attr)
-{
- struct mthca_fmr *fmr;
- int err;
-
- fmr = kmalloc(sizeof *fmr, GFP_KERNEL);
- if (!fmr)
- return ERR_PTR(-ENOMEM);
-
- memcpy(&fmr->attr, fmr_attr, sizeof *fmr_attr);
- err = mthca_fmr_alloc(to_mdev(pd->device), to_mpd(pd)->pd_num,
- convert_access(mr_access_flags), fmr);
-
- if (err) {
- kfree(fmr);
- return ERR_PTR(err);
- }
-
- return &fmr->ibmr;
-}
-
-static int mthca_dealloc_fmr(struct ib_fmr *fmr)
-{
- struct mthca_fmr *mfmr = to_mfmr(fmr);
- int err;
-
- err = mthca_free_fmr(to_mdev(fmr->device), mfmr);
- if (err)
- return err;
-
- kfree(mfmr);
- return 0;
-}
-
-static int mthca_unmap_fmr(struct list_head *fmr_list)
-{
- struct ib_fmr *fmr;
- int err;
- struct mthca_dev *mdev = NULL;
-
- list_for_each_entry(fmr, fmr_list, list) {
- if (mdev && to_mdev(fmr->device) != mdev)
- return -EINVAL;
- mdev = to_mdev(fmr->device);
- }
-
- if (!mdev)
- return 0;
-
- if (mthca_is_memfree(mdev)) {
- list_for_each_entry(fmr, fmr_list, list)
- mthca_arbel_fmr_unmap(mdev, to_mfmr(fmr));
-
- wmb();
- } else
- list_for_each_entry(fmr, fmr_list, list)
- mthca_tavor_fmr_unmap(mdev, to_mfmr(fmr));
-
- err = mthca_SYNC_TPT(mdev);
- return err;
-}
-
static ssize_t hw_rev_show(struct device *device,
struct device_attribute *attr, char *buf)
{
@@ -1203,20 +1140,6 @@ static void get_dev_fw_str(struct ib_device *device, char *str)
INIT_RDMA_OBJ_SIZE(ib_srq, mthca_srq, ibsrq),
};
-static const struct ib_device_ops mthca_dev_arbel_fmr_ops = {
- .alloc_fmr = mthca_alloc_fmr,
- .dealloc_fmr = mthca_dealloc_fmr,
- .map_phys_fmr = mthca_arbel_map_phys_fmr,
- .unmap_fmr = mthca_unmap_fmr,
-};
-
-static const struct ib_device_ops mthca_dev_tavor_fmr_ops = {
- .alloc_fmr = mthca_alloc_fmr,
- .dealloc_fmr = mthca_dealloc_fmr,
- .map_phys_fmr = mthca_tavor_map_phys_fmr,
- .unmap_fmr = mthca_unmap_fmr,
-};
-
static const struct ib_device_ops mthca_dev_arbel_ops = {
.post_recv = mthca_arbel_post_receive,
.post_send = mthca_arbel_post_send,
@@ -1275,15 +1198,6 @@ int mthca_register_device(struct mthca_dev *dev)
&mthca_dev_tavor_srq_ops);
}
- if (dev->mthca_flags & MTHCA_FLAG_FMR) {
- if (mthca_is_memfree(dev))
- ib_set_device_ops(&dev->ib_dev,
- &mthca_dev_arbel_fmr_ops);
- else
- ib_set_device_ops(&dev->ib_dev,
- &mthca_dev_tavor_fmr_ops);
- }
-
ib_set_device_ops(&dev->ib_dev, &mthca_dev_ops);
if (mthca_is_memfree(dev))
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 5/9] RDMA/rdmavt: remove FMR memory registration
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
` (3 preceding siblings ...)
2020-05-27 9:46 ` [PATCH 4/9] RDMA/mthca: " Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-27 15:56 ` Dennis Dalessandro
2020-05-27 9:46 ` [PATCH 6/9] RDMA/iser: Remove support for " Max Gurtovoy
` (4 subsequent siblings)
9 siblings, 1 reply; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
Use FRWR method to register memory by default and remove the ancient and
unsafe FMR method.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
drivers/infiniband/sw/rdmavt/mr.c | 154 --------------------------------------
drivers/infiniband/sw/rdmavt/mr.h | 15 ----
drivers/infiniband/sw/rdmavt/vt.c | 4 -
3 files changed, 173 deletions(-)
diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index 72f6534..ddb0c0d 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -714,160 +714,6 @@ int rvt_invalidate_rkey(struct rvt_qp *qp, u32 rkey)
EXPORT_SYMBOL(rvt_invalidate_rkey);
/**
- * rvt_alloc_fmr - allocate a fast memory region
- * @pd: the protection domain for this memory region
- * @mr_access_flags: access flags for this memory region
- * @fmr_attr: fast memory region attributes
- *
- * Return: the memory region on success, otherwise returns an errno.
- */
-struct ib_fmr *rvt_alloc_fmr(struct ib_pd *pd, int mr_access_flags,
- struct ib_fmr_attr *fmr_attr)
-{
- struct rvt_fmr *fmr;
- int m;
- struct ib_fmr *ret;
- int rval = -ENOMEM;
-
- /* Allocate struct plus pointers to first level page tables. */
- m = (fmr_attr->max_pages + RVT_SEGSZ - 1) / RVT_SEGSZ;
- fmr = kzalloc(struct_size(fmr, mr.map, m), GFP_KERNEL);
- if (!fmr)
- goto bail;
-
- rval = rvt_init_mregion(&fmr->mr, pd, fmr_attr->max_pages,
- PERCPU_REF_INIT_ATOMIC);
- if (rval)
- goto bail;
-
- /*
- * ib_alloc_fmr() will initialize fmr->ibfmr except for lkey &
- * rkey.
- */
- rval = rvt_alloc_lkey(&fmr->mr, 0);
- if (rval)
- goto bail_mregion;
- fmr->ibfmr.rkey = fmr->mr.lkey;
- fmr->ibfmr.lkey = fmr->mr.lkey;
- /*
- * Resources are allocated but no valid mapping (RKEY can't be
- * used).
- */
- fmr->mr.access_flags = mr_access_flags;
- fmr->mr.max_segs = fmr_attr->max_pages;
- fmr->mr.page_shift = fmr_attr->page_shift;
-
- ret = &fmr->ibfmr;
-done:
- return ret;
-
-bail_mregion:
- rvt_deinit_mregion(&fmr->mr);
-bail:
- kfree(fmr);
- ret = ERR_PTR(rval);
- goto done;
-}
-
-/**
- * rvt_map_phys_fmr - set up a fast memory region
- * @ibfmr: the fast memory region to set up
- * @page_list: the list of pages to associate with the fast memory region
- * @list_len: the number of pages to associate with the fast memory region
- * @iova: the virtual address of the start of the fast memory region
- *
- * This may be called from interrupt context.
- *
- * Return: 0 on success
- */
-
-int rvt_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
- int list_len, u64 iova)
-{
- struct rvt_fmr *fmr = to_ifmr(ibfmr);
- struct rvt_lkey_table *rkt;
- unsigned long flags;
- int m, n;
- unsigned long i;
- u32 ps;
- struct rvt_dev_info *rdi = ib_to_rvt(ibfmr->device);
-
- i = atomic_long_read(&fmr->mr.refcount.count);
- if (i > 2)
- return -EBUSY;
-
- if (list_len > fmr->mr.max_segs)
- return -EINVAL;
-
- rkt = &rdi->lkey_table;
- spin_lock_irqsave(&rkt->lock, flags);
- fmr->mr.user_base = iova;
- fmr->mr.iova = iova;
- ps = 1 << fmr->mr.page_shift;
- fmr->mr.length = list_len * ps;
- m = 0;
- n = 0;
- for (i = 0; i < list_len; i++) {
- fmr->mr.map[m]->segs[n].vaddr = (void *)page_list[i];
- fmr->mr.map[m]->segs[n].length = ps;
- trace_rvt_mr_fmr_seg(&fmr->mr, m, n, (void *)page_list[i], ps);
- if (++n == RVT_SEGSZ) {
- m++;
- n = 0;
- }
- }
- spin_unlock_irqrestore(&rkt->lock, flags);
- return 0;
-}
-
-/**
- * rvt_unmap_fmr - unmap fast memory regions
- * @fmr_list: the list of fast memory regions to unmap
- *
- * Return: 0 on success.
- */
-int rvt_unmap_fmr(struct list_head *fmr_list)
-{
- struct rvt_fmr *fmr;
- struct rvt_lkey_table *rkt;
- unsigned long flags;
- struct rvt_dev_info *rdi;
-
- list_for_each_entry(fmr, fmr_list, ibfmr.list) {
- rdi = ib_to_rvt(fmr->ibfmr.device);
- rkt = &rdi->lkey_table;
- spin_lock_irqsave(&rkt->lock, flags);
- fmr->mr.user_base = 0;
- fmr->mr.iova = 0;
- fmr->mr.length = 0;
- spin_unlock_irqrestore(&rkt->lock, flags);
- }
- return 0;
-}
-
-/**
- * rvt_dealloc_fmr - deallocate a fast memory region
- * @ibfmr: the fast memory region to deallocate
- *
- * Return: 0 on success.
- */
-int rvt_dealloc_fmr(struct ib_fmr *ibfmr)
-{
- struct rvt_fmr *fmr = to_ifmr(ibfmr);
- int ret = 0;
-
- rvt_free_lkey(&fmr->mr);
- rvt_put_mr(&fmr->mr); /* will set completion if last */
- ret = rvt_check_refs(&fmr->mr, __func__);
- if (ret)
- goto out;
- rvt_deinit_mregion(&fmr->mr);
- kfree(fmr);
-out:
- return ret;
-}
-
-/**
* rvt_sge_adjacent - is isge compressible
* @last_sge: last outgoing SGE written
* @sge: SGE to check
diff --git a/drivers/infiniband/sw/rdmavt/mr.h b/drivers/infiniband/sw/rdmavt/mr.h
index 2c8d075..780fc63 100644
--- a/drivers/infiniband/sw/rdmavt/mr.h
+++ b/drivers/infiniband/sw/rdmavt/mr.h
@@ -49,10 +49,6 @@
*/
#include <rdma/rdma_vt.h>
-struct rvt_fmr {
- struct ib_fmr ibfmr;
- struct rvt_mregion mr; /* must be last */
-};
struct rvt_mr {
struct ib_mr ibmr;
@@ -60,11 +56,6 @@ struct rvt_mr {
struct rvt_mregion mr; /* must be last */
};
-static inline struct rvt_fmr *to_ifmr(struct ib_fmr *ibfmr)
-{
- return container_of(ibfmr, struct rvt_fmr, ibfmr);
-}
-
static inline struct rvt_mr *to_imr(struct ib_mr *ibmr)
{
return container_of(ibmr, struct rvt_mr, ibmr);
@@ -83,11 +74,5 @@ struct ib_mr *rvt_alloc_mr(struct ib_pd *pd, enum ib_mr_type mr_type,
u32 max_num_sg, struct ib_udata *udata);
int rvt_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
int sg_nents, unsigned int *sg_offset);
-struct ib_fmr *rvt_alloc_fmr(struct ib_pd *pd, int mr_access_flags,
- struct ib_fmr_attr *fmr_attr);
-int rvt_map_phys_fmr(struct ib_fmr *ibfmr, u64 *page_list,
- int list_len, u64 iova);
-int rvt_unmap_fmr(struct list_head *fmr_list);
-int rvt_dealloc_fmr(struct ib_fmr *ibfmr);
#endif /* DEF_RVTMR_H */
diff --git a/drivers/infiniband/sw/rdmavt/vt.c b/drivers/infiniband/sw/rdmavt/vt.c
index 72b031a..f904bb3 100644
--- a/drivers/infiniband/sw/rdmavt/vt.c
+++ b/drivers/infiniband/sw/rdmavt/vt.c
@@ -378,7 +378,6 @@ enum {
static const struct ib_device_ops rvt_dev_ops = {
.uverbs_abi_ver = RVT_UVERBS_ABI_VERSION,
- .alloc_fmr = rvt_alloc_fmr,
.alloc_mr = rvt_alloc_mr,
.alloc_pd = rvt_alloc_pd,
.alloc_ucontext = rvt_alloc_ucontext,
@@ -387,7 +386,6 @@ enum {
.create_cq = rvt_create_cq,
.create_qp = rvt_create_qp,
.create_srq = rvt_create_srq,
- .dealloc_fmr = rvt_dealloc_fmr,
.dealloc_pd = rvt_dealloc_pd,
.dealloc_ucontext = rvt_dealloc_ucontext,
.dereg_mr = rvt_dereg_mr,
@@ -399,7 +397,6 @@ enum {
.get_dma_mr = rvt_get_dma_mr,
.get_port_immutable = rvt_get_port_immutable,
.map_mr_sg = rvt_map_mr_sg,
- .map_phys_fmr = rvt_map_phys_fmr,
.mmap = rvt_mmap,
.modify_ah = rvt_modify_ah,
.modify_device = rvt_modify_device,
@@ -420,7 +417,6 @@ enum {
.reg_user_mr = rvt_reg_user_mr,
.req_notify_cq = rvt_req_notify_cq,
.resize_cq = rvt_resize_cq,
- .unmap_fmr = rvt_unmap_fmr,
INIT_RDMA_OBJ_SIZE(ib_ah, rvt_ah, ibah),
INIT_RDMA_OBJ_SIZE(ib_cq, rvt_cq, ibcq),
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 5/9] RDMA/rdmavt: remove FMR memory registration
2020-05-27 9:46 ` [PATCH 5/9] RDMA/rdmavt: remove FMR " Max Gurtovoy
@ 2020-05-27 15:56 ` Dennis Dalessandro
0 siblings, 0 replies; 14+ messages in thread
From: Dennis Dalessandro @ 2020-05-27 15:56 UTC (permalink / raw)
To: Max Gurtovoy, jgg, dledford, leon, galpress, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk
On 5/27/2020 5:46 AM, Max Gurtovoy wrote:
> Use FRWR method to register memory by default and remove the ancient and
> unsafe FMR method.
>
> Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
See the expected failures.
Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 6/9] RDMA/iser: Remove support for FMR memory registration
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
` (4 preceding siblings ...)
2020-05-27 9:46 ` [PATCH 5/9] RDMA/rdmavt: remove FMR " Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-27 9:46 ` [PATCH 7/9] RDMA/srp: remove " Max Gurtovoy
` (3 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
From: Israel Rukshin <israelr@mellanox.com>
FMR is not supported on most recent RDMA devices (that use fast memory
registration mechanism). Also, FMR was recently removed from NFS/RDMA
ULP.
Signed-off-by: Israel Rukshin <israelr@mellanox.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
---
drivers/infiniband/ulp/iser/iscsi_iser.h | 79 +----------
drivers/infiniband/ulp/iser/iser_initiator.c | 19 ++-
drivers/infiniband/ulp/iser/iser_memory.c | 188 ++-------------------------
drivers/infiniband/ulp/iser/iser_verbs.c | 126 +++---------------
4 files changed, 40 insertions(+), 372 deletions(-)
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.h b/drivers/infiniband/ulp/iser/iscsi_iser.h
index 029c001..1d77c7f 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.h
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.h
@@ -65,7 +65,6 @@
#include <linux/in6.h>
#include <rdma/ib_verbs.h>
-#include <rdma/ib_fmr_pool.h>
#include <rdma/rdma_cm.h>
#define DRV_NAME "iser"
@@ -313,33 +312,6 @@ struct iser_comp {
};
/**
- * struct iser_reg_ops - Memory registration operations
- * per-device registration schemes
- *
- * @alloc_reg_res: Allocate registration resources
- * @free_reg_res: Free registration resources
- * @reg_mem: Register memory buffers
- * @unreg_mem: Un-register memory buffers
- * @reg_desc_get: Get a registration descriptor for pool
- * @reg_desc_put: Get a registration descriptor to pool
- */
-struct iser_reg_ops {
- int (*alloc_reg_res)(struct ib_conn *ib_conn,
- unsigned cmds_max,
- unsigned int size);
- void (*free_reg_res)(struct ib_conn *ib_conn);
- int (*reg_mem)(struct iscsi_iser_task *iser_task,
- struct iser_data_buf *mem,
- struct iser_reg_resources *rsc,
- struct iser_mem_reg *reg);
- void (*unreg_mem)(struct iscsi_iser_task *iser_task,
- enum iser_data_dir cmd_dir);
- struct iser_fr_desc * (*reg_desc_get)(struct ib_conn *ib_conn);
- void (*reg_desc_put)(struct ib_conn *ib_conn,
- struct iser_fr_desc *desc);
-};
-
-/**
* struct iser_device - iSER device handle
*
* @ib_device: RDMA device
@@ -351,8 +323,6 @@ struct iser_reg_ops {
* @comps_used: Number of completion contexts used, Min between online
* cpus and device max completion vectors
* @comps: Dinamically allocated array of completion handlers
- * @reg_ops: Registration ops
- * @remote_inv_sup: Remote invalidate is supported on this device
*/
struct iser_device {
struct ib_device *ib_device;
@@ -362,26 +332,18 @@ struct iser_device {
int refcount;
int comps_used;
struct iser_comp *comps;
- const struct iser_reg_ops *reg_ops;
- bool remote_inv_sup;
};
/**
* struct iser_reg_resources - Fast registration resources
*
* @mr: memory region
- * @fmr_pool: pool of fmrs
* @sig_mr: signature memory region
- * @page_vec: fast reg page list used by fmr pool
* @mr_valid: is mr valid indicator
*/
struct iser_reg_resources {
- union {
- struct ib_mr *mr;
- struct ib_fmr_pool *fmr_pool;
- };
+ struct ib_mr *mr;
struct ib_mr *sig_mr;
- struct iser_page_vec *page_vec;
u8 mr_valid:1;
};
@@ -403,7 +365,7 @@ struct iser_fr_desc {
* struct iser_fr_pool - connection fast registration pool
*
* @list: list of fastreg descriptors
- * @lock: protects fmr/fastreg pool
+ * @lock: protects fastreg pool
* @size: size of the pool
*/
struct iser_fr_pool {
@@ -518,12 +480,6 @@ struct iscsi_iser_task {
struct iser_data_buf prot[ISER_DIRS_NUM];
};
-struct iser_page_vec {
- u64 *pages;
- int npages;
- struct ib_mr fake_mr;
-};
-
/**
* struct iser_global - iSER global context
*
@@ -548,8 +504,6 @@ struct iser_global {
extern unsigned int iser_max_sectors;
extern bool iser_always_reg;
-int iser_assign_reg_ops(struct iser_device *device);
-
int iser_send_control(struct iscsi_conn *conn,
struct iscsi_task *task);
@@ -591,22 +545,17 @@ void iser_finalize_rdma_unaligned_sg(struct iscsi_iser_task *iser_task,
struct iser_data_buf *mem,
enum iser_data_dir cmd_dir);
-int iser_reg_rdma_mem(struct iscsi_iser_task *task,
- enum iser_data_dir dir,
- bool all_imm);
-void iser_unreg_rdma_mem(struct iscsi_iser_task *task,
- enum iser_data_dir dir);
+int iser_reg_mem_fastreg(struct iscsi_iser_task *task,
+ enum iser_data_dir dir,
+ bool all_imm);
+void iser_unreg_mem_fastreg(struct iscsi_iser_task *task,
+ enum iser_data_dir dir);
int iser_connect(struct iser_conn *iser_conn,
struct sockaddr *src_addr,
struct sockaddr *dst_addr,
int non_blocking);
-void iser_unreg_mem_fmr(struct iscsi_iser_task *iser_task,
- enum iser_data_dir cmd_dir);
-void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task,
- enum iser_data_dir cmd_dir);
-
int iser_post_recvl(struct iser_conn *iser_conn);
int iser_post_recvm(struct iser_conn *iser_conn, int count);
int iser_post_send(struct ib_conn *ib_conn, struct iser_tx_desc *tx_desc,
@@ -625,26 +574,12 @@ int iser_initialize_task_headers(struct iscsi_task *task,
struct iser_tx_desc *tx_desc);
int iser_alloc_rx_descriptors(struct iser_conn *iser_conn,
struct iscsi_session *session);
-int iser_alloc_fmr_pool(struct ib_conn *ib_conn,
- unsigned cmds_max,
- unsigned int size);
-void iser_free_fmr_pool(struct ib_conn *ib_conn);
int iser_alloc_fastreg_pool(struct ib_conn *ib_conn,
unsigned cmds_max,
unsigned int size);
void iser_free_fastreg_pool(struct ib_conn *ib_conn);
u8 iser_check_task_pi_status(struct iscsi_iser_task *iser_task,
enum iser_data_dir cmd_dir, sector_t *sector);
-struct iser_fr_desc *
-iser_reg_desc_get_fr(struct ib_conn *ib_conn);
-void
-iser_reg_desc_put_fr(struct ib_conn *ib_conn,
- struct iser_fr_desc *desc);
-struct iser_fr_desc *
-iser_reg_desc_get_fmr(struct ib_conn *ib_conn);
-void
-iser_reg_desc_put_fmr(struct ib_conn *ib_conn,
- struct iser_fr_desc *desc);
static inline struct iser_conn *
to_iser_conn(struct ib_conn *ib_conn)
diff --git a/drivers/infiniband/ulp/iser/iser_initiator.c b/drivers/infiniband/ulp/iser/iser_initiator.c
index 4a7045b..27a6f75 100644
--- a/drivers/infiniband/ulp/iser/iser_initiator.c
+++ b/drivers/infiniband/ulp/iser/iser_initiator.c
@@ -72,7 +72,7 @@ static int iser_prepare_read_cmd(struct iscsi_task *task)
return err;
}
- err = iser_reg_rdma_mem(iser_task, ISER_DIR_IN, false);
+ err = iser_reg_mem_fastreg(iser_task, ISER_DIR_IN, false);
if (err) {
iser_err("Failed to set up Data-IN RDMA\n");
return err;
@@ -126,8 +126,8 @@ static int iser_prepare_read_cmd(struct iscsi_task *task)
return err;
}
- err = iser_reg_rdma_mem(iser_task, ISER_DIR_OUT,
- buf_out->data_len == imm_sz);
+ err = iser_reg_mem_fastreg(iser_task, ISER_DIR_OUT,
+ buf_out->data_len == imm_sz);
if (err != 0) {
iser_err("Failed to register write cmd RDMA mem\n");
return err;
@@ -250,8 +250,8 @@ int iser_alloc_rx_descriptors(struct iser_conn *iser_conn,
iser_conn->qp_max_recv_dtos_mask = session->cmds_max - 1; /* cmds_max is 2^N */
iser_conn->min_posted_rx = iser_conn->qp_max_recv_dtos >> 2;
- if (device->reg_ops->alloc_reg_res(ib_conn, session->scsi_cmds_max,
- iser_conn->pages_per_mr))
+ if (iser_alloc_fastreg_pool(ib_conn, session->scsi_cmds_max,
+ iser_conn->pages_per_mr))
goto create_rdma_reg_res_failed;
if (iser_alloc_login_buf(iser_conn))
@@ -293,7 +293,7 @@ int iser_alloc_rx_descriptors(struct iser_conn *iser_conn,
rx_desc_alloc_fail:
iser_free_login_buf(iser_conn);
alloc_login_buf_fail:
- device->reg_ops->free_reg_res(ib_conn);
+ iser_free_fastreg_pool(ib_conn);
create_rdma_reg_res_failed:
iser_err("failed allocating rx descriptors / data buffers\n");
return -ENOMEM;
@@ -306,8 +306,7 @@ void iser_free_rx_descriptors(struct iser_conn *iser_conn)
struct ib_conn *ib_conn = &iser_conn->ib_conn;
struct iser_device *device = ib_conn->device;
- if (device->reg_ops->free_reg_res)
- device->reg_ops->free_reg_res(ib_conn);
+ iser_free_fastreg_pool(ib_conn);
rx_desc = iser_conn->rx_descs;
for (i = 0; i < iser_conn->qp_max_recv_dtos; i++, rx_desc++)
@@ -768,7 +767,7 @@ void iser_task_rdma_finalize(struct iscsi_iser_task *iser_task)
int prot_count = scsi_prot_sg_count(iser_task->sc);
if (iser_task->dir[ISER_DIR_IN]) {
- iser_unreg_rdma_mem(iser_task, ISER_DIR_IN);
+ iser_unreg_mem_fastreg(iser_task, ISER_DIR_IN);
iser_dma_unmap_task_data(iser_task,
&iser_task->data[ISER_DIR_IN],
DMA_FROM_DEVICE);
@@ -779,7 +778,7 @@ void iser_task_rdma_finalize(struct iscsi_iser_task *iser_task)
}
if (iser_task->dir[ISER_DIR_OUT]) {
- iser_unreg_rdma_mem(iser_task, ISER_DIR_OUT);
+ iser_unreg_mem_fastreg(iser_task, ISER_DIR_OUT);
iser_dma_unmap_task_data(iser_task,
&iser_task->data[ISER_DIR_OUT],
DMA_TO_DEVICE);
diff --git a/drivers/infiniband/ulp/iser/iser_memory.c b/drivers/infiniband/ulp/iser/iser_memory.c
index 999ef7c..d4e057f 100644
--- a/drivers/infiniband/ulp/iser/iser_memory.c
+++ b/drivers/infiniband/ulp/iser/iser_memory.c
@@ -38,62 +38,13 @@
#include <linux/scatterlist.h>
#include "iscsi_iser.h"
-static
-int iser_fast_reg_fmr(struct iscsi_iser_task *iser_task,
- struct iser_data_buf *mem,
- struct iser_reg_resources *rsc,
- struct iser_mem_reg *mem_reg);
-static
-int iser_fast_reg_mr(struct iscsi_iser_task *iser_task,
- struct iser_data_buf *mem,
- struct iser_reg_resources *rsc,
- struct iser_mem_reg *mem_reg);
-
-static const struct iser_reg_ops fastreg_ops = {
- .alloc_reg_res = iser_alloc_fastreg_pool,
- .free_reg_res = iser_free_fastreg_pool,
- .reg_mem = iser_fast_reg_mr,
- .unreg_mem = iser_unreg_mem_fastreg,
- .reg_desc_get = iser_reg_desc_get_fr,
- .reg_desc_put = iser_reg_desc_put_fr,
-};
-
-static const struct iser_reg_ops fmr_ops = {
- .alloc_reg_res = iser_alloc_fmr_pool,
- .free_reg_res = iser_free_fmr_pool,
- .reg_mem = iser_fast_reg_fmr,
- .unreg_mem = iser_unreg_mem_fmr,
- .reg_desc_get = iser_reg_desc_get_fmr,
- .reg_desc_put = iser_reg_desc_put_fmr,
-};
void iser_reg_comp(struct ib_cq *cq, struct ib_wc *wc)
{
iser_err_comp(wc, "memreg");
}
-int iser_assign_reg_ops(struct iser_device *device)
-{
- struct ib_device *ib_dev = device->ib_device;
-
- /* Assign function handles - based on FMR support */
- if (ib_dev->ops.alloc_fmr && ib_dev->ops.dealloc_fmr &&
- ib_dev->ops.map_phys_fmr && ib_dev->ops.unmap_fmr) {
- iser_info("FMR supported, using FMR for registration\n");
- device->reg_ops = &fmr_ops;
- } else if (ib_dev->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) {
- iser_info("FastReg supported, using FastReg for registration\n");
- device->reg_ops = &fastreg_ops;
- device->remote_inv_sup = iser_always_reg;
- } else {
- iser_err("IB device does not support FMRs nor FastRegs, can't register memory\n");
- return -1;
- }
-
- return 0;
-}
-
-struct iser_fr_desc *
+static struct iser_fr_desc *
iser_reg_desc_get_fr(struct ib_conn *ib_conn)
{
struct iser_fr_pool *fr_pool = &ib_conn->fr_pool;
@@ -109,7 +60,7 @@ struct iser_fr_desc *
return desc;
}
-void
+static void
iser_reg_desc_put_fr(struct ib_conn *ib_conn,
struct iser_fr_desc *desc)
{
@@ -121,44 +72,6 @@ struct iser_fr_desc *
spin_unlock_irqrestore(&fr_pool->lock, flags);
}
-struct iser_fr_desc *
-iser_reg_desc_get_fmr(struct ib_conn *ib_conn)
-{
- struct iser_fr_pool *fr_pool = &ib_conn->fr_pool;
-
- return list_first_entry(&fr_pool->list,
- struct iser_fr_desc, list);
-}
-
-void
-iser_reg_desc_put_fmr(struct ib_conn *ib_conn,
- struct iser_fr_desc *desc)
-{
-}
-
-static void iser_data_buf_dump(struct iser_data_buf *data,
- struct ib_device *ibdev)
-{
- struct scatterlist *sg;
- int i;
-
- for_each_sg(data->sg, sg, data->dma_nents, i)
- iser_dbg("sg[%d] dma_addr:0x%lX page:0x%p "
- "off:0x%x sz:0x%x dma_len:0x%x\n",
- i, (unsigned long)sg_dma_address(sg),
- sg_page(sg), sg->offset, sg->length, sg_dma_len(sg));
-}
-
-static void iser_dump_page_vec(struct iser_page_vec *page_vec)
-{
- int i;
-
- iser_err("page vec npages %d data length %lld\n",
- page_vec->npages, page_vec->fake_mr.length);
- for (i = 0; i < page_vec->npages; i++)
- iser_err("vec[%d]: %llx\n", i, page_vec->pages[i]);
-}
-
int iser_dma_map_task_data(struct iscsi_iser_task *iser_task,
struct iser_data_buf *data,
enum iser_data_dir iser_dir,
@@ -213,84 +126,9 @@ void iser_dma_unmap_task_data(struct iscsi_iser_task *iser_task,
return 0;
}
-static int iser_set_page(struct ib_mr *mr, u64 addr)
-{
- struct iser_page_vec *page_vec =
- container_of(mr, struct iser_page_vec, fake_mr);
-
- page_vec->pages[page_vec->npages++] = addr;
-
- return 0;
-}
-
-static
-int iser_fast_reg_fmr(struct iscsi_iser_task *iser_task,
- struct iser_data_buf *mem,
- struct iser_reg_resources *rsc,
- struct iser_mem_reg *reg)
-{
- struct ib_conn *ib_conn = &iser_task->iser_conn->ib_conn;
- struct iser_device *device = ib_conn->device;
- struct iser_page_vec *page_vec = rsc->page_vec;
- struct ib_fmr_pool *fmr_pool = rsc->fmr_pool;
- struct ib_pool_fmr *fmr;
- int ret, plen;
-
- page_vec->npages = 0;
- page_vec->fake_mr.page_size = SZ_4K;
- plen = ib_sg_to_pages(&page_vec->fake_mr, mem->sg,
- mem->dma_nents, NULL, iser_set_page);
- if (unlikely(plen < mem->dma_nents)) {
- iser_err("page vec too short to hold this SG\n");
- iser_data_buf_dump(mem, device->ib_device);
- iser_dump_page_vec(page_vec);
- return -EINVAL;
- }
-
- fmr = ib_fmr_pool_map_phys(fmr_pool, page_vec->pages,
- page_vec->npages, page_vec->pages[0]);
- if (IS_ERR(fmr)) {
- ret = PTR_ERR(fmr);
- iser_err("ib_fmr_pool_map_phys failed: %d\n", ret);
- return ret;
- }
-
- reg->sge.lkey = fmr->fmr->lkey;
- reg->rkey = fmr->fmr->rkey;
- reg->sge.addr = page_vec->fake_mr.iova;
- reg->sge.length = page_vec->fake_mr.length;
- reg->mem_h = fmr;
-
- iser_dbg("fmr reg: lkey=0x%x, rkey=0x%x, addr=0x%llx,"
- " length=0x%x\n", reg->sge.lkey, reg->rkey,
- reg->sge.addr, reg->sge.length);
-
- return 0;
-}
-
-/**
- * Unregister (previosuly registered using FMR) memory.
- * If memory is non-FMR does nothing.
- */
-void iser_unreg_mem_fmr(struct iscsi_iser_task *iser_task,
- enum iser_data_dir cmd_dir)
-{
- struct iser_mem_reg *reg = &iser_task->rdma_reg[cmd_dir];
-
- if (!reg->mem_h)
- return;
-
- iser_dbg("PHYSICAL Mem.Unregister mem_h %p\n", reg->mem_h);
-
- ib_fmr_pool_unmap((struct ib_pool_fmr *)reg->mem_h);
-
- reg->mem_h = NULL;
-}
-
void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task,
enum iser_data_dir cmd_dir)
{
- struct iser_device *device = iser_task->iser_conn->ib_conn.device;
struct iser_mem_reg *reg = &iser_task->rdma_reg[cmd_dir];
struct iser_fr_desc *desc;
struct ib_mr_status mr_status;
@@ -312,7 +150,7 @@ void iser_unreg_mem_fastreg(struct iscsi_iser_task *iser_task,
ib_check_mr_status(desc->rsc.sig_mr, IB_MR_CHECK_SIG_STATUS,
&mr_status);
}
- device->reg_ops->reg_desc_put(&iser_task->iser_conn->ib_conn, desc);
+ iser_reg_desc_put_fr(&iser_task->iser_conn->ib_conn, reg->mem_h);
reg->mem_h = NULL;
}
@@ -509,15 +347,14 @@ static int iser_fast_reg_mr(struct iscsi_iser_task *iser_task,
if (use_dma_key)
return iser_reg_dma(device, mem, reg);
- return device->reg_ops->reg_mem(task, mem, &desc->rsc, reg);
+ return iser_fast_reg_mr(task, mem, &desc->rsc, reg);
}
-int iser_reg_rdma_mem(struct iscsi_iser_task *task,
- enum iser_data_dir dir,
- bool all_imm)
+int iser_reg_mem_fastreg(struct iscsi_iser_task *task,
+ enum iser_data_dir dir,
+ bool all_imm)
{
struct ib_conn *ib_conn = &task->iser_conn->ib_conn;
- struct iser_device *device = ib_conn->device;
struct iser_data_buf *mem = &task->data[dir];
struct iser_mem_reg *reg = &task->rdma_reg[dir];
struct iser_fr_desc *desc = NULL;
@@ -528,7 +365,7 @@ int iser_reg_rdma_mem(struct iscsi_iser_task *task,
scsi_get_prot_op(task->sc) == SCSI_PROT_NORMAL;
if (!use_dma_key) {
- desc = device->reg_ops->reg_desc_get(ib_conn);
+ desc = iser_reg_desc_get_fr(ib_conn);
reg->mem_h = desc;
}
@@ -549,15 +386,8 @@ int iser_reg_rdma_mem(struct iscsi_iser_task *task,
err_reg:
if (desc)
- device->reg_ops->reg_desc_put(ib_conn, desc);
+ iser_reg_desc_put_fr(ib_conn, desc);
return err;
}
-void iser_unreg_rdma_mem(struct iscsi_iser_task *task,
- enum iser_data_dir dir)
-{
- struct iser_device *device = task->iser_conn->ib_conn.device;
-
- device->reg_ops->unreg_mem(task, dir);
-}
diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c
index 127887c..c1f44c4 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -68,11 +68,12 @@ static void iser_event_handler(struct ib_event_handler *handler,
static int iser_create_device_ib_res(struct iser_device *device)
{
struct ib_device *ib_dev = device->ib_device;
- int ret, i, max_cqe;
+ int i, max_cqe;
- ret = iser_assign_reg_ops(device);
- if (ret)
- return ret;
+ if (!(ib_dev->attrs.device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS)) {
+ iser_err("IB device does not support memory registrations\n");
+ return -1;
+ }
device->comps_used = min_t(int, num_online_cpus(),
ib_dev->num_comp_vectors);
@@ -147,96 +148,6 @@ static void iser_free_device_ib_res(struct iser_device *device)
device->pd = NULL;
}
-/**
- * iser_alloc_fmr_pool - Creates FMR pool and page_vector
- * @ib_conn: connection RDMA resources
- * @cmds_max: max number of SCSI commands for this connection
- * @size: max number of pages per map request
- *
- * Return: 0 on success, or errno code on failure
- */
-int iser_alloc_fmr_pool(struct ib_conn *ib_conn,
- unsigned cmds_max,
- unsigned int size)
-{
- struct iser_device *device = ib_conn->device;
- struct iser_fr_pool *fr_pool = &ib_conn->fr_pool;
- struct iser_page_vec *page_vec;
- struct iser_fr_desc *desc;
- struct ib_fmr_pool *fmr_pool;
- struct ib_fmr_pool_param params;
- int ret;
-
- INIT_LIST_HEAD(&fr_pool->list);
- spin_lock_init(&fr_pool->lock);
-
- desc = kzalloc(sizeof(*desc), GFP_KERNEL);
- if (!desc)
- return -ENOMEM;
-
- page_vec = kmalloc(sizeof(*page_vec) + (sizeof(u64) * size),
- GFP_KERNEL);
- if (!page_vec) {
- ret = -ENOMEM;
- goto err_frpl;
- }
-
- page_vec->pages = (u64 *)(page_vec + 1);
-
- params.page_shift = ilog2(SZ_4K);
- params.max_pages_per_fmr = size;
- /* make the pool size twice the max number of SCSI commands *
- * the ML is expected to queue, watermark for unmap at 50% */
- params.pool_size = cmds_max * 2;
- params.dirty_watermark = cmds_max;
- params.cache = 0;
- params.flush_function = NULL;
- params.access = (IB_ACCESS_LOCAL_WRITE |
- IB_ACCESS_REMOTE_WRITE |
- IB_ACCESS_REMOTE_READ);
-
- fmr_pool = ib_create_fmr_pool(device->pd, ¶ms);
- if (IS_ERR(fmr_pool)) {
- ret = PTR_ERR(fmr_pool);
- iser_err("FMR allocation failed, err %d\n", ret);
- goto err_fmr;
- }
-
- desc->rsc.page_vec = page_vec;
- desc->rsc.fmr_pool = fmr_pool;
- list_add(&desc->list, &fr_pool->list);
-
- return 0;
-
-err_fmr:
- kfree(page_vec);
-err_frpl:
- kfree(desc);
-
- return ret;
-}
-
-/**
- * iser_free_fmr_pool - releases the FMR pool and page vec
- * @ib_conn: connection RDMA resources
- */
-void iser_free_fmr_pool(struct ib_conn *ib_conn)
-{
- struct iser_fr_pool *fr_pool = &ib_conn->fr_pool;
- struct iser_fr_desc *desc;
-
- desc = list_first_entry(&fr_pool->list,
- struct iser_fr_desc, list);
- list_del(&desc->list);
-
- iser_info("freeing conn %p fmr pool %p\n",
- ib_conn, desc->rsc.fmr_pool);
-
- ib_destroy_fmr_pool(desc->rsc.fmr_pool);
- kfree(desc->rsc.page_vec);
- kfree(desc);
-}
-
static struct iser_fr_desc *
iser_create_fastreg_desc(struct iser_device *device,
struct ib_pd *pd,
@@ -667,13 +578,12 @@ static void iser_connect_error(struct rdma_cm_id *cma_id)
u32 max_num_sg;
/*
- * FRs without SG_GAPS or FMRs can only map up to a (device) page per
- * entry, but if the first entry is misaligned we'll end up using two
- * entries (head and tail) for a single page worth data, so one
- * additional entry is required.
+ * FRs without SG_GAPS can only map up to a (device) page per entry,
+ * but if the first entry is misaligned we'll end up using two entries
+ * (head and tail) for a single page worth data, so one additional
+ * entry is required.
*/
- if ((attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS) &&
- (attr->device_cap_flags & IB_DEVICE_SG_GAPS_REG))
+ if (attr->device_cap_flags & IB_DEVICE_SG_GAPS_REG)
reserved_mr_pages = 0;
else
reserved_mr_pages = 1;
@@ -684,14 +594,8 @@ static void iser_connect_error(struct rdma_cm_id *cma_id)
max_num_sg = attr->max_fast_reg_page_list_len;
sg_tablesize = DIV_ROUND_UP(max_sectors * SECTOR_SIZE, SZ_4K);
- if (attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS)
- sup_sg_tablesize =
- min_t(
- uint, ISCSI_ISER_MAX_SG_TABLESIZE,
- max_num_sg - reserved_mr_pages);
- else
- sup_sg_tablesize = ISCSI_ISER_MAX_SG_TABLESIZE;
-
+ sup_sg_tablesize = min_t(uint, ISCSI_ISER_MAX_SG_TABLESIZE,
+ max_num_sg - reserved_mr_pages);
iser_conn->scsi_sg_tablesize = min(sg_tablesize, sup_sg_tablesize);
iser_conn->pages_per_mr =
iser_conn->scsi_sg_tablesize + reserved_mr_pages;
@@ -755,7 +659,7 @@ static void iser_route_handler(struct rdma_cm_id *cma_id)
struct iser_cm_hdr req_hdr;
struct iser_conn *iser_conn = (struct iser_conn *)cma_id->context;
struct ib_conn *ib_conn = &iser_conn->ib_conn;
- struct iser_device *device = ib_conn->device;
+ struct ib_device *ib_dev = ib_conn->device->ib_device;
if (iser_conn->state != ISER_CONN_PENDING)
/* bailout */
@@ -766,14 +670,14 @@ static void iser_route_handler(struct rdma_cm_id *cma_id)
goto failure;
memset(&conn_param, 0, sizeof conn_param);
- conn_param.responder_resources = device->ib_device->attrs.max_qp_rd_atom;
+ conn_param.responder_resources = ib_dev->attrs.max_qp_rd_atom;
conn_param.initiator_depth = 1;
conn_param.retry_count = 7;
conn_param.rnr_retry_count = 6;
memset(&req_hdr, 0, sizeof(req_hdr));
req_hdr.flags = ISER_ZBVA_NOT_SUP;
- if (!device->remote_inv_sup)
+ if (!iser_always_reg)
req_hdr.flags |= ISER_SEND_W_INV_NOT_SUP;
conn_param.private_data = (void *)&req_hdr;
conn_param.private_data_len = sizeof(struct iser_cm_hdr);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 7/9] RDMA/srp: remove support for FMR memory registration
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
` (5 preceding siblings ...)
2020-05-27 9:46 ` [PATCH 6/9] RDMA/iser: Remove support for " Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-27 9:46 ` [PATCH 8/9] RDMA/core: remove FMR pool API Max Gurtovoy
` (2 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
FMR is not supported on most recent RDMA devices (that use fast memory
registration mechanism). Also, FMR was recently removed from NFS/RDMA
ULP.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
Reviewed-by: Israel Rukshin <israelr@mellanox.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
---
drivers/infiniband/ulp/srp/ib_srp.c | 222 +++---------------------------------
drivers/infiniband/ulp/srp/ib_srp.h | 27 +----
2 files changed, 22 insertions(+), 227 deletions(-)
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index cd1181c..f276155 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -71,7 +71,6 @@
static unsigned int cmd_sg_entries;
static unsigned int indirect_sg_entries;
static bool allow_ext_sg;
-static bool prefer_fr = true;
static bool register_always = true;
static bool never_register;
static int topspin_workarounds = 1;
@@ -95,10 +94,6 @@
MODULE_PARM_DESC(topspin_workarounds,
"Enable workarounds for Topspin/Cisco SRP target bugs if != 0");
-module_param(prefer_fr, bool, 0444);
-MODULE_PARM_DESC(prefer_fr,
-"Whether to use fast registration if both FMR and fast registration are supported");
-
module_param(register_always, bool, 0444);
MODULE_PARM_DESC(register_always,
"Use memory registration even for contiguous memory regions");
@@ -388,24 +383,6 @@ static int srp_new_cm_id(struct srp_rdma_ch *ch)
srp_new_ib_cm_id(ch);
}
-static struct ib_fmr_pool *srp_alloc_fmr_pool(struct srp_target_port *target)
-{
- struct srp_device *dev = target->srp_host->srp_dev;
- struct ib_fmr_pool_param fmr_param;
-
- memset(&fmr_param, 0, sizeof(fmr_param));
- fmr_param.pool_size = target->mr_pool_size;
- fmr_param.dirty_watermark = fmr_param.pool_size / 4;
- fmr_param.cache = 1;
- fmr_param.max_pages_per_fmr = dev->max_pages_per_mr;
- fmr_param.page_shift = ilog2(dev->mr_page_size);
- fmr_param.access = (IB_ACCESS_LOCAL_WRITE |
- IB_ACCESS_REMOTE_WRITE |
- IB_ACCESS_REMOTE_READ);
-
- return ib_create_fmr_pool(dev->pd, &fmr_param);
-}
-
/**
* srp_destroy_fr_pool() - free the resources owned by a pool
* @pool: Fast registration pool to be destroyed.
@@ -556,7 +533,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch)
struct ib_qp_init_attr *init_attr;
struct ib_cq *recv_cq, *send_cq;
struct ib_qp *qp;
- struct ib_fmr_pool *fmr_pool = NULL;
struct srp_fr_pool *fr_pool = NULL;
const int m = 1 + dev->use_fast_reg * target->mr_per_cmd * 2;
int ret;
@@ -619,14 +595,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch)
"FR pool allocation failed (%d)\n", ret);
goto err_qp;
}
- } else if (dev->use_fmr) {
- fmr_pool = srp_alloc_fmr_pool(target);
- if (IS_ERR(fmr_pool)) {
- ret = PTR_ERR(fmr_pool);
- shost_printk(KERN_WARNING, target->scsi_host, PFX
- "FMR pool allocation failed (%d)\n", ret);
- goto err_qp;
- }
}
if (ch->qp)
@@ -644,10 +612,6 @@ static int srp_create_ch_ib(struct srp_rdma_ch *ch)
if (ch->fr_pool)
srp_destroy_fr_pool(ch->fr_pool);
ch->fr_pool = fr_pool;
- } else if (dev->use_fmr) {
- if (ch->fmr_pool)
- ib_destroy_fmr_pool(ch->fmr_pool);
- ch->fmr_pool = fmr_pool;
}
kfree(init_attr);
@@ -702,9 +666,6 @@ static void srp_free_ch_ib(struct srp_target_port *target,
if (dev->use_fast_reg) {
if (ch->fr_pool)
srp_destroy_fr_pool(ch->fr_pool);
- } else if (dev->use_fmr) {
- if (ch->fmr_pool)
- ib_destroy_fmr_pool(ch->fmr_pool);
}
srp_destroy_qp(ch);
@@ -1017,12 +978,8 @@ static void srp_free_req_data(struct srp_target_port *target,
for (i = 0; i < target->req_ring_size; ++i) {
req = &ch->req_ring[i];
- if (dev->use_fast_reg) {
+ if (dev->use_fast_reg)
kfree(req->fr_list);
- } else {
- kfree(req->fmr_list);
- kfree(req->map_page);
- }
if (req->indirect_dma_addr) {
ib_dma_unmap_single(ibdev, req->indirect_dma_addr,
target->indirect_size,
@@ -1056,16 +1013,8 @@ static int srp_alloc_req_data(struct srp_rdma_ch *ch)
GFP_KERNEL);
if (!mr_list)
goto out;
- if (srp_dev->use_fast_reg) {
+ if (srp_dev->use_fast_reg)
req->fr_list = mr_list;
- } else {
- req->fmr_list = mr_list;
- req->map_page = kmalloc_array(srp_dev->max_pages_per_mr,
- sizeof(void *),
- GFP_KERNEL);
- if (!req->map_page)
- goto out;
- }
req->indirect_desc = kmalloc(target->indirect_size, GFP_KERNEL);
if (!req->indirect_desc)
goto out;
@@ -1272,11 +1221,6 @@ static void srp_unmap_data(struct scsi_cmnd *scmnd,
if (req->nmdesc)
srp_fr_pool_put(ch->fr_pool, req->fr_list,
req->nmdesc);
- } else if (dev->use_fmr) {
- struct ib_pool_fmr **pfmr;
-
- for (i = req->nmdesc, pfmr = req->fmr_list; i > 0; i--, pfmr++)
- ib_fmr_pool_unmap(*pfmr);
}
ib_dma_unmap_sg(ibdev, scsi_sglist(scmnd), scsi_sg_count(scmnd),
@@ -1472,50 +1416,6 @@ static void srp_map_desc(struct srp_map_state *state, dma_addr_t dma_addr,
state->ndesc++;
}
-static int srp_map_finish_fmr(struct srp_map_state *state,
- struct srp_rdma_ch *ch)
-{
- struct srp_target_port *target = ch->target;
- struct srp_device *dev = target->srp_host->srp_dev;
- struct ib_pool_fmr *fmr;
- u64 io_addr = 0;
-
- if (state->fmr.next >= state->fmr.end) {
- shost_printk(KERN_ERR, ch->target->scsi_host,
- PFX "Out of MRs (mr_per_cmd = %d)\n",
- ch->target->mr_per_cmd);
- return -ENOMEM;
- }
-
- WARN_ON_ONCE(!dev->use_fmr);
-
- if (state->npages == 0)
- return 0;
-
- if (state->npages == 1 && target->global_rkey) {
- srp_map_desc(state, state->base_dma_addr, state->dma_len,
- target->global_rkey);
- goto reset_state;
- }
-
- fmr = ib_fmr_pool_map_phys(ch->fmr_pool, state->pages,
- state->npages, io_addr);
- if (IS_ERR(fmr))
- return PTR_ERR(fmr);
-
- *state->fmr.next++ = fmr;
- state->nmdesc++;
-
- srp_map_desc(state, state->base_dma_addr & ~dev->mr_page_mask,
- state->dma_len, fmr->fmr->rkey);
-
-reset_state:
- state->npages = 0;
- state->dma_len = 0;
-
- return 0;
-}
-
static void srp_reg_mr_err_done(struct ib_cq *cq, struct ib_wc *wc)
{
srp_handle_qp_err(cq, wc, "FAST REG");
@@ -1606,74 +1506,6 @@ static int srp_map_finish_fr(struct srp_map_state *state,
return n;
}
-static int srp_map_sg_entry(struct srp_map_state *state,
- struct srp_rdma_ch *ch,
- struct scatterlist *sg)
-{
- struct srp_target_port *target = ch->target;
- struct srp_device *dev = target->srp_host->srp_dev;
- dma_addr_t dma_addr = sg_dma_address(sg);
- unsigned int dma_len = sg_dma_len(sg);
- unsigned int len = 0;
- int ret;
-
- WARN_ON_ONCE(!dma_len);
-
- while (dma_len) {
- unsigned offset = dma_addr & ~dev->mr_page_mask;
-
- if (state->npages == dev->max_pages_per_mr ||
- (state->npages > 0 && offset != 0)) {
- ret = srp_map_finish_fmr(state, ch);
- if (ret)
- return ret;
- }
-
- len = min_t(unsigned int, dma_len, dev->mr_page_size - offset);
-
- if (!state->npages)
- state->base_dma_addr = dma_addr;
- state->pages[state->npages++] = dma_addr & dev->mr_page_mask;
- state->dma_len += len;
- dma_addr += len;
- dma_len -= len;
- }
-
- /*
- * If the end of the MR is not on a page boundary then we need to
- * close it out and start a new one -- we can only merge at page
- * boundaries.
- */
- ret = 0;
- if ((dma_addr & ~dev->mr_page_mask) != 0)
- ret = srp_map_finish_fmr(state, ch);
- return ret;
-}
-
-static int srp_map_sg_fmr(struct srp_map_state *state, struct srp_rdma_ch *ch,
- struct srp_request *req, struct scatterlist *scat,
- int count)
-{
- struct scatterlist *sg;
- int i, ret;
-
- state->pages = req->map_page;
- state->fmr.next = req->fmr_list;
- state->fmr.end = req->fmr_list + ch->target->mr_per_cmd;
-
- for_each_sg(scat, sg, count, i) {
- ret = srp_map_sg_entry(state, ch, sg);
- if (ret)
- return ret;
- }
-
- ret = srp_map_finish_fmr(state, ch);
- if (ret)
- return ret;
-
- return 0;
-}
-
static int srp_map_sg_fr(struct srp_map_state *state, struct srp_rdma_ch *ch,
struct srp_request *req, struct scatterlist *scat,
int count)
@@ -1733,7 +1565,6 @@ static int srp_map_idb(struct srp_rdma_ch *ch, struct srp_request *req,
struct srp_device *dev = target->srp_host->srp_dev;
struct srp_map_state state;
struct srp_direct_buf idb_desc;
- u64 idb_pages[1];
struct scatterlist idb_sg[1];
int ret;
@@ -1756,14 +1587,6 @@ static int srp_map_idb(struct srp_rdma_ch *ch, struct srp_request *req,
if (ret < 0)
return ret;
WARN_ON_ONCE(ret < 1);
- } else if (dev->use_fmr) {
- state.pages = idb_pages;
- state.pages[0] = (req->indirect_dma_addr &
- dev->mr_page_mask);
- state.npages = 1;
- ret = srp_map_finish_fmr(&state, ch);
- if (ret < 0)
- return ret;
} else {
return -EINVAL;
}
@@ -1787,9 +1610,6 @@ static void srp_check_mapping(struct srp_map_state *state,
if (dev->use_fast_reg)
for (i = 0, pfr = req->fr_list; i < state->nmdesc; i++, pfr++)
mr_len += (*pfr)->mr->length;
- else if (dev->use_fmr)
- for (i = 0; i < state->nmdesc; i++)
- mr_len += be32_to_cpu(req->indirect_desc[i].len);
if (desc_len != scsi_bufflen(req->scmnd) ||
mr_len > scsi_bufflen(req->scmnd))
pr_err("Inconsistent: scsi len %d <> desc len %lld <> mr len %lld; ndesc %d; nmdesc = %d\n",
@@ -1904,8 +1724,6 @@ static int srp_map_data(struct scsi_cmnd *scmnd, struct srp_rdma_ch *ch,
state.desc = req->indirect_desc;
if (dev->use_fast_reg)
ret = srp_map_sg_fr(&state, ch, req, scat, count);
- else if (dev->use_fmr)
- ret = srp_map_sg_fmr(&state, ch, req, scat, count);
else
ret = srp_map_sg_dma(&state, ch, req, scat, count);
req->nmdesc = state.nmdesc;
@@ -3864,13 +3682,13 @@ static ssize_t srp_create_target(struct device *dev,
goto out;
}
- if (!srp_dev->has_fmr && !srp_dev->has_fr && !target->allow_ext_sg &&
+ if (!srp_dev->has_fr && !target->allow_ext_sg &&
target->cmd_sg_cnt < target->sg_tablesize) {
pr_warn("No MR pool and no external indirect descriptors, limiting sg_tablesize to cmd_sg_cnt\n");
target->sg_tablesize = target->cmd_sg_cnt;
}
- if (srp_dev->use_fast_reg || srp_dev->use_fmr) {
+ if (srp_dev->use_fast_reg) {
bool gaps_reg = (ibdev->attrs.device_cap_flags &
IB_DEVICE_SG_GAPS_REG);
@@ -3878,12 +3696,12 @@ static ssize_t srp_create_target(struct device *dev,
(ilog2(srp_dev->mr_page_size) - 9);
if (!gaps_reg) {
/*
- * FR and FMR can only map one HCA page per entry. If
- * the start address is not aligned on a HCA page
- * boundary two entries will be used for the head and
- * the tail although these two entries combined
- * contain at most one HCA page of data. Hence the "+
- * 1" in the calculation below.
+ * FR can only map one HCA page per entry. If the start
+ * address is not aligned on a HCA page boundary two
+ * entries will be used for the head and the tail
+ * although these two entries combined contain at most
+ * one HCA page of data. Hence the "+ 1" in the
+ * calculation below.
*
* The indirect data buffer descriptor is contiguous
* so the memory for that buffer will only be
@@ -4162,23 +3980,15 @@ static void srp_add_one(struct ib_device *device)
srp_dev->max_pages_per_mr = min_t(u64, SRP_MAX_PAGES_PER_MR,
max_pages_per_mr);
- srp_dev->has_fmr = (device->ops.alloc_fmr &&
- device->ops.dealloc_fmr &&
- device->ops.map_phys_fmr &&
- device->ops.unmap_fmr);
srp_dev->has_fr = (attr->device_cap_flags &
IB_DEVICE_MEM_MGT_EXTENSIONS);
- if (!never_register && !srp_dev->has_fmr && !srp_dev->has_fr) {
- dev_warn(&device->dev, "neither FMR nor FR is supported\n");
- } else if (!never_register &&
- attr->max_mr_size >= 2 * srp_dev->mr_page_size) {
- srp_dev->use_fast_reg = (srp_dev->has_fr &&
- (!srp_dev->has_fmr || prefer_fr));
- srp_dev->use_fmr = !srp_dev->use_fast_reg && srp_dev->has_fmr;
- }
+ if (!never_register && !srp_dev->has_fr)
+ dev_warn(&device->dev, "FR is not supported\n");
+ else if (!never_register &&
+ attr->max_mr_size >= 2 * srp_dev->mr_page_size)
+ srp_dev->use_fast_reg = srp_dev->has_fr;
- if (never_register || !register_always ||
- (!srp_dev->has_fmr && !srp_dev->has_fr))
+ if (never_register || !register_always || !srp_dev->has_fr)
flags |= IB_PD_UNSAFE_GLOBAL_RKEY;
if (srp_dev->use_fast_reg) {
diff --git a/drivers/infiniband/ulp/srp/ib_srp.h b/drivers/infiniband/ulp/srp/ib_srp.h
index 6fabcc2..6818cac 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.h
+++ b/drivers/infiniband/ulp/srp/ib_srp.h
@@ -44,7 +44,6 @@
#include <rdma/ib_verbs.h>
#include <rdma/ib_sa.h>
#include <rdma/ib_cm.h>
-#include <rdma/ib_fmr_pool.h>
#include <rdma/rdma_cm.h>
enum {
@@ -95,8 +94,7 @@ enum srp_iu_type {
/*
* @mr_page_mask: HCA memory registration page mask.
* @mr_page_size: HCA memory registration page size.
- * @mr_max_size: Maximum size in bytes of a single FMR / FR registration
- * request.
+ * @mr_max_size: Maximum size in bytes of a single FR registration request.
*/
struct srp_device {
struct list_head dev_list;
@@ -107,9 +105,7 @@ struct srp_device {
int mr_page_size;
int mr_max_size;
int max_pages_per_mr;
- bool has_fmr;
bool has_fr;
- bool use_fmr;
bool use_fast_reg;
};
@@ -127,11 +123,7 @@ struct srp_host {
struct srp_request {
struct scsi_cmnd *scmnd;
struct srp_iu *cmd;
- union {
- struct ib_pool_fmr **fmr_list;
- struct srp_fr_desc **fr_list;
- };
- u64 *map_page;
+ struct srp_fr_desc **fr_list;
struct srp_direct_buf *indirect_desc;
dma_addr_t indirect_dma_addr;
short nmdesc;
@@ -155,10 +147,7 @@ struct srp_rdma_ch {
struct ib_cq *send_cq;
struct ib_cq *recv_cq;
struct ib_qp *qp;
- union {
- struct ib_fmr_pool *fmr_pool;
- struct srp_fr_pool *fr_pool;
- };
+ struct srp_fr_pool *fr_pool;
uint32_t max_it_iu_len;
uint32_t max_ti_iu_len;
u8 max_imm_sge;
@@ -319,20 +308,16 @@ struct srp_fr_pool {
* @pages: Array with DMA addresses of pages being considered for
* memory registration.
* @base_dma_addr: DMA address of the first page that has not yet been mapped.
- * @dma_len: Number of bytes that will be registered with the next
- * FMR or FR memory registration call.
+ * @dma_len: Number of bytes that will be registered with the next FR
+ * memory registration call.
* @total_len: Total number of bytes in the sg-list being mapped.
* @npages: Number of page addresses in the pages[] array.
- * @nmdesc: Number of FMR or FR memory descriptors used for mapping.
+ * @nmdesc: Number of FR memory descriptors used for mapping.
* @ndesc: Number of SRP buffer descriptors that have been filled in.
*/
struct srp_map_state {
union {
struct {
- struct ib_pool_fmr **next;
- struct ib_pool_fmr **end;
- } fmr;
- struct {
struct srp_fr_desc **next;
struct srp_fr_desc **end;
} fr;
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 8/9] RDMA/core: remove FMR pool API
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
` (6 preceding siblings ...)
2020-05-27 9:46 ` [PATCH 7/9] RDMA/srp: remove " Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-27 9:46 ` [PATCH 9/9] RDMA/core: remove FMR device ops Max Gurtovoy
2020-05-28 1:06 ` [PATCH 0/9 v2] Remove FMR support from RDMA drivers Jason Gunthorpe
9 siblings, 0 replies; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
This ancient and unsafe method for memory registration is no longer used
by any RDMA based ULP. Remove the FMR pool API from the core driver.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
Documentation/driver-api/infiniband.rst | 3 -
drivers/infiniband/core/Makefile | 2 +-
drivers/infiniband/core/fmr_pool.c | 494 --------------------------------
include/rdma/ib_fmr_pool.h | 93 ------
4 files changed, 1 insertion(+), 591 deletions(-)
delete mode 100644 drivers/infiniband/core/fmr_pool.c
delete mode 100644 include/rdma/ib_fmr_pool.h
diff --git a/Documentation/driver-api/infiniband.rst b/Documentation/driver-api/infiniband.rst
index 1a3116f..30e142c 100644
--- a/Documentation/driver-api/infiniband.rst
+++ b/Documentation/driver-api/infiniband.rst
@@ -37,9 +37,6 @@ InfiniBand core interfaces
.. kernel-doc:: drivers/infiniband/core/ud_header.c
:export:
-.. kernel-doc:: drivers/infiniband/core/fmr_pool.c
- :export:
-
.. kernel-doc:: drivers/infiniband/core/umem.c
:export:
diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index d1b14887..064cd34 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -8,7 +8,7 @@ obj-$(CONFIG_INFINIBAND_USER_MAD) += ib_umad.o
obj-$(CONFIG_INFINIBAND_USER_ACCESS) += ib_uverbs.o $(user_access-y)
ib_core-y := packer.o ud_header.o verbs.o cq.o rw.o sysfs.o \
- device.o fmr_pool.o cache.o netlink.o \
+ device.o cache.o netlink.o \
roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
multicast.o mad.o smi.o agent.o mad_rmpp.o \
nldev.o restrack.o counters.o ib_core_uverbs.o \
diff --git a/drivers/infiniband/core/fmr_pool.c b/drivers/infiniband/core/fmr_pool.c
deleted file mode 100644
index e08aec4..0000000
--- a/drivers/infiniband/core/fmr_pool.c
+++ /dev/null
@@ -1,494 +0,0 @@
-/*
- * Copyright (c) 2004 Topspin Communications. All rights reserved.
- * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
- *
- * This software is available to you under a choice of one of two
- * licenses. You may choose to be licensed under the terms of the GNU
- * General Public License (GPL) Version 2, available from the file
- * COPYING in the main directory of this source tree, or the
- * OpenIB.org BSD license below:
- *
- * Redistribution and use in source and binary forms, with or
- * without modification, are permitted provided that the following
- * conditions are met:
- *
- * - Redistributions of source code must retain the above
- * copyright notice, this list of conditions and the following
- * disclaimer.
- *
- * - Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following
- * disclaimer in the documentation and/or other materials
- * provided with the distribution.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
- * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
- * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
- * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- * SOFTWARE.
- */
-
-#include <linux/errno.h>
-#include <linux/spinlock.h>
-#include <linux/export.h>
-#include <linux/slab.h>
-#include <linux/jhash.h>
-#include <linux/kthread.h>
-
-#include <rdma/ib_fmr_pool.h>
-
-#include "core_priv.h"
-
-#define PFX "fmr_pool: "
-
-enum {
- IB_FMR_MAX_REMAPS = 32,
-
- IB_FMR_HASH_BITS = 8,
- IB_FMR_HASH_SIZE = 1 << IB_FMR_HASH_BITS,
- IB_FMR_HASH_MASK = IB_FMR_HASH_SIZE - 1
-};
-
-/*
- * If an FMR is not in use, then the list member will point to either
- * its pool's free_list (if the FMR can be mapped again; that is,
- * remap_count < pool->max_remaps) or its pool's dirty_list (if the
- * FMR needs to be unmapped before being remapped). In either of
- * these cases it is a bug if the ref_count is not 0. In other words,
- * if ref_count is > 0, then the list member must not be linked into
- * either free_list or dirty_list.
- *
- * The cache_node member is used to link the FMR into a cache bucket
- * (if caching is enabled). This is independent of the reference
- * count of the FMR. When a valid FMR is released, its ref_count is
- * decremented, and if ref_count reaches 0, the FMR is placed in
- * either free_list or dirty_list as appropriate. However, it is not
- * removed from the cache and may be "revived" if a call to
- * ib_fmr_register_physical() occurs before the FMR is remapped. In
- * this case we just increment the ref_count and remove the FMR from
- * free_list/dirty_list.
- *
- * Before we remap an FMR from free_list, we remove it from the cache
- * (to prevent another user from obtaining a stale FMR). When an FMR
- * is released, we add it to the tail of the free list, so that our
- * cache eviction policy is "least recently used."
- *
- * All manipulation of ref_count, list and cache_node is protected by
- * pool_lock to maintain consistency.
- */
-
-struct ib_fmr_pool {
- spinlock_t pool_lock;
-
- int pool_size;
- int max_pages;
- int max_remaps;
- int dirty_watermark;
- int dirty_len;
- struct list_head free_list;
- struct list_head dirty_list;
- struct hlist_head *cache_bucket;
-
- void (*flush_function)(struct ib_fmr_pool *pool,
- void * arg);
- void *flush_arg;
-
- struct kthread_worker *worker;
- struct kthread_work work;
-
- atomic_t req_ser;
- atomic_t flush_ser;
-
- wait_queue_head_t force_wait;
-};
-
-static inline u32 ib_fmr_hash(u64 first_page)
-{
- return jhash_2words((u32) first_page, (u32) (first_page >> 32), 0) &
- (IB_FMR_HASH_SIZE - 1);
-}
-
-/* Caller must hold pool_lock */
-static inline struct ib_pool_fmr *ib_fmr_cache_lookup(struct ib_fmr_pool *pool,
- u64 *page_list,
- int page_list_len,
- u64 io_virtual_address)
-{
- struct hlist_head *bucket;
- struct ib_pool_fmr *fmr;
-
- if (!pool->cache_bucket)
- return NULL;
-
- bucket = pool->cache_bucket + ib_fmr_hash(*page_list);
-
- hlist_for_each_entry(fmr, bucket, cache_node)
- if (io_virtual_address == fmr->io_virtual_address &&
- page_list_len == fmr->page_list_len &&
- !memcmp(page_list, fmr->page_list,
- page_list_len * sizeof *page_list))
- return fmr;
-
- return NULL;
-}
-
-static void ib_fmr_batch_release(struct ib_fmr_pool *pool)
-{
- int ret;
- struct ib_pool_fmr *fmr;
- LIST_HEAD(unmap_list);
- LIST_HEAD(fmr_list);
-
- spin_lock_irq(&pool->pool_lock);
-
- list_for_each_entry(fmr, &pool->dirty_list, list) {
- hlist_del_init(&fmr->cache_node);
- fmr->remap_count = 0;
- list_add_tail(&fmr->fmr->list, &fmr_list);
- }
-
- list_splice_init(&pool->dirty_list, &unmap_list);
- pool->dirty_len = 0;
-
- spin_unlock_irq(&pool->pool_lock);
-
- if (list_empty(&unmap_list)) {
- return;
- }
-
- ret = ib_unmap_fmr(&fmr_list);
- if (ret)
- pr_warn(PFX "ib_unmap_fmr returned %d\n", ret);
-
- spin_lock_irq(&pool->pool_lock);
- list_splice(&unmap_list, &pool->free_list);
- spin_unlock_irq(&pool->pool_lock);
-}
-
-static void ib_fmr_cleanup_func(struct kthread_work *work)
-{
- struct ib_fmr_pool *pool = container_of(work, struct ib_fmr_pool, work);
-
- ib_fmr_batch_release(pool);
- atomic_inc(&pool->flush_ser);
- wake_up_interruptible(&pool->force_wait);
-
- if (pool->flush_function)
- pool->flush_function(pool, pool->flush_arg);
-
- if (atomic_read(&pool->flush_ser) - atomic_read(&pool->req_ser) < 0)
- kthread_queue_work(pool->worker, &pool->work);
-}
-
-/**
- * ib_create_fmr_pool - Create an FMR pool
- * @pd:Protection domain for FMRs
- * @params:FMR pool parameters
- *
- * Create a pool of FMRs. Return value is pointer to new pool or
- * error code if creation failed.
- */
-struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd,
- struct ib_fmr_pool_param *params)
-{
- struct ib_device *device;
- struct ib_fmr_pool *pool;
- int i;
- int ret;
- int max_remaps;
-
- if (!params)
- return ERR_PTR(-EINVAL);
-
- device = pd->device;
- if (!device->ops.alloc_fmr || !device->ops.dealloc_fmr ||
- !device->ops.map_phys_fmr || !device->ops.unmap_fmr) {
- dev_info(&device->dev, "Device does not support FMRs\n");
- return ERR_PTR(-ENOSYS);
- }
-
- if (!device->attrs.max_map_per_fmr)
- max_remaps = IB_FMR_MAX_REMAPS;
- else
- max_remaps = device->attrs.max_map_per_fmr;
-
- pool = kmalloc(sizeof *pool, GFP_KERNEL);
- if (!pool)
- return ERR_PTR(-ENOMEM);
-
- pool->cache_bucket = NULL;
- pool->flush_function = params->flush_function;
- pool->flush_arg = params->flush_arg;
-
- INIT_LIST_HEAD(&pool->free_list);
- INIT_LIST_HEAD(&pool->dirty_list);
-
- if (params->cache) {
- pool->cache_bucket =
- kmalloc_array(IB_FMR_HASH_SIZE,
- sizeof(*pool->cache_bucket),
- GFP_KERNEL);
- if (!pool->cache_bucket) {
- ret = -ENOMEM;
- goto out_free_pool;
- }
-
- for (i = 0; i < IB_FMR_HASH_SIZE; ++i)
- INIT_HLIST_HEAD(pool->cache_bucket + i);
- }
-
- pool->pool_size = 0;
- pool->max_pages = params->max_pages_per_fmr;
- pool->max_remaps = max_remaps;
- pool->dirty_watermark = params->dirty_watermark;
- pool->dirty_len = 0;
- spin_lock_init(&pool->pool_lock);
- atomic_set(&pool->req_ser, 0);
- atomic_set(&pool->flush_ser, 0);
- init_waitqueue_head(&pool->force_wait);
-
- pool->worker =
- kthread_create_worker(0, "ib_fmr(%s)", dev_name(&device->dev));
- if (IS_ERR(pool->worker)) {
- pr_warn(PFX "couldn't start cleanup kthread worker\n");
- ret = PTR_ERR(pool->worker);
- goto out_free_pool;
- }
- kthread_init_work(&pool->work, ib_fmr_cleanup_func);
-
- {
- struct ib_pool_fmr *fmr;
- struct ib_fmr_attr fmr_attr = {
- .max_pages = params->max_pages_per_fmr,
- .max_maps = pool->max_remaps,
- .page_shift = params->page_shift
- };
- int bytes_per_fmr = sizeof *fmr;
-
- if (pool->cache_bucket)
- bytes_per_fmr += params->max_pages_per_fmr * sizeof (u64);
-
- for (i = 0; i < params->pool_size; ++i) {
- fmr = kmalloc(bytes_per_fmr, GFP_KERNEL);
- if (!fmr)
- goto out_fail;
-
- fmr->pool = pool;
- fmr->remap_count = 0;
- fmr->ref_count = 0;
- INIT_HLIST_NODE(&fmr->cache_node);
-
- fmr->fmr = ib_alloc_fmr(pd, params->access, &fmr_attr);
- if (IS_ERR(fmr->fmr)) {
- pr_warn(PFX "fmr_create failed for FMR %d\n",
- i);
- kfree(fmr);
- goto out_fail;
- }
-
- list_add_tail(&fmr->list, &pool->free_list);
- ++pool->pool_size;
- }
- }
-
- return pool;
-
- out_free_pool:
- kfree(pool->cache_bucket);
- kfree(pool);
-
- return ERR_PTR(ret);
-
- out_fail:
- ib_destroy_fmr_pool(pool);
-
- return ERR_PTR(-ENOMEM);
-}
-EXPORT_SYMBOL(ib_create_fmr_pool);
-
-/**
- * ib_destroy_fmr_pool - Free FMR pool
- * @pool:FMR pool to free
- *
- * Destroy an FMR pool and free all associated resources.
- */
-void ib_destroy_fmr_pool(struct ib_fmr_pool *pool)
-{
- struct ib_pool_fmr *fmr;
- struct ib_pool_fmr *tmp;
- LIST_HEAD(fmr_list);
- int i;
-
- kthread_destroy_worker(pool->worker);
- ib_fmr_batch_release(pool);
-
- i = 0;
- list_for_each_entry_safe(fmr, tmp, &pool->free_list, list) {
- if (fmr->remap_count) {
- INIT_LIST_HEAD(&fmr_list);
- list_add_tail(&fmr->fmr->list, &fmr_list);
- ib_unmap_fmr(&fmr_list);
- }
- ib_dealloc_fmr(fmr->fmr);
- list_del(&fmr->list);
- kfree(fmr);
- ++i;
- }
-
- if (i < pool->pool_size)
- pr_warn(PFX "pool still has %d regions registered\n",
- pool->pool_size - i);
-
- kfree(pool->cache_bucket);
- kfree(pool);
-}
-EXPORT_SYMBOL(ib_destroy_fmr_pool);
-
-/**
- * ib_flush_fmr_pool - Invalidate all unmapped FMRs
- * @pool:FMR pool to flush
- *
- * Ensure that all unmapped FMRs are fully invalidated.
- */
-int ib_flush_fmr_pool(struct ib_fmr_pool *pool)
-{
- int serial;
- struct ib_pool_fmr *fmr, *next;
-
- /*
- * The free_list holds FMRs that may have been used
- * but have not been remapped enough times to be dirty.
- * Put them on the dirty list now so that the cleanup
- * thread will reap them too.
- */
- spin_lock_irq(&pool->pool_lock);
- list_for_each_entry_safe(fmr, next, &pool->free_list, list) {
- if (fmr->remap_count > 0)
- list_move(&fmr->list, &pool->dirty_list);
- }
- spin_unlock_irq(&pool->pool_lock);
-
- serial = atomic_inc_return(&pool->req_ser);
- kthread_queue_work(pool->worker, &pool->work);
-
- if (wait_event_interruptible(pool->force_wait,
- atomic_read(&pool->flush_ser) - serial >= 0))
- return -EINTR;
-
- return 0;
-}
-EXPORT_SYMBOL(ib_flush_fmr_pool);
-
-/**
- * ib_fmr_pool_map_phys - Map an FMR from an FMR pool.
- * @pool_handle: FMR pool to allocate FMR from
- * @page_list: List of pages to map
- * @list_len: Number of pages in @page_list
- * @io_virtual_address: I/O virtual address for new FMR
- */
-struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle,
- u64 *page_list,
- int list_len,
- u64 io_virtual_address)
-{
- struct ib_fmr_pool *pool = pool_handle;
- struct ib_pool_fmr *fmr;
- unsigned long flags;
- int result;
-
- if (list_len < 1 || list_len > pool->max_pages)
- return ERR_PTR(-EINVAL);
-
- spin_lock_irqsave(&pool->pool_lock, flags);
- fmr = ib_fmr_cache_lookup(pool,
- page_list,
- list_len,
- io_virtual_address);
- if (fmr) {
- /* found in cache */
- ++fmr->ref_count;
- if (fmr->ref_count == 1) {
- list_del(&fmr->list);
- }
-
- spin_unlock_irqrestore(&pool->pool_lock, flags);
-
- return fmr;
- }
-
- if (list_empty(&pool->free_list)) {
- spin_unlock_irqrestore(&pool->pool_lock, flags);
- return ERR_PTR(-EAGAIN);
- }
-
- fmr = list_entry(pool->free_list.next, struct ib_pool_fmr, list);
- list_del(&fmr->list);
- hlist_del_init(&fmr->cache_node);
- spin_unlock_irqrestore(&pool->pool_lock, flags);
-
- result = ib_map_phys_fmr(fmr->fmr, page_list, list_len,
- io_virtual_address);
-
- if (result) {
- spin_lock_irqsave(&pool->pool_lock, flags);
- list_add(&fmr->list, &pool->free_list);
- spin_unlock_irqrestore(&pool->pool_lock, flags);
-
- pr_warn(PFX "fmr_map returns %d\n", result);
-
- return ERR_PTR(result);
- }
-
- ++fmr->remap_count;
- fmr->ref_count = 1;
-
- if (pool->cache_bucket) {
- fmr->io_virtual_address = io_virtual_address;
- fmr->page_list_len = list_len;
- memcpy(fmr->page_list, page_list, list_len * sizeof(*page_list));
-
- spin_lock_irqsave(&pool->pool_lock, flags);
- hlist_add_head(&fmr->cache_node,
- pool->cache_bucket + ib_fmr_hash(fmr->page_list[0]));
- spin_unlock_irqrestore(&pool->pool_lock, flags);
- }
-
- return fmr;
-}
-EXPORT_SYMBOL(ib_fmr_pool_map_phys);
-
-/**
- * ib_fmr_pool_unmap - Unmap FMR
- * @fmr:FMR to unmap
- *
- * Unmap an FMR. The FMR mapping may remain valid until the FMR is
- * reused (or until ib_flush_fmr_pool() is called).
- */
-void ib_fmr_pool_unmap(struct ib_pool_fmr *fmr)
-{
- struct ib_fmr_pool *pool;
- unsigned long flags;
-
- pool = fmr->pool;
-
- spin_lock_irqsave(&pool->pool_lock, flags);
-
- --fmr->ref_count;
- if (!fmr->ref_count) {
- if (fmr->remap_count < pool->max_remaps) {
- list_add_tail(&fmr->list, &pool->free_list);
- } else {
- list_add_tail(&fmr->list, &pool->dirty_list);
- if (++pool->dirty_len >= pool->dirty_watermark) {
- atomic_inc(&pool->req_ser);
- kthread_queue_work(pool->worker, &pool->work);
- }
- }
- }
-
- spin_unlock_irqrestore(&pool->pool_lock, flags);
-}
-EXPORT_SYMBOL(ib_fmr_pool_unmap);
diff --git a/include/rdma/ib_fmr_pool.h b/include/rdma/ib_fmr_pool.h
deleted file mode 100644
index 2fd9bfb..0000000
--- a/include/rdma/ib_fmr_pool.h
+++ /dev/null
@@ -1,93 +0,0 @@
-/*
- * Copyright (c) 2004 Topspin Corporation. All rights reserved.
- * Copyright (c) 2005 Sun Microsystems, Inc. All rights reserved.
- *
- * This software is available to you under a choice of one of two
- * licenses. You may choose to be licensed under the terms of the GNU
- * General Public License (GPL) Version 2, available from the file
- * COPYING in the main directory of this source tree, or the
- * OpenIB.org BSD license below:
- *
- * Redistribution and use in source and binary forms, with or
- * without modification, are permitted provided that the following
- * conditions are met:
- *
- * - Redistributions of source code must retain the above
- * copyright notice, this list of conditions and the following
- * disclaimer.
- *
- * - Redistributions in binary form must reproduce the above
- * copyright notice, this list of conditions and the following
- * disclaimer in the documentation and/or other materials
- * provided with the distribution.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
- * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
- * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
- * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
- * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
- * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
- * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
- * SOFTWARE.
- */
-
-#if !defined(IB_FMR_POOL_H)
-#define IB_FMR_POOL_H
-
-#include <rdma/ib_verbs.h>
-
-struct ib_fmr_pool;
-
-/**
- * struct ib_fmr_pool_param - Parameters for creating FMR pool
- * @max_pages_per_fmr:Maximum number of pages per map request.
- * @page_shift: Log2 of sizeof "pages" mapped by this fmr
- * @access:Access flags for FMRs in pool.
- * @pool_size:Number of FMRs to allocate for pool.
- * @dirty_watermark:Flush is triggered when @dirty_watermark dirty
- * FMRs are present.
- * @flush_function:Callback called when unmapped FMRs are flushed and
- * more FMRs are possibly available for mapping
- * @flush_arg:Context passed to user's flush function.
- * @cache:If set, FMRs may be reused after unmapping for identical map
- * requests.
- */
-struct ib_fmr_pool_param {
- int max_pages_per_fmr;
- int page_shift;
- enum ib_access_flags access;
- int pool_size;
- int dirty_watermark;
- void (*flush_function)(struct ib_fmr_pool *pool,
- void *arg);
- void *flush_arg;
- unsigned cache:1;
-};
-
-struct ib_pool_fmr {
- struct ib_fmr *fmr;
- struct ib_fmr_pool *pool;
- struct list_head list;
- struct hlist_node cache_node;
- int ref_count;
- int remap_count;
- u64 io_virtual_address;
- int page_list_len;
- u64 page_list[];
-};
-
-struct ib_fmr_pool *ib_create_fmr_pool(struct ib_pd *pd,
- struct ib_fmr_pool_param *params);
-
-void ib_destroy_fmr_pool(struct ib_fmr_pool *pool);
-
-int ib_flush_fmr_pool(struct ib_fmr_pool *pool);
-
-struct ib_pool_fmr *ib_fmr_pool_map_phys(struct ib_fmr_pool *pool_handle,
- u64 *page_list,
- int list_len,
- u64 io_virtual_address);
-
-void ib_fmr_pool_unmap(struct ib_pool_fmr *fmr);
-
-#endif /* IB_FMR_POOL_H */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH 9/9] RDMA/core: remove FMR device ops
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
` (7 preceding siblings ...)
2020-05-27 9:46 ` [PATCH 8/9] RDMA/core: remove FMR pool API Max Gurtovoy
@ 2020-05-27 9:46 ` Max Gurtovoy
2020-05-28 1:06 ` [PATCH 0/9 v2] Remove FMR support from RDMA drivers Jason Gunthorpe
9 siblings, 0 replies; 14+ messages in thread
From: Max Gurtovoy @ 2020-05-27 9:46 UTC (permalink / raw)
To: jgg, dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom
Cc: aron.silverton, israelr, oren, shlomin, vladimirk, Max Gurtovoy
After removing FMR support from all the RDMA ULPs and providers, there
is no need to keep FMR operation for IB devices.
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>
---
Documentation/infiniband/core_locking.rst | 2 --
drivers/infiniband/core/device.c | 4 ---
drivers/infiniband/core/verbs.c | 48 -------------------------------
include/rdma/ib_verbs.h | 45 -----------------------------
4 files changed, 99 deletions(-)
diff --git a/Documentation/infiniband/core_locking.rst b/Documentation/infiniband/core_locking.rst
index 8f76a8a..efd5e76 100644
--- a/Documentation/infiniband/core_locking.rst
+++ b/Documentation/infiniband/core_locking.rst
@@ -22,7 +22,6 @@ Sleeping and interrupt context
- post_recv
- poll_cq
- req_notify_cq
- - map_phys_fmr
which may not sleep and must be callable from any context.
@@ -36,7 +35,6 @@ Sleeping and interrupt context
- ib_post_send
- ib_post_recv
- ib_req_notify_cq
- - ib_map_phys_fmr
are therefore safe to call from any context.
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index d0b3d35..4acbef2 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2557,7 +2557,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, add_gid);
SET_DEVICE_OP(dev_ops, advise_mr);
SET_DEVICE_OP(dev_ops, alloc_dm);
- SET_DEVICE_OP(dev_ops, alloc_fmr);
SET_DEVICE_OP(dev_ops, alloc_hw_stats);
SET_DEVICE_OP(dev_ops, alloc_mr);
SET_DEVICE_OP(dev_ops, alloc_mr_integrity);
@@ -2584,7 +2583,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, create_wq);
SET_DEVICE_OP(dev_ops, dealloc_dm);
SET_DEVICE_OP(dev_ops, dealloc_driver);
- SET_DEVICE_OP(dev_ops, dealloc_fmr);
SET_DEVICE_OP(dev_ops, dealloc_mw);
SET_DEVICE_OP(dev_ops, dealloc_pd);
SET_DEVICE_OP(dev_ops, dealloc_ucontext);
@@ -2628,7 +2626,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, iw_rem_ref);
SET_DEVICE_OP(dev_ops, map_mr_sg);
SET_DEVICE_OP(dev_ops, map_mr_sg_pi);
- SET_DEVICE_OP(dev_ops, map_phys_fmr);
SET_DEVICE_OP(dev_ops, mmap);
SET_DEVICE_OP(dev_ops, mmap_free);
SET_DEVICE_OP(dev_ops, modify_ah);
@@ -2662,7 +2659,6 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
SET_DEVICE_OP(dev_ops, resize_cq);
SET_DEVICE_OP(dev_ops, set_vf_guid);
SET_DEVICE_OP(dev_ops, set_vf_link_state);
- SET_DEVICE_OP(dev_ops, unmap_fmr);
SET_OBJ_SIZE(dev_ops, ib_ah);
SET_OBJ_SIZE(dev_ops, ib_cq);
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 56a7133..fa6689b 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2160,54 +2160,6 @@ struct ib_mr *ib_alloc_mr_integrity(struct ib_pd *pd,
}
EXPORT_SYMBOL(ib_alloc_mr_integrity);
-/* "Fast" memory regions */
-
-struct ib_fmr *ib_alloc_fmr(struct ib_pd *pd,
- int mr_access_flags,
- struct ib_fmr_attr *fmr_attr)
-{
- struct ib_fmr *fmr;
-
- if (!pd->device->ops.alloc_fmr)
- return ERR_PTR(-EOPNOTSUPP);
-
- fmr = pd->device->ops.alloc_fmr(pd, mr_access_flags, fmr_attr);
- if (!IS_ERR(fmr)) {
- fmr->device = pd->device;
- fmr->pd = pd;
- atomic_inc(&pd->usecnt);
- }
-
- return fmr;
-}
-EXPORT_SYMBOL(ib_alloc_fmr);
-
-int ib_unmap_fmr(struct list_head *fmr_list)
-{
- struct ib_fmr *fmr;
-
- if (list_empty(fmr_list))
- return 0;
-
- fmr = list_entry(fmr_list->next, struct ib_fmr, list);
- return fmr->device->ops.unmap_fmr(fmr_list);
-}
-EXPORT_SYMBOL(ib_unmap_fmr);
-
-int ib_dealloc_fmr(struct ib_fmr *fmr)
-{
- struct ib_pd *pd;
- int ret;
-
- pd = fmr->pd;
- ret = fmr->device->ops.dealloc_fmr(fmr);
- if (!ret)
- atomic_dec(&pd->usecnt);
-
- return ret;
-}
-EXPORT_SYMBOL(ib_dealloc_fmr);
-
/* Multicast groups */
static bool is_valid_mcast_lid(struct ib_qp *qp, u16 lid)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index bbc5cfb..1850b3a 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2453,12 +2453,6 @@ struct ib_device_ops {
struct ib_mw *(*alloc_mw)(struct ib_pd *pd, enum ib_mw_type type,
struct ib_udata *udata);
int (*dealloc_mw)(struct ib_mw *mw);
- struct ib_fmr *(*alloc_fmr)(struct ib_pd *pd, int mr_access_flags,
- struct ib_fmr_attr *fmr_attr);
- int (*map_phys_fmr)(struct ib_fmr *fmr, u64 *page_list, int list_len,
- u64 iova);
- int (*unmap_fmr)(struct list_head *fmr_list);
- int (*dealloc_fmr)(struct ib_fmr *fmr);
int (*attach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid);
int (*detach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid);
struct ib_xrcd *(*alloc_xrcd)(struct ib_device *device,
@@ -4209,45 +4203,6 @@ static inline u32 ib_inc_rkey(u32 rkey)
}
/**
- * ib_alloc_fmr - Allocates a unmapped fast memory region.
- * @pd: The protection domain associated with the unmapped region.
- * @mr_access_flags: Specifies the memory access rights.
- * @fmr_attr: Attributes of the unmapped region.
- *
- * A fast memory region must be mapped before it can be used as part of
- * a work request.
- */
-struct ib_fmr *ib_alloc_fmr(struct ib_pd *pd,
- int mr_access_flags,
- struct ib_fmr_attr *fmr_attr);
-
-/**
- * ib_map_phys_fmr - Maps a list of physical pages to a fast memory region.
- * @fmr: The fast memory region to associate with the pages.
- * @page_list: An array of physical pages to map to the fast memory region.
- * @list_len: The number of pages in page_list.
- * @iova: The I/O virtual address to use with the mapped region.
- */
-static inline int ib_map_phys_fmr(struct ib_fmr *fmr,
- u64 *page_list, int list_len,
- u64 iova)
-{
- return fmr->device->ops.map_phys_fmr(fmr, page_list, list_len, iova);
-}
-
-/**
- * ib_unmap_fmr - Removes the mapping from a list of fast memory regions.
- * @fmr_list: A linked list of fast memory regions to unmap.
- */
-int ib_unmap_fmr(struct list_head *fmr_list);
-
-/**
- * ib_dealloc_fmr - Deallocates a fast memory region.
- * @fmr: The fast memory region to deallocate.
- */
-int ib_dealloc_fmr(struct ib_fmr *fmr);
-
-/**
* ib_attach_mcast - Attaches the specified QP to a multicast group.
* @qp: QP to attach to the multicast group. The QP must be type
* IB_QPT_UD.
--
1.8.3.1
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH 0/9 v2] Remove FMR support from RDMA drivers
2020-05-27 9:46 [PATCH 0/9 v2] Remove FMR support from RDMA drivers Max Gurtovoy
` (8 preceding siblings ...)
2020-05-27 9:46 ` [PATCH 9/9] RDMA/core: remove FMR device ops Max Gurtovoy
@ 2020-05-28 1:06 ` Jason Gunthorpe
9 siblings, 0 replies; 14+ messages in thread
From: Jason Gunthorpe @ 2020-05-28 1:06 UTC (permalink / raw)
To: Max Gurtovoy
Cc: dledford, leon, galpress, dennis.dalessandro, netdev, sagi,
linux-rdma, bvanassche, santosh.shilimkar, tom, aron.silverton,
israelr, oren, shlomin, vladimirk
On Wed, May 27, 2020 at 12:46:25PM +0300, Max Gurtovoy wrote:
> This series removes the support for FMR mode to register memory. This ancient
> mode is unsafe (rkeys that are usually exposed for caching purposes and the API
> is limited to page granularity mappings) and not maintained/tested in the last
> few years. It also doesn't have any reasonable advantage over other memory
> registration methods such as FRWR (that is implemented in all the recent RDMA
> adapters). This series should be reviewed and approved by the maintainer of the
> effected drivers and I suggest to test it as well.
>
> The tests that I made for this series (fio benchmarks and fio verify data):
> 1. iSER initiator on ConnectX-4
> 2. iSER initiator on ConnectX-3
> 3. SRP initiator on ConnectX-4 (loopback to SRP target)
> 4. SRP initiator on ConnectX-3
I looked at this for a bit, and it seems like there is still a fair
amount more to remove, I came up with this:
https://github.com/jgunthorpe/linux/commits/for_max
Can you check?
Thanks
Jason
^ permalink raw reply [flat|nested] 14+ messages in thread