linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: linux-rdma@vger.kernel.org, Jason Gunthorpe <jgg@mellanox.com>,
	Artemy Kovalyov <artemyko@mellanox.com>
Subject: Re: [PATCH 6/6] RDMA/mlx5: Add missing synchronize_srcu() for MW cases
Date: Thu, 3 Oct 2019 11:54:49 +0300	[thread overview]
Message-ID: <20191003085449.GN5855@unreal> (raw)
In-Reply-To: <20191001153821.23621-7-jgg@ziepe.ca>

On Tue, Oct 01, 2019 at 12:38:21PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> While MR uses live as the SRCU 'update', the MW case uses the xarray
> directly, xa_erase() causes the MW to become inaccessible to the pagefault
> thread.
>
> Thus whenever a MW is removed from the xarray we must synchronize_srcu()
> before freeing it.
>
> This must be done before freeing the mkey as re-use of the mkey while the
> pagefault thread is using the stale mkey is undesirable.
>
> Add the missing synchronizes to MW and DEVX indirect mkey and delete the
> bogus protection against double destroy in mlx5_core_destroy_mkey()
>
> Fixes: 534fd7aac56a ("IB/mlx5: Manage indirection mkey upon DEVX flow for ODP")
> Fixes: 6aec21f6a832 ("IB/mlx5: Page faults handling infrastructure")
> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> ---
>  drivers/infiniband/hw/mlx5/devx.c            | 58 ++++++--------------
>  drivers/infiniband/hw/mlx5/mlx5_ib.h         |  1 -
>  drivers/infiniband/hw/mlx5/mr.c              | 21 +++++--
>  drivers/net/ethernet/mellanox/mlx5/core/mr.c |  8 +--
>  4 files changed, 33 insertions(+), 55 deletions(-)
>
> diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c
> index 59022b7441448f..d609f4659afb7a 100644
> --- a/drivers/infiniband/hw/mlx5/devx.c
> +++ b/drivers/infiniband/hw/mlx5/devx.c
> @@ -1298,29 +1298,6 @@ static int devx_handle_mkey_create(struct mlx5_ib_dev *dev,
>  	return 0;
>  }
>
> -static void devx_free_indirect_mkey(struct rcu_head *rcu)
> -{
> -	kfree(container_of(rcu, struct devx_obj, devx_mr.rcu));
> -}
> -
> -/* This function to delete from the radix tree needs to be called before
> - * destroying the underlying mkey. Otherwise a race might occur in case that
> - * other thread will get the same mkey before this one will be deleted,
> - * in that case it will fail via inserting to the tree its own data.
> - *
> - * Note:
> - * An error in the destroy is not expected unless there is some other indirect
> - * mkey which points to this one. In a kernel cleanup flow it will be just
> - * destroyed in the iterative destruction call. In a user flow, in case
> - * the application didn't close in the expected order it's its own problem,
> - * the mkey won't be part of the tree, in both cases the kernel is safe.
> - */
> -static void devx_cleanup_mkey(struct devx_obj *obj)
> -{
> -	xa_erase(&obj->ib_dev->mdev->priv.mkey_table,
> -		 mlx5_base_mkey(obj->devx_mr.mmkey.key));
> -}
> -
>  static void devx_cleanup_subscription(struct mlx5_ib_dev *dev,
>  				      struct devx_event_subscription *sub)
>  {
> @@ -1362,8 +1339,16 @@ static int devx_obj_cleanup(struct ib_uobject *uobject,
>  	int ret;
>
>  	dev = mlx5_udata_to_mdev(&attrs->driver_udata);
> -	if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY)
> -		devx_cleanup_mkey(obj);
> +	if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) {
> +		/*
> +		 * The pagefault_single_data_segment() does commands against
> +		 * the mmkey, we must wait for that to stop before freeing the
> +		 * mkey, as another allocation could get the same mkey #.
> +		 */
> +		xa_erase(&obj->ib_dev->mdev->priv.mkey_table,
> +			 mlx5_base_mkey(obj->devx_mr.mmkey.key));
> +		synchronize_srcu(&dev->mr_srcu);
> +	}
>
>  	if (obj->flags & DEVX_OBJ_FLAGS_DCT)
>  		ret = mlx5_core_destroy_dct(obj->ib_dev->mdev, &obj->core_dct);
> @@ -1382,12 +1367,6 @@ static int devx_obj_cleanup(struct ib_uobject *uobject,
>  		devx_cleanup_subscription(dev, sub_entry);
>  	mutex_unlock(&devx_event_table->event_xa_lock);
>
> -	if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) {
> -		call_srcu(&dev->mr_srcu, &obj->devx_mr.rcu,
> -			  devx_free_indirect_mkey);
> -		return ret;
> -	}
> -
>  	kfree(obj);
>  	return ret;
>  }
> @@ -1491,26 +1470,21 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_OBJ_CREATE)(
>  				   &obj_id);
>  	WARN_ON(obj->dinlen > MLX5_MAX_DESTROY_INBOX_SIZE_DW * sizeof(u32));
>
> -	if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) {
> -		err = devx_handle_mkey_indirect(obj, dev, cmd_in, cmd_out);
> -		if (err)
> -			goto obj_destroy;
> -	}
> -
>  	err = uverbs_copy_to(attrs, MLX5_IB_ATTR_DEVX_OBJ_CREATE_CMD_OUT, cmd_out, cmd_out_len);
>  	if (err)
> -		goto err_copy;
> +		goto obj_destroy;
>
>  	if (opcode == MLX5_CMD_OP_CREATE_GENERAL_OBJECT)
>  		obj_type = MLX5_GET(general_obj_in_cmd_hdr, cmd_in, obj_type);
> -
>  	obj->obj_id = get_enc_obj_id(opcode | obj_type << 16, obj_id);
>
> +	if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY) {
> +		err = devx_handle_mkey_indirect(obj, dev, cmd_in, cmd_out);
> +		if (err)
> +			goto obj_destroy;
> +	}
>  	return 0;
>
> -err_copy:
> -	if (obj->flags & DEVX_OBJ_FLAGS_INDIRECT_MKEY)
> -		devx_cleanup_mkey(obj);
>  obj_destroy:
>  	if (obj->flags & DEVX_OBJ_FLAGS_DCT)
>  		mlx5_core_destroy_dct(obj->ib_dev->mdev, &obj->core_dct);
> diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> index 15e42825cc976e..1a98ee2e01c4b9 100644
> --- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
> +++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
> @@ -639,7 +639,6 @@ struct mlx5_ib_mw {
>  struct mlx5_ib_devx_mr {
>  	struct mlx5_core_mkey	mmkey;
>  	int			ndescs;
> -	struct rcu_head		rcu;
>  };
>
>  struct mlx5_ib_umr_context {
> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index 3a27bddfcf31f5..630599311586ec 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1962,14 +1962,25 @@ struct ib_mw *mlx5_ib_alloc_mw(struct ib_pd *pd, enum ib_mw_type type,
>
>  int mlx5_ib_dealloc_mw(struct ib_mw *mw)
>  {
> +	struct mlx5_ib_dev *dev = to_mdev(mw->device);
>  	struct mlx5_ib_mw *mmw = to_mmw(mw);
>  	int err;
>
> -	err =  mlx5_core_destroy_mkey((to_mdev(mw->device))->mdev,
> -				      &mmw->mmkey);
> -	if (!err)
> -		kfree(mmw);
> -	return err;
> +	if (IS_ENABLED(CONFIG_INFINIBAND_ON_DEMAND_PAGING)) {
> +		xa_erase(&dev->mdev->priv.mkey_table,
> +			 mlx5_base_mkey(mmw->mmkey.key));
> +		/*
> +		 * pagefault_single_data_segment() may be accessing mmw under
> +		 * SRCU if the user bound an ODP MR to this MW.
> +		 */
> +		synchronize_srcu(&dev->mr_srcu);
> +	}
> +
> +	err = mlx5_core_destroy_mkey(dev->mdev, &mmw->mmkey);
> +	if (err)
> +		return err;
> +	kfree(mmw);

You are skipping kfree() in case of error returned by mlx5_core_destroy_mkey().
IMHO, it is right for -ENOENT, but is not right for mlx5_cmd_exec() failures.

Thanks

  reply	other threads:[~2019-10-03  8:54 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-01 15:38 [PATCH -rc 0/6] Bug fixes for odp Jason Gunthorpe
2019-10-01 15:38 ` [PATCH 1/6] RDMA/mlx5: Do not allow rereg of a ODP MR Jason Gunthorpe
2019-10-01 15:38 ` [PATCH 2/6] RDMA/mlx5: Fix a race with mlx5_ib_update_xlt on an implicit MR Jason Gunthorpe
2019-10-02  8:18   ` Leon Romanovsky
2019-10-02 14:39     ` Jason Gunthorpe
2019-10-02 15:41       ` Leon Romanovsky
2019-10-03 12:48         ` Jason Gunthorpe
2019-10-01 15:38 ` [PATCH 3/6] RDMA/odp: Lift umem_mutex out of ib_umem_odp_unmap_dma_pages() Jason Gunthorpe
2019-10-01 15:38 ` [PATCH 4/6] RDMA/mlx5: Order num_pending_prefetch properly with synchronize_srcu Jason Gunthorpe
2019-10-01 15:38 ` [PATCH 5/6] RDMA/mlx5: Put live in the correct place for ODP MRs Jason Gunthorpe
2019-10-01 15:38 ` [PATCH 6/6] RDMA/mlx5: Add missing synchronize_srcu() for MW cases Jason Gunthorpe
2019-10-03  8:54   ` Leon Romanovsky [this message]
2019-10-03 12:33     ` Jason Gunthorpe
2019-10-04 18:55 ` [PATCH -rc 0/6] Bug fixes for odp Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191003085449.GN5855@unreal \
    --to=leon@kernel.org \
    --cc=artemyko@mellanox.com \
    --cc=jgg@mellanox.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-rdma@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).