All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jianxin Xiong <jianxin.xiong@intel.com>
Cc: linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org,
	Doug Ledford <dledford@redhat.com>,
	Leon Romanovsky <leon@kernel.org>,
	Sumit Semwal <sumit.semwal@linaro.org>,
	Christian Koenig <christian.koenig@amd.com>,
	Daniel Vetter <daniel.vetter@intel.com>
Subject: Re: [PATCH v8 4/5] RDMA/mlx5: Support dma-buf based userspace memory region
Date: Thu, 5 Nov 2020 20:25:15 -0400	[thread overview]
Message-ID: <20201106002515.GM36674@ziepe.ca> (raw)
In-Reply-To: <1604616489-69267-5-git-send-email-jianxin.xiong@intel.com>

On Thu, Nov 05, 2020 at 02:48:08PM -0800, Jianxin Xiong wrote:
> @@ -966,7 +969,10 @@ static struct mlx5_ib_mr *alloc_mr_from_cache(struct ib_pd *pd,
>  	struct mlx5_ib_mr *mr;
>  	unsigned int page_size;
>  
> -	page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, 0, iova);
> +	if (umem->is_dmabuf)
> +		page_size = ib_umem_find_best_pgsz(umem, PAGE_SIZE, iova);

You said the sgl is not set here, why doesn't this crash? It is
certainly wrong to call this function without a SGL.

> +/**
> + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR
> + * @mr: to fence
> + *
> + * On return no parallel threads will be touching this MR and no DMA will be
> + * active.
> + */
> +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr)
> +{
> +	struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem);
> +
> +	/* Prevent new page faults and prefetch requests from succeeding */
> +	xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key));
> +
> +	/* Wait for all running page-fault handlers to finish. */
> +	synchronize_srcu(&mr->dev->odp_srcu);
> +
> +	wait_event(mr->q_deferred_work, !atomic_read(&mr->num_deferred_work));
> +
> +	dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL);
> +	mlx5_mr_cache_invalidate(mr);
> +	umem_dmabuf->private = NULL;
> +	dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
> +
> +	if (!mr->cache_ent) {
> +		mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey);
> +		WARN_ON(mr->descs);
> +	}
> +}

I would expect this to call ib_umem_dmabuf_unmap_pages() ?

Who calls it on the dereg path?

This looks quite strange to me, it calls ib_umem_dmabuf_unmap_pages()
only from the invalidate callback?

I feel uneasy how this seems to assume everything works sanely, we can
have parallel page faults so pagefault_dmabuf_mr() can be called
multiple times after an invalidation, and it doesn't protect itself
against calling ib_umem_dmabuf_map_pages() twice.

Perhaps the umem code should keep track of the current map state and
exit if there is already a sgl. NULL or not NULL sgl would do and
seems quite reasonable.

> @@ -810,22 +871,31 @@ static int pagefault_mr(struct mlx5_ib_mr *mr, u64 io_virt, size_t bcnt,
>  			u32 *bytes_mapped, u32 flags)
>  {
>  	struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem);
> +	struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem);
>  
>  	lockdep_assert_held(&mr->dev->odp_srcu);
>  	if (unlikely(io_virt < mr->mmkey.iova))
>  		return -EFAULT;
>  
> -	if (!odp->is_implicit_odp) {
> +	if (is_dmabuf_mr(mr) || !odp->is_implicit_odp) {
>  		u64 user_va;
> +		u64 end;
>  
>  		if (check_add_overflow(io_virt - mr->mmkey.iova,
> -				       (u64)odp->umem.address, &user_va))
> +				       (u64)mr->umem->address, &user_va))
>  			return -EFAULT;
> -		if (unlikely(user_va >= ib_umem_end(odp) ||
> -			     ib_umem_end(odp) - user_va < bcnt))
> +		if (is_dmabuf_mr(mr))
> +			end = mr->umem->address + mr->umem->length;
> +		else
> +			end = ib_umem_end(odp);
> +		if (unlikely(user_va >= end || end - user_va < bcnt))
>  			return -EFAULT;
> -		return pagefault_real_mr(mr, odp, user_va, bcnt, bytes_mapped,
> -					 flags);
> +		if (is_dmabuf_mr(mr))
> +			return pagefault_dmabuf_mr(mr, umem_dmabuf, user_va,
> +						   bcnt, bytes_mapped, flags);

But this doesn't care about user_va or bcnt it just triggers the whole
thing to be remapped, so why calculate it?

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@ziepe.ca>
To: Jianxin Xiong <jianxin.xiong@intel.com>
Cc: Leon Romanovsky <leon@kernel.org>,
	linux-rdma@vger.kernel.org, dri-devel@lists.freedesktop.org,
	Doug Ledford <dledford@redhat.com>,
	Daniel Vetter <daniel.vetter@intel.com>,
	Christian Koenig <christian.koenig@amd.com>
Subject: Re: [PATCH v8 4/5] RDMA/mlx5: Support dma-buf based userspace memory region
Date: Thu, 5 Nov 2020 20:25:15 -0400	[thread overview]
Message-ID: <20201106002515.GM36674@ziepe.ca> (raw)
In-Reply-To: <1604616489-69267-5-git-send-email-jianxin.xiong@intel.com>

On Thu, Nov 05, 2020 at 02:48:08PM -0800, Jianxin Xiong wrote:
> @@ -966,7 +969,10 @@ static struct mlx5_ib_mr *alloc_mr_from_cache(struct ib_pd *pd,
>  	struct mlx5_ib_mr *mr;
>  	unsigned int page_size;
>  
> -	page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, 0, iova);
> +	if (umem->is_dmabuf)
> +		page_size = ib_umem_find_best_pgsz(umem, PAGE_SIZE, iova);

You said the sgl is not set here, why doesn't this crash? It is
certainly wrong to call this function without a SGL.

> +/**
> + * mlx5_ib_fence_dmabuf_mr - Stop all access to the dmabuf MR
> + * @mr: to fence
> + *
> + * On return no parallel threads will be touching this MR and no DMA will be
> + * active.
> + */
> +void mlx5_ib_fence_dmabuf_mr(struct mlx5_ib_mr *mr)
> +{
> +	struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem);
> +
> +	/* Prevent new page faults and prefetch requests from succeeding */
> +	xa_erase(&mr->dev->odp_mkeys, mlx5_base_mkey(mr->mmkey.key));
> +
> +	/* Wait for all running page-fault handlers to finish. */
> +	synchronize_srcu(&mr->dev->odp_srcu);
> +
> +	wait_event(mr->q_deferred_work, !atomic_read(&mr->num_deferred_work));
> +
> +	dma_resv_lock(umem_dmabuf->attach->dmabuf->resv, NULL);
> +	mlx5_mr_cache_invalidate(mr);
> +	umem_dmabuf->private = NULL;
> +	dma_resv_unlock(umem_dmabuf->attach->dmabuf->resv);
> +
> +	if (!mr->cache_ent) {
> +		mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey);
> +		WARN_ON(mr->descs);
> +	}
> +}

I would expect this to call ib_umem_dmabuf_unmap_pages() ?

Who calls it on the dereg path?

This looks quite strange to me, it calls ib_umem_dmabuf_unmap_pages()
only from the invalidate callback?

I feel uneasy how this seems to assume everything works sanely, we can
have parallel page faults so pagefault_dmabuf_mr() can be called
multiple times after an invalidation, and it doesn't protect itself
against calling ib_umem_dmabuf_map_pages() twice.

Perhaps the umem code should keep track of the current map state and
exit if there is already a sgl. NULL or not NULL sgl would do and
seems quite reasonable.

> @@ -810,22 +871,31 @@ static int pagefault_mr(struct mlx5_ib_mr *mr, u64 io_virt, size_t bcnt,
>  			u32 *bytes_mapped, u32 flags)
>  {
>  	struct ib_umem_odp *odp = to_ib_umem_odp(mr->umem);
> +	struct ib_umem_dmabuf *umem_dmabuf = to_ib_umem_dmabuf(mr->umem);
>  
>  	lockdep_assert_held(&mr->dev->odp_srcu);
>  	if (unlikely(io_virt < mr->mmkey.iova))
>  		return -EFAULT;
>  
> -	if (!odp->is_implicit_odp) {
> +	if (is_dmabuf_mr(mr) || !odp->is_implicit_odp) {
>  		u64 user_va;
> +		u64 end;
>  
>  		if (check_add_overflow(io_virt - mr->mmkey.iova,
> -				       (u64)odp->umem.address, &user_va))
> +				       (u64)mr->umem->address, &user_va))
>  			return -EFAULT;
> -		if (unlikely(user_va >= ib_umem_end(odp) ||
> -			     ib_umem_end(odp) - user_va < bcnt))
> +		if (is_dmabuf_mr(mr))
> +			end = mr->umem->address + mr->umem->length;
> +		else
> +			end = ib_umem_end(odp);
> +		if (unlikely(user_va >= end || end - user_va < bcnt))
>  			return -EFAULT;
> -		return pagefault_real_mr(mr, odp, user_va, bcnt, bytes_mapped,
> -					 flags);
> +		if (is_dmabuf_mr(mr))
> +			return pagefault_dmabuf_mr(mr, umem_dmabuf, user_va,
> +						   bcnt, bytes_mapped, flags);

But this doesn't care about user_va or bcnt it just triggers the whole
thing to be remapped, so why calculate it?

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-11-06  0:25 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-05 22:48 [PATCH v8 0/5] RDMA: Add dma-buf support Jianxin Xiong
2020-11-05 22:48 ` Jianxin Xiong
2020-11-05 22:48 ` [PATCH v8 1/5] RDMA/umem: Support importing dma-buf as user memory region Jianxin Xiong
2020-11-05 22:48   ` Jianxin Xiong
2020-11-06  0:08   ` Jason Gunthorpe
2020-11-06  0:08     ` Jason Gunthorpe
2020-11-06 16:34     ` Xiong, Jianxin
2020-11-06 16:34       ` Xiong, Jianxin
2020-11-06 16:39       ` Jason Gunthorpe
2020-11-06 16:39         ` Jason Gunthorpe
2020-11-06 17:01         ` Xiong, Jianxin
2020-11-06 17:01           ` Xiong, Jianxin
2020-11-10 14:14         ` Daniel Vetter
2020-11-10 14:14           ` Daniel Vetter
2020-11-10 14:27           ` Jason Gunthorpe
2020-11-10 14:27             ` Jason Gunthorpe
2020-11-10 14:44             ` Daniel Vetter
2020-11-10 14:44               ` Daniel Vetter
2020-11-10 17:23               ` Xiong, Jianxin
2020-11-10 17:23                 ` Xiong, Jianxin
2020-11-05 22:48 ` [PATCH v8 2/5] RDMA/core: Add device method for registering dma-buf based " Jianxin Xiong
2020-11-05 22:48   ` Jianxin Xiong
2020-11-05 22:48 ` [PATCH v8 3/5] RDMA/uverbs: Add uverbs command for dma-buf based MR registration Jianxin Xiong
2020-11-05 22:48   ` Jianxin Xiong
2020-11-06  0:13   ` Jason Gunthorpe
2020-11-06  0:13     ` Jason Gunthorpe
2020-11-06 16:20     ` Xiong, Jianxin
2020-11-06 16:20       ` Xiong, Jianxin
2020-11-06 16:37       ` Jason Gunthorpe
2020-11-06 16:37         ` Jason Gunthorpe
2020-11-05 22:48 ` [PATCH v8 4/5] RDMA/mlx5: Support dma-buf based userspace memory region Jianxin Xiong
2020-11-05 22:48   ` Jianxin Xiong
2020-11-06  0:25   ` Jason Gunthorpe [this message]
2020-11-06  0:25     ` Jason Gunthorpe
2020-11-06  1:11     ` Xiong, Jianxin
2020-11-06  1:11       ` Xiong, Jianxin
2020-11-06 12:48       ` Jason Gunthorpe
2020-11-06 12:48         ` Jason Gunthorpe
2020-11-06 16:10         ` Xiong, Jianxin
2020-11-06 16:10           ` Xiong, Jianxin
2020-11-05 22:48 ` [PATCH v8 5/5] dma-buf: Reject attach request from importers that use dma_virt_ops Jianxin Xiong
2020-11-05 22:48   ` Jianxin Xiong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201106002515.GM36674@ziepe.ca \
    --to=jgg@ziepe.ca \
    --cc=christian.koenig@amd.com \
    --cc=daniel.vetter@intel.com \
    --cc=dledford@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jianxin.xiong@intel.com \
    --cc=leon@kernel.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.