From: Jason Gunthorpe <jgg@ziepe.ca> To: "Xiong, Jianxin" <jianxin.xiong@intel.com> Cc: "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>, "dri-devel@lists.freedesktop.org" <dri-devel@lists.freedesktop.org>, Doug Ledford <dledford@redhat.com>, Leon Romanovsky <leon@kernel.org>, Sumit Semwal <sumit.semwal@linaro.org>, Christian Koenig <christian.koenig@amd.com>, "Vetter, Daniel" <daniel.vetter@intel.com> Subject: Re: [PATCH v8 4/5] RDMA/mlx5: Support dma-buf based userspace memory region Date: Fri, 6 Nov 2020 08:48:33 -0400 [thread overview] Message-ID: <20201106124833.GN36674@ziepe.ca> (raw) In-Reply-To: <MW3PR11MB45556A1524ABE605698B9A8EE5ED0@MW3PR11MB4555.namprd11.prod.outlook.com> On Fri, Nov 06, 2020 at 01:11:38AM +0000, Xiong, Jianxin wrote: > > On Thu, Nov 05, 2020 at 02:48:08PM -0800, Jianxin Xiong wrote: > > > @@ -966,7 +969,10 @@ static struct mlx5_ib_mr *alloc_mr_from_cache(struct ib_pd *pd, > > > struct mlx5_ib_mr *mr; > > > unsigned int page_size; > > > > > > - page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, 0, iova); > > > + if (umem->is_dmabuf) > > > + page_size = ib_umem_find_best_pgsz(umem, PAGE_SIZE, iova); > > > > You said the sgl is not set here, why doesn't this crash? It is certainly wrong to call this function without a SGL. > > The sgl is NULL, and nmap is 0. The 'for_each_sg' loop is just skipped and won't crash. Just wire this to 4k it is clearer than calling some no-op pgsz > > > + if (!mr->cache_ent) { > > > + mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey); > > > + WARN_ON(mr->descs); > > > + } > > > +} > > > > I would expect this to call ib_umem_dmabuf_unmap_pages() ? > > > > Who calls it on the dereg path? > > > > This looks quite strange to me, it calls ib_umem_dmabuf_unmap_pages() only from the invalidate callback? > > It is also called from ib_umem_dmabuf_release(). Hmm, that is no how the other APIs work, the unmap should be paired with the map in the caller, and the sequence for destroy should be invalidate unmap destroy_mkey release_umem I have another series coming that makes the other three destroy flows much closer to that ideal. > > I feel uneasy how this seems to assume everything works sanely, we can have parallel page faults so pagefault_dmabuf_mr() can be called > > multiple times after an invalidation, and it doesn't protect itself against calling ib_umem_dmabuf_map_pages() twice. > > > > Perhaps the umem code should keep track of the current map state and exit if there is already a sgl. NULL or not NULL sgl would do and > > seems quite reasonable. > > Ib_umem_dmabuf_map() already checks the sgl and will do nothing if it is already set. How? What I see in patch 1 is an unconditonal call to dma_buf_map_attachment() ? > > > + if (is_dmabuf_mr(mr)) > > > + return pagefault_dmabuf_mr(mr, umem_dmabuf, user_va, > > > + bcnt, bytes_mapped, flags); > > > > But this doesn't care about user_va or bcnt it just triggers the whole thing to be remapped, so why calculate it? > > The range check is still needed, in order to catch application > errors of using incorrect address or count in verbs command. Passing > the values further in is to allow pagefault_dmabuf_mr to generate > return value and set bytes_mapped in a way consistent with the page > fault handler chain. The HW validates the range. The range check in the ODP case is to protect against a HW bug that would cause the kernel to malfunction. For dmabuf you don't need to do it Jason
WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@ziepe.ca> To: "Xiong, Jianxin" <jianxin.xiong@intel.com> Cc: Leon Romanovsky <leon@kernel.org>, "linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>, "dri-devel@lists.freedesktop.org" <dri-devel@lists.freedesktop.org>, Doug Ledford <dledford@redhat.com>, "Vetter, Daniel" <daniel.vetter@intel.com>, Christian Koenig <christian.koenig@amd.com> Subject: Re: [PATCH v8 4/5] RDMA/mlx5: Support dma-buf based userspace memory region Date: Fri, 6 Nov 2020 08:48:33 -0400 [thread overview] Message-ID: <20201106124833.GN36674@ziepe.ca> (raw) In-Reply-To: <MW3PR11MB45556A1524ABE605698B9A8EE5ED0@MW3PR11MB4555.namprd11.prod.outlook.com> On Fri, Nov 06, 2020 at 01:11:38AM +0000, Xiong, Jianxin wrote: > > On Thu, Nov 05, 2020 at 02:48:08PM -0800, Jianxin Xiong wrote: > > > @@ -966,7 +969,10 @@ static struct mlx5_ib_mr *alloc_mr_from_cache(struct ib_pd *pd, > > > struct mlx5_ib_mr *mr; > > > unsigned int page_size; > > > > > > - page_size = mlx5_umem_find_best_pgsz(umem, mkc, log_page_size, 0, iova); > > > + if (umem->is_dmabuf) > > > + page_size = ib_umem_find_best_pgsz(umem, PAGE_SIZE, iova); > > > > You said the sgl is not set here, why doesn't this crash? It is certainly wrong to call this function without a SGL. > > The sgl is NULL, and nmap is 0. The 'for_each_sg' loop is just skipped and won't crash. Just wire this to 4k it is clearer than calling some no-op pgsz > > > + if (!mr->cache_ent) { > > > + mlx5_core_destroy_mkey(mr->dev->mdev, &mr->mmkey); > > > + WARN_ON(mr->descs); > > > + } > > > +} > > > > I would expect this to call ib_umem_dmabuf_unmap_pages() ? > > > > Who calls it on the dereg path? > > > > This looks quite strange to me, it calls ib_umem_dmabuf_unmap_pages() only from the invalidate callback? > > It is also called from ib_umem_dmabuf_release(). Hmm, that is no how the other APIs work, the unmap should be paired with the map in the caller, and the sequence for destroy should be invalidate unmap destroy_mkey release_umem I have another series coming that makes the other three destroy flows much closer to that ideal. > > I feel uneasy how this seems to assume everything works sanely, we can have parallel page faults so pagefault_dmabuf_mr() can be called > > multiple times after an invalidation, and it doesn't protect itself against calling ib_umem_dmabuf_map_pages() twice. > > > > Perhaps the umem code should keep track of the current map state and exit if there is already a sgl. NULL or not NULL sgl would do and > > seems quite reasonable. > > Ib_umem_dmabuf_map() already checks the sgl and will do nothing if it is already set. How? What I see in patch 1 is an unconditonal call to dma_buf_map_attachment() ? > > > + if (is_dmabuf_mr(mr)) > > > + return pagefault_dmabuf_mr(mr, umem_dmabuf, user_va, > > > + bcnt, bytes_mapped, flags); > > > > But this doesn't care about user_va or bcnt it just triggers the whole thing to be remapped, so why calculate it? > > The range check is still needed, in order to catch application > errors of using incorrect address or count in verbs command. Passing > the values further in is to allow pagefault_dmabuf_mr to generate > return value and set bytes_mapped in a way consistent with the page > fault handler chain. The HW validates the range. The range check in the ODP case is to protect against a HW bug that would cause the kernel to malfunction. For dmabuf you don't need to do it Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
next prev parent reply other threads:[~2020-11-06 12:48 UTC|newest] Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-11-05 22:48 [PATCH v8 0/5] RDMA: Add dma-buf support Jianxin Xiong 2020-11-05 22:48 ` Jianxin Xiong 2020-11-05 22:48 ` [PATCH v8 1/5] RDMA/umem: Support importing dma-buf as user memory region Jianxin Xiong 2020-11-05 22:48 ` Jianxin Xiong 2020-11-06 0:08 ` Jason Gunthorpe 2020-11-06 0:08 ` Jason Gunthorpe 2020-11-06 16:34 ` Xiong, Jianxin 2020-11-06 16:34 ` Xiong, Jianxin 2020-11-06 16:39 ` Jason Gunthorpe 2020-11-06 16:39 ` Jason Gunthorpe 2020-11-06 17:01 ` Xiong, Jianxin 2020-11-06 17:01 ` Xiong, Jianxin 2020-11-10 14:14 ` Daniel Vetter 2020-11-10 14:14 ` Daniel Vetter 2020-11-10 14:27 ` Jason Gunthorpe 2020-11-10 14:27 ` Jason Gunthorpe 2020-11-10 14:44 ` Daniel Vetter 2020-11-10 14:44 ` Daniel Vetter 2020-11-10 17:23 ` Xiong, Jianxin 2020-11-10 17:23 ` Xiong, Jianxin 2020-11-05 22:48 ` [PATCH v8 2/5] RDMA/core: Add device method for registering dma-buf based " Jianxin Xiong 2020-11-05 22:48 ` Jianxin Xiong 2020-11-05 22:48 ` [PATCH v8 3/5] RDMA/uverbs: Add uverbs command for dma-buf based MR registration Jianxin Xiong 2020-11-05 22:48 ` Jianxin Xiong 2020-11-06 0:13 ` Jason Gunthorpe 2020-11-06 0:13 ` Jason Gunthorpe 2020-11-06 16:20 ` Xiong, Jianxin 2020-11-06 16:20 ` Xiong, Jianxin 2020-11-06 16:37 ` Jason Gunthorpe 2020-11-06 16:37 ` Jason Gunthorpe 2020-11-05 22:48 ` [PATCH v8 4/5] RDMA/mlx5: Support dma-buf based userspace memory region Jianxin Xiong 2020-11-05 22:48 ` Jianxin Xiong 2020-11-06 0:25 ` Jason Gunthorpe 2020-11-06 0:25 ` Jason Gunthorpe 2020-11-06 1:11 ` Xiong, Jianxin 2020-11-06 1:11 ` Xiong, Jianxin 2020-11-06 12:48 ` Jason Gunthorpe [this message] 2020-11-06 12:48 ` Jason Gunthorpe 2020-11-06 16:10 ` Xiong, Jianxin 2020-11-06 16:10 ` Xiong, Jianxin 2020-11-05 22:48 ` [PATCH v8 5/5] dma-buf: Reject attach request from importers that use dma_virt_ops Jianxin Xiong 2020-11-05 22:48 ` Jianxin Xiong
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20201106124833.GN36674@ziepe.ca \ --to=jgg@ziepe.ca \ --cc=christian.koenig@amd.com \ --cc=daniel.vetter@intel.com \ --cc=dledford@redhat.com \ --cc=dri-devel@lists.freedesktop.org \ --cc=jianxin.xiong@intel.com \ --cc=leon@kernel.org \ --cc=linux-rdma@vger.kernel.org \ --cc=sumit.semwal@linaro.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.