All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-23 10:59 ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw)
  To: DRI Development
  Cc: Intel Graphics Development, Daniel Vetter, Christian König,
	Jason Gunthorpe, Suren Baghdasaryan, Matthew Wilcox, John Stultz,
	Daniel Vetter, Sumit Semwal, linux-media, linaro-mm-sig

tldr; DMA buffers aren't normal memory, expecting that you can use
them like that (like calling get_user_pages works, or that they're
accounting like any other normal memory) cannot be guaranteed.

Since some userspace only runs on integrated devices, where all
buffers are actually all resident system memory, there's a huge
temptation to assume that a struct page is always present and useable
like for any more pagecache backed mmap. This has the potential to
result in a uapi nightmare.

To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
blocks get_user_pages and all the other struct page based
infrastructure for everyone. In spirit this is the uapi counterpart to
the kernel-internal CONFIG_DMABUF_DEBUG.

Motivated by a recent patch which wanted to swich the system dma-buf
heap to vm_insert_page instead of vm_insert_pfn.

v2:

Jason brought up that we also want to guarantee that all ptes have the
pte_special flag set, to catch fast get_user_pages (on architectures
that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.

From auditing the various functions to insert pfn pte entires
(vm_insert_pfn_prot, remap_pfn_range and all it's callers like
dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
this should be the correct flag to check for.

References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/dma-buf/dma-buf.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index f264b70c383e..06cb1d2e9fdc 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -127,6 +127,7 @@ static struct file_system_type dma_buf_fs_type = {
 static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 {
 	struct dma_buf *dmabuf;
+	int ret;
 
 	if (!is_dma_buf_file(file))
 		return -EINVAL;
@@ -142,7 +143,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 	    dmabuf->size >> PAGE_SHIFT)
 		return -EINVAL;
 
-	return dmabuf->ops->mmap(dmabuf, vma);
+	ret = dmabuf->ops->mmap(dmabuf, vma);
+
+	WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+	return ret;
 }
 
 static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
@@ -1244,6 +1249,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
 int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 		 unsigned long pgoff)
 {
+	int ret;
+
 	if (WARN_ON(!dmabuf || !vma))
 		return -EINVAL;
 
@@ -1264,7 +1271,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 	vma_set_file(vma, dmabuf->file);
 	vma->vm_pgoff = pgoff;
 
-	return dmabuf->ops->mmap(dmabuf, vma);
+	ret = dmabuf->ops->mmap(dmabuf, vma);
+
+	WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(dma_buf_mmap);
 
-- 
2.30.0


^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-23 10:59 ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox,
	linaro-mm-sig, Jason Gunthorpe, Daniel Vetter,
	Suren Baghdasaryan, Christian König, linux-media

tldr; DMA buffers aren't normal memory, expecting that you can use
them like that (like calling get_user_pages works, or that they're
accounting like any other normal memory) cannot be guaranteed.

Since some userspace only runs on integrated devices, where all
buffers are actually all resident system memory, there's a huge
temptation to assume that a struct page is always present and useable
like for any more pagecache backed mmap. This has the potential to
result in a uapi nightmare.

To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
blocks get_user_pages and all the other struct page based
infrastructure for everyone. In spirit this is the uapi counterpart to
the kernel-internal CONFIG_DMABUF_DEBUG.

Motivated by a recent patch which wanted to swich the system dma-buf
heap to vm_insert_page instead of vm_insert_pfn.

v2:

Jason brought up that we also want to guarantee that all ptes have the
pte_special flag set, to catch fast get_user_pages (on architectures
that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.

From auditing the various functions to insert pfn pte entires
(vm_insert_pfn_prot, remap_pfn_range and all it's callers like
dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
this should be the correct flag to check for.

References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/dma-buf/dma-buf.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index f264b70c383e..06cb1d2e9fdc 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -127,6 +127,7 @@ static struct file_system_type dma_buf_fs_type = {
 static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 {
 	struct dma_buf *dmabuf;
+	int ret;
 
 	if (!is_dma_buf_file(file))
 		return -EINVAL;
@@ -142,7 +143,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 	    dmabuf->size >> PAGE_SHIFT)
 		return -EINVAL;
 
-	return dmabuf->ops->mmap(dmabuf, vma);
+	ret = dmabuf->ops->mmap(dmabuf, vma);
+
+	WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+	return ret;
 }
 
 static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
@@ -1244,6 +1249,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
 int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 		 unsigned long pgoff)
 {
+	int ret;
+
 	if (WARN_ON(!dmabuf || !vma))
 		return -EINVAL;
 
@@ -1264,7 +1271,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 	vma_set_file(vma, dmabuf->file);
 	vma->vm_pgoff = pgoff;
 
-	return dmabuf->ops->mmap(dmabuf, vma);
+	ret = dmabuf->ops->mmap(dmabuf, vma);
+
+	WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(dma_buf_mmap);
 
-- 
2.30.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [Intel-gfx] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-23 10:59 ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox,
	Sumit Semwal, linaro-mm-sig, Jason Gunthorpe, John Stultz,
	Daniel Vetter, Suren Baghdasaryan, Christian König,
	linux-media

tldr; DMA buffers aren't normal memory, expecting that you can use
them like that (like calling get_user_pages works, or that they're
accounting like any other normal memory) cannot be guaranteed.

Since some userspace only runs on integrated devices, where all
buffers are actually all resident system memory, there's a huge
temptation to assume that a struct page is always present and useable
like for any more pagecache backed mmap. This has the potential to
result in a uapi nightmare.

To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
blocks get_user_pages and all the other struct page based
infrastructure for everyone. In spirit this is the uapi counterpart to
the kernel-internal CONFIG_DMABUF_DEBUG.

Motivated by a recent patch which wanted to swich the system dma-buf
heap to vm_insert_page instead of vm_insert_pfn.

v2:

Jason brought up that we also want to guarantee that all ptes have the
pte_special flag set, to catch fast get_user_pages (on architectures
that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.

From auditing the various functions to insert pfn pte entires
(vm_insert_pfn_prot, remap_pfn_range and all it's callers like
dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
this should be the correct flag to check for.

References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/
Acked-by: Christian König <christian.koenig@amd.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: John Stultz <john.stultz@linaro.org>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
 drivers/dma-buf/dma-buf.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index f264b70c383e..06cb1d2e9fdc 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -127,6 +127,7 @@ static struct file_system_type dma_buf_fs_type = {
 static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 {
 	struct dma_buf *dmabuf;
+	int ret;
 
 	if (!is_dma_buf_file(file))
 		return -EINVAL;
@@ -142,7 +143,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma)
 	    dmabuf->size >> PAGE_SHIFT)
 		return -EINVAL;
 
-	return dmabuf->ops->mmap(dmabuf, vma);
+	ret = dmabuf->ops->mmap(dmabuf, vma);
+
+	WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+	return ret;
 }
 
 static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence)
@@ -1244,6 +1249,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access);
 int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 		 unsigned long pgoff)
 {
+	int ret;
+
 	if (WARN_ON(!dmabuf || !vma))
 		return -EINVAL;
 
@@ -1264,7 +1271,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma,
 	vma_set_file(vma, dmabuf->file);
 	vma->vm_pgoff = pgoff;
 
-	return dmabuf->ops->mmap(dmabuf, vma);
+	ret = dmabuf->ops->mmap(dmabuf, vma);
+
+	WARN_ON(!(vma->vm_flags & VM_PFNMAP));
+
+	return ret;
 }
 EXPORT_SYMBOL_GPL(dma_buf_mmap);
 
-- 
2.30.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [PATCH 2/2] drm/vgem: use shmem helpers
  2021-02-23 10:59 ` Daniel Vetter
@ 2021-02-23 10:59   ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Christian König,
	Melissa Wen, Daniel Vetter, Chris Wilson

Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/vgem/vgem_drv.c | 280 +-------------------------------
 1 file changed, 3 insertions(+), 277 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index a0e75f1d5d01..88b3d125a610 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -40,6 +40,7 @@
 #include <drm/drm_file.h>
 #include <drm/drm_ioctl.h>
 #include <drm/drm_managed.h>
+#include <drm/drm_gem_shmem_helper.h>
 #include <drm/drm_prime.h>
 
 #include "vgem_drv.h"
@@ -50,27 +51,11 @@
 #define DRIVER_MAJOR	1
 #define DRIVER_MINOR	0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
 	struct drm_device drm;
 	struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-	kvfree(vgem_obj->pages);
-	mutex_destroy(&vgem_obj->pages_lock);
-
-	if (obj->import_attach)
-		drm_prime_gem_destroy(obj, vgem_obj->table);
-
-	drm_gem_object_release(obj);
-	kfree(vgem_obj);
-}
-
 static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
@@ -159,265 +144,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
 	kfree(vfile);
 }
 
-static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
-						unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return ERR_PTR(-ENOMEM);
-
-	obj->base.funcs = &vgem_gem_object_funcs;
-
-	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
-	if (ret) {
-		kfree(obj);
-		return ERR_PTR(ret);
-	}
-
-	mutex_init(&obj->pages_lock);
-
-	return obj;
-}
-
-static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
-{
-	drm_gem_object_release(&obj->base);
-	kfree(obj);
-}
-
-static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
-					      struct drm_file *file,
-					      unsigned int *handle,
-					      unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = __vgem_gem_create(dev, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	ret = drm_gem_handle_create(file, &obj->base, handle);
-	if (ret) {
-		drm_gem_object_put(&obj->base);
-		return ERR_PTR(ret);
-	}
-
-	return &obj->base;
-}
-
-static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
-				struct drm_mode_create_dumb *args)
-{
-	struct drm_gem_object *gem_object;
-	u64 pitch, size;
-
-	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
-	size = args->height * pitch;
-	if (size == 0)
-		return -EINVAL;
-
-	gem_object = vgem_gem_create(dev, file, &args->handle, size);
-	if (IS_ERR(gem_object))
-		return PTR_ERR(gem_object);
-
-	args->size = gem_object->size;
-	args->pitch = pitch;
-
-	drm_gem_object_put(gem_object);
-
-	DRM_DEBUG("Created object of size %llu\n", args->size);
-
-	return 0;
-}
-
 static struct drm_ioctl_desc vgem_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
 };
 
-static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-	unsigned long flags = vma->vm_flags;
-	int ret;
-
-	ret = drm_gem_mmap(filp, vma);
-	if (ret)
-		return ret;
-
-	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
-	 * are ordinary and not special.
-	 */
-	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
-	return 0;
-}
-
-static const struct file_operations vgem_driver_fops = {
-	.owner		= THIS_MODULE,
-	.open		= drm_open,
-	.mmap		= vgem_mmap,
-	.poll		= drm_poll,
-	.read		= drm_read,
-	.unlocked_ioctl = drm_ioctl,
-	.compat_ioctl	= drm_compat_ioctl,
-	.release	= drm_release,
-};
-
-static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (bo->pages_pin_count++ == 0) {
-		struct page **pages;
-
-		pages = drm_gem_get_pages(&bo->base);
-		if (IS_ERR(pages)) {
-			bo->pages_pin_count--;
-			mutex_unlock(&bo->pages_lock);
-			return pages;
-		}
-
-		bo->pages = pages;
-	}
-	mutex_unlock(&bo->pages_lock);
-
-	return bo->pages;
-}
-
-static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (--bo->pages_pin_count == 0) {
-		drm_gem_put_pages(&bo->base, bo->pages, true, true);
-		bo->pages = NULL;
-	}
-	mutex_unlock(&bo->pages_lock);
-}
-
-static int vgem_prime_pin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	/* Flush the object from the CPU cache so that importers can rely
-	 * on coherent indirect access via the exported dma-address.
-	 */
-	drm_clflush_pages(pages, n_pages);
-
-	return 0;
-}
-
-static void vgem_prime_unpin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vgem_unpin_pages(bo);
-}
-
-static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
-}
-
-static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
-						struct dma_buf *dma_buf)
-{
-	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
-
-	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
-}
-
-static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
-			struct dma_buf_attachment *attach, struct sg_table *sg)
-{
-	struct drm_vgem_gem_object *obj;
-	int npages;
-
-	obj = __vgem_gem_create(dev, attach->dmabuf->size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
-
-	obj->table = sg;
-	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (!obj->pages) {
-		__vgem_gem_destroy(obj);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	obj->pages_pin_count++; /* perma-pinned */
-	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
-	return &obj->base;
-}
-
-static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-	void *vaddr;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
-	if (!vaddr)
-		return -ENOMEM;
-	dma_buf_map_set_vaddr(map, vaddr);
-
-	return 0;
-}
-
-static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vunmap(map->vaddr);
-	vgem_unpin_pages(bo);
-}
-
-static int vgem_prime_mmap(struct drm_gem_object *obj,
-			   struct vm_area_struct *vma)
-{
-	int ret;
-
-	if (obj->size < vma->vm_end - vma->vm_start)
-		return -EINVAL;
-
-	if (!obj->filp)
-		return -ENODEV;
-
-	ret = call_mmap(obj->filp, vma);
-	if (ret)
-		return ret;
-
-	vma_set_file(vma, obj->filp);
-	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
-	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-
-	return 0;
-}
-
-static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
-	.free = vgem_gem_free_object,
-	.pin = vgem_prime_pin,
-	.unpin = vgem_prime_unpin,
-	.get_sg_table = vgem_prime_get_sg_table,
-	.vmap = vgem_prime_vmap,
-	.vunmap = vgem_prime_vunmap,
-	.vm_ops = &vgem_gem_vm_ops,
-};
+DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
 
 static const struct drm_driver vgem_driver = {
 	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
@@ -427,13 +159,7 @@ static const struct drm_driver vgem_driver = {
 	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
 	.fops				= &vgem_driver_fops,
 
-	.dumb_create			= vgem_gem_dumb_create,
-
-	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
-	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-	.gem_prime_import = vgem_prime_import,
-	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
-	.gem_prime_mmap = vgem_prime_mmap,
+	DRM_GEM_SHMEM_DRIVER_OPS,
 
 	.name	= DRIVER_NAME,
 	.desc	= DRIVER_DESC,
-- 
2.30.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [Intel-gfx] [PATCH 2/2] drm/vgem: use shmem helpers
@ 2021-02-23 10:59   ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Christian König,
	Melissa Wen, John Stultz, Daniel Vetter, Chris Wilson,
	Sumit Semwal

Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/vgem/vgem_drv.c | 280 +-------------------------------
 1 file changed, 3 insertions(+), 277 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index a0e75f1d5d01..88b3d125a610 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -40,6 +40,7 @@
 #include <drm/drm_file.h>
 #include <drm/drm_ioctl.h>
 #include <drm/drm_managed.h>
+#include <drm/drm_gem_shmem_helper.h>
 #include <drm/drm_prime.h>
 
 #include "vgem_drv.h"
@@ -50,27 +51,11 @@
 #define DRIVER_MAJOR	1
 #define DRIVER_MINOR	0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
 	struct drm_device drm;
 	struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-	kvfree(vgem_obj->pages);
-	mutex_destroy(&vgem_obj->pages_lock);
-
-	if (obj->import_attach)
-		drm_prime_gem_destroy(obj, vgem_obj->table);
-
-	drm_gem_object_release(obj);
-	kfree(vgem_obj);
-}
-
 static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
@@ -159,265 +144,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
 	kfree(vfile);
 }
 
-static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
-						unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return ERR_PTR(-ENOMEM);
-
-	obj->base.funcs = &vgem_gem_object_funcs;
-
-	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
-	if (ret) {
-		kfree(obj);
-		return ERR_PTR(ret);
-	}
-
-	mutex_init(&obj->pages_lock);
-
-	return obj;
-}
-
-static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
-{
-	drm_gem_object_release(&obj->base);
-	kfree(obj);
-}
-
-static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
-					      struct drm_file *file,
-					      unsigned int *handle,
-					      unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = __vgem_gem_create(dev, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	ret = drm_gem_handle_create(file, &obj->base, handle);
-	if (ret) {
-		drm_gem_object_put(&obj->base);
-		return ERR_PTR(ret);
-	}
-
-	return &obj->base;
-}
-
-static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
-				struct drm_mode_create_dumb *args)
-{
-	struct drm_gem_object *gem_object;
-	u64 pitch, size;
-
-	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
-	size = args->height * pitch;
-	if (size == 0)
-		return -EINVAL;
-
-	gem_object = vgem_gem_create(dev, file, &args->handle, size);
-	if (IS_ERR(gem_object))
-		return PTR_ERR(gem_object);
-
-	args->size = gem_object->size;
-	args->pitch = pitch;
-
-	drm_gem_object_put(gem_object);
-
-	DRM_DEBUG("Created object of size %llu\n", args->size);
-
-	return 0;
-}
-
 static struct drm_ioctl_desc vgem_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
 };
 
-static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-	unsigned long flags = vma->vm_flags;
-	int ret;
-
-	ret = drm_gem_mmap(filp, vma);
-	if (ret)
-		return ret;
-
-	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
-	 * are ordinary and not special.
-	 */
-	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
-	return 0;
-}
-
-static const struct file_operations vgem_driver_fops = {
-	.owner		= THIS_MODULE,
-	.open		= drm_open,
-	.mmap		= vgem_mmap,
-	.poll		= drm_poll,
-	.read		= drm_read,
-	.unlocked_ioctl = drm_ioctl,
-	.compat_ioctl	= drm_compat_ioctl,
-	.release	= drm_release,
-};
-
-static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (bo->pages_pin_count++ == 0) {
-		struct page **pages;
-
-		pages = drm_gem_get_pages(&bo->base);
-		if (IS_ERR(pages)) {
-			bo->pages_pin_count--;
-			mutex_unlock(&bo->pages_lock);
-			return pages;
-		}
-
-		bo->pages = pages;
-	}
-	mutex_unlock(&bo->pages_lock);
-
-	return bo->pages;
-}
-
-static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (--bo->pages_pin_count == 0) {
-		drm_gem_put_pages(&bo->base, bo->pages, true, true);
-		bo->pages = NULL;
-	}
-	mutex_unlock(&bo->pages_lock);
-}
-
-static int vgem_prime_pin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	/* Flush the object from the CPU cache so that importers can rely
-	 * on coherent indirect access via the exported dma-address.
-	 */
-	drm_clflush_pages(pages, n_pages);
-
-	return 0;
-}
-
-static void vgem_prime_unpin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vgem_unpin_pages(bo);
-}
-
-static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
-}
-
-static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
-						struct dma_buf *dma_buf)
-{
-	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
-
-	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
-}
-
-static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
-			struct dma_buf_attachment *attach, struct sg_table *sg)
-{
-	struct drm_vgem_gem_object *obj;
-	int npages;
-
-	obj = __vgem_gem_create(dev, attach->dmabuf->size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
-
-	obj->table = sg;
-	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (!obj->pages) {
-		__vgem_gem_destroy(obj);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	obj->pages_pin_count++; /* perma-pinned */
-	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
-	return &obj->base;
-}
-
-static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-	void *vaddr;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
-	if (!vaddr)
-		return -ENOMEM;
-	dma_buf_map_set_vaddr(map, vaddr);
-
-	return 0;
-}
-
-static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vunmap(map->vaddr);
-	vgem_unpin_pages(bo);
-}
-
-static int vgem_prime_mmap(struct drm_gem_object *obj,
-			   struct vm_area_struct *vma)
-{
-	int ret;
-
-	if (obj->size < vma->vm_end - vma->vm_start)
-		return -EINVAL;
-
-	if (!obj->filp)
-		return -ENODEV;
-
-	ret = call_mmap(obj->filp, vma);
-	if (ret)
-		return ret;
-
-	vma_set_file(vma, obj->filp);
-	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
-	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-
-	return 0;
-}
-
-static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
-	.free = vgem_gem_free_object,
-	.pin = vgem_prime_pin,
-	.unpin = vgem_prime_unpin,
-	.get_sg_table = vgem_prime_get_sg_table,
-	.vmap = vgem_prime_vmap,
-	.vunmap = vgem_prime_vunmap,
-	.vm_ops = &vgem_gem_vm_ops,
-};
+DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
 
 static const struct drm_driver vgem_driver = {
 	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
@@ -427,13 +159,7 @@ static const struct drm_driver vgem_driver = {
 	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
 	.fops				= &vgem_driver_fops,
 
-	.dumb_create			= vgem_gem_dumb_create,
-
-	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
-	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-	.gem_prime_import = vgem_prime_import,
-	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
-	.gem_prime_mmap = vgem_prime_mmap,
+	DRM_GEM_SHMEM_DRIVER_OPS,
 
 	.name	= DRIVER_NAME,
 	.desc	= DRIVER_DESC,
-- 
2.30.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-23 10:59 ` Daniel Vetter
                   ` (2 preceding siblings ...)
  (?)
@ 2021-02-23 11:19 ` Patchwork
  -1 siblings, 0 replies; 110+ messages in thread
From: Patchwork @ 2021-02-23 11:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap
URL   : https://patchwork.freedesktop.org/series/87313/
State : failure

== Summary ==

CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  DESCEND  objtool
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  MODPOST Module.symvers
ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined!
ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined!
scripts/Makefile.modpost:111: recipe for target 'Module.symvers' failed
make[1]: *** [Module.symvers] Error 1
make[1]: *** Deleting file 'Module.symvers'
Makefile:1391: recipe for target 'modules' failed
make: *** [modules] Error 2


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH 2/2] drm/vgem: use shmem helpers
  2021-02-23 10:59   ` [Intel-gfx] " Daniel Vetter
@ 2021-02-23 11:19     ` Thomas Zimmermann
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Zimmermann @ 2021-02-23 11:19 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Melissa Wen, Daniel Vetter, Intel Graphics Development,
	Christian König, Chris Wilson


[-- Attachment #1.1.1: Type: text/plain, Size: 10420 bytes --]

Hi

Am 23.02.21 um 11:59 schrieb Daniel Vetter:
> Aside from deleting lots of code the real motivation here is to switch
> the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> work, and even if you try and there's a struct page behind that,
> touching it and mucking around with its refcount can upset drivers
> real bad.
> 
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Melissa Wen <melissa.srw@gmail.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/vgem/vgem_drv.c | 280 +-------------------------------
>   1 file changed, 3 insertions(+), 277 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> index a0e75f1d5d01..88b3d125a610 100644
> --- a/drivers/gpu/drm/vgem/vgem_drv.c
> +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> @@ -40,6 +40,7 @@
>   #include <drm/drm_file.h>
>   #include <drm/drm_ioctl.h>
>   #include <drm/drm_managed.h>
> +#include <drm/drm_gem_shmem_helper.h>

This should be between file.h and ioctl.h

>   #include <drm/drm_prime.h>
>   
>   #include "vgem_drv.h"
> @@ -50,27 +51,11 @@
>   #define DRIVER_MAJOR	1
>   #define DRIVER_MINOR	0
>   
> -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
> -
>   static struct vgem_device {
>   	struct drm_device drm;
>   	struct platform_device *platform;
>   } *vgem_device;
>   
> -static void vgem_gem_free_object(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
> -
> -	kvfree(vgem_obj->pages);
> -	mutex_destroy(&vgem_obj->pages_lock);
> -
> -	if (obj->import_attach)
> -		drm_prime_gem_destroy(obj, vgem_obj->table);
> -
> -	drm_gem_object_release(obj);
> -	kfree(vgem_obj);
> -}
> -
>   static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)

 From a quick grep it looks like you should be able to remove this 
function and vgam_gem_vm_ops as well.

The rest of the patch looks good to me.

Acked-by: Thomas Zimmermann <tzimmermann@suse.de>

>   {
>   	struct vm_area_struct *vma = vmf->vma;
> @@ -159,265 +144,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
>   	kfree(vfile);
>   }
>   
> -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> -						unsigned long size)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int ret;
> -
> -	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> -	if (!obj)
> -		return ERR_PTR(-ENOMEM);
> -
> -	obj->base.funcs = &vgem_gem_object_funcs;
> -
> -	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
> -	if (ret) {
> -		kfree(obj);
> -		return ERR_PTR(ret);
> -	}
> -
> -	mutex_init(&obj->pages_lock);
> -
> -	return obj;
> -}
> -
> -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
> -{
> -	drm_gem_object_release(&obj->base);
> -	kfree(obj);
> -}
> -
> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> -					      struct drm_file *file,
> -					      unsigned int *handle,
> -					      unsigned long size)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int ret;
> -
> -	obj = __vgem_gem_create(dev, size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	ret = drm_gem_handle_create(file, &obj->base, handle);
> -	if (ret) {
> -		drm_gem_object_put(&obj->base);
> -		return ERR_PTR(ret);
> -	}
> -
> -	return &obj->base;
> -}
> -
> -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
> -				struct drm_mode_create_dumb *args)
> -{
> -	struct drm_gem_object *gem_object;
> -	u64 pitch, size;
> -
> -	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
> -	size = args->height * pitch;
> -	if (size == 0)
> -		return -EINVAL;
> -
> -	gem_object = vgem_gem_create(dev, file, &args->handle, size);
> -	if (IS_ERR(gem_object))
> -		return PTR_ERR(gem_object);
> -
> -	args->size = gem_object->size;
> -	args->pitch = pitch;
> -
> -	drm_gem_object_put(gem_object);
> -
> -	DRM_DEBUG("Created object of size %llu\n", args->size);
> -
> -	return 0;
> -}
> -
>   static struct drm_ioctl_desc vgem_ioctls[] = {
>   	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
>   };
>   
> -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> -{
> -	unsigned long flags = vma->vm_flags;
> -	int ret;
> -
> -	ret = drm_gem_mmap(filp, vma);
> -	if (ret)
> -		return ret;
> -
> -	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
> -	 * are ordinary and not special.
> -	 */
> -	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> -	return 0;
> -}
> -
> -static const struct file_operations vgem_driver_fops = {
> -	.owner		= THIS_MODULE,
> -	.open		= drm_open,
> -	.mmap		= vgem_mmap,
> -	.poll		= drm_poll,
> -	.read		= drm_read,
> -	.unlocked_ioctl = drm_ioctl,
> -	.compat_ioctl	= drm_compat_ioctl,
> -	.release	= drm_release,
> -};
> -
> -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
> -{
> -	mutex_lock(&bo->pages_lock);
> -	if (bo->pages_pin_count++ == 0) {
> -		struct page **pages;
> -
> -		pages = drm_gem_get_pages(&bo->base);
> -		if (IS_ERR(pages)) {
> -			bo->pages_pin_count--;
> -			mutex_unlock(&bo->pages_lock);
> -			return pages;
> -		}
> -
> -		bo->pages = pages;
> -	}
> -	mutex_unlock(&bo->pages_lock);
> -
> -	return bo->pages;
> -}
> -
> -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
> -{
> -	mutex_lock(&bo->pages_lock);
> -	if (--bo->pages_pin_count == 0) {
> -		drm_gem_put_pages(&bo->base, bo->pages, true, true);
> -		bo->pages = NULL;
> -	}
> -	mutex_unlock(&bo->pages_lock);
> -}
> -
> -static int vgem_prime_pin(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -	long n_pages = obj->size >> PAGE_SHIFT;
> -	struct page **pages;
> -
> -	pages = vgem_pin_pages(bo);
> -	if (IS_ERR(pages))
> -		return PTR_ERR(pages);
> -
> -	/* Flush the object from the CPU cache so that importers can rely
> -	 * on coherent indirect access via the exported dma-address.
> -	 */
> -	drm_clflush_pages(pages, n_pages);
> -
> -	return 0;
> -}
> -
> -static void vgem_prime_unpin(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	vgem_unpin_pages(bo);
> -}
> -
> -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
> -}
> -
> -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
> -						struct dma_buf *dma_buf)
> -{
> -	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
> -
> -	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
> -}
> -
> -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
> -			struct dma_buf_attachment *attach, struct sg_table *sg)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int npages;
> -
> -	obj = __vgem_gem_create(dev, attach->dmabuf->size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
> -
> -	obj->table = sg;
> -	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
> -	if (!obj->pages) {
> -		__vgem_gem_destroy(obj);
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	obj->pages_pin_count++; /* perma-pinned */
> -	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
> -	return &obj->base;
> -}
> -
> -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -	long n_pages = obj->size >> PAGE_SHIFT;
> -	struct page **pages;
> -	void *vaddr;
> -
> -	pages = vgem_pin_pages(bo);
> -	if (IS_ERR(pages))
> -		return PTR_ERR(pages);
> -
> -	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> -	if (!vaddr)
> -		return -ENOMEM;
> -	dma_buf_map_set_vaddr(map, vaddr);
> -
> -	return 0;
> -}
> -
> -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	vunmap(map->vaddr);
> -	vgem_unpin_pages(bo);
> -}
> -
> -static int vgem_prime_mmap(struct drm_gem_object *obj,
> -			   struct vm_area_struct *vma)
> -{
> -	int ret;
> -
> -	if (obj->size < vma->vm_end - vma->vm_start)
> -		return -EINVAL;
> -
> -	if (!obj->filp)
> -		return -ENODEV;
> -
> -	ret = call_mmap(obj->filp, vma);
> -	if (ret)
> -		return ret;
> -
> -	vma_set_file(vma, obj->filp);
> -	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> -	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> -
> -	return 0;
> -}
> -
> -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
> -	.free = vgem_gem_free_object,
> -	.pin = vgem_prime_pin,
> -	.unpin = vgem_prime_unpin,
> -	.get_sg_table = vgem_prime_get_sg_table,
> -	.vmap = vgem_prime_vmap,
> -	.vunmap = vgem_prime_vunmap,
> -	.vm_ops = &vgem_gem_vm_ops,
> -};
> +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
>   
>   static const struct drm_driver vgem_driver = {
>   	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
> @@ -427,13 +159,7 @@ static const struct drm_driver vgem_driver = {
>   	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
>   	.fops				= &vgem_driver_fops,
>   
> -	.dumb_create			= vgem_gem_dumb_create,
> -
> -	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> -	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> -	.gem_prime_import = vgem_prime_import,
> -	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
> -	.gem_prime_mmap = vgem_prime_mmap,
> +	DRM_GEM_SHMEM_DRIVER_OPS,
>   
>   	.name	= DRIVER_NAME,
>   	.desc	= DRIVER_DESC,
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] drm/vgem: use shmem helpers
@ 2021-02-23 11:19     ` Thomas Zimmermann
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Zimmermann @ 2021-02-23 11:19 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Melissa Wen, Daniel Vetter, Intel Graphics Development,
	Christian König, Chris Wilson


[-- Attachment #1.1.1: Type: text/plain, Size: 10420 bytes --]

Hi

Am 23.02.21 um 11:59 schrieb Daniel Vetter:
> Aside from deleting lots of code the real motivation here is to switch
> the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> work, and even if you try and there's a struct page behind that,
> touching it and mucking around with its refcount can upset drivers
> real bad.
> 
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Melissa Wen <melissa.srw@gmail.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/vgem/vgem_drv.c | 280 +-------------------------------
>   1 file changed, 3 insertions(+), 277 deletions(-)
> 
> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> index a0e75f1d5d01..88b3d125a610 100644
> --- a/drivers/gpu/drm/vgem/vgem_drv.c
> +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> @@ -40,6 +40,7 @@
>   #include <drm/drm_file.h>
>   #include <drm/drm_ioctl.h>
>   #include <drm/drm_managed.h>
> +#include <drm/drm_gem_shmem_helper.h>

This should be between file.h and ioctl.h

>   #include <drm/drm_prime.h>
>   
>   #include "vgem_drv.h"
> @@ -50,27 +51,11 @@
>   #define DRIVER_MAJOR	1
>   #define DRIVER_MINOR	0
>   
> -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
> -
>   static struct vgem_device {
>   	struct drm_device drm;
>   	struct platform_device *platform;
>   } *vgem_device;
>   
> -static void vgem_gem_free_object(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
> -
> -	kvfree(vgem_obj->pages);
> -	mutex_destroy(&vgem_obj->pages_lock);
> -
> -	if (obj->import_attach)
> -		drm_prime_gem_destroy(obj, vgem_obj->table);
> -
> -	drm_gem_object_release(obj);
> -	kfree(vgem_obj);
> -}
> -
>   static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)

 From a quick grep it looks like you should be able to remove this 
function and vgam_gem_vm_ops as well.

The rest of the patch looks good to me.

Acked-by: Thomas Zimmermann <tzimmermann@suse.de>

>   {
>   	struct vm_area_struct *vma = vmf->vma;
> @@ -159,265 +144,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
>   	kfree(vfile);
>   }
>   
> -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> -						unsigned long size)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int ret;
> -
> -	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> -	if (!obj)
> -		return ERR_PTR(-ENOMEM);
> -
> -	obj->base.funcs = &vgem_gem_object_funcs;
> -
> -	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
> -	if (ret) {
> -		kfree(obj);
> -		return ERR_PTR(ret);
> -	}
> -
> -	mutex_init(&obj->pages_lock);
> -
> -	return obj;
> -}
> -
> -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
> -{
> -	drm_gem_object_release(&obj->base);
> -	kfree(obj);
> -}
> -
> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> -					      struct drm_file *file,
> -					      unsigned int *handle,
> -					      unsigned long size)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int ret;
> -
> -	obj = __vgem_gem_create(dev, size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	ret = drm_gem_handle_create(file, &obj->base, handle);
> -	if (ret) {
> -		drm_gem_object_put(&obj->base);
> -		return ERR_PTR(ret);
> -	}
> -
> -	return &obj->base;
> -}
> -
> -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
> -				struct drm_mode_create_dumb *args)
> -{
> -	struct drm_gem_object *gem_object;
> -	u64 pitch, size;
> -
> -	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
> -	size = args->height * pitch;
> -	if (size == 0)
> -		return -EINVAL;
> -
> -	gem_object = vgem_gem_create(dev, file, &args->handle, size);
> -	if (IS_ERR(gem_object))
> -		return PTR_ERR(gem_object);
> -
> -	args->size = gem_object->size;
> -	args->pitch = pitch;
> -
> -	drm_gem_object_put(gem_object);
> -
> -	DRM_DEBUG("Created object of size %llu\n", args->size);
> -
> -	return 0;
> -}
> -
>   static struct drm_ioctl_desc vgem_ioctls[] = {
>   	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
>   };
>   
> -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> -{
> -	unsigned long flags = vma->vm_flags;
> -	int ret;
> -
> -	ret = drm_gem_mmap(filp, vma);
> -	if (ret)
> -		return ret;
> -
> -	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
> -	 * are ordinary and not special.
> -	 */
> -	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> -	return 0;
> -}
> -
> -static const struct file_operations vgem_driver_fops = {
> -	.owner		= THIS_MODULE,
> -	.open		= drm_open,
> -	.mmap		= vgem_mmap,
> -	.poll		= drm_poll,
> -	.read		= drm_read,
> -	.unlocked_ioctl = drm_ioctl,
> -	.compat_ioctl	= drm_compat_ioctl,
> -	.release	= drm_release,
> -};
> -
> -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
> -{
> -	mutex_lock(&bo->pages_lock);
> -	if (bo->pages_pin_count++ == 0) {
> -		struct page **pages;
> -
> -		pages = drm_gem_get_pages(&bo->base);
> -		if (IS_ERR(pages)) {
> -			bo->pages_pin_count--;
> -			mutex_unlock(&bo->pages_lock);
> -			return pages;
> -		}
> -
> -		bo->pages = pages;
> -	}
> -	mutex_unlock(&bo->pages_lock);
> -
> -	return bo->pages;
> -}
> -
> -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
> -{
> -	mutex_lock(&bo->pages_lock);
> -	if (--bo->pages_pin_count == 0) {
> -		drm_gem_put_pages(&bo->base, bo->pages, true, true);
> -		bo->pages = NULL;
> -	}
> -	mutex_unlock(&bo->pages_lock);
> -}
> -
> -static int vgem_prime_pin(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -	long n_pages = obj->size >> PAGE_SHIFT;
> -	struct page **pages;
> -
> -	pages = vgem_pin_pages(bo);
> -	if (IS_ERR(pages))
> -		return PTR_ERR(pages);
> -
> -	/* Flush the object from the CPU cache so that importers can rely
> -	 * on coherent indirect access via the exported dma-address.
> -	 */
> -	drm_clflush_pages(pages, n_pages);
> -
> -	return 0;
> -}
> -
> -static void vgem_prime_unpin(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	vgem_unpin_pages(bo);
> -}
> -
> -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
> -}
> -
> -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
> -						struct dma_buf *dma_buf)
> -{
> -	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
> -
> -	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
> -}
> -
> -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
> -			struct dma_buf_attachment *attach, struct sg_table *sg)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int npages;
> -
> -	obj = __vgem_gem_create(dev, attach->dmabuf->size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
> -
> -	obj->table = sg;
> -	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
> -	if (!obj->pages) {
> -		__vgem_gem_destroy(obj);
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	obj->pages_pin_count++; /* perma-pinned */
> -	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
> -	return &obj->base;
> -}
> -
> -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -	long n_pages = obj->size >> PAGE_SHIFT;
> -	struct page **pages;
> -	void *vaddr;
> -
> -	pages = vgem_pin_pages(bo);
> -	if (IS_ERR(pages))
> -		return PTR_ERR(pages);
> -
> -	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> -	if (!vaddr)
> -		return -ENOMEM;
> -	dma_buf_map_set_vaddr(map, vaddr);
> -
> -	return 0;
> -}
> -
> -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	vunmap(map->vaddr);
> -	vgem_unpin_pages(bo);
> -}
> -
> -static int vgem_prime_mmap(struct drm_gem_object *obj,
> -			   struct vm_area_struct *vma)
> -{
> -	int ret;
> -
> -	if (obj->size < vma->vm_end - vma->vm_start)
> -		return -EINVAL;
> -
> -	if (!obj->filp)
> -		return -ENODEV;
> -
> -	ret = call_mmap(obj->filp, vma);
> -	if (ret)
> -		return ret;
> -
> -	vma_set_file(vma, obj->filp);
> -	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> -	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> -
> -	return 0;
> -}
> -
> -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
> -	.free = vgem_gem_free_object,
> -	.pin = vgem_prime_pin,
> -	.unpin = vgem_prime_unpin,
> -	.get_sg_table = vgem_prime_get_sg_table,
> -	.vmap = vgem_prime_vmap,
> -	.vunmap = vgem_prime_vunmap,
> -	.vm_ops = &vgem_gem_vm_ops,
> -};
> +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
>   
>   static const struct drm_driver vgem_driver = {
>   	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
> @@ -427,13 +159,7 @@ static const struct drm_driver vgem_driver = {
>   	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
>   	.fops				= &vgem_driver_fops,
>   
> -	.dumb_create			= vgem_gem_dumb_create,
> -
> -	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> -	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> -	.gem_prime_import = vgem_prime_import,
> -	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
> -	.gem_prime_mmap = vgem_prime_mmap,
> +	DRM_GEM_SHMEM_DRIVER_OPS,
>   
>   	.name	= DRIVER_NAME,
>   	.desc	= DRIVER_DESC,
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH] drm/vgem: use shmem helpers
  2021-02-23 10:59   ` [Intel-gfx] " Daniel Vetter
@ 2021-02-23 11:51     ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-23 11:51 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Christian König,
	Melissa Wen, Thomas Zimmermann, Daniel Vetter, Chris Wilson

Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

v2: Review from Thomas:
- sort #include
- drop more dead code that I didn't spot somehow

Cc: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
 1 file changed, 3 insertions(+), 337 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index a0e75f1d5d01..b1b3a5ffc542 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -38,6 +38,7 @@
 
 #include <drm/drm_drv.h>
 #include <drm/drm_file.h>
+#include <drm/drm_gem_shmem_helper.h>
 #include <drm/drm_ioctl.h>
 #include <drm/drm_managed.h>
 #include <drm/drm_prime.h>
@@ -50,87 +51,11 @@
 #define DRIVER_MAJOR	1
 #define DRIVER_MINOR	0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
 	struct drm_device drm;
 	struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-	kvfree(vgem_obj->pages);
-	mutex_destroy(&vgem_obj->pages_lock);
-
-	if (obj->import_attach)
-		drm_prime_gem_destroy(obj, vgem_obj->table);
-
-	drm_gem_object_release(obj);
-	kfree(vgem_obj);
-}
-
-static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
-{
-	struct vm_area_struct *vma = vmf->vma;
-	struct drm_vgem_gem_object *obj = vma->vm_private_data;
-	/* We don't use vmf->pgoff since that has the fake offset */
-	unsigned long vaddr = vmf->address;
-	vm_fault_t ret = VM_FAULT_SIGBUS;
-	loff_t num_pages;
-	pgoff_t page_offset;
-	page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
-
-	num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
-
-	if (page_offset >= num_pages)
-		return VM_FAULT_SIGBUS;
-
-	mutex_lock(&obj->pages_lock);
-	if (obj->pages) {
-		get_page(obj->pages[page_offset]);
-		vmf->page = obj->pages[page_offset];
-		ret = 0;
-	}
-	mutex_unlock(&obj->pages_lock);
-	if (ret) {
-		struct page *page;
-
-		page = shmem_read_mapping_page(
-					file_inode(obj->base.filp)->i_mapping,
-					page_offset);
-		if (!IS_ERR(page)) {
-			vmf->page = page;
-			ret = 0;
-		} else switch (PTR_ERR(page)) {
-			case -ENOSPC:
-			case -ENOMEM:
-				ret = VM_FAULT_OOM;
-				break;
-			case -EBUSY:
-				ret = VM_FAULT_RETRY;
-				break;
-			case -EFAULT:
-			case -EINVAL:
-				ret = VM_FAULT_SIGBUS;
-				break;
-			default:
-				WARN_ON(PTR_ERR(page));
-				ret = VM_FAULT_SIGBUS;
-				break;
-		}
-
-	}
-	return ret;
-}
-
-static const struct vm_operations_struct vgem_gem_vm_ops = {
-	.fault = vgem_gem_fault,
-	.open = drm_gem_vm_open,
-	.close = drm_gem_vm_close,
-};
-
 static int vgem_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct vgem_file *vfile;
@@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
 	kfree(vfile);
 }
 
-static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
-						unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return ERR_PTR(-ENOMEM);
-
-	obj->base.funcs = &vgem_gem_object_funcs;
-
-	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
-	if (ret) {
-		kfree(obj);
-		return ERR_PTR(ret);
-	}
-
-	mutex_init(&obj->pages_lock);
-
-	return obj;
-}
-
-static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
-{
-	drm_gem_object_release(&obj->base);
-	kfree(obj);
-}
-
-static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
-					      struct drm_file *file,
-					      unsigned int *handle,
-					      unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = __vgem_gem_create(dev, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	ret = drm_gem_handle_create(file, &obj->base, handle);
-	if (ret) {
-		drm_gem_object_put(&obj->base);
-		return ERR_PTR(ret);
-	}
-
-	return &obj->base;
-}
-
-static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
-				struct drm_mode_create_dumb *args)
-{
-	struct drm_gem_object *gem_object;
-	u64 pitch, size;
-
-	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
-	size = args->height * pitch;
-	if (size == 0)
-		return -EINVAL;
-
-	gem_object = vgem_gem_create(dev, file, &args->handle, size);
-	if (IS_ERR(gem_object))
-		return PTR_ERR(gem_object);
-
-	args->size = gem_object->size;
-	args->pitch = pitch;
-
-	drm_gem_object_put(gem_object);
-
-	DRM_DEBUG("Created object of size %llu\n", args->size);
-
-	return 0;
-}
-
 static struct drm_ioctl_desc vgem_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
 };
 
-static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-	unsigned long flags = vma->vm_flags;
-	int ret;
-
-	ret = drm_gem_mmap(filp, vma);
-	if (ret)
-		return ret;
-
-	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
-	 * are ordinary and not special.
-	 */
-	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
-	return 0;
-}
-
-static const struct file_operations vgem_driver_fops = {
-	.owner		= THIS_MODULE,
-	.open		= drm_open,
-	.mmap		= vgem_mmap,
-	.poll		= drm_poll,
-	.read		= drm_read,
-	.unlocked_ioctl = drm_ioctl,
-	.compat_ioctl	= drm_compat_ioctl,
-	.release	= drm_release,
-};
-
-static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (bo->pages_pin_count++ == 0) {
-		struct page **pages;
-
-		pages = drm_gem_get_pages(&bo->base);
-		if (IS_ERR(pages)) {
-			bo->pages_pin_count--;
-			mutex_unlock(&bo->pages_lock);
-			return pages;
-		}
-
-		bo->pages = pages;
-	}
-	mutex_unlock(&bo->pages_lock);
-
-	return bo->pages;
-}
-
-static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (--bo->pages_pin_count == 0) {
-		drm_gem_put_pages(&bo->base, bo->pages, true, true);
-		bo->pages = NULL;
-	}
-	mutex_unlock(&bo->pages_lock);
-}
-
-static int vgem_prime_pin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	/* Flush the object from the CPU cache so that importers can rely
-	 * on coherent indirect access via the exported dma-address.
-	 */
-	drm_clflush_pages(pages, n_pages);
-
-	return 0;
-}
-
-static void vgem_prime_unpin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vgem_unpin_pages(bo);
-}
-
-static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
-}
-
-static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
-						struct dma_buf *dma_buf)
-{
-	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
-
-	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
-}
-
-static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
-			struct dma_buf_attachment *attach, struct sg_table *sg)
-{
-	struct drm_vgem_gem_object *obj;
-	int npages;
-
-	obj = __vgem_gem_create(dev, attach->dmabuf->size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
-
-	obj->table = sg;
-	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (!obj->pages) {
-		__vgem_gem_destroy(obj);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	obj->pages_pin_count++; /* perma-pinned */
-	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
-	return &obj->base;
-}
-
-static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-	void *vaddr;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
-	if (!vaddr)
-		return -ENOMEM;
-	dma_buf_map_set_vaddr(map, vaddr);
-
-	return 0;
-}
-
-static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vunmap(map->vaddr);
-	vgem_unpin_pages(bo);
-}
-
-static int vgem_prime_mmap(struct drm_gem_object *obj,
-			   struct vm_area_struct *vma)
-{
-	int ret;
-
-	if (obj->size < vma->vm_end - vma->vm_start)
-		return -EINVAL;
-
-	if (!obj->filp)
-		return -ENODEV;
-
-	ret = call_mmap(obj->filp, vma);
-	if (ret)
-		return ret;
-
-	vma_set_file(vma, obj->filp);
-	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
-	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-
-	return 0;
-}
-
-static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
-	.free = vgem_gem_free_object,
-	.pin = vgem_prime_pin,
-	.unpin = vgem_prime_unpin,
-	.get_sg_table = vgem_prime_get_sg_table,
-	.vmap = vgem_prime_vmap,
-	.vunmap = vgem_prime_vunmap,
-	.vm_ops = &vgem_gem_vm_ops,
-};
+DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
 
 static const struct drm_driver vgem_driver = {
 	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
@@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
 	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
 	.fops				= &vgem_driver_fops,
 
-	.dumb_create			= vgem_gem_dumb_create,
-
-	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
-	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-	.gem_prime_import = vgem_prime_import,
-	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
-	.gem_prime_mmap = vgem_prime_mmap,
+	DRM_GEM_SHMEM_DRIVER_OPS,
 
 	.name	= DRIVER_NAME,
 	.desc	= DRIVER_DESC,
-- 
2.30.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [Intel-gfx] [PATCH] drm/vgem: use shmem helpers
@ 2021-02-23 11:51     ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-23 11:51 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Christian König,
	Melissa Wen, John Stultz, Thomas Zimmermann, Daniel Vetter,
	Chris Wilson, Sumit Semwal

Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

v2: Review from Thomas:
- sort #include
- drop more dead code that I didn't spot somehow

Cc: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
 1 file changed, 3 insertions(+), 337 deletions(-)

diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index a0e75f1d5d01..b1b3a5ffc542 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -38,6 +38,7 @@
 
 #include <drm/drm_drv.h>
 #include <drm/drm_file.h>
+#include <drm/drm_gem_shmem_helper.h>
 #include <drm/drm_ioctl.h>
 #include <drm/drm_managed.h>
 #include <drm/drm_prime.h>
@@ -50,87 +51,11 @@
 #define DRIVER_MAJOR	1
 #define DRIVER_MINOR	0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
 	struct drm_device drm;
 	struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-	kvfree(vgem_obj->pages);
-	mutex_destroy(&vgem_obj->pages_lock);
-
-	if (obj->import_attach)
-		drm_prime_gem_destroy(obj, vgem_obj->table);
-
-	drm_gem_object_release(obj);
-	kfree(vgem_obj);
-}
-
-static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
-{
-	struct vm_area_struct *vma = vmf->vma;
-	struct drm_vgem_gem_object *obj = vma->vm_private_data;
-	/* We don't use vmf->pgoff since that has the fake offset */
-	unsigned long vaddr = vmf->address;
-	vm_fault_t ret = VM_FAULT_SIGBUS;
-	loff_t num_pages;
-	pgoff_t page_offset;
-	page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
-
-	num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
-
-	if (page_offset >= num_pages)
-		return VM_FAULT_SIGBUS;
-
-	mutex_lock(&obj->pages_lock);
-	if (obj->pages) {
-		get_page(obj->pages[page_offset]);
-		vmf->page = obj->pages[page_offset];
-		ret = 0;
-	}
-	mutex_unlock(&obj->pages_lock);
-	if (ret) {
-		struct page *page;
-
-		page = shmem_read_mapping_page(
-					file_inode(obj->base.filp)->i_mapping,
-					page_offset);
-		if (!IS_ERR(page)) {
-			vmf->page = page;
-			ret = 0;
-		} else switch (PTR_ERR(page)) {
-			case -ENOSPC:
-			case -ENOMEM:
-				ret = VM_FAULT_OOM;
-				break;
-			case -EBUSY:
-				ret = VM_FAULT_RETRY;
-				break;
-			case -EFAULT:
-			case -EINVAL:
-				ret = VM_FAULT_SIGBUS;
-				break;
-			default:
-				WARN_ON(PTR_ERR(page));
-				ret = VM_FAULT_SIGBUS;
-				break;
-		}
-
-	}
-	return ret;
-}
-
-static const struct vm_operations_struct vgem_gem_vm_ops = {
-	.fault = vgem_gem_fault,
-	.open = drm_gem_vm_open,
-	.close = drm_gem_vm_close,
-};
-
 static int vgem_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct vgem_file *vfile;
@@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
 	kfree(vfile);
 }
 
-static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
-						unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return ERR_PTR(-ENOMEM);
-
-	obj->base.funcs = &vgem_gem_object_funcs;
-
-	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
-	if (ret) {
-		kfree(obj);
-		return ERR_PTR(ret);
-	}
-
-	mutex_init(&obj->pages_lock);
-
-	return obj;
-}
-
-static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
-{
-	drm_gem_object_release(&obj->base);
-	kfree(obj);
-}
-
-static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
-					      struct drm_file *file,
-					      unsigned int *handle,
-					      unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = __vgem_gem_create(dev, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	ret = drm_gem_handle_create(file, &obj->base, handle);
-	if (ret) {
-		drm_gem_object_put(&obj->base);
-		return ERR_PTR(ret);
-	}
-
-	return &obj->base;
-}
-
-static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
-				struct drm_mode_create_dumb *args)
-{
-	struct drm_gem_object *gem_object;
-	u64 pitch, size;
-
-	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
-	size = args->height * pitch;
-	if (size == 0)
-		return -EINVAL;
-
-	gem_object = vgem_gem_create(dev, file, &args->handle, size);
-	if (IS_ERR(gem_object))
-		return PTR_ERR(gem_object);
-
-	args->size = gem_object->size;
-	args->pitch = pitch;
-
-	drm_gem_object_put(gem_object);
-
-	DRM_DEBUG("Created object of size %llu\n", args->size);
-
-	return 0;
-}
-
 static struct drm_ioctl_desc vgem_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
 };
 
-static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-	unsigned long flags = vma->vm_flags;
-	int ret;
-
-	ret = drm_gem_mmap(filp, vma);
-	if (ret)
-		return ret;
-
-	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
-	 * are ordinary and not special.
-	 */
-	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
-	return 0;
-}
-
-static const struct file_operations vgem_driver_fops = {
-	.owner		= THIS_MODULE,
-	.open		= drm_open,
-	.mmap		= vgem_mmap,
-	.poll		= drm_poll,
-	.read		= drm_read,
-	.unlocked_ioctl = drm_ioctl,
-	.compat_ioctl	= drm_compat_ioctl,
-	.release	= drm_release,
-};
-
-static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (bo->pages_pin_count++ == 0) {
-		struct page **pages;
-
-		pages = drm_gem_get_pages(&bo->base);
-		if (IS_ERR(pages)) {
-			bo->pages_pin_count--;
-			mutex_unlock(&bo->pages_lock);
-			return pages;
-		}
-
-		bo->pages = pages;
-	}
-	mutex_unlock(&bo->pages_lock);
-
-	return bo->pages;
-}
-
-static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (--bo->pages_pin_count == 0) {
-		drm_gem_put_pages(&bo->base, bo->pages, true, true);
-		bo->pages = NULL;
-	}
-	mutex_unlock(&bo->pages_lock);
-}
-
-static int vgem_prime_pin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	/* Flush the object from the CPU cache so that importers can rely
-	 * on coherent indirect access via the exported dma-address.
-	 */
-	drm_clflush_pages(pages, n_pages);
-
-	return 0;
-}
-
-static void vgem_prime_unpin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vgem_unpin_pages(bo);
-}
-
-static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
-}
-
-static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
-						struct dma_buf *dma_buf)
-{
-	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
-
-	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
-}
-
-static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
-			struct dma_buf_attachment *attach, struct sg_table *sg)
-{
-	struct drm_vgem_gem_object *obj;
-	int npages;
-
-	obj = __vgem_gem_create(dev, attach->dmabuf->size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
-
-	obj->table = sg;
-	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (!obj->pages) {
-		__vgem_gem_destroy(obj);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	obj->pages_pin_count++; /* perma-pinned */
-	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
-	return &obj->base;
-}
-
-static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-	void *vaddr;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
-	if (!vaddr)
-		return -ENOMEM;
-	dma_buf_map_set_vaddr(map, vaddr);
-
-	return 0;
-}
-
-static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vunmap(map->vaddr);
-	vgem_unpin_pages(bo);
-}
-
-static int vgem_prime_mmap(struct drm_gem_object *obj,
-			   struct vm_area_struct *vma)
-{
-	int ret;
-
-	if (obj->size < vma->vm_end - vma->vm_start)
-		return -EINVAL;
-
-	if (!obj->filp)
-		return -ENODEV;
-
-	ret = call_mmap(obj->filp, vma);
-	if (ret)
-		return ret;
-
-	vma_set_file(vma, obj->filp);
-	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
-	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-
-	return 0;
-}
-
-static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
-	.free = vgem_gem_free_object,
-	.pin = vgem_prime_pin,
-	.unpin = vgem_prime_unpin,
-	.get_sg_table = vgem_prime_get_sg_table,
-	.vmap = vgem_prime_vmap,
-	.vunmap = vgem_prime_vunmap,
-	.vm_ops = &vgem_gem_vm_ops,
-};
+DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
 
 static const struct drm_driver vgem_driver = {
 	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
@@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
 	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
 	.fops				= &vgem_driver_fops,
 
-	.dumb_create			= vgem_gem_dumb_create,
-
-	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
-	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-	.gem_prime_import = vgem_prime_import,
-	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
-	.gem_prime_mmap = vgem_prime_mmap,
+	DRM_GEM_SHMEM_DRIVER_OPS,
 
 	.name	= DRIVER_NAME,
 	.desc	= DRIVER_DESC,
-- 
2.30.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev2)
  2021-02-23 10:59 ` Daniel Vetter
                   ` (3 preceding siblings ...)
  (?)
@ 2021-02-23 13:11 ` Patchwork
  -1 siblings, 0 replies; 110+ messages in thread
From: Patchwork @ 2021-02-23 13:11 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev2)
URL   : https://patchwork.freedesktop.org/series/87313/
State : failure

== Summary ==

CALL    scripts/checksyscalls.sh
  CALL    scripts/atomic/check-atomics.sh
  DESCEND  objtool
  CHK     include/generated/compile.h
Kernel: arch/x86/boot/bzImage is ready  (#1)
  MODPOST Module.symvers
ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined!
ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined!
scripts/Makefile.modpost:111: recipe for target 'Module.symvers' failed
make[1]: *** [Module.symvers] Error 1
make[1]: *** Deleting file 'Module.symvers'
Makefile:1391: recipe for target 'modules' failed
make: *** [modules] Error 2


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH 2/2] drm/vgem: use shmem helpers
  2021-02-23 10:59   ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-02-23 14:21     ` kernel test robot
  -1 siblings, 0 replies; 110+ messages in thread
From: kernel test robot @ 2021-02-23 14:21 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: kbuild-all, Daniel Vetter, Intel Graphics Development,
	Chris Wilson, Melissa Wen, Christian König

[-- Attachment #1: Type: text/plain, Size: 2076 bytes --]

Hi Daniel,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip linus/master next-20210223]
[cannot apply to drm/drm-next v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: openrisc-randconfig-r026-20210223 (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
        git checkout 5c544c63e333016d58d3e6f4802093906ef5456e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=openrisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   or1k-linux-ld: arch/openrisc/kernel/entry.o: in function `_external_irq_handler':
   (.text+0x83c): undefined reference to `printk'
   (.text+0x83c): relocation truncated to fit: R_OR1K_INSN_REL_26 against undefined symbol `printk'
>> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x44): undefined reference to `drm_gem_shmem_prime_import_sg_table'
>> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x4c): undefined reference to `drm_gem_shmem_dumb_create'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25334 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] drm/vgem: use shmem helpers
@ 2021-02-23 14:21     ` kernel test robot
  0 siblings, 0 replies; 110+ messages in thread
From: kernel test robot @ 2021-02-23 14:21 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: kbuild-all, Daniel Vetter, Intel Graphics Development,
	Chris Wilson, Melissa Wen, Christian König

[-- Attachment #1: Type: text/plain, Size: 2076 bytes --]

Hi Daniel,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip linus/master next-20210223]
[cannot apply to drm/drm-next v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: openrisc-randconfig-r026-20210223 (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
        git checkout 5c544c63e333016d58d3e6f4802093906ef5456e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=openrisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   or1k-linux-ld: arch/openrisc/kernel/entry.o: in function `_external_irq_handler':
   (.text+0x83c): undefined reference to `printk'
   (.text+0x83c): relocation truncated to fit: R_OR1K_INSN_REL_26 against undefined symbol `printk'
>> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x44): undefined reference to `drm_gem_shmem_prime_import_sg_table'
>> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x4c): undefined reference to `drm_gem_shmem_dumb_create'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 25334 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH 2/2] drm/vgem: use shmem helpers
@ 2021-02-23 14:21     ` kernel test robot
  0 siblings, 0 replies; 110+ messages in thread
From: kernel test robot @ 2021-02-23 14:21 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2117 bytes --]

Hi Daniel,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip linus/master next-20210223]
[cannot apply to drm/drm-next v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: openrisc-randconfig-r026-20210223 (attached as .config)
compiler: or1k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
        git checkout 5c544c63e333016d58d3e6f4802093906ef5456e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=openrisc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   or1k-linux-ld: arch/openrisc/kernel/entry.o: in function `_external_irq_handler':
   (.text+0x83c): undefined reference to `printk'
   (.text+0x83c): relocation truncated to fit: R_OR1K_INSN_REL_26 against undefined symbol `printk'
>> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x44): undefined reference to `drm_gem_shmem_prime_import_sg_table'
>> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x4c): undefined reference to `drm_gem_shmem_dumb_create'

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 25334 bytes --]

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH 2/2] drm/vgem: use shmem helpers
  2021-02-23 10:59   ` [Intel-gfx] " Daniel Vetter
  (?)
@ 2021-02-23 15:07     ` kernel test robot
  -1 siblings, 0 replies; 110+ messages in thread
From: kernel test robot @ 2021-02-23 15:07 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: kbuild-all, Daniel Vetter, Intel Graphics Development,
	Chris Wilson, Melissa Wen, Christian König

[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]

Hi Daniel,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip linus/master next-20210223]
[cannot apply to tegra-drm/drm/tegra/for-next drm-exynos/exynos-drm-next drm/drm-next v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: microblaze-randconfig-r013-20210223 (attached as .config)
compiler: microblaze-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
        git checkout 5c544c63e333016d58d3e6f4802093906ef5456e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=microblaze 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined!
>> ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26303 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [PATCH 2/2] drm/vgem: use shmem helpers
@ 2021-02-23 15:07     ` kernel test robot
  0 siblings, 0 replies; 110+ messages in thread
From: kernel test robot @ 2021-02-23 15:07 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: kbuild-all, Daniel Vetter, Intel Graphics Development,
	Chris Wilson, Melissa Wen, Christian König

[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]

Hi Daniel,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip linus/master next-20210223]
[cannot apply to tegra-drm/drm/tegra/for-next drm-exynos/exynos-drm-next drm/drm-next v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: microblaze-randconfig-r013-20210223 (attached as .config)
compiler: microblaze-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
        git checkout 5c544c63e333016d58d3e6f4802093906ef5456e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=microblaze 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined!
>> ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26303 bytes --]

[-- Attachment #3: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH 2/2] drm/vgem: use shmem helpers
@ 2021-02-23 15:07     ` kernel test robot
  0 siblings, 0 replies; 110+ messages in thread
From: kernel test robot @ 2021-02-23 15:07 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 1914 bytes --]

Hi Daniel,

I love your patch! Yet something to improve:

[auto build test ERROR on drm-intel/for-linux-next]
[also build test ERROR on drm-tip/drm-tip linus/master next-20210223]
[cannot apply to tegra-drm/drm/tegra/for-next drm-exynos/exynos-drm-next drm/drm-next v5.11]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
base:   git://anongit.freedesktop.org/drm-intel for-linux-next
config: microblaze-randconfig-r013-20210223 (attached as .config)
compiler: microblaze-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209
        git checkout 5c544c63e333016d58d3e6f4802093906ef5456e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=microblaze 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined!
>> ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined!

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 26303 bytes --]

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-23 10:59 ` Daniel Vetter
  (?)
@ 2021-02-24  7:46   ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-24  7:46 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Matthew Wilcox, linaro-mm-sig,
	Jason Gunthorpe, John Stultz, Daniel Vetter, Suren Baghdasaryan,
	Christian König, linux-media


On 2/23/21 11:59 AM, Daniel Vetter wrote:
> tldr; DMA buffers aren't normal memory, expecting that you can use
> them like that (like calling get_user_pages works, or that they're
> accounting like any other normal memory) cannot be guaranteed.
>
> Since some userspace only runs on integrated devices, where all
> buffers are actually all resident system memory, there's a huge
> temptation to assume that a struct page is always present and useable
> like for any more pagecache backed mmap. This has the potential to
> result in a uapi nightmare.
>
> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> blocks get_user_pages and all the other struct page based
> infrastructure for everyone. In spirit this is the uapi counterpart to
> the kernel-internal CONFIG_DMABUF_DEBUG.
>
> Motivated by a recent patch which wanted to swich the system dma-buf
> heap to vm_insert_page instead of vm_insert_pfn.
>
> v2:
>
> Jason brought up that we also want to guarantee that all ptes have the
> pte_special flag set, to catch fast get_user_pages (on architectures
> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>
>  From auditing the various functions to insert pfn pte entires
> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> this should be the correct flag to check for.
>
If we require VM_PFNMAP, for ordinary page mappings, we also need to 
disallow COW mappings, since it will not work on architectures that 
don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).

Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with 
possible performance implications with x86 + PAT + VM_PFNMAP + normal 
pages. That's a very old comment, though, and might not be valid anymore.

/Thomas



^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24  7:46   ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-24  7:46 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Matthew Wilcox, linaro-mm-sig,
	Jason Gunthorpe, Daniel Vetter, Suren Baghdasaryan,
	Christian König, linux-media


On 2/23/21 11:59 AM, Daniel Vetter wrote:
> tldr; DMA buffers aren't normal memory, expecting that you can use
> them like that (like calling get_user_pages works, or that they're
> accounting like any other normal memory) cannot be guaranteed.
>
> Since some userspace only runs on integrated devices, where all
> buffers are actually all resident system memory, there's a huge
> temptation to assume that a struct page is always present and useable
> like for any more pagecache backed mmap. This has the potential to
> result in a uapi nightmare.
>
> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> blocks get_user_pages and all the other struct page based
> infrastructure for everyone. In spirit this is the uapi counterpart to
> the kernel-internal CONFIG_DMABUF_DEBUG.
>
> Motivated by a recent patch which wanted to swich the system dma-buf
> heap to vm_insert_page instead of vm_insert_pfn.
>
> v2:
>
> Jason brought up that we also want to guarantee that all ptes have the
> pte_special flag set, to catch fast get_user_pages (on architectures
> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>
>  From auditing the various functions to insert pfn pte entires
> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> this should be the correct flag to check for.
>
If we require VM_PFNMAP, for ordinary page mappings, we also need to 
disallow COW mappings, since it will not work on architectures that 
don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).

Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with 
possible performance implications with x86 + PAT + VM_PFNMAP + normal 
pages. That's a very old comment, though, and might not be valid anymore.

/Thomas


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24  7:46   ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-24  7:46 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Matthew Wilcox, linaro-mm-sig,
	Jason Gunthorpe, John Stultz, Daniel Vetter, Suren Baghdasaryan,
	Christian König, linux-media


On 2/23/21 11:59 AM, Daniel Vetter wrote:
> tldr; DMA buffers aren't normal memory, expecting that you can use
> them like that (like calling get_user_pages works, or that they're
> accounting like any other normal memory) cannot be guaranteed.
>
> Since some userspace only runs on integrated devices, where all
> buffers are actually all resident system memory, there's a huge
> temptation to assume that a struct page is always present and useable
> like for any more pagecache backed mmap. This has the potential to
> result in a uapi nightmare.
>
> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> blocks get_user_pages and all the other struct page based
> infrastructure for everyone. In spirit this is the uapi counterpart to
> the kernel-internal CONFIG_DMABUF_DEBUG.
>
> Motivated by a recent patch which wanted to swich the system dma-buf
> heap to vm_insert_page instead of vm_insert_pfn.
>
> v2:
>
> Jason brought up that we also want to guarantee that all ptes have the
> pte_special flag set, to catch fast get_user_pages (on architectures
> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>
>  From auditing the various functions to insert pfn pte entires
> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> this should be the correct flag to check for.
>
If we require VM_PFNMAP, for ordinary page mappings, we also need to 
disallow COW mappings, since it will not work on architectures that 
don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).

Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with 
possible performance implications with x86 + PAT + VM_PFNMAP + normal 
pages. That's a very old comment, though, and might not be valid anymore.

/Thomas


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-24  7:46   ` Thomas Hellström (Intel)
  (?)
@ 2021-02-24  8:45     ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-24  8:45 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: DRI Development, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > tldr; DMA buffers aren't normal memory, expecting that you can use
> > them like that (like calling get_user_pages works, or that they're
> > accounting like any other normal memory) cannot be guaranteed.
> >
> > Since some userspace only runs on integrated devices, where all
> > buffers are actually all resident system memory, there's a huge
> > temptation to assume that a struct page is always present and useable
> > like for any more pagecache backed mmap. This has the potential to
> > result in a uapi nightmare.
> >
> > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > blocks get_user_pages and all the other struct page based
> > infrastructure for everyone. In spirit this is the uapi counterpart to
> > the kernel-internal CONFIG_DMABUF_DEBUG.
> >
> > Motivated by a recent patch which wanted to swich the system dma-buf
> > heap to vm_insert_page instead of vm_insert_pfn.
> >
> > v2:
> >
> > Jason brought up that we also want to guarantee that all ptes have the
> > pte_special flag set, to catch fast get_user_pages (on architectures
> > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >
> >  From auditing the various functions to insert pfn pte entires
> > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > this should be the correct flag to check for.
> >
> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> disallow COW mappings, since it will not work on architectures that
> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).

Hm I figured everyone just uses MAP_SHARED for buffer objects since
COW really makes absolutely no sense. How would we enforce this?

> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> pages. That's a very old comment, though, and might not be valid anymore.

I think that's why ttm has a page cache for these, because it indeed
sucks. The PAT changes on pages are rather expensive.

There is still an issue for iomem mappings, because the PAT validation
does a linear walk of the resource tree (lol) for every vm_insert_pfn.
But for i915 at least this is fixed by using the io_mapping
infrastructure, which does the PAT reservation only once when you set
up the mapping area at driver load.

Also TTM uses VM_PFNMAP right now for everything, so it can't be a
problem that hurts much :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24  8:45     ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-24  8:45 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > tldr; DMA buffers aren't normal memory, expecting that you can use
> > them like that (like calling get_user_pages works, or that they're
> > accounting like any other normal memory) cannot be guaranteed.
> >
> > Since some userspace only runs on integrated devices, where all
> > buffers are actually all resident system memory, there's a huge
> > temptation to assume that a struct page is always present and useable
> > like for any more pagecache backed mmap. This has the potential to
> > result in a uapi nightmare.
> >
> > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > blocks get_user_pages and all the other struct page based
> > infrastructure for everyone. In spirit this is the uapi counterpart to
> > the kernel-internal CONFIG_DMABUF_DEBUG.
> >
> > Motivated by a recent patch which wanted to swich the system dma-buf
> > heap to vm_insert_page instead of vm_insert_pfn.
> >
> > v2:
> >
> > Jason brought up that we also want to guarantee that all ptes have the
> > pte_special flag set, to catch fast get_user_pages (on architectures
> > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >
> >  From auditing the various functions to insert pfn pte entires
> > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > this should be the correct flag to check for.
> >
> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> disallow COW mappings, since it will not work on architectures that
> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).

Hm I figured everyone just uses MAP_SHARED for buffer objects since
COW really makes absolutely no sense. How would we enforce this?

> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> pages. That's a very old comment, though, and might not be valid anymore.

I think that's why ttm has a page cache for these, because it indeed
sucks. The PAT changes on pages are rather expensive.

There is still an issue for iomem mappings, because the PAT validation
does a linear walk of the resource tree (lol) for every vm_insert_pfn.
But for i915 at least this is fixed by using the io_mapping
infrastructure, which does the PAT reservation only once when you set
up the mapping area at driver load.

Also TTM uses VM_PFNMAP right now for everything, so it can't be a
problem that hurts much :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24  8:45     ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-24  8:45 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > tldr; DMA buffers aren't normal memory, expecting that you can use
> > them like that (like calling get_user_pages works, or that they're
> > accounting like any other normal memory) cannot be guaranteed.
> >
> > Since some userspace only runs on integrated devices, where all
> > buffers are actually all resident system memory, there's a huge
> > temptation to assume that a struct page is always present and useable
> > like for any more pagecache backed mmap. This has the potential to
> > result in a uapi nightmare.
> >
> > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > blocks get_user_pages and all the other struct page based
> > infrastructure for everyone. In spirit this is the uapi counterpart to
> > the kernel-internal CONFIG_DMABUF_DEBUG.
> >
> > Motivated by a recent patch which wanted to swich the system dma-buf
> > heap to vm_insert_page instead of vm_insert_pfn.
> >
> > v2:
> >
> > Jason brought up that we also want to guarantee that all ptes have the
> > pte_special flag set, to catch fast get_user_pages (on architectures
> > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >
> >  From auditing the various functions to insert pfn pte entires
> > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > this should be the correct flag to check for.
> >
> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> disallow COW mappings, since it will not work on architectures that
> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).

Hm I figured everyone just uses MAP_SHARED for buffer objects since
COW really makes absolutely no sense. How would we enforce this?

> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> pages. That's a very old comment, though, and might not be valid anymore.

I think that's why ttm has a page cache for these, because it indeed
sucks. The PAT changes on pages are rather expensive.

There is still an issue for iomem mappings, because the PAT validation
does a linear walk of the resource tree (lol) for every vm_insert_pfn.
But for i915 at least this is fixed by using the io_mapping
infrastructure, which does the PAT reservation only once when you set
up the mapping area at driver load.

Also TTM uses VM_PFNMAP right now for everything, so it can't be a
problem that hurts much :-)
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-24  8:45     ` Daniel Vetter
  (?)
@ 2021-02-24  9:15       ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-24  9:15 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/24/21 9:45 AM, Daniel Vetter wrote:
> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>> them like that (like calling get_user_pages works, or that they're
>>> accounting like any other normal memory) cannot be guaranteed.
>>>
>>> Since some userspace only runs on integrated devices, where all
>>> buffers are actually all resident system memory, there's a huge
>>> temptation to assume that a struct page is always present and useable
>>> like for any more pagecache backed mmap. This has the potential to
>>> result in a uapi nightmare.
>>>
>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>> blocks get_user_pages and all the other struct page based
>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>
>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>
>>> v2:
>>>
>>> Jason brought up that we also want to guarantee that all ptes have the
>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>
>>>   From auditing the various functions to insert pfn pte entires
>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>> this should be the correct flag to check for.
>>>
>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>> disallow COW mappings, since it will not work on architectures that
>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> COW really makes absolutely no sense. How would we enforce this?

Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that 
or allowing MIXEDMAP.

>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>> pages. That's a very old comment, though, and might not be valid anymore.
> I think that's why ttm has a page cache for these, because it indeed
> sucks. The PAT changes on pages are rather expensive.

IIRC the page cache was implemented because of the slowness of the 
caching mode transition itself, more specifically the wbinvd() call + 
global TLB flush.

>
> There is still an issue for iomem mappings, because the PAT validation
> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> But for i915 at least this is fixed by using the io_mapping
> infrastructure, which does the PAT reservation only once when you set
> up the mapping area at driver load.

Yes, I guess that was the issue that the comment describes, but the 
issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.

>
> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> problem that hurts much :-)

Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?

https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554

> -Daniel

/Thomas



^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24  9:15       ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-24  9:15 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/24/21 9:45 AM, Daniel Vetter wrote:
> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>> them like that (like calling get_user_pages works, or that they're
>>> accounting like any other normal memory) cannot be guaranteed.
>>>
>>> Since some userspace only runs on integrated devices, where all
>>> buffers are actually all resident system memory, there's a huge
>>> temptation to assume that a struct page is always present and useable
>>> like for any more pagecache backed mmap. This has the potential to
>>> result in a uapi nightmare.
>>>
>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>> blocks get_user_pages and all the other struct page based
>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>
>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>
>>> v2:
>>>
>>> Jason brought up that we also want to guarantee that all ptes have the
>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>
>>>   From auditing the various functions to insert pfn pte entires
>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>> this should be the correct flag to check for.
>>>
>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>> disallow COW mappings, since it will not work on architectures that
>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> COW really makes absolutely no sense. How would we enforce this?

Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that 
or allowing MIXEDMAP.

>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>> pages. That's a very old comment, though, and might not be valid anymore.
> I think that's why ttm has a page cache for these, because it indeed
> sucks. The PAT changes on pages are rather expensive.

IIRC the page cache was implemented because of the slowness of the 
caching mode transition itself, more specifically the wbinvd() call + 
global TLB flush.

>
> There is still an issue for iomem mappings, because the PAT validation
> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> But for i915 at least this is fixed by using the io_mapping
> infrastructure, which does the PAT reservation only once when you set
> up the mapping area at driver load.

Yes, I guess that was the issue that the comment describes, but the 
issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.

>
> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> problem that hurts much :-)

Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?

https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554

> -Daniel

/Thomas


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24  9:15       ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-24  9:15 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/24/21 9:45 AM, Daniel Vetter wrote:
> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>> them like that (like calling get_user_pages works, or that they're
>>> accounting like any other normal memory) cannot be guaranteed.
>>>
>>> Since some userspace only runs on integrated devices, where all
>>> buffers are actually all resident system memory, there's a huge
>>> temptation to assume that a struct page is always present and useable
>>> like for any more pagecache backed mmap. This has the potential to
>>> result in a uapi nightmare.
>>>
>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>> blocks get_user_pages and all the other struct page based
>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>
>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>
>>> v2:
>>>
>>> Jason brought up that we also want to guarantee that all ptes have the
>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>
>>>   From auditing the various functions to insert pfn pte entires
>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>> this should be the correct flag to check for.
>>>
>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>> disallow COW mappings, since it will not work on architectures that
>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> COW really makes absolutely no sense. How would we enforce this?

Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that 
or allowing MIXEDMAP.

>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>> pages. That's a very old comment, though, and might not be valid anymore.
> I think that's why ttm has a page cache for these, because it indeed
> sucks. The PAT changes on pages are rather expensive.

IIRC the page cache was implemented because of the slowness of the 
caching mode transition itself, more specifically the wbinvd() call + 
global TLB flush.

>
> There is still an issue for iomem mappings, because the PAT validation
> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> But for i915 at least this is fixed by using the io_mapping
> infrastructure, which does the PAT reservation only once when you set
> up the mapping area at driver load.

Yes, I guess that was the issue that the comment describes, but the 
issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.

>
> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> problem that hurts much :-)

Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?

https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554

> -Daniel

/Thomas


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-24  9:15       ` Thomas Hellström (Intel)
  (?)
@ 2021-02-24  9:31         ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-24  9:31 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: DRI Development, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> >>
> >> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> >>> tldr; DMA buffers aren't normal memory, expecting that you can use
> >>> them like that (like calling get_user_pages works, or that they're
> >>> accounting like any other normal memory) cannot be guaranteed.
> >>>
> >>> Since some userspace only runs on integrated devices, where all
> >>> buffers are actually all resident system memory, there's a huge
> >>> temptation to assume that a struct page is always present and useable
> >>> like for any more pagecache backed mmap. This has the potential to
> >>> result in a uapi nightmare.
> >>>
> >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> >>> blocks get_user_pages and all the other struct page based
> >>> infrastructure for everyone. In spirit this is the uapi counterpart to
> >>> the kernel-internal CONFIG_DMABUF_DEBUG.
> >>>
> >>> Motivated by a recent patch which wanted to swich the system dma-buf
> >>> heap to vm_insert_page instead of vm_insert_pfn.
> >>>
> >>> v2:
> >>>
> >>> Jason brought up that we also want to guarantee that all ptes have the
> >>> pte_special flag set, to catch fast get_user_pages (on architectures
> >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >>>
> >>>   From auditing the various functions to insert pfn pte entires
> >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> >>> this should be the correct flag to check for.
> >>>
> >> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> >> disallow COW mappings, since it will not work on architectures that
> >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > COW really makes absolutely no sense. How would we enforce this?
>
> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> or allowing MIXEDMAP.
>
> >> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> >> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> >> pages. That's a very old comment, though, and might not be valid anymore.
> > I think that's why ttm has a page cache for these, because it indeed
> > sucks. The PAT changes on pages are rather expensive.
>
> IIRC the page cache was implemented because of the slowness of the
> caching mode transition itself, more specifically the wbinvd() call +
> global TLB flush.
>
> >
> > There is still an issue for iomem mappings, because the PAT validation
> > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > But for i915 at least this is fixed by using the io_mapping
> > infrastructure, which does the PAT reservation only once when you set
> > up the mapping area at driver load.
>
> Yes, I guess that was the issue that the comment describes, but the
> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>
> >
> > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > problem that hurts much :-)
>
> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554

Uh that's bad, because mixed maps pointing at struct page wont stop
gup. At least afaik.

Christian, do we need to patch this up, and maybe fix up ttm fault
handler to use io_mapping so the vm_insert_pfn stuff is fast?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24  9:31         ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-24  9:31 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> >>
> >> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> >>> tldr; DMA buffers aren't normal memory, expecting that you can use
> >>> them like that (like calling get_user_pages works, or that they're
> >>> accounting like any other normal memory) cannot be guaranteed.
> >>>
> >>> Since some userspace only runs on integrated devices, where all
> >>> buffers are actually all resident system memory, there's a huge
> >>> temptation to assume that a struct page is always present and useable
> >>> like for any more pagecache backed mmap. This has the potential to
> >>> result in a uapi nightmare.
> >>>
> >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> >>> blocks get_user_pages and all the other struct page based
> >>> infrastructure for everyone. In spirit this is the uapi counterpart to
> >>> the kernel-internal CONFIG_DMABUF_DEBUG.
> >>>
> >>> Motivated by a recent patch which wanted to swich the system dma-buf
> >>> heap to vm_insert_page instead of vm_insert_pfn.
> >>>
> >>> v2:
> >>>
> >>> Jason brought up that we also want to guarantee that all ptes have the
> >>> pte_special flag set, to catch fast get_user_pages (on architectures
> >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >>>
> >>>   From auditing the various functions to insert pfn pte entires
> >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> >>> this should be the correct flag to check for.
> >>>
> >> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> >> disallow COW mappings, since it will not work on architectures that
> >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > COW really makes absolutely no sense. How would we enforce this?
>
> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> or allowing MIXEDMAP.
>
> >> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> >> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> >> pages. That's a very old comment, though, and might not be valid anymore.
> > I think that's why ttm has a page cache for these, because it indeed
> > sucks. The PAT changes on pages are rather expensive.
>
> IIRC the page cache was implemented because of the slowness of the
> caching mode transition itself, more specifically the wbinvd() call +
> global TLB flush.
>
> >
> > There is still an issue for iomem mappings, because the PAT validation
> > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > But for i915 at least this is fixed by using the io_mapping
> > infrastructure, which does the PAT reservation only once when you set
> > up the mapping area at driver load.
>
> Yes, I guess that was the issue that the comment describes, but the
> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>
> >
> > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > problem that hurts much :-)
>
> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554

Uh that's bad, because mixed maps pointing at struct page wont stop
gup. At least afaik.

Christian, do we need to patch this up, and maybe fix up ttm fault
handler to use io_mapping so the vm_insert_pfn stuff is fast?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24  9:31         ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-24  9:31 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> >>
> >> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> >>> tldr; DMA buffers aren't normal memory, expecting that you can use
> >>> them like that (like calling get_user_pages works, or that they're
> >>> accounting like any other normal memory) cannot be guaranteed.
> >>>
> >>> Since some userspace only runs on integrated devices, where all
> >>> buffers are actually all resident system memory, there's a huge
> >>> temptation to assume that a struct page is always present and useable
> >>> like for any more pagecache backed mmap. This has the potential to
> >>> result in a uapi nightmare.
> >>>
> >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> >>> blocks get_user_pages and all the other struct page based
> >>> infrastructure for everyone. In spirit this is the uapi counterpart to
> >>> the kernel-internal CONFIG_DMABUF_DEBUG.
> >>>
> >>> Motivated by a recent patch which wanted to swich the system dma-buf
> >>> heap to vm_insert_page instead of vm_insert_pfn.
> >>>
> >>> v2:
> >>>
> >>> Jason brought up that we also want to guarantee that all ptes have the
> >>> pte_special flag set, to catch fast get_user_pages (on architectures
> >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >>>
> >>>   From auditing the various functions to insert pfn pte entires
> >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> >>> this should be the correct flag to check for.
> >>>
> >> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> >> disallow COW mappings, since it will not work on architectures that
> >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > COW really makes absolutely no sense. How would we enforce this?
>
> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> or allowing MIXEDMAP.
>
> >> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> >> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> >> pages. That's a very old comment, though, and might not be valid anymore.
> > I think that's why ttm has a page cache for these, because it indeed
> > sucks. The PAT changes on pages are rather expensive.
>
> IIRC the page cache was implemented because of the slowness of the
> caching mode transition itself, more specifically the wbinvd() call +
> global TLB flush.
>
> >
> > There is still an issue for iomem mappings, because the PAT validation
> > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > But for i915 at least this is fixed by using the io_mapping
> > infrastructure, which does the PAT reservation only once when you set
> > up the mapping area at driver load.
>
> Yes, I guess that was the issue that the comment describes, but the
> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>
> >
> > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > problem that hurts much :-)
>
> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554

Uh that's bad, because mixed maps pointing at struct page wont stop
gup. At least afaik.

Christian, do we need to patch this up, and maybe fix up ttm fault
handler to use io_mapping so the vm_insert_pfn stuff is fast?
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-24  8:45     ` Daniel Vetter
@ 2021-02-24 18:46       ` Jason Gunthorpe
  -1 siblings, 0 replies; 110+ messages in thread
From: Jason Gunthorpe @ 2021-02-24 18:46 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Thomas Hellström (Intel),
	DRI Development, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, John Stultz,
	Daniel Vetter, Suren Baghdasaryan, Christian König,
	open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote:

> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> COW really makes absolutely no sense. How would we enforce this?

In RDMA we test

drivers/infiniband/core/ib_core_uverbs.c:       if (!(vma->vm_flags & VM_SHARED))

During mmap to reject use of MAP_PRIVATE on BAR pages.

Jason

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-24 18:46       ` Jason Gunthorpe
  0 siblings, 0 replies; 110+ messages in thread
From: Jason Gunthorpe @ 2021-02-24 18:46 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Thomas Hellström (Intel),
	Matthew Wilcox, Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, DRI Development,
	Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote:

> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> COW really makes absolutely no sense. How would we enforce this?

In RDMA we test

drivers/infiniband/core/ib_core_uverbs.c:       if (!(vma->vm_flags & VM_SHARED))

During mmap to reject use of MAP_PRIVATE on BAR pages.

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [PATCH] drm/vgem: use shmem helpers
  2021-02-23 10:59   ` [Intel-gfx] " Daniel Vetter
@ 2021-02-25 10:23     ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:23 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Christian König,
	Melissa Wen, Thomas Zimmermann, Daniel Vetter, Chris Wilson

Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

v2: Review from Thomas:
- sort #include
- drop more dead code that I didn't spot somehow

v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)

Cc: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/Kconfig         |   1 +
 drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
 2 files changed, 4 insertions(+), 337 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 8e73311de583..94e4ac830283 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
 config DRM_VGEM
 	tristate "Virtual GEM provider"
 	depends on DRM
+	select DRM_GEM_SHMEM_HELPER
 	help
 	  Choose this option to get a virtual graphics memory manager,
 	  as used by Mesa's software renderer for enhanced performance.
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index a0e75f1d5d01..b1b3a5ffc542 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -38,6 +38,7 @@
 
 #include <drm/drm_drv.h>
 #include <drm/drm_file.h>
+#include <drm/drm_gem_shmem_helper.h>
 #include <drm/drm_ioctl.h>
 #include <drm/drm_managed.h>
 #include <drm/drm_prime.h>
@@ -50,87 +51,11 @@
 #define DRIVER_MAJOR	1
 #define DRIVER_MINOR	0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
 	struct drm_device drm;
 	struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-	kvfree(vgem_obj->pages);
-	mutex_destroy(&vgem_obj->pages_lock);
-
-	if (obj->import_attach)
-		drm_prime_gem_destroy(obj, vgem_obj->table);
-
-	drm_gem_object_release(obj);
-	kfree(vgem_obj);
-}
-
-static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
-{
-	struct vm_area_struct *vma = vmf->vma;
-	struct drm_vgem_gem_object *obj = vma->vm_private_data;
-	/* We don't use vmf->pgoff since that has the fake offset */
-	unsigned long vaddr = vmf->address;
-	vm_fault_t ret = VM_FAULT_SIGBUS;
-	loff_t num_pages;
-	pgoff_t page_offset;
-	page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
-
-	num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
-
-	if (page_offset >= num_pages)
-		return VM_FAULT_SIGBUS;
-
-	mutex_lock(&obj->pages_lock);
-	if (obj->pages) {
-		get_page(obj->pages[page_offset]);
-		vmf->page = obj->pages[page_offset];
-		ret = 0;
-	}
-	mutex_unlock(&obj->pages_lock);
-	if (ret) {
-		struct page *page;
-
-		page = shmem_read_mapping_page(
-					file_inode(obj->base.filp)->i_mapping,
-					page_offset);
-		if (!IS_ERR(page)) {
-			vmf->page = page;
-			ret = 0;
-		} else switch (PTR_ERR(page)) {
-			case -ENOSPC:
-			case -ENOMEM:
-				ret = VM_FAULT_OOM;
-				break;
-			case -EBUSY:
-				ret = VM_FAULT_RETRY;
-				break;
-			case -EFAULT:
-			case -EINVAL:
-				ret = VM_FAULT_SIGBUS;
-				break;
-			default:
-				WARN_ON(PTR_ERR(page));
-				ret = VM_FAULT_SIGBUS;
-				break;
-		}
-
-	}
-	return ret;
-}
-
-static const struct vm_operations_struct vgem_gem_vm_ops = {
-	.fault = vgem_gem_fault,
-	.open = drm_gem_vm_open,
-	.close = drm_gem_vm_close,
-};
-
 static int vgem_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct vgem_file *vfile;
@@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
 	kfree(vfile);
 }
 
-static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
-						unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return ERR_PTR(-ENOMEM);
-
-	obj->base.funcs = &vgem_gem_object_funcs;
-
-	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
-	if (ret) {
-		kfree(obj);
-		return ERR_PTR(ret);
-	}
-
-	mutex_init(&obj->pages_lock);
-
-	return obj;
-}
-
-static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
-{
-	drm_gem_object_release(&obj->base);
-	kfree(obj);
-}
-
-static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
-					      struct drm_file *file,
-					      unsigned int *handle,
-					      unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = __vgem_gem_create(dev, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	ret = drm_gem_handle_create(file, &obj->base, handle);
-	if (ret) {
-		drm_gem_object_put(&obj->base);
-		return ERR_PTR(ret);
-	}
-
-	return &obj->base;
-}
-
-static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
-				struct drm_mode_create_dumb *args)
-{
-	struct drm_gem_object *gem_object;
-	u64 pitch, size;
-
-	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
-	size = args->height * pitch;
-	if (size == 0)
-		return -EINVAL;
-
-	gem_object = vgem_gem_create(dev, file, &args->handle, size);
-	if (IS_ERR(gem_object))
-		return PTR_ERR(gem_object);
-
-	args->size = gem_object->size;
-	args->pitch = pitch;
-
-	drm_gem_object_put(gem_object);
-
-	DRM_DEBUG("Created object of size %llu\n", args->size);
-
-	return 0;
-}
-
 static struct drm_ioctl_desc vgem_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
 };
 
-static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-	unsigned long flags = vma->vm_flags;
-	int ret;
-
-	ret = drm_gem_mmap(filp, vma);
-	if (ret)
-		return ret;
-
-	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
-	 * are ordinary and not special.
-	 */
-	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
-	return 0;
-}
-
-static const struct file_operations vgem_driver_fops = {
-	.owner		= THIS_MODULE,
-	.open		= drm_open,
-	.mmap		= vgem_mmap,
-	.poll		= drm_poll,
-	.read		= drm_read,
-	.unlocked_ioctl = drm_ioctl,
-	.compat_ioctl	= drm_compat_ioctl,
-	.release	= drm_release,
-};
-
-static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (bo->pages_pin_count++ == 0) {
-		struct page **pages;
-
-		pages = drm_gem_get_pages(&bo->base);
-		if (IS_ERR(pages)) {
-			bo->pages_pin_count--;
-			mutex_unlock(&bo->pages_lock);
-			return pages;
-		}
-
-		bo->pages = pages;
-	}
-	mutex_unlock(&bo->pages_lock);
-
-	return bo->pages;
-}
-
-static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (--bo->pages_pin_count == 0) {
-		drm_gem_put_pages(&bo->base, bo->pages, true, true);
-		bo->pages = NULL;
-	}
-	mutex_unlock(&bo->pages_lock);
-}
-
-static int vgem_prime_pin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	/* Flush the object from the CPU cache so that importers can rely
-	 * on coherent indirect access via the exported dma-address.
-	 */
-	drm_clflush_pages(pages, n_pages);
-
-	return 0;
-}
-
-static void vgem_prime_unpin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vgem_unpin_pages(bo);
-}
-
-static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
-}
-
-static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
-						struct dma_buf *dma_buf)
-{
-	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
-
-	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
-}
-
-static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
-			struct dma_buf_attachment *attach, struct sg_table *sg)
-{
-	struct drm_vgem_gem_object *obj;
-	int npages;
-
-	obj = __vgem_gem_create(dev, attach->dmabuf->size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
-
-	obj->table = sg;
-	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (!obj->pages) {
-		__vgem_gem_destroy(obj);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	obj->pages_pin_count++; /* perma-pinned */
-	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
-	return &obj->base;
-}
-
-static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-	void *vaddr;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
-	if (!vaddr)
-		return -ENOMEM;
-	dma_buf_map_set_vaddr(map, vaddr);
-
-	return 0;
-}
-
-static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vunmap(map->vaddr);
-	vgem_unpin_pages(bo);
-}
-
-static int vgem_prime_mmap(struct drm_gem_object *obj,
-			   struct vm_area_struct *vma)
-{
-	int ret;
-
-	if (obj->size < vma->vm_end - vma->vm_start)
-		return -EINVAL;
-
-	if (!obj->filp)
-		return -ENODEV;
-
-	ret = call_mmap(obj->filp, vma);
-	if (ret)
-		return ret;
-
-	vma_set_file(vma, obj->filp);
-	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
-	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-
-	return 0;
-}
-
-static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
-	.free = vgem_gem_free_object,
-	.pin = vgem_prime_pin,
-	.unpin = vgem_prime_unpin,
-	.get_sg_table = vgem_prime_get_sg_table,
-	.vmap = vgem_prime_vmap,
-	.vunmap = vgem_prime_vunmap,
-	.vm_ops = &vgem_gem_vm_ops,
-};
+DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
 
 static const struct drm_driver vgem_driver = {
 	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
@@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
 	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
 	.fops				= &vgem_driver_fops,
 
-	.dumb_create			= vgem_gem_dumb_create,
-
-	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
-	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-	.gem_prime_import = vgem_prime_import,
-	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
-	.gem_prime_mmap = vgem_prime_mmap,
+	DRM_GEM_SHMEM_DRIVER_OPS,
 
 	.name	= DRIVER_NAME,
 	.desc	= DRIVER_DESC,
-- 
2.30.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [Intel-gfx] [PATCH] drm/vgem: use shmem helpers
@ 2021-02-25 10:23     ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:23 UTC (permalink / raw)
  To: DRI Development
  Cc: Daniel Vetter, Intel Graphics Development, Christian König,
	Melissa Wen, John Stultz, Thomas Zimmermann, Daniel Vetter,
	Chris Wilson, Sumit Semwal

Aside from deleting lots of code the real motivation here is to switch
the mmap over to VM_PFNMAP, to be more consistent with what real gpu
drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
work, and even if you try and there's a struct page behind that,
touching it and mucking around with its refcount can upset drivers
real bad.

v2: Review from Thomas:
- sort #include
- drop more dead code that I didn't spot somehow

v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)

Cc: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Melissa Wen <melissa.srw@gmail.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/Kconfig         |   1 +
 drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
 2 files changed, 4 insertions(+), 337 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 8e73311de583..94e4ac830283 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
 config DRM_VGEM
 	tristate "Virtual GEM provider"
 	depends on DRM
+	select DRM_GEM_SHMEM_HELPER
 	help
 	  Choose this option to get a virtual graphics memory manager,
 	  as used by Mesa's software renderer for enhanced performance.
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index a0e75f1d5d01..b1b3a5ffc542 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -38,6 +38,7 @@
 
 #include <drm/drm_drv.h>
 #include <drm/drm_file.h>
+#include <drm/drm_gem_shmem_helper.h>
 #include <drm/drm_ioctl.h>
 #include <drm/drm_managed.h>
 #include <drm/drm_prime.h>
@@ -50,87 +51,11 @@
 #define DRIVER_MAJOR	1
 #define DRIVER_MINOR	0
 
-static const struct drm_gem_object_funcs vgem_gem_object_funcs;
-
 static struct vgem_device {
 	struct drm_device drm;
 	struct platform_device *platform;
 } *vgem_device;
 
-static void vgem_gem_free_object(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
-
-	kvfree(vgem_obj->pages);
-	mutex_destroy(&vgem_obj->pages_lock);
-
-	if (obj->import_attach)
-		drm_prime_gem_destroy(obj, vgem_obj->table);
-
-	drm_gem_object_release(obj);
-	kfree(vgem_obj);
-}
-
-static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
-{
-	struct vm_area_struct *vma = vmf->vma;
-	struct drm_vgem_gem_object *obj = vma->vm_private_data;
-	/* We don't use vmf->pgoff since that has the fake offset */
-	unsigned long vaddr = vmf->address;
-	vm_fault_t ret = VM_FAULT_SIGBUS;
-	loff_t num_pages;
-	pgoff_t page_offset;
-	page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
-
-	num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
-
-	if (page_offset >= num_pages)
-		return VM_FAULT_SIGBUS;
-
-	mutex_lock(&obj->pages_lock);
-	if (obj->pages) {
-		get_page(obj->pages[page_offset]);
-		vmf->page = obj->pages[page_offset];
-		ret = 0;
-	}
-	mutex_unlock(&obj->pages_lock);
-	if (ret) {
-		struct page *page;
-
-		page = shmem_read_mapping_page(
-					file_inode(obj->base.filp)->i_mapping,
-					page_offset);
-		if (!IS_ERR(page)) {
-			vmf->page = page;
-			ret = 0;
-		} else switch (PTR_ERR(page)) {
-			case -ENOSPC:
-			case -ENOMEM:
-				ret = VM_FAULT_OOM;
-				break;
-			case -EBUSY:
-				ret = VM_FAULT_RETRY;
-				break;
-			case -EFAULT:
-			case -EINVAL:
-				ret = VM_FAULT_SIGBUS;
-				break;
-			default:
-				WARN_ON(PTR_ERR(page));
-				ret = VM_FAULT_SIGBUS;
-				break;
-		}
-
-	}
-	return ret;
-}
-
-static const struct vm_operations_struct vgem_gem_vm_ops = {
-	.fault = vgem_gem_fault,
-	.open = drm_gem_vm_open,
-	.close = drm_gem_vm_close,
-};
-
 static int vgem_open(struct drm_device *dev, struct drm_file *file)
 {
 	struct vgem_file *vfile;
@@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
 	kfree(vfile);
 }
 
-static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
-						unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
-	if (!obj)
-		return ERR_PTR(-ENOMEM);
-
-	obj->base.funcs = &vgem_gem_object_funcs;
-
-	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
-	if (ret) {
-		kfree(obj);
-		return ERR_PTR(ret);
-	}
-
-	mutex_init(&obj->pages_lock);
-
-	return obj;
-}
-
-static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
-{
-	drm_gem_object_release(&obj->base);
-	kfree(obj);
-}
-
-static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
-					      struct drm_file *file,
-					      unsigned int *handle,
-					      unsigned long size)
-{
-	struct drm_vgem_gem_object *obj;
-	int ret;
-
-	obj = __vgem_gem_create(dev, size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	ret = drm_gem_handle_create(file, &obj->base, handle);
-	if (ret) {
-		drm_gem_object_put(&obj->base);
-		return ERR_PTR(ret);
-	}
-
-	return &obj->base;
-}
-
-static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
-				struct drm_mode_create_dumb *args)
-{
-	struct drm_gem_object *gem_object;
-	u64 pitch, size;
-
-	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
-	size = args->height * pitch;
-	if (size == 0)
-		return -EINVAL;
-
-	gem_object = vgem_gem_create(dev, file, &args->handle, size);
-	if (IS_ERR(gem_object))
-		return PTR_ERR(gem_object);
-
-	args->size = gem_object->size;
-	args->pitch = pitch;
-
-	drm_gem_object_put(gem_object);
-
-	DRM_DEBUG("Created object of size %llu\n", args->size);
-
-	return 0;
-}
-
 static struct drm_ioctl_desc vgem_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
 };
 
-static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-	unsigned long flags = vma->vm_flags;
-	int ret;
-
-	ret = drm_gem_mmap(filp, vma);
-	if (ret)
-		return ret;
-
-	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
-	 * are ordinary and not special.
-	 */
-	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
-	return 0;
-}
-
-static const struct file_operations vgem_driver_fops = {
-	.owner		= THIS_MODULE,
-	.open		= drm_open,
-	.mmap		= vgem_mmap,
-	.poll		= drm_poll,
-	.read		= drm_read,
-	.unlocked_ioctl = drm_ioctl,
-	.compat_ioctl	= drm_compat_ioctl,
-	.release	= drm_release,
-};
-
-static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (bo->pages_pin_count++ == 0) {
-		struct page **pages;
-
-		pages = drm_gem_get_pages(&bo->base);
-		if (IS_ERR(pages)) {
-			bo->pages_pin_count--;
-			mutex_unlock(&bo->pages_lock);
-			return pages;
-		}
-
-		bo->pages = pages;
-	}
-	mutex_unlock(&bo->pages_lock);
-
-	return bo->pages;
-}
-
-static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
-{
-	mutex_lock(&bo->pages_lock);
-	if (--bo->pages_pin_count == 0) {
-		drm_gem_put_pages(&bo->base, bo->pages, true, true);
-		bo->pages = NULL;
-	}
-	mutex_unlock(&bo->pages_lock);
-}
-
-static int vgem_prime_pin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	/* Flush the object from the CPU cache so that importers can rely
-	 * on coherent indirect access via the exported dma-address.
-	 */
-	drm_clflush_pages(pages, n_pages);
-
-	return 0;
-}
-
-static void vgem_prime_unpin(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vgem_unpin_pages(bo);
-}
-
-static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
-}
-
-static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
-						struct dma_buf *dma_buf)
-{
-	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
-
-	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
-}
-
-static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
-			struct dma_buf_attachment *attach, struct sg_table *sg)
-{
-	struct drm_vgem_gem_object *obj;
-	int npages;
-
-	obj = __vgem_gem_create(dev, attach->dmabuf->size);
-	if (IS_ERR(obj))
-		return ERR_CAST(obj);
-
-	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
-
-	obj->table = sg;
-	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
-	if (!obj->pages) {
-		__vgem_gem_destroy(obj);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	obj->pages_pin_count++; /* perma-pinned */
-	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
-	return &obj->base;
-}
-
-static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-	long n_pages = obj->size >> PAGE_SHIFT;
-	struct page **pages;
-	void *vaddr;
-
-	pages = vgem_pin_pages(bo);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
-	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
-	if (!vaddr)
-		return -ENOMEM;
-	dma_buf_map_set_vaddr(map, vaddr);
-
-	return 0;
-}
-
-static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
-{
-	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
-
-	vunmap(map->vaddr);
-	vgem_unpin_pages(bo);
-}
-
-static int vgem_prime_mmap(struct drm_gem_object *obj,
-			   struct vm_area_struct *vma)
-{
-	int ret;
-
-	if (obj->size < vma->vm_end - vma->vm_start)
-		return -EINVAL;
-
-	if (!obj->filp)
-		return -ENODEV;
-
-	ret = call_mmap(obj->filp, vma);
-	if (ret)
-		return ret;
-
-	vma_set_file(vma, obj->filp);
-	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
-	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-
-	return 0;
-}
-
-static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
-	.free = vgem_gem_free_object,
-	.pin = vgem_prime_pin,
-	.unpin = vgem_prime_unpin,
-	.get_sg_table = vgem_prime_get_sg_table,
-	.vmap = vgem_prime_vmap,
-	.vunmap = vgem_prime_vunmap,
-	.vm_ops = &vgem_gem_vm_ops,
-};
+DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
 
 static const struct drm_driver vgem_driver = {
 	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
@@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
 	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
 	.fops				= &vgem_driver_fops,
 
-	.dumb_create			= vgem_gem_dumb_create,
-
-	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
-	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
-	.gem_prime_import = vgem_prime_import,
-	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
-	.gem_prime_mmap = vgem_prime_mmap,
+	DRM_GEM_SHMEM_DRIVER_OPS,
 
 	.name	= DRIVER_NAME,
 	.desc	= DRIVER_DESC,
-- 
2.30.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-24  9:31         ` Daniel Vetter
  (?)
@ 2021-02-25 10:28           ` Christian König
  -1 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 10:28 UTC (permalink / raw)
  To: Daniel Vetter, Thomas Hellström (Intel)
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>> <thomas_os@shipmail.org> wrote:
>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>> them like that (like calling get_user_pages works, or that they're
>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>
>>>>> Since some userspace only runs on integrated devices, where all
>>>>> buffers are actually all resident system memory, there's a huge
>>>>> temptation to assume that a struct page is always present and useable
>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>> result in a uapi nightmare.
>>>>>
>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>> blocks get_user_pages and all the other struct page based
>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>
>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>
>>>>> v2:
>>>>>
>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>
>>>>>    From auditing the various functions to insert pfn pte entires
>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>> this should be the correct flag to check for.
>>>>>
>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>> disallow COW mappings, since it will not work on architectures that
>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>> COW really makes absolutely no sense. How would we enforce this?
>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>> or allowing MIXEDMAP.
>>
>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>> I think that's why ttm has a page cache for these, because it indeed
>>> sucks. The PAT changes on pages are rather expensive.
>> IIRC the page cache was implemented because of the slowness of the
>> caching mode transition itself, more specifically the wbinvd() call +
>> global TLB flush.

Yes, exactly that. The global TLB flush is what really breaks our neck 
here from a performance perspective.

>>> There is still an issue for iomem mappings, because the PAT validation
>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>> But for i915 at least this is fixed by using the io_mapping
>>> infrastructure, which does the PAT reservation only once when you set
>>> up the mapping area at driver load.
>> Yes, I guess that was the issue that the comment describes, but the
>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>
>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>> problem that hurts much :-)
>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>
>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> Uh that's bad, because mixed maps pointing at struct page wont stop
> gup. At least afaik.

Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have 
already seen tons of problems with the page cache.

Regards,
Christian.

> Christian, do we need to patch this up, and maybe fix up ttm fault
> handler to use io_mapping so the vm_insert_pfn stuff is fast?
> -Daniel


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 10:28           ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 10:28 UTC (permalink / raw)
  To: Daniel Vetter, Thomas Hellström (Intel)
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	Matthew Wilcox, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>> <thomas_os@shipmail.org> wrote:
>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>> them like that (like calling get_user_pages works, or that they're
>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>
>>>>> Since some userspace only runs on integrated devices, where all
>>>>> buffers are actually all resident system memory, there's a huge
>>>>> temptation to assume that a struct page is always present and useable
>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>> result in a uapi nightmare.
>>>>>
>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>> blocks get_user_pages and all the other struct page based
>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>
>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>
>>>>> v2:
>>>>>
>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>
>>>>>    From auditing the various functions to insert pfn pte entires
>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>> this should be the correct flag to check for.
>>>>>
>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>> disallow COW mappings, since it will not work on architectures that
>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>> COW really makes absolutely no sense. How would we enforce this?
>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>> or allowing MIXEDMAP.
>>
>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>> I think that's why ttm has a page cache for these, because it indeed
>>> sucks. The PAT changes on pages are rather expensive.
>> IIRC the page cache was implemented because of the slowness of the
>> caching mode transition itself, more specifically the wbinvd() call +
>> global TLB flush.

Yes, exactly that. The global TLB flush is what really breaks our neck 
here from a performance perspective.

>>> There is still an issue for iomem mappings, because the PAT validation
>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>> But for i915 at least this is fixed by using the io_mapping
>>> infrastructure, which does the PAT reservation only once when you set
>>> up the mapping area at driver load.
>> Yes, I guess that was the issue that the comment describes, but the
>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>
>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>> problem that hurts much :-)
>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>
>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> Uh that's bad, because mixed maps pointing at struct page wont stop
> gup. At least afaik.

Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have 
already seen tons of problems with the page cache.

Regards,
Christian.

> Christian, do we need to patch this up, and maybe fix up ttm fault
> handler to use io_mapping so the vm_insert_pfn stuff is fast?
> -Daniel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 10:28           ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 10:28 UTC (permalink / raw)
  To: Daniel Vetter, Thomas Hellström (Intel)
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, Matthew Wilcox, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>> <thomas_os@shipmail.org> wrote:
>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>> them like that (like calling get_user_pages works, or that they're
>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>
>>>>> Since some userspace only runs on integrated devices, where all
>>>>> buffers are actually all resident system memory, there's a huge
>>>>> temptation to assume that a struct page is always present and useable
>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>> result in a uapi nightmare.
>>>>>
>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>> blocks get_user_pages and all the other struct page based
>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>
>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>
>>>>> v2:
>>>>>
>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>
>>>>>    From auditing the various functions to insert pfn pte entires
>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>> this should be the correct flag to check for.
>>>>>
>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>> disallow COW mappings, since it will not work on architectures that
>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>> COW really makes absolutely no sense. How would we enforce this?
>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>> or allowing MIXEDMAP.
>>
>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>> I think that's why ttm has a page cache for these, because it indeed
>>> sucks. The PAT changes on pages are rather expensive.
>> IIRC the page cache was implemented because of the slowness of the
>> caching mode transition itself, more specifically the wbinvd() call +
>> global TLB flush.

Yes, exactly that. The global TLB flush is what really breaks our neck 
here from a performance perspective.

>>> There is still an issue for iomem mappings, because the PAT validation
>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>> But for i915 at least this is fixed by using the io_mapping
>>> infrastructure, which does the PAT reservation only once when you set
>>> up the mapping area at driver load.
>> Yes, I guess that was the issue that the comment describes, but the
>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>
>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>> problem that hurts much :-)
>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>
>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> Uh that's bad, because mixed maps pointing at struct page wont stop
> gup. At least afaik.

Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have 
already seen tons of problems with the page cache.

Regards,
Christian.

> Christian, do we need to patch this up, and maybe fix up ttm fault
> handler to use io_mapping so the vm_insert_pfn stuff is fast?
> -Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-24 18:46       ` Jason Gunthorpe
  (?)
@ 2021-02-25 10:30         ` Christian König
  -1 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 10:30 UTC (permalink / raw)
  To: Jason Gunthorpe, Daniel Vetter
  Cc: Thomas Hellström (Intel),
	DRI Development, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, John Stultz,
	Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 24.02.21 um 19:46 schrieb Jason Gunthorpe:
> On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote:
>
>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>> COW really makes absolutely no sense. How would we enforce this?
> In RDMA we test
>
> drivers/infiniband/core/ib_core_uverbs.c:       if (!(vma->vm_flags & VM_SHARED))
>
> During mmap to reject use of MAP_PRIVATE on BAR pages.

That's a really good idea. MAP_PRIVATE and any driver mappings doesn't 
really work at all.

Christian.

>
> Jason


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 10:30         ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 10:30 UTC (permalink / raw)
  To: Jason Gunthorpe, Daniel Vetter
  Cc: Thomas Hellström (Intel),
	Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 24.02.21 um 19:46 schrieb Jason Gunthorpe:
> On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote:
>
>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>> COW really makes absolutely no sense. How would we enforce this?
> In RDMA we test
>
> drivers/infiniband/core/ib_core_uverbs.c:       if (!(vma->vm_flags & VM_SHARED))
>
> During mmap to reject use of MAP_PRIVATE on BAR pages.

That's a really good idea. MAP_PRIVATE and any driver mappings doesn't 
really work at all.

Christian.

>
> Jason

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 10:30         ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 10:30 UTC (permalink / raw)
  To: Jason Gunthorpe, Daniel Vetter
  Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 24.02.21 um 19:46 schrieb Jason Gunthorpe:
> On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote:
>
>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>> COW really makes absolutely no sense. How would we enforce this?
> In RDMA we test
>
> drivers/infiniband/core/ib_core_uverbs.c:       if (!(vma->vm_flags & VM_SHARED))
>
> During mmap to reject use of MAP_PRIVATE on BAR pages.

That's a really good idea. MAP_PRIVATE and any driver mappings doesn't 
really work at all.

Christian.

>
> Jason

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3)
  2021-02-23 10:59 ` Daniel Vetter
                   ` (5 preceding siblings ...)
  (?)
@ 2021-02-25 10:38 ` Patchwork
  -1 siblings, 0 replies; 110+ messages in thread
From: Patchwork @ 2021-02-25 10:38 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3)
URL   : https://patchwork.freedesktop.org/series/87313/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
b71cc38b23b9 dma-buf: Require VM_PFNMAP vma for mmap
-:34: WARNING:TYPO_SPELLING: 'entires' may be misspelled - perhaps 'entries'?
#34: 
From auditing the various functions to insert pfn pte entires
                                                      ^^^^^^^

-:39: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#39: 
References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/

-:97: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 3 warnings, 0 checks, 39 lines checked
93fc58ee63d1 drm/vgem: use shmem helpers
-:424: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>'

total: 0 errors, 1 warnings, 0 checks, 381 lines checked


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-25 10:28           ` Christian König
  (?)
@ 2021-02-25 10:44             ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:44 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, Thomas Hellström (Intel),
	Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > > 
> > > On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > > > <thomas_os@shipmail.org> wrote:
> > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use
> > > > > > them like that (like calling get_user_pages works, or that they're
> > > > > > accounting like any other normal memory) cannot be guaranteed.
> > > > > > 
> > > > > > Since some userspace only runs on integrated devices, where all
> > > > > > buffers are actually all resident system memory, there's a huge
> > > > > > temptation to assume that a struct page is always present and useable
> > > > > > like for any more pagecache backed mmap. This has the potential to
> > > > > > result in a uapi nightmare.
> > > > > > 
> > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > > > > > blocks get_user_pages and all the other struct page based
> > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to
> > > > > > the kernel-internal CONFIG_DMABUF_DEBUG.
> > > > > > 
> > > > > > Motivated by a recent patch which wanted to swich the system dma-buf
> > > > > > heap to vm_insert_page instead of vm_insert_pfn.
> > > > > > 
> > > > > > v2:
> > > > > > 
> > > > > > Jason brought up that we also want to guarantee that all ptes have the
> > > > > > pte_special flag set, to catch fast get_user_pages (on architectures
> > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> > > > > > 
> > > > > >    From auditing the various functions to insert pfn pte entires
> > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > > > > > this should be the correct flag to check for.
> > > > > > 
> > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to
> > > > > disallow COW mappings, since it will not work on architectures that
> > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > > COW really makes absolutely no sense. How would we enforce this?
> > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> > > or allowing MIXEDMAP.
> > > 
> > > > > Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal
> > > > > pages. That's a very old comment, though, and might not be valid anymore.
> > > > I think that's why ttm has a page cache for these, because it indeed
> > > > sucks. The PAT changes on pages are rather expensive.
> > > IIRC the page cache was implemented because of the slowness of the
> > > caching mode transition itself, more specifically the wbinvd() call +
> > > global TLB flush.
> 
> Yes, exactly that. The global TLB flush is what really breaks our neck here
> from a performance perspective.
> 
> > > > There is still an issue for iomem mappings, because the PAT validation
> > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > > > But for i915 at least this is fixed by using the io_mapping
> > > > infrastructure, which does the PAT reservation only once when you set
> > > > up the mapping area at driver load.
> > > Yes, I guess that was the issue that the comment describes, but the
> > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> > > 
> > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > > > problem that hurts much :-)
> > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> > > 
> > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> > Uh that's bad, because mixed maps pointing at struct page wont stop
> > gup. At least afaik.
> 
> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> already seen tons of problems with the page cache.

On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.

But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
you're stopping gup slow path. See check_vma_flags() in mm/gup.c.

Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
vm_insert_mixed even works on iomem pfns. There's the devmap exception,
but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
in hugepte support by intentionally not being devmap.

So I'm really not sure this works as we think it should. Maybe good to do
a quick test program on amdgpu with a buffer in system memory only and try
to do direct io into it. If it works, you have a problem, and a bad one.
-Daniel

> 
> Regards,
> Christian.
> 
> > Christian, do we need to patch this up, and maybe fix up ttm fault
> > handler to use io_mapping so the vm_insert_pfn stuff is fast?
> > -Daniel
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 10:44             ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:44 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, Thomas Hellström (Intel),
	Matthew Wilcox, Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > > 
> > > On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > > > <thomas_os@shipmail.org> wrote:
> > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use
> > > > > > them like that (like calling get_user_pages works, or that they're
> > > > > > accounting like any other normal memory) cannot be guaranteed.
> > > > > > 
> > > > > > Since some userspace only runs on integrated devices, where all
> > > > > > buffers are actually all resident system memory, there's a huge
> > > > > > temptation to assume that a struct page is always present and useable
> > > > > > like for any more pagecache backed mmap. This has the potential to
> > > > > > result in a uapi nightmare.
> > > > > > 
> > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > > > > > blocks get_user_pages and all the other struct page based
> > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to
> > > > > > the kernel-internal CONFIG_DMABUF_DEBUG.
> > > > > > 
> > > > > > Motivated by a recent patch which wanted to swich the system dma-buf
> > > > > > heap to vm_insert_page instead of vm_insert_pfn.
> > > > > > 
> > > > > > v2:
> > > > > > 
> > > > > > Jason brought up that we also want to guarantee that all ptes have the
> > > > > > pte_special flag set, to catch fast get_user_pages (on architectures
> > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> > > > > > 
> > > > > >    From auditing the various functions to insert pfn pte entires
> > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > > > > > this should be the correct flag to check for.
> > > > > > 
> > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to
> > > > > disallow COW mappings, since it will not work on architectures that
> > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > > COW really makes absolutely no sense. How would we enforce this?
> > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> > > or allowing MIXEDMAP.
> > > 
> > > > > Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal
> > > > > pages. That's a very old comment, though, and might not be valid anymore.
> > > > I think that's why ttm has a page cache for these, because it indeed
> > > > sucks. The PAT changes on pages are rather expensive.
> > > IIRC the page cache was implemented because of the slowness of the
> > > caching mode transition itself, more specifically the wbinvd() call +
> > > global TLB flush.
> 
> Yes, exactly that. The global TLB flush is what really breaks our neck here
> from a performance perspective.
> 
> > > > There is still an issue for iomem mappings, because the PAT validation
> > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > > > But for i915 at least this is fixed by using the io_mapping
> > > > infrastructure, which does the PAT reservation only once when you set
> > > > up the mapping area at driver load.
> > > Yes, I guess that was the issue that the comment describes, but the
> > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> > > 
> > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > > > problem that hurts much :-)
> > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> > > 
> > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> > Uh that's bad, because mixed maps pointing at struct page wont stop
> > gup. At least afaik.
> 
> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> already seen tons of problems with the page cache.

On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.

But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
you're stopping gup slow path. See check_vma_flags() in mm/gup.c.

Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
vm_insert_mixed even works on iomem pfns. There's the devmap exception,
but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
in hugepte support by intentionally not being devmap.

So I'm really not sure this works as we think it should. Maybe good to do
a quick test program on amdgpu with a buffer in system memory only and try
to do direct io into it. If it works, you have a problem, and a bad one.
-Daniel

> 
> Regards,
> Christian.
> 
> > Christian, do we need to patch this up, and maybe fix up ttm fault
> > handler to use io_mapping so the vm_insert_pfn stuff is fast?
> > -Daniel
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 10:44             ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:44 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, Matthew Wilcox, Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > > 
> > > On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > > > <thomas_os@shipmail.org> wrote:
> > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use
> > > > > > them like that (like calling get_user_pages works, or that they're
> > > > > > accounting like any other normal memory) cannot be guaranteed.
> > > > > > 
> > > > > > Since some userspace only runs on integrated devices, where all
> > > > > > buffers are actually all resident system memory, there's a huge
> > > > > > temptation to assume that a struct page is always present and useable
> > > > > > like for any more pagecache backed mmap. This has the potential to
> > > > > > result in a uapi nightmare.
> > > > > > 
> > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > > > > > blocks get_user_pages and all the other struct page based
> > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to
> > > > > > the kernel-internal CONFIG_DMABUF_DEBUG.
> > > > > > 
> > > > > > Motivated by a recent patch which wanted to swich the system dma-buf
> > > > > > heap to vm_insert_page instead of vm_insert_pfn.
> > > > > > 
> > > > > > v2:
> > > > > > 
> > > > > > Jason brought up that we also want to guarantee that all ptes have the
> > > > > > pte_special flag set, to catch fast get_user_pages (on architectures
> > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> > > > > > 
> > > > > >    From auditing the various functions to insert pfn pte entires
> > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > > > > > this should be the correct flag to check for.
> > > > > > 
> > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to
> > > > > disallow COW mappings, since it will not work on architectures that
> > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > > COW really makes absolutely no sense. How would we enforce this?
> > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> > > or allowing MIXEDMAP.
> > > 
> > > > > Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal
> > > > > pages. That's a very old comment, though, and might not be valid anymore.
> > > > I think that's why ttm has a page cache for these, because it indeed
> > > > sucks. The PAT changes on pages are rather expensive.
> > > IIRC the page cache was implemented because of the slowness of the
> > > caching mode transition itself, more specifically the wbinvd() call +
> > > global TLB flush.
> 
> Yes, exactly that. The global TLB flush is what really breaks our neck here
> from a performance perspective.
> 
> > > > There is still an issue for iomem mappings, because the PAT validation
> > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > > > But for i915 at least this is fixed by using the io_mapping
> > > > infrastructure, which does the PAT reservation only once when you set
> > > > up the mapping area at driver load.
> > > Yes, I guess that was the issue that the comment describes, but the
> > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> > > 
> > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > > > problem that hurts much :-)
> > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> > > 
> > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> > Uh that's bad, because mixed maps pointing at struct page wont stop
> > gup. At least afaik.
> 
> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> already seen tons of problems with the page cache.

On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.

But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
you're stopping gup slow path. See check_vma_flags() in mm/gup.c.

Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
vm_insert_mixed even works on iomem pfns. There's the devmap exception,
but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
in hugepte support by intentionally not being devmap.

So I'm really not sure this works as we think it should. Maybe good to do
a quick test program on amdgpu with a buffer in system memory only and try
to do direct io into it. If it works, you have a problem, and a bad one.
-Daniel

> 
> Regards,
> Christian.
> 
> > Christian, do we need to patch this up, and maybe fix up ttm fault
> > handler to use io_mapping so the vm_insert_pfn stuff is fast?
> > -Daniel
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-25 10:30         ` Christian König
  (?)
@ 2021-02-25 10:45           ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:45 UTC (permalink / raw)
  To: Christian König
  Cc: Jason Gunthorpe, Daniel Vetter, Thomas Hellström (Intel),
	DRI Development, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, John Stultz,
	Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:30:23AM +0100, Christian König wrote:
> 
> 
> Am 24.02.21 um 19:46 schrieb Jason Gunthorpe:
> > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote:
> > 
> > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > COW really makes absolutely no sense. How would we enforce this?
> > In RDMA we test
> > 
> > drivers/infiniband/core/ib_core_uverbs.c:       if (!(vma->vm_flags & VM_SHARED))
> > 
> > During mmap to reject use of MAP_PRIVATE on BAR pages.
> 
> That's a really good idea. MAP_PRIVATE and any driver mappings doesn't
> really work at all.

Yeah I feel like this is the next patch we need to add on this little
series of locking down dma-buf mmap semantics. Probably should also push
these into drm gem mmap code (and maybe ttm can switch over to that, it's
really the same).

One at a time.
-Daniel
> 
> Christian.
> 
> > 
> > Jason
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 10:45           ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:45 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, Thomas Hellström (Intel),
	Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Jason Gunthorpe, DRI Development, Daniel Vetter,
	Suren Baghdasaryan, Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:30:23AM +0100, Christian König wrote:
> 
> 
> Am 24.02.21 um 19:46 schrieb Jason Gunthorpe:
> > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote:
> > 
> > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > COW really makes absolutely no sense. How would we enforce this?
> > In RDMA we test
> > 
> > drivers/infiniband/core/ib_core_uverbs.c:       if (!(vma->vm_flags & VM_SHARED))
> > 
> > During mmap to reject use of MAP_PRIVATE on BAR pages.
> 
> That's a really good idea. MAP_PRIVATE and any driver mappings doesn't
> really work at all.

Yeah I feel like this is the next patch we need to add on this little
series of locking down dma-buf mmap semantics. Probably should also push
these into drm gem mmap code (and maybe ttm can switch over to that, it's
really the same).

One at a time.
-Daniel
> 
> Christian.
> 
> > 
> > Jason
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 10:45           ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 10:45 UTC (permalink / raw)
  To: Christian König
  Cc: Daniel Vetter, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:30:23AM +0100, Christian König wrote:
> 
> 
> Am 24.02.21 um 19:46 schrieb Jason Gunthorpe:
> > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote:
> > 
> > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > COW really makes absolutely no sense. How would we enforce this?
> > In RDMA we test
> > 
> > drivers/infiniband/core/ib_core_uverbs.c:       if (!(vma->vm_flags & VM_SHARED))
> > 
> > During mmap to reject use of MAP_PRIVATE on BAR pages.
> 
> That's a really good idea. MAP_PRIVATE and any driver mappings doesn't
> really work at all.

Yeah I feel like this is the next patch we need to add on this little
series of locking down dma-buf mmap semantics. Probably should also push
these into drm gem mmap code (and maybe ttm can switch over to that, it's
really the same).

One at a time.
-Daniel
> 
> Christian.
> 
> > 
> > Jason
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3)
  2021-02-23 10:59 ` Daniel Vetter
                   ` (6 preceding siblings ...)
  (?)
@ 2021-02-25 11:19 ` Patchwork
  -1 siblings, 0 replies; 110+ messages in thread
From: Patchwork @ 2021-02-25 11:19 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 26735 bytes --]

== Series Details ==

Series: series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3)
URL   : https://patchwork.freedesktop.org/series/87313/
State : failure

== Summary ==

CI Bug Log - changes from CI_DRM_9804 -> Patchwork_19728
====================================================

Summary
-------

  **FAILURE**

  Serious unknown changes coming with Patchwork_19728 absolutely need to be
  verified manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_19728, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/index.html

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_19728:

### IGT changes ###

#### Possible regressions ####

  * igt@prime_vgem@basic-fence-mmap:
    - fi-byt-j1900:       [PASS][1] -> [FAIL][2] +3 similar issues
   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-byt-j1900/igt@prime_vgem@basic-fence-mmap.html
   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-byt-j1900/igt@prime_vgem@basic-fence-mmap.html

  * igt@prime_vgem@basic-fence-read:
    - fi-bsw-kefka:       [PASS][3] -> [INCOMPLETE][4] +1 similar issue
   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-kefka/igt@prime_vgem@basic-fence-read.html
   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-kefka/igt@prime_vgem@basic-fence-read.html
    - fi-ilk-650:         [PASS][5] -> [INCOMPLETE][6] +1 similar issue
   [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ilk-650/igt@prime_vgem@basic-fence-read.html
   [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ilk-650/igt@prime_vgem@basic-fence-read.html
    - fi-byt-j1900:       [PASS][7] -> [INCOMPLETE][8]
   [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-byt-j1900/igt@prime_vgem@basic-fence-read.html
   [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-byt-j1900/igt@prime_vgem@basic-fence-read.html

  * igt@prime_vgem@basic-gtt:
    - fi-ilk-650:         [PASS][9] -> [FAIL][10] +3 similar issues
   [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ilk-650/igt@prime_vgem@basic-gtt.html
   [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ilk-650/igt@prime_vgem@basic-gtt.html
    - fi-elk-e7500:       [PASS][11] -> [FAIL][12] +5 similar issues
   [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-elk-e7500/igt@prime_vgem@basic-gtt.html
   [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-elk-e7500/igt@prime_vgem@basic-gtt.html

  * igt@prime_vgem@basic-read:
    - fi-bsw-nick:        [PASS][13] -> [FAIL][14] +4 similar issues
   [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-nick/igt@prime_vgem@basic-read.html
   [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-nick/igt@prime_vgem@basic-read.html

  * igt@prime_vgem@basic-write:
    - fi-pnv-d510:        [PASS][15] -> [FAIL][16] +2 similar issues
   [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-pnv-d510/igt@prime_vgem@basic-write.html
   [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-pnv-d510/igt@prime_vgem@basic-write.html

  * igt@runner@aborted:
    - fi-ilk-650:         NOTRUN -> [FAIL][17]
   [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ilk-650/igt@runner@aborted.html
    - fi-kbl-x1275:       NOTRUN -> [FAIL][18]
   [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-x1275/igt@runner@aborted.html
    - fi-bsw-kefka:       NOTRUN -> [FAIL][19]
   [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-kefka/igt@runner@aborted.html
    - fi-cfl-8700k:       NOTRUN -> [FAIL][20]
   [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8700k/igt@runner@aborted.html
    - fi-tgl-y:           NOTRUN -> [FAIL][21]
   [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@runner@aborted.html
    - fi-skl-6600u:       NOTRUN -> [FAIL][22]
   [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6600u/igt@runner@aborted.html
    - fi-cfl-8109u:       NOTRUN -> [FAIL][23]
   [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8109u/igt@runner@aborted.html
    - fi-bsw-nick:        NOTRUN -> [FAIL][24]
   [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-nick/igt@runner@aborted.html
    - fi-snb-2520m:       NOTRUN -> [FAIL][25]
   [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2520m/igt@runner@aborted.html
    - fi-kbl-soraka:      NOTRUN -> [FAIL][26]
   [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-soraka/igt@runner@aborted.html
    - fi-kbl-7500u:       NOTRUN -> [FAIL][27]
   [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-7500u/igt@runner@aborted.html
    - fi-kbl-guc:         NOTRUN -> [FAIL][28]
   [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-guc/igt@runner@aborted.html
    - fi-cml-u2:          NOTRUN -> [FAIL][29]
   [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-u2/igt@runner@aborted.html
    - fi-ivb-3770:        NOTRUN -> [FAIL][30]
   [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ivb-3770/igt@runner@aborted.html
    - fi-bxt-dsi:         NOTRUN -> [FAIL][31]
   [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bxt-dsi/igt@runner@aborted.html
    - fi-elk-e7500:       NOTRUN -> [FAIL][32]
   [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-elk-e7500/igt@runner@aborted.html
    - fi-cml-s:           NOTRUN -> [FAIL][33]
   [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-s/igt@runner@aborted.html
    - fi-cfl-guc:         NOTRUN -> [FAIL][34]
   [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-guc/igt@runner@aborted.html
    - fi-skl-guc:         NOTRUN -> [FAIL][35]
   [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-guc/igt@runner@aborted.html
    - fi-skl-6700k2:      NOTRUN -> [FAIL][36]
   [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6700k2/igt@runner@aborted.html
    - fi-tgl-u2:          NOTRUN -> [FAIL][37]
   [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-u2/igt@runner@aborted.html

  * igt@vgem_basic@create:
    - fi-skl-6700k2:      [PASS][38] -> [FAIL][39]
   [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-6700k2/igt@vgem_basic@create.html
   [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6700k2/igt@vgem_basic@create.html
    - fi-glk-dsi:         [PASS][40] -> [FAIL][41]
   [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-glk-dsi/igt@vgem_basic@create.html
   [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-glk-dsi/igt@vgem_basic@create.html
    - fi-kbl-x1275:       [PASS][42] -> [FAIL][43]
   [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-x1275/igt@vgem_basic@create.html
   [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-x1275/igt@vgem_basic@create.html
    - fi-bsw-kefka:       [PASS][44] -> [FAIL][45] +3 similar issues
   [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-kefka/igt@vgem_basic@create.html
   [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-kefka/igt@vgem_basic@create.html
    - fi-snb-2600:        [PASS][46] -> [FAIL][47]
   [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-snb-2600/igt@vgem_basic@create.html
   [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2600/igt@vgem_basic@create.html
    - fi-bdw-5557u:       [PASS][48] -> [FAIL][49]
   [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bdw-5557u/igt@vgem_basic@create.html
   [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bdw-5557u/igt@vgem_basic@create.html
    - fi-tgl-y:           [PASS][50] -> [FAIL][51]
   [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-y/igt@vgem_basic@create.html
   [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@vgem_basic@create.html
    - fi-skl-guc:         [PASS][52] -> [FAIL][53]
   [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-guc/igt@vgem_basic@create.html
   [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-guc/igt@vgem_basic@create.html
    - fi-cfl-8109u:       [PASS][54] -> [FAIL][55]
   [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-8109u/igt@vgem_basic@create.html
   [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8109u/igt@vgem_basic@create.html
    - fi-kbl-7500u:       [PASS][56] -> [FAIL][57]
   [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-7500u/igt@vgem_basic@create.html
   [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-7500u/igt@vgem_basic@create.html
    - fi-kbl-guc:         [PASS][58] -> [FAIL][59]
   [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-guc/igt@vgem_basic@create.html
   [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-guc/igt@vgem_basic@create.html
    - fi-cml-u2:          [PASS][60] -> [FAIL][61]
   [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cml-u2/igt@vgem_basic@create.html
   [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-u2/igt@vgem_basic@create.html
    - fi-cfl-8700k:       [PASS][62] -> [FAIL][63]
   [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-8700k/igt@vgem_basic@create.html
   [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8700k/igt@vgem_basic@create.html
    - fi-bxt-dsi:         [PASS][64] -> [FAIL][65]
   [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bxt-dsi/igt@vgem_basic@create.html
   [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bxt-dsi/igt@vgem_basic@create.html
    - fi-hsw-4770:        [PASS][66] -> [FAIL][67]
   [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-hsw-4770/igt@vgem_basic@create.html
   [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-4770/igt@vgem_basic@create.html
    - fi-snb-2520m:       [PASS][68] -> [FAIL][69]
   [68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-snb-2520m/igt@vgem_basic@create.html
   [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2520m/igt@vgem_basic@create.html
    - fi-cml-s:           [PASS][70] -> [FAIL][71]
   [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cml-s/igt@vgem_basic@create.html
   [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-s/igt@vgem_basic@create.html
    - fi-cfl-guc:         [PASS][72] -> [FAIL][73]
   [72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-guc/igt@vgem_basic@create.html
   [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-guc/igt@vgem_basic@create.html
    - fi-kbl-soraka:      [PASS][74] -> [FAIL][75]
   [74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-soraka/igt@vgem_basic@create.html
   [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-soraka/igt@vgem_basic@create.html
    - fi-tgl-u2:          [PASS][76] -> [FAIL][77]
   [76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-u2/igt@vgem_basic@create.html
   [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-u2/igt@vgem_basic@create.html
    - fi-skl-6600u:       [PASS][78] -> [FAIL][79]
   [78]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-6600u/igt@vgem_basic@create.html
   [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6600u/igt@vgem_basic@create.html
    - fi-ivb-3770:        [PASS][80] -> [FAIL][81]
   [80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ivb-3770/igt@vgem_basic@create.html
   [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ivb-3770/igt@vgem_basic@create.html

  * igt@vgem_basic@dmabuf-mmap:
    - fi-ivb-3770:        [PASS][82] -> [DMESG-WARN][83]
   [82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ivb-3770/igt@vgem_basic@dmabuf-mmap.html
   [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ivb-3770/igt@vgem_basic@dmabuf-mmap.html
    - fi-glk-dsi:         [PASS][84] -> [DMESG-WARN][85]
   [84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-glk-dsi/igt@vgem_basic@dmabuf-mmap.html
   [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-glk-dsi/igt@vgem_basic@dmabuf-mmap.html
    - fi-kbl-soraka:      [PASS][86] -> [DMESG-WARN][87]
   [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-soraka/igt@vgem_basic@dmabuf-mmap.html
   [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-soraka/igt@vgem_basic@dmabuf-mmap.html
    - fi-elk-e7500:       [PASS][88] -> [DMESG-WARN][89]
   [88]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-elk-e7500/igt@vgem_basic@dmabuf-mmap.html
   [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-elk-e7500/igt@vgem_basic@dmabuf-mmap.html
    - fi-skl-6700k2:      [PASS][90] -> [DMESG-WARN][91]
   [90]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-6700k2/igt@vgem_basic@dmabuf-mmap.html
   [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6700k2/igt@vgem_basic@dmabuf-mmap.html
    - fi-cml-s:           [PASS][92] -> [DMESG-WARN][93]
   [92]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cml-s/igt@vgem_basic@dmabuf-mmap.html
   [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-s/igt@vgem_basic@dmabuf-mmap.html
    - fi-cfl-guc:         [PASS][94] -> [DMESG-WARN][95]
   [94]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-guc/igt@vgem_basic@dmabuf-mmap.html
   [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-guc/igt@vgem_basic@dmabuf-mmap.html
    - fi-hsw-4770:        [PASS][96] -> [DMESG-WARN][97]
   [96]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-hsw-4770/igt@vgem_basic@dmabuf-mmap.html
   [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-4770/igt@vgem_basic@dmabuf-mmap.html
    - fi-ilk-650:         [PASS][98] -> [DMESG-WARN][99]
   [98]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ilk-650/igt@vgem_basic@dmabuf-mmap.html
   [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ilk-650/igt@vgem_basic@dmabuf-mmap.html
    - fi-tgl-u2:          [PASS][100] -> [DMESG-WARN][101]
   [100]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-u2/igt@vgem_basic@dmabuf-mmap.html
   [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-u2/igt@vgem_basic@dmabuf-mmap.html
    - fi-byt-j1900:       [PASS][102] -> [DMESG-WARN][103]
   [102]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-byt-j1900/igt@vgem_basic@dmabuf-mmap.html
   [103]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-byt-j1900/igt@vgem_basic@dmabuf-mmap.html
    - fi-pnv-d510:        [PASS][104] -> [DMESG-WARN][105]
   [104]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-pnv-d510/igt@vgem_basic@dmabuf-mmap.html
   [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-pnv-d510/igt@vgem_basic@dmabuf-mmap.html
    - fi-cml-u2:          [PASS][106] -> [DMESG-WARN][107]
   [106]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cml-u2/igt@vgem_basic@dmabuf-mmap.html
   [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-u2/igt@vgem_basic@dmabuf-mmap.html
    - fi-skl-6600u:       [PASS][108] -> [DMESG-WARN][109]
   [108]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-6600u/igt@vgem_basic@dmabuf-mmap.html
   [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6600u/igt@vgem_basic@dmabuf-mmap.html
    - fi-bxt-dsi:         [PASS][110] -> [DMESG-WARN][111]
   [110]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bxt-dsi/igt@vgem_basic@dmabuf-mmap.html
   [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bxt-dsi/igt@vgem_basic@dmabuf-mmap.html
    - fi-cfl-8700k:       [PASS][112] -> [DMESG-WARN][113]
   [112]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-8700k/igt@vgem_basic@dmabuf-mmap.html
   [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8700k/igt@vgem_basic@dmabuf-mmap.html
    - fi-snb-2520m:       [PASS][114] -> [DMESG-WARN][115]
   [114]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-snb-2520m/igt@vgem_basic@dmabuf-mmap.html
   [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2520m/igt@vgem_basic@dmabuf-mmap.html
    - fi-cfl-8109u:       [PASS][116] -> [DMESG-WARN][117]
   [116]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-8109u/igt@vgem_basic@dmabuf-mmap.html
   [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8109u/igt@vgem_basic@dmabuf-mmap.html
    - fi-bdw-5557u:       [PASS][118] -> [DMESG-WARN][119]
   [118]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bdw-5557u/igt@vgem_basic@dmabuf-mmap.html
   [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bdw-5557u/igt@vgem_basic@dmabuf-mmap.html
    - fi-bsw-nick:        [PASS][120] -> [DMESG-WARN][121]
   [120]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-nick/igt@vgem_basic@dmabuf-mmap.html
   [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-nick/igt@vgem_basic@dmabuf-mmap.html
    - fi-skl-guc:         [PASS][122] -> [DMESG-WARN][123]
   [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-guc/igt@vgem_basic@dmabuf-mmap.html
   [123]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-guc/igt@vgem_basic@dmabuf-mmap.html
    - fi-bsw-kefka:       [PASS][124] -> [DMESG-WARN][125]
   [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-kefka/igt@vgem_basic@dmabuf-mmap.html
   [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-kefka/igt@vgem_basic@dmabuf-mmap.html
    - fi-kbl-guc:         [PASS][126] -> [DMESG-WARN][127]
   [126]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-guc/igt@vgem_basic@dmabuf-mmap.html
   [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-guc/igt@vgem_basic@dmabuf-mmap.html
    - fi-kbl-7500u:       [PASS][128] -> [DMESG-WARN][129]
   [128]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-7500u/igt@vgem_basic@dmabuf-mmap.html
   [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-7500u/igt@vgem_basic@dmabuf-mmap.html
    - fi-tgl-y:           [PASS][130] -> [DMESG-WARN][131]
   [130]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-y/igt@vgem_basic@dmabuf-mmap.html
   [131]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@vgem_basic@dmabuf-mmap.html
    - fi-snb-2600:        [PASS][132] -> [DMESG-WARN][133]
   [132]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-snb-2600/igt@vgem_basic@dmabuf-mmap.html
   [133]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2600/igt@vgem_basic@dmabuf-mmap.html

  
#### Suppressed ####

  The following results come from untrusted machines, tests, or statuses.
  They do not affect the overall result.

  * igt@runner@aborted:
    - {fi-rkl-11500t}:    NOTRUN -> [FAIL][134]
   [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-rkl-11500t/igt@runner@aborted.html
    - {fi-tgl-dsi}:       NOTRUN -> [FAIL][135]
   [135]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-dsi/igt@runner@aborted.html
    - {fi-jsl-1}:         NOTRUN -> [FAIL][136]
   [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-jsl-1/igt@runner@aborted.html

  * igt@vgem_basic@create:
    - {fi-rkl-11500t}:    [PASS][137] -> [FAIL][138]
   [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-rkl-11500t/igt@vgem_basic@create.html
   [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-rkl-11500t/igt@vgem_basic@create.html
    - {fi-ehl-2}:         NOTRUN -> [FAIL][139] +1 similar issue
   [139]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ehl-2/igt@vgem_basic@create.html
    - {fi-jsl-1}:         [PASS][140] -> [FAIL][141]
   [140]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-jsl-1/igt@vgem_basic@create.html
   [141]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-jsl-1/igt@vgem_basic@create.html
    - {fi-tgl-dsi}:       [PASS][142] -> [FAIL][143]
   [142]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-dsi/igt@vgem_basic@create.html
   [143]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-dsi/igt@vgem_basic@create.html
    - {fi-hsw-gt1}:       [PASS][144] -> [FAIL][145]
   [144]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-hsw-gt1/igt@vgem_basic@create.html
   [145]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-gt1/igt@vgem_basic@create.html
    - {fi-ehl-1}:         [PASS][146] -> [FAIL][147]
   [146]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ehl-1/igt@vgem_basic@create.html
   [147]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ehl-1/igt@vgem_basic@create.html

  * igt@vgem_basic@dmabuf-mmap:
    - {fi-ehl-1}:         [PASS][148] -> [DMESG-WARN][149]
   [148]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ehl-1/igt@vgem_basic@dmabuf-mmap.html
   [149]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ehl-1/igt@vgem_basic@dmabuf-mmap.html
    - {fi-jsl-1}:         [PASS][150] -> [DMESG-WARN][151]
   [150]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-jsl-1/igt@vgem_basic@dmabuf-mmap.html
   [151]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-jsl-1/igt@vgem_basic@dmabuf-mmap.html
    - {fi-hsw-gt1}:       [PASS][152] -> [DMESG-WARN][153]
   [152]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-hsw-gt1/igt@vgem_basic@dmabuf-mmap.html
   [153]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-gt1/igt@vgem_basic@dmabuf-mmap.html
    - {fi-tgl-dsi}:       [PASS][154] -> [DMESG-WARN][155]
   [154]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-dsi/igt@vgem_basic@dmabuf-mmap.html
   [155]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-dsi/igt@vgem_basic@dmabuf-mmap.html
    - {fi-ehl-2}:         NOTRUN -> [DMESG-WARN][156]
   [156]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ehl-2/igt@vgem_basic@dmabuf-mmap.html
    - {fi-rkl-11500t}:    [PASS][157] -> [DMESG-WARN][158]
   [157]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-rkl-11500t/igt@vgem_basic@dmabuf-mmap.html
   [158]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-rkl-11500t/igt@vgem_basic@dmabuf-mmap.html

  
Known issues
------------

  Here are the changes found in Patchwork_19728 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@prime_self_import@basic-with_one_bo_two_files:
    - fi-tgl-y:           [PASS][159] -> [DMESG-WARN][160] ([i915#402])
   [159]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-y/igt@prime_self_import@basic-with_one_bo_two_files.html
   [160]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@prime_self_import@basic-with_one_bo_two_files.html

  * igt@runner@aborted:
    - fi-pnv-d510:        NOTRUN -> [FAIL][161] ([i915#2403] / [i915#2505])
   [161]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-pnv-d510/igt@runner@aborted.html
    - fi-glk-dsi:         NOTRUN -> [FAIL][162] ([k.org#202321])
   [162]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-glk-dsi/igt@runner@aborted.html
    - fi-bdw-5557u:       NOTRUN -> [FAIL][163] ([i915#2369])
   [163]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bdw-5557u/igt@runner@aborted.html
    - fi-hsw-4770:        NOTRUN -> [FAIL][164] ([i915#2505])
   [164]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-4770/igt@runner@aborted.html
    - fi-snb-2600:        NOTRUN -> [FAIL][165] ([i915#698])
   [165]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2600/igt@runner@aborted.html
    - fi-byt-j1900:       NOTRUN -> [FAIL][166] ([i915#2505])
   [166]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-byt-j1900/igt@runner@aborted.html

  
#### Possible fixes ####

  * igt@prime_vgem@basic-fence-flip:
    - fi-tgl-y:           [DMESG-WARN][167] ([i915#402]) -> [PASS][168]
   [167]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-y/igt@prime_vgem@basic-fence-flip.html
   [168]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@prime_vgem@basic-fence-flip.html

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285
  [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827
  [i915#1222]: https://gitlab.freedesktop.org/drm/intel/issues/1222
  [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190
  [i915#2369]: https://gitlab.freedesktop.org/drm/intel/issues/2369
  [i915#2403]: https://gitlab.freedesktop.org/drm/intel/issues/2403
  [i915#2505]: https://gitlab.freedesktop.org/drm/intel/issues/2505
  [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402
  [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533
  [i915#698]: https://gitlab.freedesktop.org/drm/intel/issues/698
  [k.org#202321]: https://bugzilla.kernel.org/show_bug.cgi?id=202321


Participating hosts (42 -> 38)
------------------------------

  Additional (1): fi-ehl-2 
  Missing    (5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-bdw-samus 


Build changes
-------------

  * Linux: CI_DRM_9804 -> Patchwork_19728

  CI-20190529: 20190529
  CI_DRM_9804: 0ed1d18cdc37ecf5e07f009a9788ea9ad74677a8 @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_6015: aa44cddf4ef689f8a3726fcbeedc03f08b12bd82 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_19728: 93fc58ee63d1e8a1289b265f4d6b75a18b222945 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

93fc58ee63d1 drm/vgem: use shmem helpers
b71cc38b23b9 dma-buf: Require VM_PFNMAP vma for mmap

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/index.html

[-- Attachment #1.2: Type: text/html, Size: 28700 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-25 10:44             ` Daniel Vetter
  (?)
@ 2021-02-25 15:49               ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 15:49 UTC (permalink / raw)
  To: Christian König
  Cc: Thomas Hellström (Intel),
	Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> > Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> > > <thomas_os@shipmail.org> wrote:
> > > >
> > > > On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > > > > <thomas_os@shipmail.org> wrote:
> > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use
> > > > > > > them like that (like calling get_user_pages works, or that they're
> > > > > > > accounting like any other normal memory) cannot be guaranteed.
> > > > > > >
> > > > > > > Since some userspace only runs on integrated devices, where all
> > > > > > > buffers are actually all resident system memory, there's a huge
> > > > > > > temptation to assume that a struct page is always present and useable
> > > > > > > like for any more pagecache backed mmap. This has the potential to
> > > > > > > result in a uapi nightmare.
> > > > > > >
> > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > > > > > > blocks get_user_pages and all the other struct page based
> > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to
> > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG.
> > > > > > >
> > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf
> > > > > > > heap to vm_insert_page instead of vm_insert_pfn.
> > > > > > >
> > > > > > > v2:
> > > > > > >
> > > > > > > Jason brought up that we also want to guarantee that all ptes have the
> > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures
> > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> > > > > > >
> > > > > > >    From auditing the various functions to insert pfn pte entires
> > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > > > > > > this should be the correct flag to check for.
> > > > > > >
> > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to
> > > > > > disallow COW mappings, since it will not work on architectures that
> > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > > > COW really makes absolutely no sense. How would we enforce this?
> > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> > > > or allowing MIXEDMAP.
> > > >
> > > > > > Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal
> > > > > > pages. That's a very old comment, though, and might not be valid anymore.
> > > > > I think that's why ttm has a page cache for these, because it indeed
> > > > > sucks. The PAT changes on pages are rather expensive.
> > > > IIRC the page cache was implemented because of the slowness of the
> > > > caching mode transition itself, more specifically the wbinvd() call +
> > > > global TLB flush.
> >
> > Yes, exactly that. The global TLB flush is what really breaks our neck here
> > from a performance perspective.
> >
> > > > > There is still an issue for iomem mappings, because the PAT validation
> > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > > > > But for i915 at least this is fixed by using the io_mapping
> > > > > infrastructure, which does the PAT reservation only once when you set
> > > > > up the mapping area at driver load.
> > > > Yes, I guess that was the issue that the comment describes, but the
> > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> > > >
> > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > > > > problem that hurts much :-)
> > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> > > >
> > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> > > Uh that's bad, because mixed maps pointing at struct page wont stop
> > > gup. At least afaik.
> >
> > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> > already seen tons of problems with the page cache.
>
> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>
> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>
> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
> in hugepte support by intentionally not being devmap.
>
> So I'm really not sure this works as we think it should. Maybe good to do
> a quick test program on amdgpu with a buffer in system memory only and try
> to do direct io into it. If it works, you have a problem, and a bad one.

That's probably impossible, since a quick git grep shows that pretty
much anything reasonable has special ptes: arc, arm, arm64, powerpc,
riscv, s390, sh, sparc, x86. I don't think you'll have a platform
where you can plug an amdgpu in and actually exercise the bug :-)

So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?
-Daniel


>
> >
> > Regards,
> > Christian.
> >
> > > Christian, do we need to patch this up, and maybe fix up ttm fault
> > > handler to use io_mapping so the vm_insert_pfn stuff is fast?
> > > -Daniel
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 15:49               ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 15:49 UTC (permalink / raw)
  To: Christian König
  Cc: Thomas Hellström (Intel),
	Matthew Wilcox, Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> > Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> > > <thomas_os@shipmail.org> wrote:
> > > >
> > > > On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > > > > <thomas_os@shipmail.org> wrote:
> > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use
> > > > > > > them like that (like calling get_user_pages works, or that they're
> > > > > > > accounting like any other normal memory) cannot be guaranteed.
> > > > > > >
> > > > > > > Since some userspace only runs on integrated devices, where all
> > > > > > > buffers are actually all resident system memory, there's a huge
> > > > > > > temptation to assume that a struct page is always present and useable
> > > > > > > like for any more pagecache backed mmap. This has the potential to
> > > > > > > result in a uapi nightmare.
> > > > > > >
> > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > > > > > > blocks get_user_pages and all the other struct page based
> > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to
> > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG.
> > > > > > >
> > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf
> > > > > > > heap to vm_insert_page instead of vm_insert_pfn.
> > > > > > >
> > > > > > > v2:
> > > > > > >
> > > > > > > Jason brought up that we also want to guarantee that all ptes have the
> > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures
> > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> > > > > > >
> > > > > > >    From auditing the various functions to insert pfn pte entires
> > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > > > > > > this should be the correct flag to check for.
> > > > > > >
> > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to
> > > > > > disallow COW mappings, since it will not work on architectures that
> > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > > > COW really makes absolutely no sense. How would we enforce this?
> > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> > > > or allowing MIXEDMAP.
> > > >
> > > > > > Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal
> > > > > > pages. That's a very old comment, though, and might not be valid anymore.
> > > > > I think that's why ttm has a page cache for these, because it indeed
> > > > > sucks. The PAT changes on pages are rather expensive.
> > > > IIRC the page cache was implemented because of the slowness of the
> > > > caching mode transition itself, more specifically the wbinvd() call +
> > > > global TLB flush.
> >
> > Yes, exactly that. The global TLB flush is what really breaks our neck here
> > from a performance perspective.
> >
> > > > > There is still an issue for iomem mappings, because the PAT validation
> > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > > > > But for i915 at least this is fixed by using the io_mapping
> > > > > infrastructure, which does the PAT reservation only once when you set
> > > > > up the mapping area at driver load.
> > > > Yes, I guess that was the issue that the comment describes, but the
> > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> > > >
> > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > > > > problem that hurts much :-)
> > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> > > >
> > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> > > Uh that's bad, because mixed maps pointing at struct page wont stop
> > > gup. At least afaik.
> >
> > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> > already seen tons of problems with the page cache.
>
> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>
> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>
> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
> in hugepte support by intentionally not being devmap.
>
> So I'm really not sure this works as we think it should. Maybe good to do
> a quick test program on amdgpu with a buffer in system memory only and try
> to do direct io into it. If it works, you have a problem, and a bad one.

That's probably impossible, since a quick git grep shows that pretty
much anything reasonable has special ptes: arc, arm, arm64, powerpc,
riscv, s390, sh, sparc, x86. I don't think you'll have a platform
where you can plug an amdgpu in and actually exercise the bug :-)

So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?
-Daniel


>
> >
> > Regards,
> > Christian.
> >
> > > Christian, do we need to patch this up, and maybe fix up ttm fault
> > > handler to use io_mapping so the vm_insert_pfn stuff is fast?
> > > -Daniel
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 15:49               ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-25 15:49 UTC (permalink / raw)
  To: Christian König
  Cc: Matthew Wilcox, Christian König,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>
> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> > Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> > > <thomas_os@shipmail.org> wrote:
> > > >
> > > > On 2/24/21 9:45 AM, Daniel Vetter wrote:
> > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> > > > > <thomas_os@shipmail.org> wrote:
> > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote:
> > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use
> > > > > > > them like that (like calling get_user_pages works, or that they're
> > > > > > > accounting like any other normal memory) cannot be guaranteed.
> > > > > > >
> > > > > > > Since some userspace only runs on integrated devices, where all
> > > > > > > buffers are actually all resident system memory, there's a huge
> > > > > > > temptation to assume that a struct page is always present and useable
> > > > > > > like for any more pagecache backed mmap. This has the potential to
> > > > > > > result in a uapi nightmare.
> > > > > > >
> > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> > > > > > > blocks get_user_pages and all the other struct page based
> > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to
> > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG.
> > > > > > >
> > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf
> > > > > > > heap to vm_insert_page instead of vm_insert_pfn.
> > > > > > >
> > > > > > > v2:
> > > > > > >
> > > > > > > Jason brought up that we also want to guarantee that all ptes have the
> > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures
> > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> > > > > > >
> > > > > > >    From auditing the various functions to insert pfn pte entires
> > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> > > > > > > this should be the correct flag to check for.
> > > > > > >
> > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to
> > > > > > disallow COW mappings, since it will not work on architectures that
> > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since
> > > > > COW really makes absolutely no sense. How would we enforce this?
> > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> > > > or allowing MIXEDMAP.
> > > >
> > > > > > Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal
> > > > > > pages. That's a very old comment, though, and might not be valid anymore.
> > > > > I think that's why ttm has a page cache for these, because it indeed
> > > > > sucks. The PAT changes on pages are rather expensive.
> > > > IIRC the page cache was implemented because of the slowness of the
> > > > caching mode transition itself, more specifically the wbinvd() call +
> > > > global TLB flush.
> >
> > Yes, exactly that. The global TLB flush is what really breaks our neck here
> > from a performance perspective.
> >
> > > > > There is still an issue for iomem mappings, because the PAT validation
> > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> > > > > But for i915 at least this is fixed by using the io_mapping
> > > > > infrastructure, which does the PAT reservation only once when you set
> > > > > up the mapping area at driver load.
> > > > Yes, I guess that was the issue that the comment describes, but the
> > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> > > >
> > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> > > > > problem that hurts much :-)
> > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> > > >
> > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> > > Uh that's bad, because mixed maps pointing at struct page wont stop
> > > gup. At least afaik.
> >
> > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> > already seen tons of problems with the page cache.
>
> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>
> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>
> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
> in hugepte support by intentionally not being devmap.
>
> So I'm really not sure this works as we think it should. Maybe good to do
> a quick test program on amdgpu with a buffer in system memory only and try
> to do direct io into it. If it works, you have a problem, and a bad one.

That's probably impossible, since a quick git grep shows that pretty
much anything reasonable has special ptes: arc, arm, arm64, powerpc,
riscv, s390, sh, sparc, x86. I don't think you'll have a platform
where you can plug an amdgpu in and actually exercise the bug :-)

So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?
-Daniel


>
> >
> > Regards,
> > Christian.
> >
> > > Christian, do we need to patch this up, and maybe fix up ttm fault
> > > handler to use io_mapping so the vm_insert_pfn stuff is fast?
> > > -Daniel
> >
>
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-25 15:49               ` Daniel Vetter
  (?)
@ 2021-02-25 16:53                 ` Christian König
  -1 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 16:53 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Thomas Hellström (Intel),
	Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 25.02.21 um 16:49 schrieb Daniel Vetter:
> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>
>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>> result in a uapi nightmare.
>>>>>>>>
>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>
>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>>
>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>
>>>>>>>>     From auditing the various functions to insert pfn pte entires
>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>> this should be the correct flag to check for.
>>>>>>>>
>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>> or allowing MIXEDMAP.
>>>>>
>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>> global TLB flush.
>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>> from a performance perspective.
>>>
>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>> up the mapping area at driver load.
>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>
>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>> problem that hurts much :-)
>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_bo_vm.c%23L554&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=7%2BO0WNdBF62eVDy7u4hRydsfviF6dBJEDeZiYIzQAcc%3D&amp;reserved=0
>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>> gup. At least afaik.
>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>> already seen tons of problems with the page cache.
>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>
>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>
>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>> in hugepte support by intentionally not being devmap.
>>
>> So I'm really not sure this works as we think it should. Maybe good to do
>> a quick test program on amdgpu with a buffer in system memory only and try
>> to do direct io into it. If it works, you have a problem, and a bad one.
> That's probably impossible, since a quick git grep shows that pretty
> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> where you can plug an amdgpu in and actually exercise the bug :-)
>
> So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?

Maybe yes, but not sure.

I've once had a request to do this from some google guys, but rejected 
it because I wasn't sure of the consequences.

Christian.

> -Daniel
>
>
>>> Regards,
>>> Christian.
>>>
>>>> Christian, do we need to patch this up, and maybe fix up ttm fault
>>>> handler to use io_mapping so the vm_insert_pfn stuff is fast?
>>>> -Daniel
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PmdIbYM6kemXstScf2OoZU9YyXGGzzNzeWEyL8ZDnfo%3D&amp;reserved=0
>
>


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 16:53                 ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 16:53 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Thomas Hellström (Intel),
	Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Jason Gunthorpe, DRI Development, Daniel Vetter,
	Suren Baghdasaryan, Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 25.02.21 um 16:49 schrieb Daniel Vetter:
> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>
>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>> result in a uapi nightmare.
>>>>>>>>
>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>
>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>>
>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>
>>>>>>>>     From auditing the various functions to insert pfn pte entires
>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>> this should be the correct flag to check for.
>>>>>>>>
>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>> or allowing MIXEDMAP.
>>>>>
>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>> global TLB flush.
>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>> from a performance perspective.
>>>
>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>> up the mapping area at driver load.
>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>
>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>> problem that hurts much :-)
>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_bo_vm.c%23L554&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=7%2BO0WNdBF62eVDy7u4hRydsfviF6dBJEDeZiYIzQAcc%3D&amp;reserved=0
>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>> gup. At least afaik.
>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>> already seen tons of problems with the page cache.
>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>
>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>
>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>> in hugepte support by intentionally not being devmap.
>>
>> So I'm really not sure this works as we think it should. Maybe good to do
>> a quick test program on amdgpu with a buffer in system memory only and try
>> to do direct io into it. If it works, you have a problem, and a bad one.
> That's probably impossible, since a quick git grep shows that pretty
> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> where you can plug an amdgpu in and actually exercise the bug :-)
>
> So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?

Maybe yes, but not sure.

I've once had a request to do this from some google guys, but rejected 
it because I wasn't sure of the consequences.

Christian.

> -Daniel
>
>
>>> Regards,
>>> Christian.
>>>
>>>> Christian, do we need to patch this up, and maybe fix up ttm fault
>>>> handler to use io_mapping so the vm_insert_pfn stuff is fast?
>>>> -Daniel
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PmdIbYM6kemXstScf2OoZU9YyXGGzzNzeWEyL8ZDnfo%3D&amp;reserved=0
>
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-25 16:53                 ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-02-25 16:53 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter,
	Suren Baghdasaryan, Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 25.02.21 um 16:49 schrieb Daniel Vetter:
> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>
>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>> result in a uapi nightmare.
>>>>>>>>
>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>
>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>>
>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>
>>>>>>>>     From auditing the various functions to insert pfn pte entires
>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>> this should be the correct flag to check for.
>>>>>>>>
>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>> or allowing MIXEDMAP.
>>>>>
>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>> global TLB flush.
>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>> from a performance perspective.
>>>
>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>> up the mapping area at driver load.
>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>
>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>> problem that hurts much :-)
>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>
>>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_bo_vm.c%23L554&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=7%2BO0WNdBF62eVDy7u4hRydsfviF6dBJEDeZiYIzQAcc%3D&amp;reserved=0
>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>> gup. At least afaik.
>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>> already seen tons of problems with the page cache.
>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>
>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>
>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>> in hugepte support by intentionally not being devmap.
>>
>> So I'm really not sure this works as we think it should. Maybe good to do
>> a quick test program on amdgpu with a buffer in system memory only and try
>> to do direct io into it. If it works, you have a problem, and a bad one.
> That's probably impossible, since a quick git grep shows that pretty
> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> where you can plug an amdgpu in and actually exercise the bug :-)
>
> So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?

Maybe yes, but not sure.

I've once had a request to do this from some google guys, but rejected 
it because I wasn't sure of the consequences.

Christian.

> -Daniel
>
>
>>> Regards,
>>> Christian.
>>>
>>>> Christian, do we need to patch this up, and maybe fix up ttm fault
>>>> handler to use io_mapping so the vm_insert_pfn stuff is fast?
>>>> -Daniel
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&amp;data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&amp;sdata=PmdIbYM6kemXstScf2OoZU9YyXGGzzNzeWEyL8ZDnfo%3D&amp;reserved=0
>
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-23 10:59 ` Daniel Vetter
  (?)
@ 2021-02-26  3:57   ` John Stultz
  -1 siblings, 0 replies; 110+ messages in thread
From: John Stultz @ 2021-02-26  3:57 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: DRI Development, Intel Graphics Development,
	Christian König, Jason Gunthorpe, Suren Baghdasaryan,
	Matthew Wilcox, Daniel Vetter, Sumit Semwal, linux-media,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Hridya Valsaraju

On Tue, Feb 23, 2021 at 3:00 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> tldr; DMA buffers aren't normal memory, expecting that you can use
> them like that (like calling get_user_pages works, or that they're
> accounting like any other normal memory) cannot be guaranteed.
>
> Since some userspace only runs on integrated devices, where all
> buffers are actually all resident system memory, there's a huge
> temptation to assume that a struct page is always present and useable
> like for any more pagecache backed mmap. This has the potential to
> result in a uapi nightmare.
>
> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> blocks get_user_pages and all the other struct page based
> infrastructure for everyone. In spirit this is the uapi counterpart to
> the kernel-internal CONFIG_DMABUF_DEBUG.
>
> Motivated by a recent patch which wanted to swich the system dma-buf
> heap to vm_insert_page instead of vm_insert_pfn.
>
> v2:
>
> Jason brought up that we also want to guarantee that all ptes have the
> pte_special flag set, to catch fast get_user_pages (on architectures
> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>
> From auditing the various functions to insert pfn pte entires
> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> this should be the correct flag to check for.
>
> References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/
> Acked-by: Christian König <christian.koenig@amd.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-buf.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)


So I gave this a spin in a few of my environments, and with the
current dmabuf heaps it spews a lot of warnings.

I'm testing some simple fixes to add:
    vma->vm_flags |= VM_PFNMAP;

to the dmabuf heap mmap ops, which we might want to queue along side of this.

So assuming those can land together.
Acked-by: John Stultz <john.stultz@linaro.org>

thanks
-john

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-26  3:57   ` John Stultz
  0 siblings, 0 replies; 110+ messages in thread
From: John Stultz @ 2021-02-26  3:57 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Hridya Valsaraju, Daniel Vetter,
	Suren Baghdasaryan, Christian König, linux-media

On Tue, Feb 23, 2021 at 3:00 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> tldr; DMA buffers aren't normal memory, expecting that you can use
> them like that (like calling get_user_pages works, or that they're
> accounting like any other normal memory) cannot be guaranteed.
>
> Since some userspace only runs on integrated devices, where all
> buffers are actually all resident system memory, there's a huge
> temptation to assume that a struct page is always present and useable
> like for any more pagecache backed mmap. This has the potential to
> result in a uapi nightmare.
>
> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> blocks get_user_pages and all the other struct page based
> infrastructure for everyone. In spirit this is the uapi counterpart to
> the kernel-internal CONFIG_DMABUF_DEBUG.
>
> Motivated by a recent patch which wanted to swich the system dma-buf
> heap to vm_insert_page instead of vm_insert_pfn.
>
> v2:
>
> Jason brought up that we also want to guarantee that all ptes have the
> pte_special flag set, to catch fast get_user_pages (on architectures
> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>
> From auditing the various functions to insert pfn pte entires
> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> this should be the correct flag to check for.
>
> References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/
> Acked-by: Christian König <christian.koenig@amd.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-buf.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)


So I gave this a spin in a few of my environments, and with the
current dmabuf heaps it spews a lot of warnings.

I'm testing some simple fixes to add:
    vma->vm_flags |= VM_PFNMAP;

to the dmabuf heap mmap ops, which we might want to queue along side of this.

So assuming those can land together.
Acked-by: John Stultz <john.stultz@linaro.org>

thanks
-john
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-26  3:57   ` John Stultz
  0 siblings, 0 replies; 110+ messages in thread
From: John Stultz @ 2021-02-26  3:57 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, Matthew Wilcox, Sumit Semwal,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Hridya Valsaraju, Daniel Vetter,
	Suren Baghdasaryan, Christian König, linux-media

On Tue, Feb 23, 2021 at 3:00 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>
> tldr; DMA buffers aren't normal memory, expecting that you can use
> them like that (like calling get_user_pages works, or that they're
> accounting like any other normal memory) cannot be guaranteed.
>
> Since some userspace only runs on integrated devices, where all
> buffers are actually all resident system memory, there's a huge
> temptation to assume that a struct page is always present and useable
> like for any more pagecache backed mmap. This has the potential to
> result in a uapi nightmare.
>
> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> blocks get_user_pages and all the other struct page based
> infrastructure for everyone. In spirit this is the uapi counterpart to
> the kernel-internal CONFIG_DMABUF_DEBUG.
>
> Motivated by a recent patch which wanted to swich the system dma-buf
> heap to vm_insert_page instead of vm_insert_pfn.
>
> v2:
>
> Jason brought up that we also want to guarantee that all ptes have the
> pte_special flag set, to catch fast get_user_pages (on architectures
> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>
> From auditing the various functions to insert pfn pte entires
> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> this should be the correct flag to check for.
>
> References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/
> Acked-by: Christian König <christian.koenig@amd.com>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: John Stultz <john.stultz@linaro.org>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Cc: linux-media@vger.kernel.org
> Cc: linaro-mm-sig@lists.linaro.org
> ---
>  drivers/dma-buf/dma-buf.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)


So I gave this a spin in a few of my environments, and with the
current dmabuf heaps it spews a lot of warnings.

I'm testing some simple fixes to add:
    vma->vm_flags |= VM_PFNMAP;

to the dmabuf heap mmap ops, which we might want to queue along side of this.

So assuming those can land together.
Acked-by: John Stultz <john.stultz@linaro.org>

thanks
-john
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH] drm/vgem: use shmem helpers
  2021-02-25 10:23     ` [Intel-gfx] " Daniel Vetter
@ 2021-02-26  9:19       ` Thomas Zimmermann
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Zimmermann @ 2021-02-26  9:19 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Christian König, Melissa Wen,
	Daniel Vetter, Chris Wilson


[-- Attachment #1.1.1: Type: text/plain, Size: 12676 bytes --]

Hi

Am 25.02.21 um 11:23 schrieb Daniel Vetter:
> Aside from deleting lots of code the real motivation here is to switch
> the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> work, and even if you try and there's a struct page behind that,
> touching it and mucking around with its refcount can upset drivers
> real bad.
> 
> v2: Review from Thomas:
> - sort #include
> - drop more dead code that I didn't spot somehow
> 
> v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)

Since you're working on it, could you move the config item into a 
Kconfig file under vgem?

Best regards
Thomas

> 
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Melissa Wen <melissa.srw@gmail.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/Kconfig         |   1 +
>   drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
>   2 files changed, 4 insertions(+), 337 deletions(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 8e73311de583..94e4ac830283 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
>   config DRM_VGEM
>   	tristate "Virtual GEM provider"
>   	depends on DRM
> +	select DRM_GEM_SHMEM_HELPER
>   	help
>   	  Choose this option to get a virtual graphics memory manager,
>   	  as used by Mesa's software renderer for enhanced performance.
> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> index a0e75f1d5d01..b1b3a5ffc542 100644
> --- a/drivers/gpu/drm/vgem/vgem_drv.c
> +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> @@ -38,6 +38,7 @@
>   
>   #include <drm/drm_drv.h>
>   #include <drm/drm_file.h>
> +#include <drm/drm_gem_shmem_helper.h>
>   #include <drm/drm_ioctl.h>
>   #include <drm/drm_managed.h>
>   #include <drm/drm_prime.h>
> @@ -50,87 +51,11 @@
>   #define DRIVER_MAJOR	1
>   #define DRIVER_MINOR	0
>   
> -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
> -
>   static struct vgem_device {
>   	struct drm_device drm;
>   	struct platform_device *platform;
>   } *vgem_device;
>   
> -static void vgem_gem_free_object(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
> -
> -	kvfree(vgem_obj->pages);
> -	mutex_destroy(&vgem_obj->pages_lock);
> -
> -	if (obj->import_attach)
> -		drm_prime_gem_destroy(obj, vgem_obj->table);
> -
> -	drm_gem_object_release(obj);
> -	kfree(vgem_obj);
> -}
> -
> -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
> -{
> -	struct vm_area_struct *vma = vmf->vma;
> -	struct drm_vgem_gem_object *obj = vma->vm_private_data;
> -	/* We don't use vmf->pgoff since that has the fake offset */
> -	unsigned long vaddr = vmf->address;
> -	vm_fault_t ret = VM_FAULT_SIGBUS;
> -	loff_t num_pages;
> -	pgoff_t page_offset;
> -	page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
> -
> -	num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
> -
> -	if (page_offset >= num_pages)
> -		return VM_FAULT_SIGBUS;
> -
> -	mutex_lock(&obj->pages_lock);
> -	if (obj->pages) {
> -		get_page(obj->pages[page_offset]);
> -		vmf->page = obj->pages[page_offset];
> -		ret = 0;
> -	}
> -	mutex_unlock(&obj->pages_lock);
> -	if (ret) {
> -		struct page *page;
> -
> -		page = shmem_read_mapping_page(
> -					file_inode(obj->base.filp)->i_mapping,
> -					page_offset);
> -		if (!IS_ERR(page)) {
> -			vmf->page = page;
> -			ret = 0;
> -		} else switch (PTR_ERR(page)) {
> -			case -ENOSPC:
> -			case -ENOMEM:
> -				ret = VM_FAULT_OOM;
> -				break;
> -			case -EBUSY:
> -				ret = VM_FAULT_RETRY;
> -				break;
> -			case -EFAULT:
> -			case -EINVAL:
> -				ret = VM_FAULT_SIGBUS;
> -				break;
> -			default:
> -				WARN_ON(PTR_ERR(page));
> -				ret = VM_FAULT_SIGBUS;
> -				break;
> -		}
> -
> -	}
> -	return ret;
> -}
> -
> -static const struct vm_operations_struct vgem_gem_vm_ops = {
> -	.fault = vgem_gem_fault,
> -	.open = drm_gem_vm_open,
> -	.close = drm_gem_vm_close,
> -};
> -
>   static int vgem_open(struct drm_device *dev, struct drm_file *file)
>   {
>   	struct vgem_file *vfile;
> @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
>   	kfree(vfile);
>   }
>   
> -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> -						unsigned long size)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int ret;
> -
> -	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> -	if (!obj)
> -		return ERR_PTR(-ENOMEM);
> -
> -	obj->base.funcs = &vgem_gem_object_funcs;
> -
> -	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
> -	if (ret) {
> -		kfree(obj);
> -		return ERR_PTR(ret);
> -	}
> -
> -	mutex_init(&obj->pages_lock);
> -
> -	return obj;
> -}
> -
> -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
> -{
> -	drm_gem_object_release(&obj->base);
> -	kfree(obj);
> -}
> -
> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> -					      struct drm_file *file,
> -					      unsigned int *handle,
> -					      unsigned long size)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int ret;
> -
> -	obj = __vgem_gem_create(dev, size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	ret = drm_gem_handle_create(file, &obj->base, handle);
> -	if (ret) {
> -		drm_gem_object_put(&obj->base);
> -		return ERR_PTR(ret);
> -	}
> -
> -	return &obj->base;
> -}
> -
> -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
> -				struct drm_mode_create_dumb *args)
> -{
> -	struct drm_gem_object *gem_object;
> -	u64 pitch, size;
> -
> -	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
> -	size = args->height * pitch;
> -	if (size == 0)
> -		return -EINVAL;
> -
> -	gem_object = vgem_gem_create(dev, file, &args->handle, size);
> -	if (IS_ERR(gem_object))
> -		return PTR_ERR(gem_object);
> -
> -	args->size = gem_object->size;
> -	args->pitch = pitch;
> -
> -	drm_gem_object_put(gem_object);
> -
> -	DRM_DEBUG("Created object of size %llu\n", args->size);
> -
> -	return 0;
> -}
> -
>   static struct drm_ioctl_desc vgem_ioctls[] = {
>   	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
>   };
>   
> -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> -{
> -	unsigned long flags = vma->vm_flags;
> -	int ret;
> -
> -	ret = drm_gem_mmap(filp, vma);
> -	if (ret)
> -		return ret;
> -
> -	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
> -	 * are ordinary and not special.
> -	 */
> -	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> -	return 0;
> -}
> -
> -static const struct file_operations vgem_driver_fops = {
> -	.owner		= THIS_MODULE,
> -	.open		= drm_open,
> -	.mmap		= vgem_mmap,
> -	.poll		= drm_poll,
> -	.read		= drm_read,
> -	.unlocked_ioctl = drm_ioctl,
> -	.compat_ioctl	= drm_compat_ioctl,
> -	.release	= drm_release,
> -};
> -
> -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
> -{
> -	mutex_lock(&bo->pages_lock);
> -	if (bo->pages_pin_count++ == 0) {
> -		struct page **pages;
> -
> -		pages = drm_gem_get_pages(&bo->base);
> -		if (IS_ERR(pages)) {
> -			bo->pages_pin_count--;
> -			mutex_unlock(&bo->pages_lock);
> -			return pages;
> -		}
> -
> -		bo->pages = pages;
> -	}
> -	mutex_unlock(&bo->pages_lock);
> -
> -	return bo->pages;
> -}
> -
> -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
> -{
> -	mutex_lock(&bo->pages_lock);
> -	if (--bo->pages_pin_count == 0) {
> -		drm_gem_put_pages(&bo->base, bo->pages, true, true);
> -		bo->pages = NULL;
> -	}
> -	mutex_unlock(&bo->pages_lock);
> -}
> -
> -static int vgem_prime_pin(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -	long n_pages = obj->size >> PAGE_SHIFT;
> -	struct page **pages;
> -
> -	pages = vgem_pin_pages(bo);
> -	if (IS_ERR(pages))
> -		return PTR_ERR(pages);
> -
> -	/* Flush the object from the CPU cache so that importers can rely
> -	 * on coherent indirect access via the exported dma-address.
> -	 */
> -	drm_clflush_pages(pages, n_pages);
> -
> -	return 0;
> -}
> -
> -static void vgem_prime_unpin(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	vgem_unpin_pages(bo);
> -}
> -
> -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
> -}
> -
> -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
> -						struct dma_buf *dma_buf)
> -{
> -	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
> -
> -	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
> -}
> -
> -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
> -			struct dma_buf_attachment *attach, struct sg_table *sg)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int npages;
> -
> -	obj = __vgem_gem_create(dev, attach->dmabuf->size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
> -
> -	obj->table = sg;
> -	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
> -	if (!obj->pages) {
> -		__vgem_gem_destroy(obj);
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	obj->pages_pin_count++; /* perma-pinned */
> -	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
> -	return &obj->base;
> -}
> -
> -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -	long n_pages = obj->size >> PAGE_SHIFT;
> -	struct page **pages;
> -	void *vaddr;
> -
> -	pages = vgem_pin_pages(bo);
> -	if (IS_ERR(pages))
> -		return PTR_ERR(pages);
> -
> -	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> -	if (!vaddr)
> -		return -ENOMEM;
> -	dma_buf_map_set_vaddr(map, vaddr);
> -
> -	return 0;
> -}
> -
> -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	vunmap(map->vaddr);
> -	vgem_unpin_pages(bo);
> -}
> -
> -static int vgem_prime_mmap(struct drm_gem_object *obj,
> -			   struct vm_area_struct *vma)
> -{
> -	int ret;
> -
> -	if (obj->size < vma->vm_end - vma->vm_start)
> -		return -EINVAL;
> -
> -	if (!obj->filp)
> -		return -ENODEV;
> -
> -	ret = call_mmap(obj->filp, vma);
> -	if (ret)
> -		return ret;
> -
> -	vma_set_file(vma, obj->filp);
> -	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> -	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> -
> -	return 0;
> -}
> -
> -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
> -	.free = vgem_gem_free_object,
> -	.pin = vgem_prime_pin,
> -	.unpin = vgem_prime_unpin,
> -	.get_sg_table = vgem_prime_get_sg_table,
> -	.vmap = vgem_prime_vmap,
> -	.vunmap = vgem_prime_vunmap,
> -	.vm_ops = &vgem_gem_vm_ops,
> -};
> +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
>   
>   static const struct drm_driver vgem_driver = {
>   	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
> @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
>   	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
>   	.fops				= &vgem_driver_fops,
>   
> -	.dumb_create			= vgem_gem_dumb_create,
> -
> -	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> -	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> -	.gem_prime_import = vgem_prime_import,
> -	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
> -	.gem_prime_mmap = vgem_prime_mmap,
> +	DRM_GEM_SHMEM_DRIVER_OPS,
>   
>   	.name	= DRIVER_NAME,
>   	.desc	= DRIVER_DESC,
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/vgem: use shmem helpers
@ 2021-02-26  9:19       ` Thomas Zimmermann
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Zimmermann @ 2021-02-26  9:19 UTC (permalink / raw)
  To: Daniel Vetter, DRI Development
  Cc: Intel Graphics Development, Christian König, Melissa Wen,
	John Stultz, Daniel Vetter, Chris Wilson, Sumit Semwal


[-- Attachment #1.1.1: Type: text/plain, Size: 12676 bytes --]

Hi

Am 25.02.21 um 11:23 schrieb Daniel Vetter:
> Aside from deleting lots of code the real motivation here is to switch
> the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> work, and even if you try and there's a struct page behind that,
> touching it and mucking around with its refcount can upset drivers
> real bad.
> 
> v2: Review from Thomas:
> - sort #include
> - drop more dead code that I didn't spot somehow
> 
> v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)

Since you're working on it, could you move the config item into a 
Kconfig file under vgem?

Best regards
Thomas

> 
> Cc: Thomas Zimmermann <tzimmermann@suse.de>
> Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: "Christian König" <christian.koenig@amd.com>
> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> Cc: Melissa Wen <melissa.srw@gmail.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/Kconfig         |   1 +
>   drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
>   2 files changed, 4 insertions(+), 337 deletions(-)
> 
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 8e73311de583..94e4ac830283 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
>   config DRM_VGEM
>   	tristate "Virtual GEM provider"
>   	depends on DRM
> +	select DRM_GEM_SHMEM_HELPER
>   	help
>   	  Choose this option to get a virtual graphics memory manager,
>   	  as used by Mesa's software renderer for enhanced performance.
> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> index a0e75f1d5d01..b1b3a5ffc542 100644
> --- a/drivers/gpu/drm/vgem/vgem_drv.c
> +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> @@ -38,6 +38,7 @@
>   
>   #include <drm/drm_drv.h>
>   #include <drm/drm_file.h>
> +#include <drm/drm_gem_shmem_helper.h>
>   #include <drm/drm_ioctl.h>
>   #include <drm/drm_managed.h>
>   #include <drm/drm_prime.h>
> @@ -50,87 +51,11 @@
>   #define DRIVER_MAJOR	1
>   #define DRIVER_MINOR	0
>   
> -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
> -
>   static struct vgem_device {
>   	struct drm_device drm;
>   	struct platform_device *platform;
>   } *vgem_device;
>   
> -static void vgem_gem_free_object(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
> -
> -	kvfree(vgem_obj->pages);
> -	mutex_destroy(&vgem_obj->pages_lock);
> -
> -	if (obj->import_attach)
> -		drm_prime_gem_destroy(obj, vgem_obj->table);
> -
> -	drm_gem_object_release(obj);
> -	kfree(vgem_obj);
> -}
> -
> -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
> -{
> -	struct vm_area_struct *vma = vmf->vma;
> -	struct drm_vgem_gem_object *obj = vma->vm_private_data;
> -	/* We don't use vmf->pgoff since that has the fake offset */
> -	unsigned long vaddr = vmf->address;
> -	vm_fault_t ret = VM_FAULT_SIGBUS;
> -	loff_t num_pages;
> -	pgoff_t page_offset;
> -	page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
> -
> -	num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
> -
> -	if (page_offset >= num_pages)
> -		return VM_FAULT_SIGBUS;
> -
> -	mutex_lock(&obj->pages_lock);
> -	if (obj->pages) {
> -		get_page(obj->pages[page_offset]);
> -		vmf->page = obj->pages[page_offset];
> -		ret = 0;
> -	}
> -	mutex_unlock(&obj->pages_lock);
> -	if (ret) {
> -		struct page *page;
> -
> -		page = shmem_read_mapping_page(
> -					file_inode(obj->base.filp)->i_mapping,
> -					page_offset);
> -		if (!IS_ERR(page)) {
> -			vmf->page = page;
> -			ret = 0;
> -		} else switch (PTR_ERR(page)) {
> -			case -ENOSPC:
> -			case -ENOMEM:
> -				ret = VM_FAULT_OOM;
> -				break;
> -			case -EBUSY:
> -				ret = VM_FAULT_RETRY;
> -				break;
> -			case -EFAULT:
> -			case -EINVAL:
> -				ret = VM_FAULT_SIGBUS;
> -				break;
> -			default:
> -				WARN_ON(PTR_ERR(page));
> -				ret = VM_FAULT_SIGBUS;
> -				break;
> -		}
> -
> -	}
> -	return ret;
> -}
> -
> -static const struct vm_operations_struct vgem_gem_vm_ops = {
> -	.fault = vgem_gem_fault,
> -	.open = drm_gem_vm_open,
> -	.close = drm_gem_vm_close,
> -};
> -
>   static int vgem_open(struct drm_device *dev, struct drm_file *file)
>   {
>   	struct vgem_file *vfile;
> @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
>   	kfree(vfile);
>   }
>   
> -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> -						unsigned long size)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int ret;
> -
> -	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> -	if (!obj)
> -		return ERR_PTR(-ENOMEM);
> -
> -	obj->base.funcs = &vgem_gem_object_funcs;
> -
> -	ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
> -	if (ret) {
> -		kfree(obj);
> -		return ERR_PTR(ret);
> -	}
> -
> -	mutex_init(&obj->pages_lock);
> -
> -	return obj;
> -}
> -
> -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
> -{
> -	drm_gem_object_release(&obj->base);
> -	kfree(obj);
> -}
> -
> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> -					      struct drm_file *file,
> -					      unsigned int *handle,
> -					      unsigned long size)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int ret;
> -
> -	obj = __vgem_gem_create(dev, size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	ret = drm_gem_handle_create(file, &obj->base, handle);
> -	if (ret) {
> -		drm_gem_object_put(&obj->base);
> -		return ERR_PTR(ret);
> -	}
> -
> -	return &obj->base;
> -}
> -
> -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
> -				struct drm_mode_create_dumb *args)
> -{
> -	struct drm_gem_object *gem_object;
> -	u64 pitch, size;
> -
> -	pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
> -	size = args->height * pitch;
> -	if (size == 0)
> -		return -EINVAL;
> -
> -	gem_object = vgem_gem_create(dev, file, &args->handle, size);
> -	if (IS_ERR(gem_object))
> -		return PTR_ERR(gem_object);
> -
> -	args->size = gem_object->size;
> -	args->pitch = pitch;
> -
> -	drm_gem_object_put(gem_object);
> -
> -	DRM_DEBUG("Created object of size %llu\n", args->size);
> -
> -	return 0;
> -}
> -
>   static struct drm_ioctl_desc vgem_ioctls[] = {
>   	DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
>   	DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
>   };
>   
> -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> -{
> -	unsigned long flags = vma->vm_flags;
> -	int ret;
> -
> -	ret = drm_gem_mmap(filp, vma);
> -	if (ret)
> -		return ret;
> -
> -	/* Keep the WC mmaping set by drm_gem_mmap() but our pages
> -	 * are ordinary and not special.
> -	 */
> -	vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> -	return 0;
> -}
> -
> -static const struct file_operations vgem_driver_fops = {
> -	.owner		= THIS_MODULE,
> -	.open		= drm_open,
> -	.mmap		= vgem_mmap,
> -	.poll		= drm_poll,
> -	.read		= drm_read,
> -	.unlocked_ioctl = drm_ioctl,
> -	.compat_ioctl	= drm_compat_ioctl,
> -	.release	= drm_release,
> -};
> -
> -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
> -{
> -	mutex_lock(&bo->pages_lock);
> -	if (bo->pages_pin_count++ == 0) {
> -		struct page **pages;
> -
> -		pages = drm_gem_get_pages(&bo->base);
> -		if (IS_ERR(pages)) {
> -			bo->pages_pin_count--;
> -			mutex_unlock(&bo->pages_lock);
> -			return pages;
> -		}
> -
> -		bo->pages = pages;
> -	}
> -	mutex_unlock(&bo->pages_lock);
> -
> -	return bo->pages;
> -}
> -
> -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
> -{
> -	mutex_lock(&bo->pages_lock);
> -	if (--bo->pages_pin_count == 0) {
> -		drm_gem_put_pages(&bo->base, bo->pages, true, true);
> -		bo->pages = NULL;
> -	}
> -	mutex_unlock(&bo->pages_lock);
> -}
> -
> -static int vgem_prime_pin(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -	long n_pages = obj->size >> PAGE_SHIFT;
> -	struct page **pages;
> -
> -	pages = vgem_pin_pages(bo);
> -	if (IS_ERR(pages))
> -		return PTR_ERR(pages);
> -
> -	/* Flush the object from the CPU cache so that importers can rely
> -	 * on coherent indirect access via the exported dma-address.
> -	 */
> -	drm_clflush_pages(pages, n_pages);
> -
> -	return 0;
> -}
> -
> -static void vgem_prime_unpin(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	vgem_unpin_pages(bo);
> -}
> -
> -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
> -}
> -
> -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
> -						struct dma_buf *dma_buf)
> -{
> -	struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
> -
> -	return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
> -}
> -
> -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
> -			struct dma_buf_attachment *attach, struct sg_table *sg)
> -{
> -	struct drm_vgem_gem_object *obj;
> -	int npages;
> -
> -	obj = __vgem_gem_create(dev, attach->dmabuf->size);
> -	if (IS_ERR(obj))
> -		return ERR_CAST(obj);
> -
> -	npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
> -
> -	obj->table = sg;
> -	obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
> -	if (!obj->pages) {
> -		__vgem_gem_destroy(obj);
> -		return ERR_PTR(-ENOMEM);
> -	}
> -
> -	obj->pages_pin_count++; /* perma-pinned */
> -	drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
> -	return &obj->base;
> -}
> -
> -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -	long n_pages = obj->size >> PAGE_SHIFT;
> -	struct page **pages;
> -	void *vaddr;
> -
> -	pages = vgem_pin_pages(bo);
> -	if (IS_ERR(pages))
> -		return PTR_ERR(pages);
> -
> -	vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> -	if (!vaddr)
> -		return -ENOMEM;
> -	dma_buf_map_set_vaddr(map, vaddr);
> -
> -	return 0;
> -}
> -
> -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> -{
> -	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> -
> -	vunmap(map->vaddr);
> -	vgem_unpin_pages(bo);
> -}
> -
> -static int vgem_prime_mmap(struct drm_gem_object *obj,
> -			   struct vm_area_struct *vma)
> -{
> -	int ret;
> -
> -	if (obj->size < vma->vm_end - vma->vm_start)
> -		return -EINVAL;
> -
> -	if (!obj->filp)
> -		return -ENODEV;
> -
> -	ret = call_mmap(obj->filp, vma);
> -	if (ret)
> -		return ret;
> -
> -	vma_set_file(vma, obj->filp);
> -	vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> -	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> -
> -	return 0;
> -}
> -
> -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
> -	.free = vgem_gem_free_object,
> -	.pin = vgem_prime_pin,
> -	.unpin = vgem_prime_unpin,
> -	.get_sg_table = vgem_prime_get_sg_table,
> -	.vmap = vgem_prime_vmap,
> -	.vunmap = vgem_prime_vunmap,
> -	.vm_ops = &vgem_gem_vm_ops,
> -};
> +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
>   
>   static const struct drm_driver vgem_driver = {
>   	.driver_features		= DRIVER_GEM | DRIVER_RENDER,
> @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
>   	.num_ioctls 			= ARRAY_SIZE(vgem_ioctls),
>   	.fops				= &vgem_driver_fops,
>   
> -	.dumb_create			= vgem_gem_dumb_create,
> -
> -	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> -	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> -	.gem_prime_import = vgem_prime_import,
> -	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
> -	.gem_prime_mmap = vgem_prime_mmap,
> +	DRM_GEM_SHMEM_DRIVER_OPS,
>   
>   	.name	= DRIVER_NAME,
>   	.desc	= DRIVER_DESC,
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-25 15:49               ` Daniel Vetter
  (?)
@ 2021-02-26  9:41                 ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-26  9:41 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/25/21 4:49 PM, Daniel Vetter wrote:
> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>
>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>> result in a uapi nightmare.
>>>>>>>>
>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>
>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>>
>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>
>>>>>>>>     From auditing the various functions to insert pfn pte entires
>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>> this should be the correct flag to check for.
>>>>>>>>
>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>> or allowing MIXEDMAP.
>>>>>
>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>> global TLB flush.
>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>> from a performance perspective.
>>>
>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>> up the mapping area at driver load.
>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>
>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>> problem that hurts much :-)
>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>
>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>> gup. At least afaik.
>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>> already seen tons of problems with the page cache.
>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>
>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>
>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>> in hugepte support by intentionally not being devmap.
>>
>> So I'm really not sure this works as we think it should. Maybe good to do
>> a quick test program on amdgpu with a buffer in system memory only and try
>> to do direct io into it. If it works, you have a problem, and a bad one.
> That's probably impossible, since a quick git grep shows that pretty
> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> where you can plug an amdgpu in and actually exercise the bug :-)

Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I 
don't see what should be stopping gup to those?

/Thomas



>
> So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?
> -Daniel
>
>
>>> Regards,
>>> Christian.
>>>
>>>> Christian, do we need to patch this up, and maybe fix up ttm fault
>>>> handler to use io_mapping so the vm_insert_pfn stuff is fast?
>>>> -Daniel
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch
>
>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-26  9:41                 ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-26  9:41 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	Matthew Wilcox, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/25/21 4:49 PM, Daniel Vetter wrote:
> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>
>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>> result in a uapi nightmare.
>>>>>>>>
>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>
>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>>
>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>
>>>>>>>>     From auditing the various functions to insert pfn pte entires
>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>> this should be the correct flag to check for.
>>>>>>>>
>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>> or allowing MIXEDMAP.
>>>>>
>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>> global TLB flush.
>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>> from a performance perspective.
>>>
>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>> up the mapping area at driver load.
>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>
>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>> problem that hurts much :-)
>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>
>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>> gup. At least afaik.
>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>> already seen tons of problems with the page cache.
>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>
>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>
>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>> in hugepte support by intentionally not being devmap.
>>
>> So I'm really not sure this works as we think it should. Maybe good to do
>> a quick test program on amdgpu with a buffer in system memory only and try
>> to do direct io into it. If it works, you have a problem, and a bad one.
> That's probably impossible, since a quick git grep shows that pretty
> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> where you can plug an amdgpu in and actually exercise the bug :-)

Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I 
don't see what should be stopping gup to those?

/Thomas



>
> So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?
> -Daniel
>
>
>>> Regards,
>>> Christian.
>>>
>>>> Christian, do we need to patch this up, and maybe fix up ttm fault
>>>> handler to use io_mapping so the vm_insert_pfn stuff is fast?
>>>> -Daniel
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-26  9:41                 ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-26  9:41 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Intel Graphics Development, DRI Development,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, Matthew Wilcox, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/25/21 4:49 PM, Daniel Vetter wrote:
> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>
>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>> result in a uapi nightmare.
>>>>>>>>
>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>
>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>
>>>>>>>> v2:
>>>>>>>>
>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>
>>>>>>>>     From auditing the various functions to insert pfn pte entires
>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>> this should be the correct flag to check for.
>>>>>>>>
>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>> or allowing MIXEDMAP.
>>>>>
>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>> global TLB flush.
>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>> from a performance perspective.
>>>
>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>> up the mapping area at driver load.
>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>
>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>> problem that hurts much :-)
>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>
>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>> gup. At least afaik.
>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>> already seen tons of problems with the page cache.
>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>
>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>
>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>> in hugepte support by intentionally not being devmap.
>>
>> So I'm really not sure this works as we think it should. Maybe good to do
>> a quick test program on amdgpu with a buffer in system memory only and try
>> to do direct io into it. If it works, you have a problem, and a bad one.
> That's probably impossible, since a quick git grep shows that pretty
> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> where you can plug an amdgpu in and actually exercise the bug :-)

Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I 
don't see what should be stopping gup to those?

/Thomas



>
> So maybe we should just switch over to VM_PFNMAP for ttm for more clarity?
> -Daniel
>
>
>>> Regards,
>>> Christian.
>>>
>>>> Christian, do we need to patch this up, and maybe fix up ttm fault
>>>> handler to use io_mapping so the vm_insert_pfn stuff is fast?
>>>> -Daniel
>> --
>> Daniel Vetter
>> Software Engineer, Intel Corporation
>> http://blog.ffwll.ch
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-26  9:41                 ` Thomas Hellström (Intel)
  (?)
@ 2021-02-26 13:28                   ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-26 13:28 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/25/21 4:49 PM, Daniel Vetter wrote:
> > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> >>>> <thomas_os@shipmail.org> wrote:
> >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
> >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> >>>>>> <thomas_os@shipmail.org> wrote:
> >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
> >>>>>>>> them like that (like calling get_user_pages works, or that they're
> >>>>>>>> accounting like any other normal memory) cannot be guaranteed.
> >>>>>>>>
> >>>>>>>> Since some userspace only runs on integrated devices, where all
> >>>>>>>> buffers are actually all resident system memory, there's a huge
> >>>>>>>> temptation to assume that a struct page is always present and useable
> >>>>>>>> like for any more pagecache backed mmap. This has the potential to
> >>>>>>>> result in a uapi nightmare.
> >>>>>>>>
> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> >>>>>>>> blocks get_user_pages and all the other struct page based
> >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
> >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
> >>>>>>>>
> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
> >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
> >>>>>>>>
> >>>>>>>> v2:
> >>>>>>>>
> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
> >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
> >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >>>>>>>>
> >>>>>>>>     From auditing the various functions to insert pfn pte entires
> >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> >>>>>>>> this should be the correct flag to check for.
> >>>>>>>>
> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> >>>>>>> disallow COW mappings, since it will not work on architectures that
> >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> >>>>>> COW really makes absolutely no sense. How would we enforce this?
> >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> >>>>> or allowing MIXEDMAP.
> >>>>>
> >>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> >>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
> >>>>>> I think that's why ttm has a page cache for these, because it indeed
> >>>>>> sucks. The PAT changes on pages are rather expensive.
> >>>>> IIRC the page cache was implemented because of the slowness of the
> >>>>> caching mode transition itself, more specifically the wbinvd() call +
> >>>>> global TLB flush.
> >>> Yes, exactly that. The global TLB flush is what really breaks our neck here
> >>> from a performance perspective.
> >>>
> >>>>>> There is still an issue for iomem mappings, because the PAT validation
> >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> >>>>>> But for i915 at least this is fixed by using the io_mapping
> >>>>>> infrastructure, which does the PAT reservation only once when you set
> >>>>>> up the mapping area at driver load.
> >>>>> Yes, I guess that was the issue that the comment describes, but the
> >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> >>>>>
> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> >>>>>> problem that hurts much :-)
> >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> >>>>>
> >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> >>>> Uh that's bad, because mixed maps pointing at struct page wont stop
> >>>> gup. At least afaik.
> >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> >>> already seen tons of problems with the page cache.
> >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
> >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
> >>
> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
> >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
> >>
> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
> >> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
> >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
> >> in hugepte support by intentionally not being devmap.
> >>
> >> So I'm really not sure this works as we think it should. Maybe good to do
> >> a quick test program on amdgpu with a buffer in system memory only and try
> >> to do direct io into it. If it works, you have a problem, and a bad one.
> > That's probably impossible, since a quick git grep shows that pretty
> > much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> > riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> > where you can plug an amdgpu in and actually exercise the bug :-)
>
> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I
> don't see what should be stopping gup to those?

If you have an arch with pte special we use insert_pfn(), which afaict
will use pte_mkspecial for the !devmap case. And ttm isn't devmap
(otherwise our hugepte abuse of devmap hugeptes would go rather
wrong).

So I think it stops gup. But I haven't verified at all. Would be good
if Christian can check this with some direct io to a buffer in system
memory.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-26 13:28                   ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-26 13:28 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/25/21 4:49 PM, Daniel Vetter wrote:
> > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> >>>> <thomas_os@shipmail.org> wrote:
> >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
> >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> >>>>>> <thomas_os@shipmail.org> wrote:
> >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
> >>>>>>>> them like that (like calling get_user_pages works, or that they're
> >>>>>>>> accounting like any other normal memory) cannot be guaranteed.
> >>>>>>>>
> >>>>>>>> Since some userspace only runs on integrated devices, where all
> >>>>>>>> buffers are actually all resident system memory, there's a huge
> >>>>>>>> temptation to assume that a struct page is always present and useable
> >>>>>>>> like for any more pagecache backed mmap. This has the potential to
> >>>>>>>> result in a uapi nightmare.
> >>>>>>>>
> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> >>>>>>>> blocks get_user_pages and all the other struct page based
> >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
> >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
> >>>>>>>>
> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
> >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
> >>>>>>>>
> >>>>>>>> v2:
> >>>>>>>>
> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
> >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
> >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >>>>>>>>
> >>>>>>>>     From auditing the various functions to insert pfn pte entires
> >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> >>>>>>>> this should be the correct flag to check for.
> >>>>>>>>
> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> >>>>>>> disallow COW mappings, since it will not work on architectures that
> >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> >>>>>> COW really makes absolutely no sense. How would we enforce this?
> >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> >>>>> or allowing MIXEDMAP.
> >>>>>
> >>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> >>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
> >>>>>> I think that's why ttm has a page cache for these, because it indeed
> >>>>>> sucks. The PAT changes on pages are rather expensive.
> >>>>> IIRC the page cache was implemented because of the slowness of the
> >>>>> caching mode transition itself, more specifically the wbinvd() call +
> >>>>> global TLB flush.
> >>> Yes, exactly that. The global TLB flush is what really breaks our neck here
> >>> from a performance perspective.
> >>>
> >>>>>> There is still an issue for iomem mappings, because the PAT validation
> >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> >>>>>> But for i915 at least this is fixed by using the io_mapping
> >>>>>> infrastructure, which does the PAT reservation only once when you set
> >>>>>> up the mapping area at driver load.
> >>>>> Yes, I guess that was the issue that the comment describes, but the
> >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> >>>>>
> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> >>>>>> problem that hurts much :-)
> >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> >>>>>
> >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> >>>> Uh that's bad, because mixed maps pointing at struct page wont stop
> >>>> gup. At least afaik.
> >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> >>> already seen tons of problems with the page cache.
> >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
> >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
> >>
> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
> >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
> >>
> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
> >> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
> >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
> >> in hugepte support by intentionally not being devmap.
> >>
> >> So I'm really not sure this works as we think it should. Maybe good to do
> >> a quick test program on amdgpu with a buffer in system memory only and try
> >> to do direct io into it. If it works, you have a problem, and a bad one.
> > That's probably impossible, since a quick git grep shows that pretty
> > much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> > riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> > where you can plug an amdgpu in and actually exercise the bug :-)
>
> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I
> don't see what should be stopping gup to those?

If you have an arch with pte special we use insert_pfn(), which afaict
will use pte_mkspecial for the !devmap case. And ttm isn't devmap
(otherwise our hugepte abuse of devmap hugeptes would go rather
wrong).

So I think it stops gup. But I haven't verified at all. Would be good
if Christian can check this with some direct io to a buffer in system
memory.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-26 13:28                   ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-26 13:28 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
>
> On 2/25/21 4:49 PM, Daniel Vetter wrote:
> > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
> >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
> >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
> >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
> >>>> <thomas_os@shipmail.org> wrote:
> >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
> >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
> >>>>>> <thomas_os@shipmail.org> wrote:
> >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
> >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
> >>>>>>>> them like that (like calling get_user_pages works, or that they're
> >>>>>>>> accounting like any other normal memory) cannot be guaranteed.
> >>>>>>>>
> >>>>>>>> Since some userspace only runs on integrated devices, where all
> >>>>>>>> buffers are actually all resident system memory, there's a huge
> >>>>>>>> temptation to assume that a struct page is always present and useable
> >>>>>>>> like for any more pagecache backed mmap. This has the potential to
> >>>>>>>> result in a uapi nightmare.
> >>>>>>>>
> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
> >>>>>>>> blocks get_user_pages and all the other struct page based
> >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
> >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
> >>>>>>>>
> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
> >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
> >>>>>>>>
> >>>>>>>> v2:
> >>>>>>>>
> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
> >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
> >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
> >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
> >>>>>>>>
> >>>>>>>>     From auditing the various functions to insert pfn pte entires
> >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
> >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
> >>>>>>>> this should be the correct flag to check for.
> >>>>>>>>
> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
> >>>>>>> disallow COW mappings, since it will not work on architectures that
> >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
> >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
> >>>>>> COW really makes absolutely no sense. How would we enforce this?
> >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
> >>>>> or allowing MIXEDMAP.
> >>>>>
> >>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
> >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
> >>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
> >>>>>> I think that's why ttm has a page cache for these, because it indeed
> >>>>>> sucks. The PAT changes on pages are rather expensive.
> >>>>> IIRC the page cache was implemented because of the slowness of the
> >>>>> caching mode transition itself, more specifically the wbinvd() call +
> >>>>> global TLB flush.
> >>> Yes, exactly that. The global TLB flush is what really breaks our neck here
> >>> from a performance perspective.
> >>>
> >>>>>> There is still an issue for iomem mappings, because the PAT validation
> >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
> >>>>>> But for i915 at least this is fixed by using the io_mapping
> >>>>>> infrastructure, which does the PAT reservation only once when you set
> >>>>>> up the mapping area at driver load.
> >>>>> Yes, I guess that was the issue that the comment describes, but the
> >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
> >>>>>
> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
> >>>>>> problem that hurts much :-)
> >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
> >>>>>
> >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
> >>>> Uh that's bad, because mixed maps pointing at struct page wont stop
> >>>> gup. At least afaik.
> >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
> >>> already seen tons of problems with the page cache.
> >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
> >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
> >>
> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
> >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
> >>
> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
> >> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
> >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
> >> in hugepte support by intentionally not being devmap.
> >>
> >> So I'm really not sure this works as we think it should. Maybe good to do
> >> a quick test program on amdgpu with a buffer in system memory only and try
> >> to do direct io into it. If it works, you have a problem, and a bad one.
> > That's probably impossible, since a quick git grep shows that pretty
> > much anything reasonable has special ptes: arc, arm, arm64, powerpc,
> > riscv, s390, sh, sparc, x86. I don't think you'll have a platform
> > where you can plug an amdgpu in and actually exercise the bug :-)
>
> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I
> don't see what should be stopping gup to those?

If you have an arch with pte special we use insert_pfn(), which afaict
will use pte_mkspecial for the !devmap case. And ttm isn't devmap
(otherwise our hugepte abuse of devmap hugeptes would go rather
wrong).

So I think it stops gup. But I haven't verified at all. Would be good
if Christian can check this with some direct io to a buffer in system
memory.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH] drm/vgem: use shmem helpers
  2021-02-26  9:19       ` [Intel-gfx] " Thomas Zimmermann
@ 2021-02-26 13:30         ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-26 13:30 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Intel Graphics Development, DRI Development,
	Christian König, Melissa Wen, Daniel Vetter, Chris Wilson

On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 25.02.21 um 11:23 schrieb Daniel Vetter:
> > Aside from deleting lots of code the real motivation here is to switch
> > the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> > work, and even if you try and there's a struct page behind that,
> > touching it and mucking around with its refcount can upset drivers
> > real bad.
> >
> > v2: Review from Thomas:
> > - sort #include
> > - drop more dead code that I didn't spot somehow
> >
> > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)
>
> Since you're working on it, could you move the config item into a
> Kconfig file under vgem?

We have a lot of drivers still without their own Kconfig. I thought
we're only doing that for drivers which have multiple options, or
otherwise would clutter up the main drm/Kconfig file?

Not opposed to this, just feels like if we do this, should do it for
all of them.
-Daniel


>
> Best regards
> Thomas
>
> >
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: John Stultz <john.stultz@linaro.org>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Melissa Wen <melissa.srw@gmail.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/Kconfig         |   1 +
> >   drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
> >   2 files changed, 4 insertions(+), 337 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index 8e73311de583..94e4ac830283 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
> >   config DRM_VGEM
> >       tristate "Virtual GEM provider"
> >       depends on DRM
> > +     select DRM_GEM_SHMEM_HELPER
> >       help
> >         Choose this option to get a virtual graphics memory manager,
> >         as used by Mesa's software renderer for enhanced performance.
> > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> > index a0e75f1d5d01..b1b3a5ffc542 100644
> > --- a/drivers/gpu/drm/vgem/vgem_drv.c
> > +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> > @@ -38,6 +38,7 @@
> >
> >   #include <drm/drm_drv.h>
> >   #include <drm/drm_file.h>
> > +#include <drm/drm_gem_shmem_helper.h>
> >   #include <drm/drm_ioctl.h>
> >   #include <drm/drm_managed.h>
> >   #include <drm/drm_prime.h>
> > @@ -50,87 +51,11 @@
> >   #define DRIVER_MAJOR        1
> >   #define DRIVER_MINOR        0
> >
> > -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
> > -
> >   static struct vgem_device {
> >       struct drm_device drm;
> >       struct platform_device *platform;
> >   } *vgem_device;
> >
> > -static void vgem_gem_free_object(struct drm_gem_object *obj)
> > -{
> > -     struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
> > -
> > -     kvfree(vgem_obj->pages);
> > -     mutex_destroy(&vgem_obj->pages_lock);
> > -
> > -     if (obj->import_attach)
> > -             drm_prime_gem_destroy(obj, vgem_obj->table);
> > -
> > -     drm_gem_object_release(obj);
> > -     kfree(vgem_obj);
> > -}
> > -
> > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
> > -{
> > -     struct vm_area_struct *vma = vmf->vma;
> > -     struct drm_vgem_gem_object *obj = vma->vm_private_data;
> > -     /* We don't use vmf->pgoff since that has the fake offset */
> > -     unsigned long vaddr = vmf->address;
> > -     vm_fault_t ret = VM_FAULT_SIGBUS;
> > -     loff_t num_pages;
> > -     pgoff_t page_offset;
> > -     page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
> > -
> > -     num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
> > -
> > -     if (page_offset >= num_pages)
> > -             return VM_FAULT_SIGBUS;
> > -
> > -     mutex_lock(&obj->pages_lock);
> > -     if (obj->pages) {
> > -             get_page(obj->pages[page_offset]);
> > -             vmf->page = obj->pages[page_offset];
> > -             ret = 0;
> > -     }
> > -     mutex_unlock(&obj->pages_lock);
> > -     if (ret) {
> > -             struct page *page;
> > -
> > -             page = shmem_read_mapping_page(
> > -                                     file_inode(obj->base.filp)->i_mapping,
> > -                                     page_offset);
> > -             if (!IS_ERR(page)) {
> > -                     vmf->page = page;
> > -                     ret = 0;
> > -             } else switch (PTR_ERR(page)) {
> > -                     case -ENOSPC:
> > -                     case -ENOMEM:
> > -                             ret = VM_FAULT_OOM;
> > -                             break;
> > -                     case -EBUSY:
> > -                             ret = VM_FAULT_RETRY;
> > -                             break;
> > -                     case -EFAULT:
> > -                     case -EINVAL:
> > -                             ret = VM_FAULT_SIGBUS;
> > -                             break;
> > -                     default:
> > -                             WARN_ON(PTR_ERR(page));
> > -                             ret = VM_FAULT_SIGBUS;
> > -                             break;
> > -             }
> > -
> > -     }
> > -     return ret;
> > -}
> > -
> > -static const struct vm_operations_struct vgem_gem_vm_ops = {
> > -     .fault = vgem_gem_fault,
> > -     .open = drm_gem_vm_open,
> > -     .close = drm_gem_vm_close,
> > -};
> > -
> >   static int vgem_open(struct drm_device *dev, struct drm_file *file)
> >   {
> >       struct vgem_file *vfile;
> > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
> >       kfree(vfile);
> >   }
> >
> > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> > -                                             unsigned long size)
> > -{
> > -     struct drm_vgem_gem_object *obj;
> > -     int ret;
> > -
> > -     obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> > -     if (!obj)
> > -             return ERR_PTR(-ENOMEM);
> > -
> > -     obj->base.funcs = &vgem_gem_object_funcs;
> > -
> > -     ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
> > -     if (ret) {
> > -             kfree(obj);
> > -             return ERR_PTR(ret);
> > -     }
> > -
> > -     mutex_init(&obj->pages_lock);
> > -
> > -     return obj;
> > -}
> > -
> > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
> > -{
> > -     drm_gem_object_release(&obj->base);
> > -     kfree(obj);
> > -}
> > -
> > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> > -                                           struct drm_file *file,
> > -                                           unsigned int *handle,
> > -                                           unsigned long size)
> > -{
> > -     struct drm_vgem_gem_object *obj;
> > -     int ret;
> > -
> > -     obj = __vgem_gem_create(dev, size);
> > -     if (IS_ERR(obj))
> > -             return ERR_CAST(obj);
> > -
> > -     ret = drm_gem_handle_create(file, &obj->base, handle);
> > -     if (ret) {
> > -             drm_gem_object_put(&obj->base);
> > -             return ERR_PTR(ret);
> > -     }
> > -
> > -     return &obj->base;
> > -}
> > -
> > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
> > -                             struct drm_mode_create_dumb *args)
> > -{
> > -     struct drm_gem_object *gem_object;
> > -     u64 pitch, size;
> > -
> > -     pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
> > -     size = args->height * pitch;
> > -     if (size == 0)
> > -             return -EINVAL;
> > -
> > -     gem_object = vgem_gem_create(dev, file, &args->handle, size);
> > -     if (IS_ERR(gem_object))
> > -             return PTR_ERR(gem_object);
> > -
> > -     args->size = gem_object->size;
> > -     args->pitch = pitch;
> > -
> > -     drm_gem_object_put(gem_object);
> > -
> > -     DRM_DEBUG("Created object of size %llu\n", args->size);
> > -
> > -     return 0;
> > -}
> > -
> >   static struct drm_ioctl_desc vgem_ioctls[] = {
> >       DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
> >       DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
> >   };
> >
> > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> > -{
> > -     unsigned long flags = vma->vm_flags;
> > -     int ret;
> > -
> > -     ret = drm_gem_mmap(filp, vma);
> > -     if (ret)
> > -             return ret;
> > -
> > -     /* Keep the WC mmaping set by drm_gem_mmap() but our pages
> > -      * are ordinary and not special.
> > -      */
> > -     vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> > -     return 0;
> > -}
> > -
> > -static const struct file_operations vgem_driver_fops = {
> > -     .owner          = THIS_MODULE,
> > -     .open           = drm_open,
> > -     .mmap           = vgem_mmap,
> > -     .poll           = drm_poll,
> > -     .read           = drm_read,
> > -     .unlocked_ioctl = drm_ioctl,
> > -     .compat_ioctl   = drm_compat_ioctl,
> > -     .release        = drm_release,
> > -};
> > -
> > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
> > -{
> > -     mutex_lock(&bo->pages_lock);
> > -     if (bo->pages_pin_count++ == 0) {
> > -             struct page **pages;
> > -
> > -             pages = drm_gem_get_pages(&bo->base);
> > -             if (IS_ERR(pages)) {
> > -                     bo->pages_pin_count--;
> > -                     mutex_unlock(&bo->pages_lock);
> > -                     return pages;
> > -             }
> > -
> > -             bo->pages = pages;
> > -     }
> > -     mutex_unlock(&bo->pages_lock);
> > -
> > -     return bo->pages;
> > -}
> > -
> > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
> > -{
> > -     mutex_lock(&bo->pages_lock);
> > -     if (--bo->pages_pin_count == 0) {
> > -             drm_gem_put_pages(&bo->base, bo->pages, true, true);
> > -             bo->pages = NULL;
> > -     }
> > -     mutex_unlock(&bo->pages_lock);
> > -}
> > -
> > -static int vgem_prime_pin(struct drm_gem_object *obj)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -     long n_pages = obj->size >> PAGE_SHIFT;
> > -     struct page **pages;
> > -
> > -     pages = vgem_pin_pages(bo);
> > -     if (IS_ERR(pages))
> > -             return PTR_ERR(pages);
> > -
> > -     /* Flush the object from the CPU cache so that importers can rely
> > -      * on coherent indirect access via the exported dma-address.
> > -      */
> > -     drm_clflush_pages(pages, n_pages);
> > -
> > -     return 0;
> > -}
> > -
> > -static void vgem_prime_unpin(struct drm_gem_object *obj)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -
> > -     vgem_unpin_pages(bo);
> > -}
> > -
> > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -
> > -     return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
> > -}
> > -
> > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
> > -                                             struct dma_buf *dma_buf)
> > -{
> > -     struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
> > -
> > -     return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
> > -}
> > -
> > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
> > -                     struct dma_buf_attachment *attach, struct sg_table *sg)
> > -{
> > -     struct drm_vgem_gem_object *obj;
> > -     int npages;
> > -
> > -     obj = __vgem_gem_create(dev, attach->dmabuf->size);
> > -     if (IS_ERR(obj))
> > -             return ERR_CAST(obj);
> > -
> > -     npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
> > -
> > -     obj->table = sg;
> > -     obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
> > -     if (!obj->pages) {
> > -             __vgem_gem_destroy(obj);
> > -             return ERR_PTR(-ENOMEM);
> > -     }
> > -
> > -     obj->pages_pin_count++; /* perma-pinned */
> > -     drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
> > -     return &obj->base;
> > -}
> > -
> > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -     long n_pages = obj->size >> PAGE_SHIFT;
> > -     struct page **pages;
> > -     void *vaddr;
> > -
> > -     pages = vgem_pin_pages(bo);
> > -     if (IS_ERR(pages))
> > -             return PTR_ERR(pages);
> > -
> > -     vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> > -     if (!vaddr)
> > -             return -ENOMEM;
> > -     dma_buf_map_set_vaddr(map, vaddr);
> > -
> > -     return 0;
> > -}
> > -
> > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -
> > -     vunmap(map->vaddr);
> > -     vgem_unpin_pages(bo);
> > -}
> > -
> > -static int vgem_prime_mmap(struct drm_gem_object *obj,
> > -                        struct vm_area_struct *vma)
> > -{
> > -     int ret;
> > -
> > -     if (obj->size < vma->vm_end - vma->vm_start)
> > -             return -EINVAL;
> > -
> > -     if (!obj->filp)
> > -             return -ENODEV;
> > -
> > -     ret = call_mmap(obj->filp, vma);
> > -     if (ret)
> > -             return ret;
> > -
> > -     vma_set_file(vma, obj->filp);
> > -     vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> > -     vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> > -
> > -     return 0;
> > -}
> > -
> > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
> > -     .free = vgem_gem_free_object,
> > -     .pin = vgem_prime_pin,
> > -     .unpin = vgem_prime_unpin,
> > -     .get_sg_table = vgem_prime_get_sg_table,
> > -     .vmap = vgem_prime_vmap,
> > -     .vunmap = vgem_prime_vunmap,
> > -     .vm_ops = &vgem_gem_vm_ops,
> > -};
> > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
> >
> >   static const struct drm_driver vgem_driver = {
> >       .driver_features                = DRIVER_GEM | DRIVER_RENDER,
> > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
> >       .num_ioctls                     = ARRAY_SIZE(vgem_ioctls),
> >       .fops                           = &vgem_driver_fops,
> >
> > -     .dumb_create                    = vgem_gem_dumb_create,
> > -
> > -     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> > -     .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> > -     .gem_prime_import = vgem_prime_import,
> > -     .gem_prime_import_sg_table = vgem_prime_import_sg_table,
> > -     .gem_prime_mmap = vgem_prime_mmap,
> > +     DRM_GEM_SHMEM_DRIVER_OPS,
> >
> >       .name   = DRIVER_NAME,
> >       .desc   = DRIVER_DESC,
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/vgem: use shmem helpers
@ 2021-02-26 13:30         ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-26 13:30 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Intel Graphics Development, DRI Development,
	Christian König, Melissa Wen, John Stultz, Daniel Vetter,
	Chris Wilson, Sumit Semwal

On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>
> Hi
>
> Am 25.02.21 um 11:23 schrieb Daniel Vetter:
> > Aside from deleting lots of code the real motivation here is to switch
> > the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> > work, and even if you try and there's a struct page behind that,
> > touching it and mucking around with its refcount can upset drivers
> > real bad.
> >
> > v2: Review from Thomas:
> > - sort #include
> > - drop more dead code that I didn't spot somehow
> >
> > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)
>
> Since you're working on it, could you move the config item into a
> Kconfig file under vgem?

We have a lot of drivers still without their own Kconfig. I thought
we're only doing that for drivers which have multiple options, or
otherwise would clutter up the main drm/Kconfig file?

Not opposed to this, just feels like if we do this, should do it for
all of them.
-Daniel


>
> Best regards
> Thomas
>
> >
> > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
> > Cc: John Stultz <john.stultz@linaro.org>
> > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > Cc: "Christian König" <christian.koenig@amd.com>
> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > Cc: Melissa Wen <melissa.srw@gmail.com>
> > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >   drivers/gpu/drm/Kconfig         |   1 +
> >   drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
> >   2 files changed, 4 insertions(+), 337 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > index 8e73311de583..94e4ac830283 100644
> > --- a/drivers/gpu/drm/Kconfig
> > +++ b/drivers/gpu/drm/Kconfig
> > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
> >   config DRM_VGEM
> >       tristate "Virtual GEM provider"
> >       depends on DRM
> > +     select DRM_GEM_SHMEM_HELPER
> >       help
> >         Choose this option to get a virtual graphics memory manager,
> >         as used by Mesa's software renderer for enhanced performance.
> > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> > index a0e75f1d5d01..b1b3a5ffc542 100644
> > --- a/drivers/gpu/drm/vgem/vgem_drv.c
> > +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> > @@ -38,6 +38,7 @@
> >
> >   #include <drm/drm_drv.h>
> >   #include <drm/drm_file.h>
> > +#include <drm/drm_gem_shmem_helper.h>
> >   #include <drm/drm_ioctl.h>
> >   #include <drm/drm_managed.h>
> >   #include <drm/drm_prime.h>
> > @@ -50,87 +51,11 @@
> >   #define DRIVER_MAJOR        1
> >   #define DRIVER_MINOR        0
> >
> > -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
> > -
> >   static struct vgem_device {
> >       struct drm_device drm;
> >       struct platform_device *platform;
> >   } *vgem_device;
> >
> > -static void vgem_gem_free_object(struct drm_gem_object *obj)
> > -{
> > -     struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
> > -
> > -     kvfree(vgem_obj->pages);
> > -     mutex_destroy(&vgem_obj->pages_lock);
> > -
> > -     if (obj->import_attach)
> > -             drm_prime_gem_destroy(obj, vgem_obj->table);
> > -
> > -     drm_gem_object_release(obj);
> > -     kfree(vgem_obj);
> > -}
> > -
> > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
> > -{
> > -     struct vm_area_struct *vma = vmf->vma;
> > -     struct drm_vgem_gem_object *obj = vma->vm_private_data;
> > -     /* We don't use vmf->pgoff since that has the fake offset */
> > -     unsigned long vaddr = vmf->address;
> > -     vm_fault_t ret = VM_FAULT_SIGBUS;
> > -     loff_t num_pages;
> > -     pgoff_t page_offset;
> > -     page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
> > -
> > -     num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
> > -
> > -     if (page_offset >= num_pages)
> > -             return VM_FAULT_SIGBUS;
> > -
> > -     mutex_lock(&obj->pages_lock);
> > -     if (obj->pages) {
> > -             get_page(obj->pages[page_offset]);
> > -             vmf->page = obj->pages[page_offset];
> > -             ret = 0;
> > -     }
> > -     mutex_unlock(&obj->pages_lock);
> > -     if (ret) {
> > -             struct page *page;
> > -
> > -             page = shmem_read_mapping_page(
> > -                                     file_inode(obj->base.filp)->i_mapping,
> > -                                     page_offset);
> > -             if (!IS_ERR(page)) {
> > -                     vmf->page = page;
> > -                     ret = 0;
> > -             } else switch (PTR_ERR(page)) {
> > -                     case -ENOSPC:
> > -                     case -ENOMEM:
> > -                             ret = VM_FAULT_OOM;
> > -                             break;
> > -                     case -EBUSY:
> > -                             ret = VM_FAULT_RETRY;
> > -                             break;
> > -                     case -EFAULT:
> > -                     case -EINVAL:
> > -                             ret = VM_FAULT_SIGBUS;
> > -                             break;
> > -                     default:
> > -                             WARN_ON(PTR_ERR(page));
> > -                             ret = VM_FAULT_SIGBUS;
> > -                             break;
> > -             }
> > -
> > -     }
> > -     return ret;
> > -}
> > -
> > -static const struct vm_operations_struct vgem_gem_vm_ops = {
> > -     .fault = vgem_gem_fault,
> > -     .open = drm_gem_vm_open,
> > -     .close = drm_gem_vm_close,
> > -};
> > -
> >   static int vgem_open(struct drm_device *dev, struct drm_file *file)
> >   {
> >       struct vgem_file *vfile;
> > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
> >       kfree(vfile);
> >   }
> >
> > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> > -                                             unsigned long size)
> > -{
> > -     struct drm_vgem_gem_object *obj;
> > -     int ret;
> > -
> > -     obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> > -     if (!obj)
> > -             return ERR_PTR(-ENOMEM);
> > -
> > -     obj->base.funcs = &vgem_gem_object_funcs;
> > -
> > -     ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
> > -     if (ret) {
> > -             kfree(obj);
> > -             return ERR_PTR(ret);
> > -     }
> > -
> > -     mutex_init(&obj->pages_lock);
> > -
> > -     return obj;
> > -}
> > -
> > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
> > -{
> > -     drm_gem_object_release(&obj->base);
> > -     kfree(obj);
> > -}
> > -
> > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> > -                                           struct drm_file *file,
> > -                                           unsigned int *handle,
> > -                                           unsigned long size)
> > -{
> > -     struct drm_vgem_gem_object *obj;
> > -     int ret;
> > -
> > -     obj = __vgem_gem_create(dev, size);
> > -     if (IS_ERR(obj))
> > -             return ERR_CAST(obj);
> > -
> > -     ret = drm_gem_handle_create(file, &obj->base, handle);
> > -     if (ret) {
> > -             drm_gem_object_put(&obj->base);
> > -             return ERR_PTR(ret);
> > -     }
> > -
> > -     return &obj->base;
> > -}
> > -
> > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
> > -                             struct drm_mode_create_dumb *args)
> > -{
> > -     struct drm_gem_object *gem_object;
> > -     u64 pitch, size;
> > -
> > -     pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
> > -     size = args->height * pitch;
> > -     if (size == 0)
> > -             return -EINVAL;
> > -
> > -     gem_object = vgem_gem_create(dev, file, &args->handle, size);
> > -     if (IS_ERR(gem_object))
> > -             return PTR_ERR(gem_object);
> > -
> > -     args->size = gem_object->size;
> > -     args->pitch = pitch;
> > -
> > -     drm_gem_object_put(gem_object);
> > -
> > -     DRM_DEBUG("Created object of size %llu\n", args->size);
> > -
> > -     return 0;
> > -}
> > -
> >   static struct drm_ioctl_desc vgem_ioctls[] = {
> >       DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
> >       DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
> >   };
> >
> > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> > -{
> > -     unsigned long flags = vma->vm_flags;
> > -     int ret;
> > -
> > -     ret = drm_gem_mmap(filp, vma);
> > -     if (ret)
> > -             return ret;
> > -
> > -     /* Keep the WC mmaping set by drm_gem_mmap() but our pages
> > -      * are ordinary and not special.
> > -      */
> > -     vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> > -     return 0;
> > -}
> > -
> > -static const struct file_operations vgem_driver_fops = {
> > -     .owner          = THIS_MODULE,
> > -     .open           = drm_open,
> > -     .mmap           = vgem_mmap,
> > -     .poll           = drm_poll,
> > -     .read           = drm_read,
> > -     .unlocked_ioctl = drm_ioctl,
> > -     .compat_ioctl   = drm_compat_ioctl,
> > -     .release        = drm_release,
> > -};
> > -
> > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
> > -{
> > -     mutex_lock(&bo->pages_lock);
> > -     if (bo->pages_pin_count++ == 0) {
> > -             struct page **pages;
> > -
> > -             pages = drm_gem_get_pages(&bo->base);
> > -             if (IS_ERR(pages)) {
> > -                     bo->pages_pin_count--;
> > -                     mutex_unlock(&bo->pages_lock);
> > -                     return pages;
> > -             }
> > -
> > -             bo->pages = pages;
> > -     }
> > -     mutex_unlock(&bo->pages_lock);
> > -
> > -     return bo->pages;
> > -}
> > -
> > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
> > -{
> > -     mutex_lock(&bo->pages_lock);
> > -     if (--bo->pages_pin_count == 0) {
> > -             drm_gem_put_pages(&bo->base, bo->pages, true, true);
> > -             bo->pages = NULL;
> > -     }
> > -     mutex_unlock(&bo->pages_lock);
> > -}
> > -
> > -static int vgem_prime_pin(struct drm_gem_object *obj)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -     long n_pages = obj->size >> PAGE_SHIFT;
> > -     struct page **pages;
> > -
> > -     pages = vgem_pin_pages(bo);
> > -     if (IS_ERR(pages))
> > -             return PTR_ERR(pages);
> > -
> > -     /* Flush the object from the CPU cache so that importers can rely
> > -      * on coherent indirect access via the exported dma-address.
> > -      */
> > -     drm_clflush_pages(pages, n_pages);
> > -
> > -     return 0;
> > -}
> > -
> > -static void vgem_prime_unpin(struct drm_gem_object *obj)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -
> > -     vgem_unpin_pages(bo);
> > -}
> > -
> > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -
> > -     return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
> > -}
> > -
> > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
> > -                                             struct dma_buf *dma_buf)
> > -{
> > -     struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
> > -
> > -     return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
> > -}
> > -
> > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
> > -                     struct dma_buf_attachment *attach, struct sg_table *sg)
> > -{
> > -     struct drm_vgem_gem_object *obj;
> > -     int npages;
> > -
> > -     obj = __vgem_gem_create(dev, attach->dmabuf->size);
> > -     if (IS_ERR(obj))
> > -             return ERR_CAST(obj);
> > -
> > -     npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
> > -
> > -     obj->table = sg;
> > -     obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
> > -     if (!obj->pages) {
> > -             __vgem_gem_destroy(obj);
> > -             return ERR_PTR(-ENOMEM);
> > -     }
> > -
> > -     obj->pages_pin_count++; /* perma-pinned */
> > -     drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
> > -     return &obj->base;
> > -}
> > -
> > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -     long n_pages = obj->size >> PAGE_SHIFT;
> > -     struct page **pages;
> > -     void *vaddr;
> > -
> > -     pages = vgem_pin_pages(bo);
> > -     if (IS_ERR(pages))
> > -             return PTR_ERR(pages);
> > -
> > -     vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> > -     if (!vaddr)
> > -             return -ENOMEM;
> > -     dma_buf_map_set_vaddr(map, vaddr);
> > -
> > -     return 0;
> > -}
> > -
> > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> > -{
> > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > -
> > -     vunmap(map->vaddr);
> > -     vgem_unpin_pages(bo);
> > -}
> > -
> > -static int vgem_prime_mmap(struct drm_gem_object *obj,
> > -                        struct vm_area_struct *vma)
> > -{
> > -     int ret;
> > -
> > -     if (obj->size < vma->vm_end - vma->vm_start)
> > -             return -EINVAL;
> > -
> > -     if (!obj->filp)
> > -             return -ENODEV;
> > -
> > -     ret = call_mmap(obj->filp, vma);
> > -     if (ret)
> > -             return ret;
> > -
> > -     vma_set_file(vma, obj->filp);
> > -     vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> > -     vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> > -
> > -     return 0;
> > -}
> > -
> > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
> > -     .free = vgem_gem_free_object,
> > -     .pin = vgem_prime_pin,
> > -     .unpin = vgem_prime_unpin,
> > -     .get_sg_table = vgem_prime_get_sg_table,
> > -     .vmap = vgem_prime_vmap,
> > -     .vunmap = vgem_prime_vunmap,
> > -     .vm_ops = &vgem_gem_vm_ops,
> > -};
> > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
> >
> >   static const struct drm_driver vgem_driver = {
> >       .driver_features                = DRIVER_GEM | DRIVER_RENDER,
> > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
> >       .num_ioctls                     = ARRAY_SIZE(vgem_ioctls),
> >       .fops                           = &vgem_driver_fops,
> >
> > -     .dumb_create                    = vgem_gem_dumb_create,
> > -
> > -     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> > -     .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> > -     .gem_prime_import = vgem_prime_import,
> > -     .gem_prime_import_sg_table = vgem_prime_import_sg_table,
> > -     .gem_prime_mmap = vgem_prime_mmap,
> > +     DRM_GEM_SHMEM_DRIVER_OPS,
> >
> >       .name   = DRIVER_NAME,
> >       .desc   = DRIVER_DESC,
> >
>
> --
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
>


-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH] drm/vgem: use shmem helpers
  2021-02-26 13:30         ` [Intel-gfx] " Daniel Vetter
@ 2021-02-26 13:51           ` Thomas Zimmermann
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Zimmermann @ 2021-02-26 13:51 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development, Chris Wilson,
	Melissa Wen, Daniel Vetter, Christian König


[-- Attachment #1.1.1: Type: text/plain, Size: 16362 bytes --]

Hi

Am 26.02.21 um 14:30 schrieb Daniel Vetter:
> On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>
>> Hi
>>
>> Am 25.02.21 um 11:23 schrieb Daniel Vetter:
>>> Aside from deleting lots of code the real motivation here is to switch
>>> the mmap over to VM_PFNMAP, to be more consistent with what real gpu
>>> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
>>> work, and even if you try and there's a struct page behind that,
>>> touching it and mucking around with its refcount can upset drivers
>>> real bad.
>>>
>>> v2: Review from Thomas:
>>> - sort #include
>>> - drop more dead code that I didn't spot somehow
>>>
>>> v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)
>>
>> Since you're working on it, could you move the config item into a
>> Kconfig file under vgem?
> 
> We have a lot of drivers still without their own Kconfig. I thought
> we're only doing that for drivers which have multiple options, or
> otherwise would clutter up the main drm/Kconfig file?
> 
> Not opposed to this, just feels like if we do this, should do it for
> all of them.

I didn't know that there was a rule for how to handle this. I just 
didn't like to have driver config rules in the main Kconfig file.

But yeah, maybe let's change this consistently in a separate patchset.

Best regards
Thomas

> -Daniel
> 
> 
>>
>> Best regards
>> Thomas
>>
>>>
>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
>>> Cc: John Stultz <john.stultz@linaro.org>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Melissa Wen <melissa.srw@gmail.com>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/Kconfig         |   1 +
>>>    drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
>>>    2 files changed, 4 insertions(+), 337 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
>>> index 8e73311de583..94e4ac830283 100644
>>> --- a/drivers/gpu/drm/Kconfig
>>> +++ b/drivers/gpu/drm/Kconfig
>>> @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
>>>    config DRM_VGEM
>>>        tristate "Virtual GEM provider"
>>>        depends on DRM
>>> +     select DRM_GEM_SHMEM_HELPER
>>>        help
>>>          Choose this option to get a virtual graphics memory manager,
>>>          as used by Mesa's software renderer for enhanced performance.
>>> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
>>> index a0e75f1d5d01..b1b3a5ffc542 100644
>>> --- a/drivers/gpu/drm/vgem/vgem_drv.c
>>> +++ b/drivers/gpu/drm/vgem/vgem_drv.c
>>> @@ -38,6 +38,7 @@
>>>
>>>    #include <drm/drm_drv.h>
>>>    #include <drm/drm_file.h>
>>> +#include <drm/drm_gem_shmem_helper.h>
>>>    #include <drm/drm_ioctl.h>
>>>    #include <drm/drm_managed.h>
>>>    #include <drm/drm_prime.h>
>>> @@ -50,87 +51,11 @@
>>>    #define DRIVER_MAJOR        1
>>>    #define DRIVER_MINOR        0
>>>
>>> -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
>>> -
>>>    static struct vgem_device {
>>>        struct drm_device drm;
>>>        struct platform_device *platform;
>>>    } *vgem_device;
>>>
>>> -static void vgem_gem_free_object(struct drm_gem_object *obj)
>>> -{
>>> -     struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
>>> -
>>> -     kvfree(vgem_obj->pages);
>>> -     mutex_destroy(&vgem_obj->pages_lock);
>>> -
>>> -     if (obj->import_attach)
>>> -             drm_prime_gem_destroy(obj, vgem_obj->table);
>>> -
>>> -     drm_gem_object_release(obj);
>>> -     kfree(vgem_obj);
>>> -}
>>> -
>>> -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
>>> -{
>>> -     struct vm_area_struct *vma = vmf->vma;
>>> -     struct drm_vgem_gem_object *obj = vma->vm_private_data;
>>> -     /* We don't use vmf->pgoff since that has the fake offset */
>>> -     unsigned long vaddr = vmf->address;
>>> -     vm_fault_t ret = VM_FAULT_SIGBUS;
>>> -     loff_t num_pages;
>>> -     pgoff_t page_offset;
>>> -     page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
>>> -
>>> -     num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
>>> -
>>> -     if (page_offset >= num_pages)
>>> -             return VM_FAULT_SIGBUS;
>>> -
>>> -     mutex_lock(&obj->pages_lock);
>>> -     if (obj->pages) {
>>> -             get_page(obj->pages[page_offset]);
>>> -             vmf->page = obj->pages[page_offset];
>>> -             ret = 0;
>>> -     }
>>> -     mutex_unlock(&obj->pages_lock);
>>> -     if (ret) {
>>> -             struct page *page;
>>> -
>>> -             page = shmem_read_mapping_page(
>>> -                                     file_inode(obj->base.filp)->i_mapping,
>>> -                                     page_offset);
>>> -             if (!IS_ERR(page)) {
>>> -                     vmf->page = page;
>>> -                     ret = 0;
>>> -             } else switch (PTR_ERR(page)) {
>>> -                     case -ENOSPC:
>>> -                     case -ENOMEM:
>>> -                             ret = VM_FAULT_OOM;
>>> -                             break;
>>> -                     case -EBUSY:
>>> -                             ret = VM_FAULT_RETRY;
>>> -                             break;
>>> -                     case -EFAULT:
>>> -                     case -EINVAL:
>>> -                             ret = VM_FAULT_SIGBUS;
>>> -                             break;
>>> -                     default:
>>> -                             WARN_ON(PTR_ERR(page));
>>> -                             ret = VM_FAULT_SIGBUS;
>>> -                             break;
>>> -             }
>>> -
>>> -     }
>>> -     return ret;
>>> -}
>>> -
>>> -static const struct vm_operations_struct vgem_gem_vm_ops = {
>>> -     .fault = vgem_gem_fault,
>>> -     .open = drm_gem_vm_open,
>>> -     .close = drm_gem_vm_close,
>>> -};
>>> -
>>>    static int vgem_open(struct drm_device *dev, struct drm_file *file)
>>>    {
>>>        struct vgem_file *vfile;
>>> @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
>>>        kfree(vfile);
>>>    }
>>>
>>> -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
>>> -                                             unsigned long size)
>>> -{
>>> -     struct drm_vgem_gem_object *obj;
>>> -     int ret;
>>> -
>>> -     obj = kzalloc(sizeof(*obj), GFP_KERNEL);
>>> -     if (!obj)
>>> -             return ERR_PTR(-ENOMEM);
>>> -
>>> -     obj->base.funcs = &vgem_gem_object_funcs;
>>> -
>>> -     ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
>>> -     if (ret) {
>>> -             kfree(obj);
>>> -             return ERR_PTR(ret);
>>> -     }
>>> -
>>> -     mutex_init(&obj->pages_lock);
>>> -
>>> -     return obj;
>>> -}
>>> -
>>> -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
>>> -{
>>> -     drm_gem_object_release(&obj->base);
>>> -     kfree(obj);
>>> -}
>>> -
>>> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
>>> -                                           struct drm_file *file,
>>> -                                           unsigned int *handle,
>>> -                                           unsigned long size)
>>> -{
>>> -     struct drm_vgem_gem_object *obj;
>>> -     int ret;
>>> -
>>> -     obj = __vgem_gem_create(dev, size);
>>> -     if (IS_ERR(obj))
>>> -             return ERR_CAST(obj);
>>> -
>>> -     ret = drm_gem_handle_create(file, &obj->base, handle);
>>> -     if (ret) {
>>> -             drm_gem_object_put(&obj->base);
>>> -             return ERR_PTR(ret);
>>> -     }
>>> -
>>> -     return &obj->base;
>>> -}
>>> -
>>> -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
>>> -                             struct drm_mode_create_dumb *args)
>>> -{
>>> -     struct drm_gem_object *gem_object;
>>> -     u64 pitch, size;
>>> -
>>> -     pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
>>> -     size = args->height * pitch;
>>> -     if (size == 0)
>>> -             return -EINVAL;
>>> -
>>> -     gem_object = vgem_gem_create(dev, file, &args->handle, size);
>>> -     if (IS_ERR(gem_object))
>>> -             return PTR_ERR(gem_object);
>>> -
>>> -     args->size = gem_object->size;
>>> -     args->pitch = pitch;
>>> -
>>> -     drm_gem_object_put(gem_object);
>>> -
>>> -     DRM_DEBUG("Created object of size %llu\n", args->size);
>>> -
>>> -     return 0;
>>> -}
>>> -
>>>    static struct drm_ioctl_desc vgem_ioctls[] = {
>>>        DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
>>>        DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
>>>    };
>>>
>>> -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
>>> -{
>>> -     unsigned long flags = vma->vm_flags;
>>> -     int ret;
>>> -
>>> -     ret = drm_gem_mmap(filp, vma);
>>> -     if (ret)
>>> -             return ret;
>>> -
>>> -     /* Keep the WC mmaping set by drm_gem_mmap() but our pages
>>> -      * are ordinary and not special.
>>> -      */
>>> -     vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
>>> -     return 0;
>>> -}
>>> -
>>> -static const struct file_operations vgem_driver_fops = {
>>> -     .owner          = THIS_MODULE,
>>> -     .open           = drm_open,
>>> -     .mmap           = vgem_mmap,
>>> -     .poll           = drm_poll,
>>> -     .read           = drm_read,
>>> -     .unlocked_ioctl = drm_ioctl,
>>> -     .compat_ioctl   = drm_compat_ioctl,
>>> -     .release        = drm_release,
>>> -};
>>> -
>>> -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
>>> -{
>>> -     mutex_lock(&bo->pages_lock);
>>> -     if (bo->pages_pin_count++ == 0) {
>>> -             struct page **pages;
>>> -
>>> -             pages = drm_gem_get_pages(&bo->base);
>>> -             if (IS_ERR(pages)) {
>>> -                     bo->pages_pin_count--;
>>> -                     mutex_unlock(&bo->pages_lock);
>>> -                     return pages;
>>> -             }
>>> -
>>> -             bo->pages = pages;
>>> -     }
>>> -     mutex_unlock(&bo->pages_lock);
>>> -
>>> -     return bo->pages;
>>> -}
>>> -
>>> -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
>>> -{
>>> -     mutex_lock(&bo->pages_lock);
>>> -     if (--bo->pages_pin_count == 0) {
>>> -             drm_gem_put_pages(&bo->base, bo->pages, true, true);
>>> -             bo->pages = NULL;
>>> -     }
>>> -     mutex_unlock(&bo->pages_lock);
>>> -}
>>> -
>>> -static int vgem_prime_pin(struct drm_gem_object *obj)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -     long n_pages = obj->size >> PAGE_SHIFT;
>>> -     struct page **pages;
>>> -
>>> -     pages = vgem_pin_pages(bo);
>>> -     if (IS_ERR(pages))
>>> -             return PTR_ERR(pages);
>>> -
>>> -     /* Flush the object from the CPU cache so that importers can rely
>>> -      * on coherent indirect access via the exported dma-address.
>>> -      */
>>> -     drm_clflush_pages(pages, n_pages);
>>> -
>>> -     return 0;
>>> -}
>>> -
>>> -static void vgem_prime_unpin(struct drm_gem_object *obj)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -
>>> -     vgem_unpin_pages(bo);
>>> -}
>>> -
>>> -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -
>>> -     return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
>>> -}
>>> -
>>> -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
>>> -                                             struct dma_buf *dma_buf)
>>> -{
>>> -     struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
>>> -
>>> -     return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
>>> -}
>>> -
>>> -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
>>> -                     struct dma_buf_attachment *attach, struct sg_table *sg)
>>> -{
>>> -     struct drm_vgem_gem_object *obj;
>>> -     int npages;
>>> -
>>> -     obj = __vgem_gem_create(dev, attach->dmabuf->size);
>>> -     if (IS_ERR(obj))
>>> -             return ERR_CAST(obj);
>>> -
>>> -     npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
>>> -
>>> -     obj->table = sg;
>>> -     obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
>>> -     if (!obj->pages) {
>>> -             __vgem_gem_destroy(obj);
>>> -             return ERR_PTR(-ENOMEM);
>>> -     }
>>> -
>>> -     obj->pages_pin_count++; /* perma-pinned */
>>> -     drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
>>> -     return &obj->base;
>>> -}
>>> -
>>> -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -     long n_pages = obj->size >> PAGE_SHIFT;
>>> -     struct page **pages;
>>> -     void *vaddr;
>>> -
>>> -     pages = vgem_pin_pages(bo);
>>> -     if (IS_ERR(pages))
>>> -             return PTR_ERR(pages);
>>> -
>>> -     vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
>>> -     if (!vaddr)
>>> -             return -ENOMEM;
>>> -     dma_buf_map_set_vaddr(map, vaddr);
>>> -
>>> -     return 0;
>>> -}
>>> -
>>> -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -
>>> -     vunmap(map->vaddr);
>>> -     vgem_unpin_pages(bo);
>>> -}
>>> -
>>> -static int vgem_prime_mmap(struct drm_gem_object *obj,
>>> -                        struct vm_area_struct *vma)
>>> -{
>>> -     int ret;
>>> -
>>> -     if (obj->size < vma->vm_end - vma->vm_start)
>>> -             return -EINVAL;
>>> -
>>> -     if (!obj->filp)
>>> -             return -ENODEV;
>>> -
>>> -     ret = call_mmap(obj->filp, vma);
>>> -     if (ret)
>>> -             return ret;
>>> -
>>> -     vma_set_file(vma, obj->filp);
>>> -     vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
>>> -     vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
>>> -
>>> -     return 0;
>>> -}
>>> -
>>> -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
>>> -     .free = vgem_gem_free_object,
>>> -     .pin = vgem_prime_pin,
>>> -     .unpin = vgem_prime_unpin,
>>> -     .get_sg_table = vgem_prime_get_sg_table,
>>> -     .vmap = vgem_prime_vmap,
>>> -     .vunmap = vgem_prime_vunmap,
>>> -     .vm_ops = &vgem_gem_vm_ops,
>>> -};
>>> +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
>>>
>>>    static const struct drm_driver vgem_driver = {
>>>        .driver_features                = DRIVER_GEM | DRIVER_RENDER,
>>> @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
>>>        .num_ioctls                     = ARRAY_SIZE(vgem_ioctls),
>>>        .fops                           = &vgem_driver_fops,
>>>
>>> -     .dumb_create                    = vgem_gem_dumb_create,
>>> -
>>> -     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
>>> -     .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
>>> -     .gem_prime_import = vgem_prime_import,
>>> -     .gem_prime_import_sg_table = vgem_prime_import_sg_table,
>>> -     .gem_prime_mmap = vgem_prime_mmap,
>>> +     DRM_GEM_SHMEM_DRIVER_OPS,
>>>
>>>        .name   = DRIVER_NAME,
>>>        .desc   = DRIVER_DESC,
>>>
>>
>> --
>> Thomas Zimmermann
>> Graphics Driver Developer
>> SUSE Software Solutions Germany GmbH
>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>> (HRB 36809, AG Nürnberg)
>> Geschäftsführer: Felix Imendörffer
>>
> 
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/vgem: use shmem helpers
@ 2021-02-26 13:51           ` Thomas Zimmermann
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Zimmermann @ 2021-02-26 13:51 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Intel Graphics Development, DRI Development, Chris Wilson,
	Melissa Wen, Daniel Vetter, Christian König


[-- Attachment #1.1.1: Type: text/plain, Size: 16362 bytes --]

Hi

Am 26.02.21 um 14:30 schrieb Daniel Vetter:
> On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
>>
>> Hi
>>
>> Am 25.02.21 um 11:23 schrieb Daniel Vetter:
>>> Aside from deleting lots of code the real motivation here is to switch
>>> the mmap over to VM_PFNMAP, to be more consistent with what real gpu
>>> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
>>> work, and even if you try and there's a struct page behind that,
>>> touching it and mucking around with its refcount can upset drivers
>>> real bad.
>>>
>>> v2: Review from Thomas:
>>> - sort #include
>>> - drop more dead code that I didn't spot somehow
>>>
>>> v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)
>>
>> Since you're working on it, could you move the config item into a
>> Kconfig file under vgem?
> 
> We have a lot of drivers still without their own Kconfig. I thought
> we're only doing that for drivers which have multiple options, or
> otherwise would clutter up the main drm/Kconfig file?
> 
> Not opposed to this, just feels like if we do this, should do it for
> all of them.

I didn't know that there was a rule for how to handle this. I just 
didn't like to have driver config rules in the main Kconfig file.

But yeah, maybe let's change this consistently in a separate patchset.

Best regards
Thomas

> -Daniel
> 
> 
>>
>> Best regards
>> Thomas
>>
>>>
>>> Cc: Thomas Zimmermann <tzimmermann@suse.de>
>>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
>>> Cc: John Stultz <john.stultz@linaro.org>
>>> Cc: Sumit Semwal <sumit.semwal@linaro.org>
>>> Cc: "Christian König" <christian.koenig@amd.com>
>>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
>>> Cc: Melissa Wen <melissa.srw@gmail.com>
>>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>>> ---
>>>    drivers/gpu/drm/Kconfig         |   1 +
>>>    drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
>>>    2 files changed, 4 insertions(+), 337 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
>>> index 8e73311de583..94e4ac830283 100644
>>> --- a/drivers/gpu/drm/Kconfig
>>> +++ b/drivers/gpu/drm/Kconfig
>>> @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
>>>    config DRM_VGEM
>>>        tristate "Virtual GEM provider"
>>>        depends on DRM
>>> +     select DRM_GEM_SHMEM_HELPER
>>>        help
>>>          Choose this option to get a virtual graphics memory manager,
>>>          as used by Mesa's software renderer for enhanced performance.
>>> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
>>> index a0e75f1d5d01..b1b3a5ffc542 100644
>>> --- a/drivers/gpu/drm/vgem/vgem_drv.c
>>> +++ b/drivers/gpu/drm/vgem/vgem_drv.c
>>> @@ -38,6 +38,7 @@
>>>
>>>    #include <drm/drm_drv.h>
>>>    #include <drm/drm_file.h>
>>> +#include <drm/drm_gem_shmem_helper.h>
>>>    #include <drm/drm_ioctl.h>
>>>    #include <drm/drm_managed.h>
>>>    #include <drm/drm_prime.h>
>>> @@ -50,87 +51,11 @@
>>>    #define DRIVER_MAJOR        1
>>>    #define DRIVER_MINOR        0
>>>
>>> -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
>>> -
>>>    static struct vgem_device {
>>>        struct drm_device drm;
>>>        struct platform_device *platform;
>>>    } *vgem_device;
>>>
>>> -static void vgem_gem_free_object(struct drm_gem_object *obj)
>>> -{
>>> -     struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
>>> -
>>> -     kvfree(vgem_obj->pages);
>>> -     mutex_destroy(&vgem_obj->pages_lock);
>>> -
>>> -     if (obj->import_attach)
>>> -             drm_prime_gem_destroy(obj, vgem_obj->table);
>>> -
>>> -     drm_gem_object_release(obj);
>>> -     kfree(vgem_obj);
>>> -}
>>> -
>>> -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
>>> -{
>>> -     struct vm_area_struct *vma = vmf->vma;
>>> -     struct drm_vgem_gem_object *obj = vma->vm_private_data;
>>> -     /* We don't use vmf->pgoff since that has the fake offset */
>>> -     unsigned long vaddr = vmf->address;
>>> -     vm_fault_t ret = VM_FAULT_SIGBUS;
>>> -     loff_t num_pages;
>>> -     pgoff_t page_offset;
>>> -     page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
>>> -
>>> -     num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
>>> -
>>> -     if (page_offset >= num_pages)
>>> -             return VM_FAULT_SIGBUS;
>>> -
>>> -     mutex_lock(&obj->pages_lock);
>>> -     if (obj->pages) {
>>> -             get_page(obj->pages[page_offset]);
>>> -             vmf->page = obj->pages[page_offset];
>>> -             ret = 0;
>>> -     }
>>> -     mutex_unlock(&obj->pages_lock);
>>> -     if (ret) {
>>> -             struct page *page;
>>> -
>>> -             page = shmem_read_mapping_page(
>>> -                                     file_inode(obj->base.filp)->i_mapping,
>>> -                                     page_offset);
>>> -             if (!IS_ERR(page)) {
>>> -                     vmf->page = page;
>>> -                     ret = 0;
>>> -             } else switch (PTR_ERR(page)) {
>>> -                     case -ENOSPC:
>>> -                     case -ENOMEM:
>>> -                             ret = VM_FAULT_OOM;
>>> -                             break;
>>> -                     case -EBUSY:
>>> -                             ret = VM_FAULT_RETRY;
>>> -                             break;
>>> -                     case -EFAULT:
>>> -                     case -EINVAL:
>>> -                             ret = VM_FAULT_SIGBUS;
>>> -                             break;
>>> -                     default:
>>> -                             WARN_ON(PTR_ERR(page));
>>> -                             ret = VM_FAULT_SIGBUS;
>>> -                             break;
>>> -             }
>>> -
>>> -     }
>>> -     return ret;
>>> -}
>>> -
>>> -static const struct vm_operations_struct vgem_gem_vm_ops = {
>>> -     .fault = vgem_gem_fault,
>>> -     .open = drm_gem_vm_open,
>>> -     .close = drm_gem_vm_close,
>>> -};
>>> -
>>>    static int vgem_open(struct drm_device *dev, struct drm_file *file)
>>>    {
>>>        struct vgem_file *vfile;
>>> @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
>>>        kfree(vfile);
>>>    }
>>>
>>> -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
>>> -                                             unsigned long size)
>>> -{
>>> -     struct drm_vgem_gem_object *obj;
>>> -     int ret;
>>> -
>>> -     obj = kzalloc(sizeof(*obj), GFP_KERNEL);
>>> -     if (!obj)
>>> -             return ERR_PTR(-ENOMEM);
>>> -
>>> -     obj->base.funcs = &vgem_gem_object_funcs;
>>> -
>>> -     ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
>>> -     if (ret) {
>>> -             kfree(obj);
>>> -             return ERR_PTR(ret);
>>> -     }
>>> -
>>> -     mutex_init(&obj->pages_lock);
>>> -
>>> -     return obj;
>>> -}
>>> -
>>> -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
>>> -{
>>> -     drm_gem_object_release(&obj->base);
>>> -     kfree(obj);
>>> -}
>>> -
>>> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
>>> -                                           struct drm_file *file,
>>> -                                           unsigned int *handle,
>>> -                                           unsigned long size)
>>> -{
>>> -     struct drm_vgem_gem_object *obj;
>>> -     int ret;
>>> -
>>> -     obj = __vgem_gem_create(dev, size);
>>> -     if (IS_ERR(obj))
>>> -             return ERR_CAST(obj);
>>> -
>>> -     ret = drm_gem_handle_create(file, &obj->base, handle);
>>> -     if (ret) {
>>> -             drm_gem_object_put(&obj->base);
>>> -             return ERR_PTR(ret);
>>> -     }
>>> -
>>> -     return &obj->base;
>>> -}
>>> -
>>> -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
>>> -                             struct drm_mode_create_dumb *args)
>>> -{
>>> -     struct drm_gem_object *gem_object;
>>> -     u64 pitch, size;
>>> -
>>> -     pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
>>> -     size = args->height * pitch;
>>> -     if (size == 0)
>>> -             return -EINVAL;
>>> -
>>> -     gem_object = vgem_gem_create(dev, file, &args->handle, size);
>>> -     if (IS_ERR(gem_object))
>>> -             return PTR_ERR(gem_object);
>>> -
>>> -     args->size = gem_object->size;
>>> -     args->pitch = pitch;
>>> -
>>> -     drm_gem_object_put(gem_object);
>>> -
>>> -     DRM_DEBUG("Created object of size %llu\n", args->size);
>>> -
>>> -     return 0;
>>> -}
>>> -
>>>    static struct drm_ioctl_desc vgem_ioctls[] = {
>>>        DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
>>>        DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
>>>    };
>>>
>>> -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
>>> -{
>>> -     unsigned long flags = vma->vm_flags;
>>> -     int ret;
>>> -
>>> -     ret = drm_gem_mmap(filp, vma);
>>> -     if (ret)
>>> -             return ret;
>>> -
>>> -     /* Keep the WC mmaping set by drm_gem_mmap() but our pages
>>> -      * are ordinary and not special.
>>> -      */
>>> -     vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
>>> -     return 0;
>>> -}
>>> -
>>> -static const struct file_operations vgem_driver_fops = {
>>> -     .owner          = THIS_MODULE,
>>> -     .open           = drm_open,
>>> -     .mmap           = vgem_mmap,
>>> -     .poll           = drm_poll,
>>> -     .read           = drm_read,
>>> -     .unlocked_ioctl = drm_ioctl,
>>> -     .compat_ioctl   = drm_compat_ioctl,
>>> -     .release        = drm_release,
>>> -};
>>> -
>>> -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
>>> -{
>>> -     mutex_lock(&bo->pages_lock);
>>> -     if (bo->pages_pin_count++ == 0) {
>>> -             struct page **pages;
>>> -
>>> -             pages = drm_gem_get_pages(&bo->base);
>>> -             if (IS_ERR(pages)) {
>>> -                     bo->pages_pin_count--;
>>> -                     mutex_unlock(&bo->pages_lock);
>>> -                     return pages;
>>> -             }
>>> -
>>> -             bo->pages = pages;
>>> -     }
>>> -     mutex_unlock(&bo->pages_lock);
>>> -
>>> -     return bo->pages;
>>> -}
>>> -
>>> -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
>>> -{
>>> -     mutex_lock(&bo->pages_lock);
>>> -     if (--bo->pages_pin_count == 0) {
>>> -             drm_gem_put_pages(&bo->base, bo->pages, true, true);
>>> -             bo->pages = NULL;
>>> -     }
>>> -     mutex_unlock(&bo->pages_lock);
>>> -}
>>> -
>>> -static int vgem_prime_pin(struct drm_gem_object *obj)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -     long n_pages = obj->size >> PAGE_SHIFT;
>>> -     struct page **pages;
>>> -
>>> -     pages = vgem_pin_pages(bo);
>>> -     if (IS_ERR(pages))
>>> -             return PTR_ERR(pages);
>>> -
>>> -     /* Flush the object from the CPU cache so that importers can rely
>>> -      * on coherent indirect access via the exported dma-address.
>>> -      */
>>> -     drm_clflush_pages(pages, n_pages);
>>> -
>>> -     return 0;
>>> -}
>>> -
>>> -static void vgem_prime_unpin(struct drm_gem_object *obj)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -
>>> -     vgem_unpin_pages(bo);
>>> -}
>>> -
>>> -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -
>>> -     return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
>>> -}
>>> -
>>> -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
>>> -                                             struct dma_buf *dma_buf)
>>> -{
>>> -     struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
>>> -
>>> -     return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
>>> -}
>>> -
>>> -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
>>> -                     struct dma_buf_attachment *attach, struct sg_table *sg)
>>> -{
>>> -     struct drm_vgem_gem_object *obj;
>>> -     int npages;
>>> -
>>> -     obj = __vgem_gem_create(dev, attach->dmabuf->size);
>>> -     if (IS_ERR(obj))
>>> -             return ERR_CAST(obj);
>>> -
>>> -     npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
>>> -
>>> -     obj->table = sg;
>>> -     obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
>>> -     if (!obj->pages) {
>>> -             __vgem_gem_destroy(obj);
>>> -             return ERR_PTR(-ENOMEM);
>>> -     }
>>> -
>>> -     obj->pages_pin_count++; /* perma-pinned */
>>> -     drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
>>> -     return &obj->base;
>>> -}
>>> -
>>> -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -     long n_pages = obj->size >> PAGE_SHIFT;
>>> -     struct page **pages;
>>> -     void *vaddr;
>>> -
>>> -     pages = vgem_pin_pages(bo);
>>> -     if (IS_ERR(pages))
>>> -             return PTR_ERR(pages);
>>> -
>>> -     vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
>>> -     if (!vaddr)
>>> -             return -ENOMEM;
>>> -     dma_buf_map_set_vaddr(map, vaddr);
>>> -
>>> -     return 0;
>>> -}
>>> -
>>> -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
>>> -{
>>> -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
>>> -
>>> -     vunmap(map->vaddr);
>>> -     vgem_unpin_pages(bo);
>>> -}
>>> -
>>> -static int vgem_prime_mmap(struct drm_gem_object *obj,
>>> -                        struct vm_area_struct *vma)
>>> -{
>>> -     int ret;
>>> -
>>> -     if (obj->size < vma->vm_end - vma->vm_start)
>>> -             return -EINVAL;
>>> -
>>> -     if (!obj->filp)
>>> -             return -ENODEV;
>>> -
>>> -     ret = call_mmap(obj->filp, vma);
>>> -     if (ret)
>>> -             return ret;
>>> -
>>> -     vma_set_file(vma, obj->filp);
>>> -     vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
>>> -     vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
>>> -
>>> -     return 0;
>>> -}
>>> -
>>> -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
>>> -     .free = vgem_gem_free_object,
>>> -     .pin = vgem_prime_pin,
>>> -     .unpin = vgem_prime_unpin,
>>> -     .get_sg_table = vgem_prime_get_sg_table,
>>> -     .vmap = vgem_prime_vmap,
>>> -     .vunmap = vgem_prime_vunmap,
>>> -     .vm_ops = &vgem_gem_vm_ops,
>>> -};
>>> +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
>>>
>>>    static const struct drm_driver vgem_driver = {
>>>        .driver_features                = DRIVER_GEM | DRIVER_RENDER,
>>> @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
>>>        .num_ioctls                     = ARRAY_SIZE(vgem_ioctls),
>>>        .fops                           = &vgem_driver_fops,
>>>
>>> -     .dumb_create                    = vgem_gem_dumb_create,
>>> -
>>> -     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
>>> -     .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
>>> -     .gem_prime_import = vgem_prime_import,
>>> -     .gem_prime_import_sg_table = vgem_prime_import_sg_table,
>>> -     .gem_prime_mmap = vgem_prime_mmap,
>>> +     DRM_GEM_SHMEM_DRIVER_OPS,
>>>
>>>        .name   = DRIVER_NAME,
>>>        .desc   = DRIVER_DESC,
>>>
>>
>> --
>> Thomas Zimmermann
>> Graphics Driver Developer
>> SUSE Software Solutions Germany GmbH
>> Maxfeldstr. 5, 90409 Nürnberg, Germany
>> (HRB 36809, AG Nürnberg)
>> Geschäftsführer: Felix Imendörffer
>>
> 
> 

-- 
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Felix Imendörffer


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 840 bytes --]

[-- Attachment #2: Type: text/plain, Size: 160 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [PATCH] drm/vgem: use shmem helpers
  2021-02-26 13:51           ` [Intel-gfx] " Thomas Zimmermann
@ 2021-02-26 14:04             ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-26 14:04 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Daniel Vetter, Intel Graphics Development, DRI Development,
	Chris Wilson, Melissa Wen, Daniel Vetter, Christian König

On Fri, Feb 26, 2021 at 02:51:58PM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 26.02.21 um 14:30 schrieb Daniel Vetter:
> > On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
> > > 
> > > Hi
> > > 
> > > Am 25.02.21 um 11:23 schrieb Daniel Vetter:
> > > > Aside from deleting lots of code the real motivation here is to switch
> > > > the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> > > > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> > > > work, and even if you try and there's a struct page behind that,
> > > > touching it and mucking around with its refcount can upset drivers
> > > > real bad.
> > > > 
> > > > v2: Review from Thomas:
> > > > - sort #include
> > > > - drop more dead code that I didn't spot somehow
> > > > 
> > > > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)
> > > 
> > > Since you're working on it, could you move the config item into a
> > > Kconfig file under vgem?
> > 
> > We have a lot of drivers still without their own Kconfig. I thought
> > we're only doing that for drivers which have multiple options, or
> > otherwise would clutter up the main drm/Kconfig file?
> > 
> > Not opposed to this, just feels like if we do this, should do it for
> > all of them.
> 
> I didn't know that there was a rule for how to handle this. I just didn't
> like to have driver config rules in the main Kconfig file.

I don't think it is an actual rule, just how the driver Kconfig files
started out.

> But yeah, maybe let's change this consistently in a separate patchset.

Yeah I looked, we should also put all the driver files at the bottom, and
maybe sort them alphabetically or something like that. It's a bit a mess
right now.
-Daniel

> 
> Best regards
> Thomas
> 
> > -Daniel
> > 
> > 
> > > 
> > > Best regards
> > > Thomas
> > > 
> > > > 
> > > > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
> > > > Cc: John Stultz <john.stultz@linaro.org>
> > > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: Melissa Wen <melissa.srw@gmail.com>
> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > > ---
> > > >    drivers/gpu/drm/Kconfig         |   1 +
> > > >    drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
> > > >    2 files changed, 4 insertions(+), 337 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > > > index 8e73311de583..94e4ac830283 100644
> > > > --- a/drivers/gpu/drm/Kconfig
> > > > +++ b/drivers/gpu/drm/Kconfig
> > > > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
> > > >    config DRM_VGEM
> > > >        tristate "Virtual GEM provider"
> > > >        depends on DRM
> > > > +     select DRM_GEM_SHMEM_HELPER
> > > >        help
> > > >          Choose this option to get a virtual graphics memory manager,
> > > >          as used by Mesa's software renderer for enhanced performance.
> > > > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> > > > index a0e75f1d5d01..b1b3a5ffc542 100644
> > > > --- a/drivers/gpu/drm/vgem/vgem_drv.c
> > > > +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> > > > @@ -38,6 +38,7 @@
> > > > 
> > > >    #include <drm/drm_drv.h>
> > > >    #include <drm/drm_file.h>
> > > > +#include <drm/drm_gem_shmem_helper.h>
> > > >    #include <drm/drm_ioctl.h>
> > > >    #include <drm/drm_managed.h>
> > > >    #include <drm/drm_prime.h>
> > > > @@ -50,87 +51,11 @@
> > > >    #define DRIVER_MAJOR        1
> > > >    #define DRIVER_MINOR        0
> > > > 
> > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
> > > > -
> > > >    static struct vgem_device {
> > > >        struct drm_device drm;
> > > >        struct platform_device *platform;
> > > >    } *vgem_device;
> > > > 
> > > > -static void vgem_gem_free_object(struct drm_gem_object *obj)
> > > > -{
> > > > -     struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
> > > > -
> > > > -     kvfree(vgem_obj->pages);
> > > > -     mutex_destroy(&vgem_obj->pages_lock);
> > > > -
> > > > -     if (obj->import_attach)
> > > > -             drm_prime_gem_destroy(obj, vgem_obj->table);
> > > > -
> > > > -     drm_gem_object_release(obj);
> > > > -     kfree(vgem_obj);
> > > > -}
> > > > -
> > > > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
> > > > -{
> > > > -     struct vm_area_struct *vma = vmf->vma;
> > > > -     struct drm_vgem_gem_object *obj = vma->vm_private_data;
> > > > -     /* We don't use vmf->pgoff since that has the fake offset */
> > > > -     unsigned long vaddr = vmf->address;
> > > > -     vm_fault_t ret = VM_FAULT_SIGBUS;
> > > > -     loff_t num_pages;
> > > > -     pgoff_t page_offset;
> > > > -     page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
> > > > -
> > > > -     num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
> > > > -
> > > > -     if (page_offset >= num_pages)
> > > > -             return VM_FAULT_SIGBUS;
> > > > -
> > > > -     mutex_lock(&obj->pages_lock);
> > > > -     if (obj->pages) {
> > > > -             get_page(obj->pages[page_offset]);
> > > > -             vmf->page = obj->pages[page_offset];
> > > > -             ret = 0;
> > > > -     }
> > > > -     mutex_unlock(&obj->pages_lock);
> > > > -     if (ret) {
> > > > -             struct page *page;
> > > > -
> > > > -             page = shmem_read_mapping_page(
> > > > -                                     file_inode(obj->base.filp)->i_mapping,
> > > > -                                     page_offset);
> > > > -             if (!IS_ERR(page)) {
> > > > -                     vmf->page = page;
> > > > -                     ret = 0;
> > > > -             } else switch (PTR_ERR(page)) {
> > > > -                     case -ENOSPC:
> > > > -                     case -ENOMEM:
> > > > -                             ret = VM_FAULT_OOM;
> > > > -                             break;
> > > > -                     case -EBUSY:
> > > > -                             ret = VM_FAULT_RETRY;
> > > > -                             break;
> > > > -                     case -EFAULT:
> > > > -                     case -EINVAL:
> > > > -                             ret = VM_FAULT_SIGBUS;
> > > > -                             break;
> > > > -                     default:
> > > > -                             WARN_ON(PTR_ERR(page));
> > > > -                             ret = VM_FAULT_SIGBUS;
> > > > -                             break;
> > > > -             }
> > > > -
> > > > -     }
> > > > -     return ret;
> > > > -}
> > > > -
> > > > -static const struct vm_operations_struct vgem_gem_vm_ops = {
> > > > -     .fault = vgem_gem_fault,
> > > > -     .open = drm_gem_vm_open,
> > > > -     .close = drm_gem_vm_close,
> > > > -};
> > > > -
> > > >    static int vgem_open(struct drm_device *dev, struct drm_file *file)
> > > >    {
> > > >        struct vgem_file *vfile;
> > > > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
> > > >        kfree(vfile);
> > > >    }
> > > > 
> > > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> > > > -                                             unsigned long size)
> > > > -{
> > > > -     struct drm_vgem_gem_object *obj;
> > > > -     int ret;
> > > > -
> > > > -     obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> > > > -     if (!obj)
> > > > -             return ERR_PTR(-ENOMEM);
> > > > -
> > > > -     obj->base.funcs = &vgem_gem_object_funcs;
> > > > -
> > > > -     ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
> > > > -     if (ret) {
> > > > -             kfree(obj);
> > > > -             return ERR_PTR(ret);
> > > > -     }
> > > > -
> > > > -     mutex_init(&obj->pages_lock);
> > > > -
> > > > -     return obj;
> > > > -}
> > > > -
> > > > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
> > > > -{
> > > > -     drm_gem_object_release(&obj->base);
> > > > -     kfree(obj);
> > > > -}
> > > > -
> > > > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> > > > -                                           struct drm_file *file,
> > > > -                                           unsigned int *handle,
> > > > -                                           unsigned long size)
> > > > -{
> > > > -     struct drm_vgem_gem_object *obj;
> > > > -     int ret;
> > > > -
> > > > -     obj = __vgem_gem_create(dev, size);
> > > > -     if (IS_ERR(obj))
> > > > -             return ERR_CAST(obj);
> > > > -
> > > > -     ret = drm_gem_handle_create(file, &obj->base, handle);
> > > > -     if (ret) {
> > > > -             drm_gem_object_put(&obj->base);
> > > > -             return ERR_PTR(ret);
> > > > -     }
> > > > -
> > > > -     return &obj->base;
> > > > -}
> > > > -
> > > > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
> > > > -                             struct drm_mode_create_dumb *args)
> > > > -{
> > > > -     struct drm_gem_object *gem_object;
> > > > -     u64 pitch, size;
> > > > -
> > > > -     pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
> > > > -     size = args->height * pitch;
> > > > -     if (size == 0)
> > > > -             return -EINVAL;
> > > > -
> > > > -     gem_object = vgem_gem_create(dev, file, &args->handle, size);
> > > > -     if (IS_ERR(gem_object))
> > > > -             return PTR_ERR(gem_object);
> > > > -
> > > > -     args->size = gem_object->size;
> > > > -     args->pitch = pitch;
> > > > -
> > > > -     drm_gem_object_put(gem_object);
> > > > -
> > > > -     DRM_DEBUG("Created object of size %llu\n", args->size);
> > > > -
> > > > -     return 0;
> > > > -}
> > > > -
> > > >    static struct drm_ioctl_desc vgem_ioctls[] = {
> > > >        DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
> > > >        DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
> > > >    };
> > > > 
> > > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> > > > -{
> > > > -     unsigned long flags = vma->vm_flags;
> > > > -     int ret;
> > > > -
> > > > -     ret = drm_gem_mmap(filp, vma);
> > > > -     if (ret)
> > > > -             return ret;
> > > > -
> > > > -     /* Keep the WC mmaping set by drm_gem_mmap() but our pages
> > > > -      * are ordinary and not special.
> > > > -      */
> > > > -     vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> > > > -     return 0;
> > > > -}
> > > > -
> > > > -static const struct file_operations vgem_driver_fops = {
> > > > -     .owner          = THIS_MODULE,
> > > > -     .open           = drm_open,
> > > > -     .mmap           = vgem_mmap,
> > > > -     .poll           = drm_poll,
> > > > -     .read           = drm_read,
> > > > -     .unlocked_ioctl = drm_ioctl,
> > > > -     .compat_ioctl   = drm_compat_ioctl,
> > > > -     .release        = drm_release,
> > > > -};
> > > > -
> > > > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
> > > > -{
> > > > -     mutex_lock(&bo->pages_lock);
> > > > -     if (bo->pages_pin_count++ == 0) {
> > > > -             struct page **pages;
> > > > -
> > > > -             pages = drm_gem_get_pages(&bo->base);
> > > > -             if (IS_ERR(pages)) {
> > > > -                     bo->pages_pin_count--;
> > > > -                     mutex_unlock(&bo->pages_lock);
> > > > -                     return pages;
> > > > -             }
> > > > -
> > > > -             bo->pages = pages;
> > > > -     }
> > > > -     mutex_unlock(&bo->pages_lock);
> > > > -
> > > > -     return bo->pages;
> > > > -}
> > > > -
> > > > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
> > > > -{
> > > > -     mutex_lock(&bo->pages_lock);
> > > > -     if (--bo->pages_pin_count == 0) {
> > > > -             drm_gem_put_pages(&bo->base, bo->pages, true, true);
> > > > -             bo->pages = NULL;
> > > > -     }
> > > > -     mutex_unlock(&bo->pages_lock);
> > > > -}
> > > > -
> > > > -static int vgem_prime_pin(struct drm_gem_object *obj)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -     long n_pages = obj->size >> PAGE_SHIFT;
> > > > -     struct page **pages;
> > > > -
> > > > -     pages = vgem_pin_pages(bo);
> > > > -     if (IS_ERR(pages))
> > > > -             return PTR_ERR(pages);
> > > > -
> > > > -     /* Flush the object from the CPU cache so that importers can rely
> > > > -      * on coherent indirect access via the exported dma-address.
> > > > -      */
> > > > -     drm_clflush_pages(pages, n_pages);
> > > > -
> > > > -     return 0;
> > > > -}
> > > > -
> > > > -static void vgem_prime_unpin(struct drm_gem_object *obj)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -
> > > > -     vgem_unpin_pages(bo);
> > > > -}
> > > > -
> > > > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -
> > > > -     return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
> > > > -}
> > > > -
> > > > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
> > > > -                                             struct dma_buf *dma_buf)
> > > > -{
> > > > -     struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
> > > > -
> > > > -     return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
> > > > -}
> > > > -
> > > > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
> > > > -                     struct dma_buf_attachment *attach, struct sg_table *sg)
> > > > -{
> > > > -     struct drm_vgem_gem_object *obj;
> > > > -     int npages;
> > > > -
> > > > -     obj = __vgem_gem_create(dev, attach->dmabuf->size);
> > > > -     if (IS_ERR(obj))
> > > > -             return ERR_CAST(obj);
> > > > -
> > > > -     npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
> > > > -
> > > > -     obj->table = sg;
> > > > -     obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
> > > > -     if (!obj->pages) {
> > > > -             __vgem_gem_destroy(obj);
> > > > -             return ERR_PTR(-ENOMEM);
> > > > -     }
> > > > -
> > > > -     obj->pages_pin_count++; /* perma-pinned */
> > > > -     drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
> > > > -     return &obj->base;
> > > > -}
> > > > -
> > > > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -     long n_pages = obj->size >> PAGE_SHIFT;
> > > > -     struct page **pages;
> > > > -     void *vaddr;
> > > > -
> > > > -     pages = vgem_pin_pages(bo);
> > > > -     if (IS_ERR(pages))
> > > > -             return PTR_ERR(pages);
> > > > -
> > > > -     vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> > > > -     if (!vaddr)
> > > > -             return -ENOMEM;
> > > > -     dma_buf_map_set_vaddr(map, vaddr);
> > > > -
> > > > -     return 0;
> > > > -}
> > > > -
> > > > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -
> > > > -     vunmap(map->vaddr);
> > > > -     vgem_unpin_pages(bo);
> > > > -}
> > > > -
> > > > -static int vgem_prime_mmap(struct drm_gem_object *obj,
> > > > -                        struct vm_area_struct *vma)
> > > > -{
> > > > -     int ret;
> > > > -
> > > > -     if (obj->size < vma->vm_end - vma->vm_start)
> > > > -             return -EINVAL;
> > > > -
> > > > -     if (!obj->filp)
> > > > -             return -ENODEV;
> > > > -
> > > > -     ret = call_mmap(obj->filp, vma);
> > > > -     if (ret)
> > > > -             return ret;
> > > > -
> > > > -     vma_set_file(vma, obj->filp);
> > > > -     vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> > > > -     vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> > > > -
> > > > -     return 0;
> > > > -}
> > > > -
> > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
> > > > -     .free = vgem_gem_free_object,
> > > > -     .pin = vgem_prime_pin,
> > > > -     .unpin = vgem_prime_unpin,
> > > > -     .get_sg_table = vgem_prime_get_sg_table,
> > > > -     .vmap = vgem_prime_vmap,
> > > > -     .vunmap = vgem_prime_vunmap,
> > > > -     .vm_ops = &vgem_gem_vm_ops,
> > > > -};
> > > > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
> > > > 
> > > >    static const struct drm_driver vgem_driver = {
> > > >        .driver_features                = DRIVER_GEM | DRIVER_RENDER,
> > > > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
> > > >        .num_ioctls                     = ARRAY_SIZE(vgem_ioctls),
> > > >        .fops                           = &vgem_driver_fops,
> > > > 
> > > > -     .dumb_create                    = vgem_gem_dumb_create,
> > > > -
> > > > -     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> > > > -     .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> > > > -     .gem_prime_import = vgem_prime_import,
> > > > -     .gem_prime_import_sg_table = vgem_prime_import_sg_table,
> > > > -     .gem_prime_mmap = vgem_prime_mmap,
> > > > +     DRM_GEM_SHMEM_DRIVER_OPS,
> > > > 
> > > >        .name   = DRIVER_NAME,
> > > >        .desc   = DRIVER_DESC,
> > > > 
> > > 
> > > --
> > > Thomas Zimmermann
> > > Graphics Driver Developer
> > > SUSE Software Solutions Germany GmbH
> > > Maxfeldstr. 5, 90409 Nürnberg, Germany
> > > (HRB 36809, AG Nürnberg)
> > > Geschäftsführer: Felix Imendörffer
> > > 
> > 
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [PATCH] drm/vgem: use shmem helpers
@ 2021-02-26 14:04             ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-02-26 14:04 UTC (permalink / raw)
  To: Thomas Zimmermann
  Cc: Daniel Vetter, Intel Graphics Development, DRI Development,
	Chris Wilson, Melissa Wen, Daniel Vetter, Christian König

On Fri, Feb 26, 2021 at 02:51:58PM +0100, Thomas Zimmermann wrote:
> Hi
> 
> Am 26.02.21 um 14:30 schrieb Daniel Vetter:
> > On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote:
> > > 
> > > Hi
> > > 
> > > Am 25.02.21 um 11:23 schrieb Daniel Vetter:
> > > > Aside from deleting lots of code the real motivation here is to switch
> > > > the mmap over to VM_PFNMAP, to be more consistent with what real gpu
> > > > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't
> > > > work, and even if you try and there's a struct page behind that,
> > > > touching it and mucking around with its refcount can upset drivers
> > > > real bad.
> > > > 
> > > > v2: Review from Thomas:
> > > > - sort #include
> > > > - drop more dead code that I didn't spot somehow
> > > > 
> > > > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci)
> > > 
> > > Since you're working on it, could you move the config item into a
> > > Kconfig file under vgem?
> > 
> > We have a lot of drivers still without their own Kconfig. I thought
> > we're only doing that for drivers which have multiple options, or
> > otherwise would clutter up the main drm/Kconfig file?
> > 
> > Not opposed to this, just feels like if we do this, should do it for
> > all of them.
> 
> I didn't know that there was a rule for how to handle this. I just didn't
> like to have driver config rules in the main Kconfig file.

I don't think it is an actual rule, just how the driver Kconfig files
started out.

> But yeah, maybe let's change this consistently in a separate patchset.

Yeah I looked, we should also put all the driver files at the bottom, and
maybe sort them alphabetically or something like that. It's a bit a mess
right now.
-Daniel

> 
> Best regards
> Thomas
> 
> > -Daniel
> > 
> > 
> > > 
> > > Best regards
> > > Thomas
> > > 
> > > > 
> > > > Cc: Thomas Zimmermann <tzimmermann@suse.de>
> > > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
> > > > Cc: John Stultz <john.stultz@linaro.org>
> > > > Cc: Sumit Semwal <sumit.semwal@linaro.org>
> > > > Cc: "Christian König" <christian.koenig@amd.com>
> > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
> > > > Cc: Melissa Wen <melissa.srw@gmail.com>
> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk>
> > > > ---
> > > >    drivers/gpu/drm/Kconfig         |   1 +
> > > >    drivers/gpu/drm/vgem/vgem_drv.c | 340 +-------------------------------
> > > >    2 files changed, 4 insertions(+), 337 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> > > > index 8e73311de583..94e4ac830283 100644
> > > > --- a/drivers/gpu/drm/Kconfig
> > > > +++ b/drivers/gpu/drm/Kconfig
> > > > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig"
> > > >    config DRM_VGEM
> > > >        tristate "Virtual GEM provider"
> > > >        depends on DRM
> > > > +     select DRM_GEM_SHMEM_HELPER
> > > >        help
> > > >          Choose this option to get a virtual graphics memory manager,
> > > >          as used by Mesa's software renderer for enhanced performance.
> > > > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
> > > > index a0e75f1d5d01..b1b3a5ffc542 100644
> > > > --- a/drivers/gpu/drm/vgem/vgem_drv.c
> > > > +++ b/drivers/gpu/drm/vgem/vgem_drv.c
> > > > @@ -38,6 +38,7 @@
> > > > 
> > > >    #include <drm/drm_drv.h>
> > > >    #include <drm/drm_file.h>
> > > > +#include <drm/drm_gem_shmem_helper.h>
> > > >    #include <drm/drm_ioctl.h>
> > > >    #include <drm/drm_managed.h>
> > > >    #include <drm/drm_prime.h>
> > > > @@ -50,87 +51,11 @@
> > > >    #define DRIVER_MAJOR        1
> > > >    #define DRIVER_MINOR        0
> > > > 
> > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs;
> > > > -
> > > >    static struct vgem_device {
> > > >        struct drm_device drm;
> > > >        struct platform_device *platform;
> > > >    } *vgem_device;
> > > > 
> > > > -static void vgem_gem_free_object(struct drm_gem_object *obj)
> > > > -{
> > > > -     struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
> > > > -
> > > > -     kvfree(vgem_obj->pages);
> > > > -     mutex_destroy(&vgem_obj->pages_lock);
> > > > -
> > > > -     if (obj->import_attach)
> > > > -             drm_prime_gem_destroy(obj, vgem_obj->table);
> > > > -
> > > > -     drm_gem_object_release(obj);
> > > > -     kfree(vgem_obj);
> > > > -}
> > > > -
> > > > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf)
> > > > -{
> > > > -     struct vm_area_struct *vma = vmf->vma;
> > > > -     struct drm_vgem_gem_object *obj = vma->vm_private_data;
> > > > -     /* We don't use vmf->pgoff since that has the fake offset */
> > > > -     unsigned long vaddr = vmf->address;
> > > > -     vm_fault_t ret = VM_FAULT_SIGBUS;
> > > > -     loff_t num_pages;
> > > > -     pgoff_t page_offset;
> > > > -     page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT;
> > > > -
> > > > -     num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE);
> > > > -
> > > > -     if (page_offset >= num_pages)
> > > > -             return VM_FAULT_SIGBUS;
> > > > -
> > > > -     mutex_lock(&obj->pages_lock);
> > > > -     if (obj->pages) {
> > > > -             get_page(obj->pages[page_offset]);
> > > > -             vmf->page = obj->pages[page_offset];
> > > > -             ret = 0;
> > > > -     }
> > > > -     mutex_unlock(&obj->pages_lock);
> > > > -     if (ret) {
> > > > -             struct page *page;
> > > > -
> > > > -             page = shmem_read_mapping_page(
> > > > -                                     file_inode(obj->base.filp)->i_mapping,
> > > > -                                     page_offset);
> > > > -             if (!IS_ERR(page)) {
> > > > -                     vmf->page = page;
> > > > -                     ret = 0;
> > > > -             } else switch (PTR_ERR(page)) {
> > > > -                     case -ENOSPC:
> > > > -                     case -ENOMEM:
> > > > -                             ret = VM_FAULT_OOM;
> > > > -                             break;
> > > > -                     case -EBUSY:
> > > > -                             ret = VM_FAULT_RETRY;
> > > > -                             break;
> > > > -                     case -EFAULT:
> > > > -                     case -EINVAL:
> > > > -                             ret = VM_FAULT_SIGBUS;
> > > > -                             break;
> > > > -                     default:
> > > > -                             WARN_ON(PTR_ERR(page));
> > > > -                             ret = VM_FAULT_SIGBUS;
> > > > -                             break;
> > > > -             }
> > > > -
> > > > -     }
> > > > -     return ret;
> > > > -}
> > > > -
> > > > -static const struct vm_operations_struct vgem_gem_vm_ops = {
> > > > -     .fault = vgem_gem_fault,
> > > > -     .open = drm_gem_vm_open,
> > > > -     .close = drm_gem_vm_close,
> > > > -};
> > > > -
> > > >    static int vgem_open(struct drm_device *dev, struct drm_file *file)
> > > >    {
> > > >        struct vgem_file *vfile;
> > > > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file)
> > > >        kfree(vfile);
> > > >    }
> > > > 
> > > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
> > > > -                                             unsigned long size)
> > > > -{
> > > > -     struct drm_vgem_gem_object *obj;
> > > > -     int ret;
> > > > -
> > > > -     obj = kzalloc(sizeof(*obj), GFP_KERNEL);
> > > > -     if (!obj)
> > > > -             return ERR_PTR(-ENOMEM);
> > > > -
> > > > -     obj->base.funcs = &vgem_gem_object_funcs;
> > > > -
> > > > -     ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE));
> > > > -     if (ret) {
> > > > -             kfree(obj);
> > > > -             return ERR_PTR(ret);
> > > > -     }
> > > > -
> > > > -     mutex_init(&obj->pages_lock);
> > > > -
> > > > -     return obj;
> > > > -}
> > > > -
> > > > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj)
> > > > -{
> > > > -     drm_gem_object_release(&obj->base);
> > > > -     kfree(obj);
> > > > -}
> > > > -
> > > > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
> > > > -                                           struct drm_file *file,
> > > > -                                           unsigned int *handle,
> > > > -                                           unsigned long size)
> > > > -{
> > > > -     struct drm_vgem_gem_object *obj;
> > > > -     int ret;
> > > > -
> > > > -     obj = __vgem_gem_create(dev, size);
> > > > -     if (IS_ERR(obj))
> > > > -             return ERR_CAST(obj);
> > > > -
> > > > -     ret = drm_gem_handle_create(file, &obj->base, handle);
> > > > -     if (ret) {
> > > > -             drm_gem_object_put(&obj->base);
> > > > -             return ERR_PTR(ret);
> > > > -     }
> > > > -
> > > > -     return &obj->base;
> > > > -}
> > > > -
> > > > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
> > > > -                             struct drm_mode_create_dumb *args)
> > > > -{
> > > > -     struct drm_gem_object *gem_object;
> > > > -     u64 pitch, size;
> > > > -
> > > > -     pitch = args->width * DIV_ROUND_UP(args->bpp, 8);
> > > > -     size = args->height * pitch;
> > > > -     if (size == 0)
> > > > -             return -EINVAL;
> > > > -
> > > > -     gem_object = vgem_gem_create(dev, file, &args->handle, size);
> > > > -     if (IS_ERR(gem_object))
> > > > -             return PTR_ERR(gem_object);
> > > > -
> > > > -     args->size = gem_object->size;
> > > > -     args->pitch = pitch;
> > > > -
> > > > -     drm_gem_object_put(gem_object);
> > > > -
> > > > -     DRM_DEBUG("Created object of size %llu\n", args->size);
> > > > -
> > > > -     return 0;
> > > > -}
> > > > -
> > > >    static struct drm_ioctl_desc vgem_ioctls[] = {
> > > >        DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW),
> > > >        DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW),
> > > >    };
> > > > 
> > > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma)
> > > > -{
> > > > -     unsigned long flags = vma->vm_flags;
> > > > -     int ret;
> > > > -
> > > > -     ret = drm_gem_mmap(filp, vma);
> > > > -     if (ret)
> > > > -             return ret;
> > > > -
> > > > -     /* Keep the WC mmaping set by drm_gem_mmap() but our pages
> > > > -      * are ordinary and not special.
> > > > -      */
> > > > -     vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP;
> > > > -     return 0;
> > > > -}
> > > > -
> > > > -static const struct file_operations vgem_driver_fops = {
> > > > -     .owner          = THIS_MODULE,
> > > > -     .open           = drm_open,
> > > > -     .mmap           = vgem_mmap,
> > > > -     .poll           = drm_poll,
> > > > -     .read           = drm_read,
> > > > -     .unlocked_ioctl = drm_ioctl,
> > > > -     .compat_ioctl   = drm_compat_ioctl,
> > > > -     .release        = drm_release,
> > > > -};
> > > > -
> > > > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
> > > > -{
> > > > -     mutex_lock(&bo->pages_lock);
> > > > -     if (bo->pages_pin_count++ == 0) {
> > > > -             struct page **pages;
> > > > -
> > > > -             pages = drm_gem_get_pages(&bo->base);
> > > > -             if (IS_ERR(pages)) {
> > > > -                     bo->pages_pin_count--;
> > > > -                     mutex_unlock(&bo->pages_lock);
> > > > -                     return pages;
> > > > -             }
> > > > -
> > > > -             bo->pages = pages;
> > > > -     }
> > > > -     mutex_unlock(&bo->pages_lock);
> > > > -
> > > > -     return bo->pages;
> > > > -}
> > > > -
> > > > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
> > > > -{
> > > > -     mutex_lock(&bo->pages_lock);
> > > > -     if (--bo->pages_pin_count == 0) {
> > > > -             drm_gem_put_pages(&bo->base, bo->pages, true, true);
> > > > -             bo->pages = NULL;
> > > > -     }
> > > > -     mutex_unlock(&bo->pages_lock);
> > > > -}
> > > > -
> > > > -static int vgem_prime_pin(struct drm_gem_object *obj)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -     long n_pages = obj->size >> PAGE_SHIFT;
> > > > -     struct page **pages;
> > > > -
> > > > -     pages = vgem_pin_pages(bo);
> > > > -     if (IS_ERR(pages))
> > > > -             return PTR_ERR(pages);
> > > > -
> > > > -     /* Flush the object from the CPU cache so that importers can rely
> > > > -      * on coherent indirect access via the exported dma-address.
> > > > -      */
> > > > -     drm_clflush_pages(pages, n_pages);
> > > > -
> > > > -     return 0;
> > > > -}
> > > > -
> > > > -static void vgem_prime_unpin(struct drm_gem_object *obj)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -
> > > > -     vgem_unpin_pages(bo);
> > > > -}
> > > > -
> > > > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -
> > > > -     return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT);
> > > > -}
> > > > -
> > > > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
> > > > -                                             struct dma_buf *dma_buf)
> > > > -{
> > > > -     struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm);
> > > > -
> > > > -     return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev);
> > > > -}
> > > > -
> > > > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
> > > > -                     struct dma_buf_attachment *attach, struct sg_table *sg)
> > > > -{
> > > > -     struct drm_vgem_gem_object *obj;
> > > > -     int npages;
> > > > -
> > > > -     obj = __vgem_gem_create(dev, attach->dmabuf->size);
> > > > -     if (IS_ERR(obj))
> > > > -             return ERR_CAST(obj);
> > > > -
> > > > -     npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE;
> > > > -
> > > > -     obj->table = sg;
> > > > -     obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL);
> > > > -     if (!obj->pages) {
> > > > -             __vgem_gem_destroy(obj);
> > > > -             return ERR_PTR(-ENOMEM);
> > > > -     }
> > > > -
> > > > -     obj->pages_pin_count++; /* perma-pinned */
> > > > -     drm_prime_sg_to_page_array(obj->table, obj->pages, npages);
> > > > -     return &obj->base;
> > > > -}
> > > > -
> > > > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -     long n_pages = obj->size >> PAGE_SHIFT;
> > > > -     struct page **pages;
> > > > -     void *vaddr;
> > > > -
> > > > -     pages = vgem_pin_pages(bo);
> > > > -     if (IS_ERR(pages))
> > > > -             return PTR_ERR(pages);
> > > > -
> > > > -     vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
> > > > -     if (!vaddr)
> > > > -             return -ENOMEM;
> > > > -     dma_buf_map_set_vaddr(map, vaddr);
> > > > -
> > > > -     return 0;
> > > > -}
> > > > -
> > > > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map)
> > > > -{
> > > > -     struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
> > > > -
> > > > -     vunmap(map->vaddr);
> > > > -     vgem_unpin_pages(bo);
> > > > -}
> > > > -
> > > > -static int vgem_prime_mmap(struct drm_gem_object *obj,
> > > > -                        struct vm_area_struct *vma)
> > > > -{
> > > > -     int ret;
> > > > -
> > > > -     if (obj->size < vma->vm_end - vma->vm_start)
> > > > -             return -EINVAL;
> > > > -
> > > > -     if (!obj->filp)
> > > > -             return -ENODEV;
> > > > -
> > > > -     ret = call_mmap(obj->filp, vma);
> > > > -     if (ret)
> > > > -             return ret;
> > > > -
> > > > -     vma_set_file(vma, obj->filp);
> > > > -     vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP;
> > > > -     vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
> > > > -
> > > > -     return 0;
> > > > -}
> > > > -
> > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = {
> > > > -     .free = vgem_gem_free_object,
> > > > -     .pin = vgem_prime_pin,
> > > > -     .unpin = vgem_prime_unpin,
> > > > -     .get_sg_table = vgem_prime_get_sg_table,
> > > > -     .vmap = vgem_prime_vmap,
> > > > -     .vunmap = vgem_prime_vunmap,
> > > > -     .vm_ops = &vgem_gem_vm_ops,
> > > > -};
> > > > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops);
> > > > 
> > > >    static const struct drm_driver vgem_driver = {
> > > >        .driver_features                = DRIVER_GEM | DRIVER_RENDER,
> > > > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = {
> > > >        .num_ioctls                     = ARRAY_SIZE(vgem_ioctls),
> > > >        .fops                           = &vgem_driver_fops,
> > > > 
> > > > -     .dumb_create                    = vgem_gem_dumb_create,
> > > > -
> > > > -     .prime_handle_to_fd = drm_gem_prime_handle_to_fd,
> > > > -     .prime_fd_to_handle = drm_gem_prime_fd_to_handle,
> > > > -     .gem_prime_import = vgem_prime_import,
> > > > -     .gem_prime_import_sg_table = vgem_prime_import_sg_table,
> > > > -     .gem_prime_mmap = vgem_prime_mmap,
> > > > +     DRM_GEM_SHMEM_DRIVER_OPS,
> > > > 
> > > >        .name   = DRIVER_NAME,
> > > >        .desc   = DRIVER_DESC,
> > > > 
> > > 
> > > --
> > > Thomas Zimmermann
> > > Graphics Driver Developer
> > > SUSE Software Solutions Germany GmbH
> > > Maxfeldstr. 5, 90409 Nürnberg, Germany
> > > (HRB 36809, AG Nürnberg)
> > > Geschäftsführer: Felix Imendörffer
> > > 
> > 
> > 
> 
> -- 
> Thomas Zimmermann
> Graphics Driver Developer
> SUSE Software Solutions Germany GmbH
> Maxfeldstr. 5, 90409 Nürnberg, Germany
> (HRB 36809, AG Nürnberg)
> Geschäftsführer: Felix Imendörffer
> 




-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-26 13:28                   ` Daniel Vetter
  (?)
@ 2021-02-27  8:06                     ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-27  8:06 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/26/21 2:28 PM, Daniel Vetter wrote:
> On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/25/21 4:49 PM, Daniel Vetter wrote:
>>> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>>>
>>>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>>>> result in a uapi nightmare.
>>>>>>>>>>
>>>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>>>
>>>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>>>
>>>>>>>>>> v2:
>>>>>>>>>>
>>>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>>>
>>>>>>>>>>      From auditing the various functions to insert pfn pte entires
>>>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>>>> this should be the correct flag to check for.
>>>>>>>>>>
>>>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>>>> or allowing MIXEDMAP.
>>>>>>>
>>>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>>>> global TLB flush.
>>>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>>>> from a performance perspective.
>>>>>
>>>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>>>> up the mapping area at driver load.
>>>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>>>
>>>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>>>> problem that hurts much :-)
>>>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>>>
>>>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
>>>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>>>> gup. At least afaik.
>>>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>>>> already seen tons of problems with the page cache.
>>>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>>>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>>>
>>>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>>>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>>>
>>>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>>>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>>>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>>>> in hugepte support by intentionally not being devmap.
>>>>
>>>> So I'm really not sure this works as we think it should. Maybe good to do
>>>> a quick test program on amdgpu with a buffer in system memory only and try
>>>> to do direct io into it. If it works, you have a problem, and a bad one.
>>> That's probably impossible, since a quick git grep shows that pretty
>>> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
>>> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
>>> where you can plug an amdgpu in and actually exercise the bug :-)
>> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I
>> don't see what should be stopping gup to those?
> If you have an arch with pte special we use insert_pfn(), which afaict
> will use pte_mkspecial for the !devmap case. And ttm isn't devmap
> (otherwise our hugepte abuse of devmap hugeptes would go rather
> wrong).
>
> So I think it stops gup. But I haven't verified at all. Would be good
> if Christian can check this with some direct io to a buffer in system
> memory.

Hmm,

Docs (again vm_normal_page() say)

  * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
  * page" backing, however the difference is that _all_ pages with a struct
  * page (that is, those where pfn_valid is true) are refcounted and 
considered
  * normal pages by the VM. The disadvantage is that pages are refcounted
  * (which can be slower and simply not an option for some PFNMAP 
users). The
  * advantage is that we don't have to follow the strict linearity rule of
  * PFNMAP mappings in order to support COWable mappings.

but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so 
the above isn't really true, which makes me wonder if and in that case 
why there could any longer ever be a significant performance difference 
between MIXEDMAP and PFNMAP.

BTW regarding the TTM hugeptes, I don't think we ever landed that devmap 
hack, so they are (for the non-gup case) relying on 
vma_is_special_huge(). For the gup case, I think the bug is still there.

/Thomas

> -Daniel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-27  8:06                     ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-27  8:06 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/26/21 2:28 PM, Daniel Vetter wrote:
> On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/25/21 4:49 PM, Daniel Vetter wrote:
>>> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>>>
>>>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>>>> result in a uapi nightmare.
>>>>>>>>>>
>>>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>>>
>>>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>>>
>>>>>>>>>> v2:
>>>>>>>>>>
>>>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>>>
>>>>>>>>>>      From auditing the various functions to insert pfn pte entires
>>>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>>>> this should be the correct flag to check for.
>>>>>>>>>>
>>>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>>>> or allowing MIXEDMAP.
>>>>>>>
>>>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>>>> global TLB flush.
>>>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>>>> from a performance perspective.
>>>>>
>>>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>>>> up the mapping area at driver load.
>>>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>>>
>>>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>>>> problem that hurts much :-)
>>>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>>>
>>>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
>>>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>>>> gup. At least afaik.
>>>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>>>> already seen tons of problems with the page cache.
>>>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>>>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>>>
>>>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>>>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>>>
>>>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>>>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>>>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>>>> in hugepte support by intentionally not being devmap.
>>>>
>>>> So I'm really not sure this works as we think it should. Maybe good to do
>>>> a quick test program on amdgpu with a buffer in system memory only and try
>>>> to do direct io into it. If it works, you have a problem, and a bad one.
>>> That's probably impossible, since a quick git grep shows that pretty
>>> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
>>> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
>>> where you can plug an amdgpu in and actually exercise the bug :-)
>> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I
>> don't see what should be stopping gup to those?
> If you have an arch with pte special we use insert_pfn(), which afaict
> will use pte_mkspecial for the !devmap case. And ttm isn't devmap
> (otherwise our hugepte abuse of devmap hugeptes would go rather
> wrong).
>
> So I think it stops gup. But I haven't verified at all. Would be good
> if Christian can check this with some direct io to a buffer in system
> memory.

Hmm,

Docs (again vm_normal_page() say)

  * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
  * page" backing, however the difference is that _all_ pages with a struct
  * page (that is, those where pfn_valid is true) are refcounted and 
considered
  * normal pages by the VM. The disadvantage is that pages are refcounted
  * (which can be slower and simply not an option for some PFNMAP 
users). The
  * advantage is that we don't have to follow the strict linearity rule of
  * PFNMAP mappings in order to support COWable mappings.

but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so 
the above isn't really true, which makes me wonder if and in that case 
why there could any longer ever be a significant performance difference 
between MIXEDMAP and PFNMAP.

BTW regarding the TTM hugeptes, I don't think we ever landed that devmap 
hack, so they are (for the non-gup case) relying on 
vma_is_special_huge(). For the gup case, I think the bug is still there.

/Thomas

> -Daniel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-02-27  8:06                     ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-02-27  8:06 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 2/26/21 2:28 PM, Daniel Vetter wrote:
> On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>>
>> On 2/25/21 4:49 PM, Daniel Vetter wrote:
>>> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote:
>>>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote:
>>>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter:
>>>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote:
>>>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel)
>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote:
>>>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use
>>>>>>>>>> them like that (like calling get_user_pages works, or that they're
>>>>>>>>>> accounting like any other normal memory) cannot be guaranteed.
>>>>>>>>>>
>>>>>>>>>> Since some userspace only runs on integrated devices, where all
>>>>>>>>>> buffers are actually all resident system memory, there's a huge
>>>>>>>>>> temptation to assume that a struct page is always present and useable
>>>>>>>>>> like for any more pagecache backed mmap. This has the potential to
>>>>>>>>>> result in a uapi nightmare.
>>>>>>>>>>
>>>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which
>>>>>>>>>> blocks get_user_pages and all the other struct page based
>>>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to
>>>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG.
>>>>>>>>>>
>>>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf
>>>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn.
>>>>>>>>>>
>>>>>>>>>> v2:
>>>>>>>>>>
>>>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the
>>>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures
>>>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would
>>>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that.
>>>>>>>>>>
>>>>>>>>>>      From auditing the various functions to insert pfn pte entires
>>>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like
>>>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so
>>>>>>>>>> this should be the correct flag to check for.
>>>>>>>>>>
>>>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to
>>>>>>>>> disallow COW mappings, since it will not work on architectures that
>>>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()).
>>>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since
>>>>>>>> COW really makes absolutely no sense. How would we enforce this?
>>>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that
>>>>>>> or allowing MIXEDMAP.
>>>>>>>
>>>>>>>>> Also worth noting is the comment in  ttm_bo_mmap_vma_setup() with
>>>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal
>>>>>>>>> pages. That's a very old comment, though, and might not be valid anymore.
>>>>>>>> I think that's why ttm has a page cache for these, because it indeed
>>>>>>>> sucks. The PAT changes on pages are rather expensive.
>>>>>>> IIRC the page cache was implemented because of the slowness of the
>>>>>>> caching mode transition itself, more specifically the wbinvd() call +
>>>>>>> global TLB flush.
>>>>> Yes, exactly that. The global TLB flush is what really breaks our neck here
>>>>> from a performance perspective.
>>>>>
>>>>>>>> There is still an issue for iomem mappings, because the PAT validation
>>>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn.
>>>>>>>> But for i915 at least this is fixed by using the io_mapping
>>>>>>>> infrastructure, which does the PAT reservation only once when you set
>>>>>>>> up the mapping area at driver load.
>>>>>>> Yes, I guess that was the issue that the comment describes, but the
>>>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP.
>>>>>>>
>>>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a
>>>>>>>> problem that hurts much :-)
>>>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP?
>>>>>>>
>>>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554
>>>>>> Uh that's bad, because mixed maps pointing at struct page wont stop
>>>>>> gup. At least afaik.
>>>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have
>>>>> already seen tons of problems with the page cache.
>>>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed
>>>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path.
>>>>
>>>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how
>>>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c.
>>>>
>>>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think
>>>> vm_insert_mixed even works on iomem pfns. There's the devmap exception,
>>>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle
>>>> in hugepte support by intentionally not being devmap.
>>>>
>>>> So I'm really not sure this works as we think it should. Maybe good to do
>>>> a quick test program on amdgpu with a buffer in system memory only and try
>>>> to do direct io into it. If it works, you have a problem, and a bad one.
>>> That's probably impossible, since a quick git grep shows that pretty
>>> much anything reasonable has special ptes: arc, arm, arm64, powerpc,
>>> riscv, s390, sh, sparc, x86. I don't think you'll have a platform
>>> where you can plug an amdgpu in and actually exercise the bug :-)
>> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I
>> don't see what should be stopping gup to those?
> If you have an arch with pte special we use insert_pfn(), which afaict
> will use pte_mkspecial for the !devmap case. And ttm isn't devmap
> (otherwise our hugepte abuse of devmap hugeptes would go rather
> wrong).
>
> So I think it stops gup. But I haven't verified at all. Would be good
> if Christian can check this with some direct io to a buffer in system
> memory.

Hmm,

Docs (again vm_normal_page() say)

  * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
  * page" backing, however the difference is that _all_ pages with a struct
  * page (that is, those where pfn_valid is true) are refcounted and 
considered
  * normal pages by the VM. The disadvantage is that pages are refcounted
  * (which can be slower and simply not an option for some PFNMAP 
users). The
  * advantage is that we don't have to follow the strict linearity rule of
  * PFNMAP mappings in order to support COWable mappings.

but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so 
the above isn't really true, which makes me wonder if and in that case 
why there could any longer ever be a significant performance difference 
between MIXEDMAP and PFNMAP.

BTW regarding the TTM hugeptes, I don't think we ever landed that devmap 
hack, so they are (for the non-gup case) relying on 
vma_is_special_huge(). For the gup case, I think the bug is still there.

/Thomas

> -Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-02-27  8:06                     ` Thomas Hellström (Intel)
  (?)
@ 2021-03-01  8:28                       ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01  8:28 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > So I think it stops gup. But I haven't verified at all. Would be good
> > if Christian can check this with some direct io to a buffer in system
> > memory.
>
> Hmm,
>
> Docs (again vm_normal_page() say)
>
>   * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>   * page" backing, however the difference is that _all_ pages with a struct
>   * page (that is, those where pfn_valid is true) are refcounted and
> considered
>   * normal pages by the VM. The disadvantage is that pages are refcounted
>   * (which can be slower and simply not an option for some PFNMAP
> users). The
>   * advantage is that we don't have to follow the strict linearity rule of
>   * PFNMAP mappings in order to support COWable mappings.
>
> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
> the above isn't really true, which makes me wonder if and in that case
> why there could any longer ever be a significant performance difference
> between MIXEDMAP and PFNMAP.

Yeah it's definitely confusing. I guess I'll hack up a patch and see
what sticks.

> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
> hack, so they are (for the non-gup case) relying on
> vma_is_special_huge(). For the gup case, I think the bug is still there.

Maybe there's another devmap hack, but the ttm_vm_insert functions do
use PFN_DEV and all that. And I think that stops gup_fast from trying
to find the underlying page.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01  8:28                       ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01  8:28 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > So I think it stops gup. But I haven't verified at all. Would be good
> > if Christian can check this with some direct io to a buffer in system
> > memory.
>
> Hmm,
>
> Docs (again vm_normal_page() say)
>
>   * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>   * page" backing, however the difference is that _all_ pages with a struct
>   * page (that is, those where pfn_valid is true) are refcounted and
> considered
>   * normal pages by the VM. The disadvantage is that pages are refcounted
>   * (which can be slower and simply not an option for some PFNMAP
> users). The
>   * advantage is that we don't have to follow the strict linearity rule of
>   * PFNMAP mappings in order to support COWable mappings.
>
> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
> the above isn't really true, which makes me wonder if and in that case
> why there could any longer ever be a significant performance difference
> between MIXEDMAP and PFNMAP.

Yeah it's definitely confusing. I guess I'll hack up a patch and see
what sticks.

> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
> hack, so they are (for the non-gup case) relying on
> vma_is_special_huge(). For the gup case, I think the bug is still there.

Maybe there's another devmap hack, but the ttm_vm_insert functions do
use PFN_DEV and all that. And I think that stops gup_fast from trying
to find the underlying page.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01  8:28                       ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01  8:28 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > So I think it stops gup. But I haven't verified at all. Would be good
> > if Christian can check this with some direct io to a buffer in system
> > memory.
>
> Hmm,
>
> Docs (again vm_normal_page() say)
>
>   * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>   * page" backing, however the difference is that _all_ pages with a struct
>   * page (that is, those where pfn_valid is true) are refcounted and
> considered
>   * normal pages by the VM. The disadvantage is that pages are refcounted
>   * (which can be slower and simply not an option for some PFNMAP
> users). The
>   * advantage is that we don't have to follow the strict linearity rule of
>   * PFNMAP mappings in order to support COWable mappings.
>
> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
> the above isn't really true, which makes me wonder if and in that case
> why there could any longer ever be a significant performance difference
> between MIXEDMAP and PFNMAP.

Yeah it's definitely confusing. I guess I'll hack up a patch and see
what sticks.

> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
> hack, so they are (for the non-gup case) relying on
> vma_is_special_huge(). For the gup case, I think the bug is still there.

Maybe there's another devmap hack, but the ttm_vm_insert functions do
use PFN_DEV and all that. And I think that stops gup_fast from trying
to find the underlying page.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-01  8:28                       ` Daniel Vetter
  (?)
@ 2021-03-01  8:39                         ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-01  8:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

Hi,

On 3/1/21 9:28 AM, Daniel Vetter wrote:
> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>> So I think it stops gup. But I haven't verified at all. Would be good
>>> if Christian can check this with some direct io to a buffer in system
>>> memory.
>> Hmm,
>>
>> Docs (again vm_normal_page() say)
>>
>>    * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>>    * page" backing, however the difference is that _all_ pages with a struct
>>    * page (that is, those where pfn_valid is true) are refcounted and
>> considered
>>    * normal pages by the VM. The disadvantage is that pages are refcounted
>>    * (which can be slower and simply not an option for some PFNMAP
>> users). The
>>    * advantage is that we don't have to follow the strict linearity rule of
>>    * PFNMAP mappings in order to support COWable mappings.
>>
>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
>> the above isn't really true, which makes me wonder if and in that case
>> why there could any longer ever be a significant performance difference
>> between MIXEDMAP and PFNMAP.
> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> what sticks.
>
>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
>> hack, so they are (for the non-gup case) relying on
>> vma_is_special_huge(). For the gup case, I think the bug is still there.
> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> use PFN_DEV and all that. And I think that stops gup_fast from trying
> to find the underlying page.
> -Daniel

Hmm perhaps it might, but I don't think so. The fix I tried out was to set

PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, 
and then

follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() 
backs off,

in the end that would mean setting in stone that "if there is a huge 
devmap page table entry for which we haven't registered any devmap 
struct pages (get_dev_pagemap returns NULL), we should treat that as a 
"special" huge page table entry".

 From what I can tell, all code calling get_dev_pagemap() already does 
that, it's just a question of getting it accepted and formalizing it.

/Thomas




^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01  8:39                         ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-01  8:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

Hi,

On 3/1/21 9:28 AM, Daniel Vetter wrote:
> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>> So I think it stops gup. But I haven't verified at all. Would be good
>>> if Christian can check this with some direct io to a buffer in system
>>> memory.
>> Hmm,
>>
>> Docs (again vm_normal_page() say)
>>
>>    * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>>    * page" backing, however the difference is that _all_ pages with a struct
>>    * page (that is, those where pfn_valid is true) are refcounted and
>> considered
>>    * normal pages by the VM. The disadvantage is that pages are refcounted
>>    * (which can be slower and simply not an option for some PFNMAP
>> users). The
>>    * advantage is that we don't have to follow the strict linearity rule of
>>    * PFNMAP mappings in order to support COWable mappings.
>>
>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
>> the above isn't really true, which makes me wonder if and in that case
>> why there could any longer ever be a significant performance difference
>> between MIXEDMAP and PFNMAP.
> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> what sticks.
>
>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
>> hack, so they are (for the non-gup case) relying on
>> vma_is_special_huge(). For the gup case, I think the bug is still there.
> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> use PFN_DEV and all that. And I think that stops gup_fast from trying
> to find the underlying page.
> -Daniel

Hmm perhaps it might, but I don't think so. The fix I tried out was to set

PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, 
and then

follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() 
backs off,

in the end that would mean setting in stone that "if there is a huge 
devmap page table entry for which we haven't registered any devmap 
struct pages (get_dev_pagemap returns NULL), we should treat that as a 
"special" huge page table entry".

 From what I can tell, all code calling get_dev_pagemap() already does 
that, it's just a question of getting it accepted and formalizing it.

/Thomas



_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01  8:39                         ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-01  8:39 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

Hi,

On 3/1/21 9:28 AM, Daniel Vetter wrote:
> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>> So I think it stops gup. But I haven't verified at all. Would be good
>>> if Christian can check this with some direct io to a buffer in system
>>> memory.
>> Hmm,
>>
>> Docs (again vm_normal_page() say)
>>
>>    * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>>    * page" backing, however the difference is that _all_ pages with a struct
>>    * page (that is, those where pfn_valid is true) are refcounted and
>> considered
>>    * normal pages by the VM. The disadvantage is that pages are refcounted
>>    * (which can be slower and simply not an option for some PFNMAP
>> users). The
>>    * advantage is that we don't have to follow the strict linearity rule of
>>    * PFNMAP mappings in order to support COWable mappings.
>>
>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
>> the above isn't really true, which makes me wonder if and in that case
>> why there could any longer ever be a significant performance difference
>> between MIXEDMAP and PFNMAP.
> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> what sticks.
>
>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
>> hack, so they are (for the non-gup case) relying on
>> vma_is_special_huge(). For the gup case, I think the bug is still there.
> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> use PFN_DEV and all that. And I think that stops gup_fast from trying
> to find the underlying page.
> -Daniel

Hmm perhaps it might, but I don't think so. The fix I tried out was to set

PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, 
and then

follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() 
backs off,

in the end that would mean setting in stone that "if there is a huge 
devmap page table entry for which we haven't registered any devmap 
struct pages (get_dev_pagemap returns NULL), we should treat that as a 
"special" huge page table entry".

 From what I can tell, all code calling get_dev_pagemap() already does 
that, it's just a question of getting it accepted and formalizing it.

/Thomas



_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-01  8:39                         ` Thomas Hellström (Intel)
  (?)
@ 2021-03-01  9:05                           ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01  9:05 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Daniel Vetter, Christian König, Intel Graphics Development,
	Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter,
	Suren Baghdasaryan, Christian König,
	open list:DMA BUFFER SHARING FRAMEWORK

On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote:
> Hi,
> 
> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > > On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > > > So I think it stops gup. But I haven't verified at all. Would be good
> > > > if Christian can check this with some direct io to a buffer in system
> > > > memory.
> > > Hmm,
> > > 
> > > Docs (again vm_normal_page() say)
> > > 
> > >    * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
> > >    * page" backing, however the difference is that _all_ pages with a struct
> > >    * page (that is, those where pfn_valid is true) are refcounted and
> > > considered
> > >    * normal pages by the VM. The disadvantage is that pages are refcounted
> > >    * (which can be slower and simply not an option for some PFNMAP
> > > users). The
> > >    * advantage is that we don't have to follow the strict linearity rule of
> > >    * PFNMAP mappings in order to support COWable mappings.
> > > 
> > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
> > > the above isn't really true, which makes me wonder if and in that case
> > > why there could any longer ever be a significant performance difference
> > > between MIXEDMAP and PFNMAP.
> > Yeah it's definitely confusing. I guess I'll hack up a patch and see
> > what sticks.
> > 
> > > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
> > > hack, so they are (for the non-gup case) relying on
> > > vma_is_special_huge(). For the gup case, I think the bug is still there.
> > Maybe there's another devmap hack, but the ttm_vm_insert functions do
> > use PFN_DEV and all that. And I think that stops gup_fast from trying
> > to find the underlying page.
> > -Daniel
> 
> Hmm perhaps it might, but I don't think so. The fix I tried out was to set
> 
> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and
> then
> 
> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast()
> backs off,
> 
> in the end that would mean setting in stone that "if there is a huge devmap
> page table entry for which we haven't registered any devmap struct pages
> (get_dev_pagemap returns NULL), we should treat that as a "special" huge
> page table entry".
> 
> From what I can tell, all code calling get_dev_pagemap() already does that,
> it's just a question of getting it accepted and formalizing it.

Oh I thought that's already how it works, since I didn't spot anything
else that would block gup_fast from falling over. I guess really would
need some testcases to make sure direct i/o (that's the easiest to test)
fails like we expect.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01  9:05                           ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01  9:05 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Christian König, Daniel Vetter,
	Suren Baghdasaryan, Christian König,
	open list:DMA BUFFER SHARING FRAMEWORK

On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote:
> Hi,
> 
> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > > On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > > > So I think it stops gup. But I haven't verified at all. Would be good
> > > > if Christian can check this with some direct io to a buffer in system
> > > > memory.
> > > Hmm,
> > > 
> > > Docs (again vm_normal_page() say)
> > > 
> > >    * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
> > >    * page" backing, however the difference is that _all_ pages with a struct
> > >    * page (that is, those where pfn_valid is true) are refcounted and
> > > considered
> > >    * normal pages by the VM. The disadvantage is that pages are refcounted
> > >    * (which can be slower and simply not an option for some PFNMAP
> > > users). The
> > >    * advantage is that we don't have to follow the strict linearity rule of
> > >    * PFNMAP mappings in order to support COWable mappings.
> > > 
> > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
> > > the above isn't really true, which makes me wonder if and in that case
> > > why there could any longer ever be a significant performance difference
> > > between MIXEDMAP and PFNMAP.
> > Yeah it's definitely confusing. I guess I'll hack up a patch and see
> > what sticks.
> > 
> > > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
> > > hack, so they are (for the non-gup case) relying on
> > > vma_is_special_huge(). For the gup case, I think the bug is still there.
> > Maybe there's another devmap hack, but the ttm_vm_insert functions do
> > use PFN_DEV and all that. And I think that stops gup_fast from trying
> > to find the underlying page.
> > -Daniel
> 
> Hmm perhaps it might, but I don't think so. The fix I tried out was to set
> 
> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and
> then
> 
> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast()
> backs off,
> 
> in the end that would mean setting in stone that "if there is a huge devmap
> page table entry for which we haven't registered any devmap struct pages
> (get_dev_pagemap returns NULL), we should treat that as a "special" huge
> page table entry".
> 
> From what I can tell, all code calling get_dev_pagemap() already does that,
> it's just a question of getting it accepted and formalizing it.

Oh I thought that's already how it works, since I didn't spot anything
else that would block gup_fast from falling over. I guess really would
need some testcases to make sure direct i/o (that's the easiest to test)
fails like we expect.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01  9:05                           ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01  9:05 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Christian König,
	Daniel Vetter, Suren Baghdasaryan, Christian König,
	open list:DMA BUFFER SHARING FRAMEWORK

On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote:
> Hi,
> 
> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> > <thomas_os@shipmail.org> wrote:
> > > On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > > > So I think it stops gup. But I haven't verified at all. Would be good
> > > > if Christian can check this with some direct io to a buffer in system
> > > > memory.
> > > Hmm,
> > > 
> > > Docs (again vm_normal_page() say)
> > > 
> > >    * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
> > >    * page" backing, however the difference is that _all_ pages with a struct
> > >    * page (that is, those where pfn_valid is true) are refcounted and
> > > considered
> > >    * normal pages by the VM. The disadvantage is that pages are refcounted
> > >    * (which can be slower and simply not an option for some PFNMAP
> > > users). The
> > >    * advantage is that we don't have to follow the strict linearity rule of
> > >    * PFNMAP mappings in order to support COWable mappings.
> > > 
> > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
> > > the above isn't really true, which makes me wonder if and in that case
> > > why there could any longer ever be a significant performance difference
> > > between MIXEDMAP and PFNMAP.
> > Yeah it's definitely confusing. I guess I'll hack up a patch and see
> > what sticks.
> > 
> > > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
> > > hack, so they are (for the non-gup case) relying on
> > > vma_is_special_huge(). For the gup case, I think the bug is still there.
> > Maybe there's another devmap hack, but the ttm_vm_insert functions do
> > use PFN_DEV and all that. And I think that stops gup_fast from trying
> > to find the underlying page.
> > -Daniel
> 
> Hmm perhaps it might, but I don't think so. The fix I tried out was to set
> 
> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and
> then
> 
> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast()
> backs off,
> 
> in the end that would mean setting in stone that "if there is a huge devmap
> page table entry for which we haven't registered any devmap struct pages
> (get_dev_pagemap returns NULL), we should treat that as a "special" huge
> page table entry".
> 
> From what I can tell, all code calling get_dev_pagemap() already does that,
> it's just a question of getting it accepted and formalizing it.

Oh I thought that's already how it works, since I didn't spot anything
else that would block gup_fast from falling over. I guess really would
need some testcases to make sure direct i/o (that's the easiest to test)
fails like we expect.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-01  9:05                           ` Daniel Vetter
  (?)
@ 2021-03-01  9:21                             ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-01  9:21 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Daniel Vetter, Christian König, Intel Graphics Development,
	Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter,
	Suren Baghdasaryan, Christian König,
	open list:DMA BUFFER SHARING FRAMEWORK


On 3/1/21 10:05 AM, Daniel Vetter wrote:
> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote:
>> Hi,
>>
>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>> <thomas_os@shipmail.org> wrote:
>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>> So I think it stops gup. But I haven't verified at all. Would be good
>>>>> if Christian can check this with some direct io to a buffer in system
>>>>> memory.
>>>> Hmm,
>>>>
>>>> Docs (again vm_normal_page() say)
>>>>
>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>>>>     * page" backing, however the difference is that _all_ pages with a struct
>>>>     * page (that is, those where pfn_valid is true) are refcounted and
>>>> considered
>>>>     * normal pages by the VM. The disadvantage is that pages are refcounted
>>>>     * (which can be slower and simply not an option for some PFNMAP
>>>> users). The
>>>>     * advantage is that we don't have to follow the strict linearity rule of
>>>>     * PFNMAP mappings in order to support COWable mappings.
>>>>
>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
>>>> the above isn't really true, which makes me wonder if and in that case
>>>> why there could any longer ever be a significant performance difference
>>>> between MIXEDMAP and PFNMAP.
>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>> what sticks.
>>>
>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
>>>> hack, so they are (for the non-gup case) relying on
>>>> vma_is_special_huge(). For the gup case, I think the bug is still there.
>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>> to find the underlying page.
>>> -Daniel
>> Hmm perhaps it might, but I don't think so. The fix I tried out was to set
>>
>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and
>> then
>>
>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast()
>> backs off,
>>
>> in the end that would mean setting in stone that "if there is a huge devmap
>> page table entry for which we haven't registered any devmap struct pages
>> (get_dev_pagemap returns NULL), we should treat that as a "special" huge
>> page table entry".
>>
>>  From what I can tell, all code calling get_dev_pagemap() already does that,
>> it's just a question of getting it accepted and formalizing it.
> Oh I thought that's already how it works, since I didn't spot anything
> else that would block gup_fast from falling over. I guess really would
> need some testcases to make sure direct i/o (that's the easiest to test)
> fails like we expect.

Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. 
Otherwise pmd_devmap() will not return true and since there is no 
pmd_special() things break.

/Thomas



> -Daniel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01  9:21                             ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-01  9:21 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Christian König, Daniel Vetter,
	Suren Baghdasaryan, Christian König,
	open list:DMA BUFFER SHARING FRAMEWORK


On 3/1/21 10:05 AM, Daniel Vetter wrote:
> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote:
>> Hi,
>>
>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>> <thomas_os@shipmail.org> wrote:
>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>> So I think it stops gup. But I haven't verified at all. Would be good
>>>>> if Christian can check this with some direct io to a buffer in system
>>>>> memory.
>>>> Hmm,
>>>>
>>>> Docs (again vm_normal_page() say)
>>>>
>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>>>>     * page" backing, however the difference is that _all_ pages with a struct
>>>>     * page (that is, those where pfn_valid is true) are refcounted and
>>>> considered
>>>>     * normal pages by the VM. The disadvantage is that pages are refcounted
>>>>     * (which can be slower and simply not an option for some PFNMAP
>>>> users). The
>>>>     * advantage is that we don't have to follow the strict linearity rule of
>>>>     * PFNMAP mappings in order to support COWable mappings.
>>>>
>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
>>>> the above isn't really true, which makes me wonder if and in that case
>>>> why there could any longer ever be a significant performance difference
>>>> between MIXEDMAP and PFNMAP.
>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>> what sticks.
>>>
>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
>>>> hack, so they are (for the non-gup case) relying on
>>>> vma_is_special_huge(). For the gup case, I think the bug is still there.
>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>> to find the underlying page.
>>> -Daniel
>> Hmm perhaps it might, but I don't think so. The fix I tried out was to set
>>
>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and
>> then
>>
>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast()
>> backs off,
>>
>> in the end that would mean setting in stone that "if there is a huge devmap
>> page table entry for which we haven't registered any devmap struct pages
>> (get_dev_pagemap returns NULL), we should treat that as a "special" huge
>> page table entry".
>>
>>  From what I can tell, all code calling get_dev_pagemap() already does that,
>> it's just a question of getting it accepted and formalizing it.
> Oh I thought that's already how it works, since I didn't spot anything
> else that would block gup_fast from falling over. I guess really would
> need some testcases to make sure direct i/o (that's the easiest to test)
> fails like we expect.

Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. 
Otherwise pmd_devmap() will not return true and since there is no 
pmd_special() things break.

/Thomas



> -Daniel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01  9:21                             ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-01  9:21 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Christian König,
	Daniel Vetter, Suren Baghdasaryan, Christian König,
	open list:DMA BUFFER SHARING FRAMEWORK


On 3/1/21 10:05 AM, Daniel Vetter wrote:
> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote:
>> Hi,
>>
>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>> <thomas_os@shipmail.org> wrote:
>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>> So I think it stops gup. But I haven't verified at all. Would be good
>>>>> if Christian can check this with some direct io to a buffer in system
>>>>> memory.
>>>> Hmm,
>>>>
>>>> Docs (again vm_normal_page() say)
>>>>
>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or without "struct
>>>>     * page" backing, however the difference is that _all_ pages with a struct
>>>>     * page (that is, those where pfn_valid is true) are refcounted and
>>>> considered
>>>>     * normal pages by the VM. The disadvantage is that pages are refcounted
>>>>     * (which can be slower and simply not an option for some PFNMAP
>>>> users). The
>>>>     * advantage is that we don't have to follow the strict linearity rule of
>>>>     * PFNMAP mappings in order to support COWable mappings.
>>>>
>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so
>>>> the above isn't really true, which makes me wonder if and in that case
>>>> why there could any longer ever be a significant performance difference
>>>> between MIXEDMAP and PFNMAP.
>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>> what sticks.
>>>
>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap
>>>> hack, so they are (for the non-gup case) relying on
>>>> vma_is_special_huge(). For the gup case, I think the bug is still there.
>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>> to find the underlying page.
>>> -Daniel
>> Hmm perhaps it might, but I don't think so. The fix I tried out was to set
>>
>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and
>> then
>>
>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast()
>> backs off,
>>
>> in the end that would mean setting in stone that "if there is a huge devmap
>> page table entry for which we haven't registered any devmap struct pages
>> (get_dev_pagemap returns NULL), we should treat that as a "special" huge
>> page table entry".
>>
>>  From what I can tell, all code calling get_dev_pagemap() already does that,
>> it's just a question of getting it accepted and formalizing it.
> Oh I thought that's already how it works, since I didn't spot anything
> else that would block gup_fast from falling over. I guess really would
> need some testcases to make sure direct i/o (that's the easiest to test)
> fails like we expect.

Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. 
Otherwise pmd_devmap() will not return true and since there is no 
pmd_special() things break.

/Thomas



> -Daniel
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-01  9:21                             ` Thomas Hellström (Intel)
  (?)
@ 2021-03-01 10:17                               ` Christian König
  -1 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-03-01 10:17 UTC (permalink / raw)
  To: Thomas Hellström (Intel), Daniel Vetter
  Cc: Daniel Vetter, Christian König, Intel Graphics Development,
	Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter,
	Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK



Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>
> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) 
>> wrote:
>>> Hi,
>>>
>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>> So I think it stops gup. But I haven't verified at all. Would be 
>>>>>> good
>>>>>> if Christian can check this with some direct io to a buffer in 
>>>>>> system
>>>>>> memory.
>>>>> Hmm,
>>>>>
>>>>> Docs (again vm_normal_page() say)
>>>>>
>>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or 
>>>>> without "struct
>>>>>     * page" backing, however the difference is that _all_ pages 
>>>>> with a struct
>>>>>     * page (that is, those where pfn_valid is true) are refcounted 
>>>>> and
>>>>> considered
>>>>>     * normal pages by the VM. The disadvantage is that pages are 
>>>>> refcounted
>>>>>     * (which can be slower and simply not an option for some PFNMAP
>>>>> users). The
>>>>>     * advantage is that we don't have to follow the strict 
>>>>> linearity rule of
>>>>>     * PFNMAP mappings in order to support COWable mappings.
>>>>>
>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() 
>>>>> path, so
>>>>> the above isn't really true, which makes me wonder if and in that 
>>>>> case
>>>>> why there could any longer ever be a significant performance 
>>>>> difference
>>>>> between MIXEDMAP and PFNMAP.
>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>> what sticks.
>>>>
>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that 
>>>>> devmap
>>>>> hack, so they are (for the non-gup case) relying on
>>>>> vma_is_special_huge(). For the gup case, I think the bug is still 
>>>>> there.
>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>> to find the underlying page.
>>>> -Daniel
>>> Hmm perhaps it might, but I don't think so. The fix I tried out was 
>>> to set
>>>
>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be 
>>> true, and
>>> then
>>>
>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and 
>>> gup_fast()
>>> backs off,
>>>
>>> in the end that would mean setting in stone that "if there is a huge 
>>> devmap
>>> page table entry for which we haven't registered any devmap struct 
>>> pages
>>> (get_dev_pagemap returns NULL), we should treat that as a "special" 
>>> huge
>>> page table entry".
>>>
>>>  From what I can tell, all code calling get_dev_pagemap() already 
>>> does that,
>>> it's just a question of getting it accepted and formalizing it.
>> Oh I thought that's already how it works, since I didn't spot anything
>> else that would block gup_fast from falling over. I guess really would
>> need some testcases to make sure direct i/o (that's the easiest to test)
>> fails like we expect.
>
> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. 
> Otherwise pmd_devmap() will not return true and since there is no 
> pmd_special() things break.

Is that maybe the issue we have seen with amdgpu and huge pages?

Apart from that I'm lost guys, that devmap and gup stuff is not 
something I have a good knowledge of apart from a one mile high view.

Christian.

>
> /Thomas
>
>
>
>> -Daniel


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01 10:17                               ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-03-01 10:17 UTC (permalink / raw)
  To: Thomas Hellström (Intel), Daniel Vetter
  Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Christian König, Daniel Vetter,
	Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK



Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>
> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) 
>> wrote:
>>> Hi,
>>>
>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>> So I think it stops gup. But I haven't verified at all. Would be 
>>>>>> good
>>>>>> if Christian can check this with some direct io to a buffer in 
>>>>>> system
>>>>>> memory.
>>>>> Hmm,
>>>>>
>>>>> Docs (again vm_normal_page() say)
>>>>>
>>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or 
>>>>> without "struct
>>>>>     * page" backing, however the difference is that _all_ pages 
>>>>> with a struct
>>>>>     * page (that is, those where pfn_valid is true) are refcounted 
>>>>> and
>>>>> considered
>>>>>     * normal pages by the VM. The disadvantage is that pages are 
>>>>> refcounted
>>>>>     * (which can be slower and simply not an option for some PFNMAP
>>>>> users). The
>>>>>     * advantage is that we don't have to follow the strict 
>>>>> linearity rule of
>>>>>     * PFNMAP mappings in order to support COWable mappings.
>>>>>
>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() 
>>>>> path, so
>>>>> the above isn't really true, which makes me wonder if and in that 
>>>>> case
>>>>> why there could any longer ever be a significant performance 
>>>>> difference
>>>>> between MIXEDMAP and PFNMAP.
>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>> what sticks.
>>>>
>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that 
>>>>> devmap
>>>>> hack, so they are (for the non-gup case) relying on
>>>>> vma_is_special_huge(). For the gup case, I think the bug is still 
>>>>> there.
>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>> to find the underlying page.
>>>> -Daniel
>>> Hmm perhaps it might, but I don't think so. The fix I tried out was 
>>> to set
>>>
>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be 
>>> true, and
>>> then
>>>
>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and 
>>> gup_fast()
>>> backs off,
>>>
>>> in the end that would mean setting in stone that "if there is a huge 
>>> devmap
>>> page table entry for which we haven't registered any devmap struct 
>>> pages
>>> (get_dev_pagemap returns NULL), we should treat that as a "special" 
>>> huge
>>> page table entry".
>>>
>>>  From what I can tell, all code calling get_dev_pagemap() already 
>>> does that,
>>> it's just a question of getting it accepted and formalizing it.
>> Oh I thought that's already how it works, since I didn't spot anything
>> else that would block gup_fast from falling over. I guess really would
>> need some testcases to make sure direct i/o (that's the easiest to test)
>> fails like we expect.
>
> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. 
> Otherwise pmd_devmap() will not return true and since there is no 
> pmd_special() things break.

Is that maybe the issue we have seen with amdgpu and huge pages?

Apart from that I'm lost guys, that devmap and gup stuff is not 
something I have a good knowledge of apart from a one mile high view.

Christian.

>
> /Thomas
>
>
>
>> -Daniel

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01 10:17                               ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-03-01 10:17 UTC (permalink / raw)
  To: Thomas Hellström (Intel), Daniel Vetter
  Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Christian König,
	Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>
> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) 
>> wrote:
>>> Hi,
>>>
>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>> <thomas_os@shipmail.org> wrote:
>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>> So I think it stops gup. But I haven't verified at all. Would be 
>>>>>> good
>>>>>> if Christian can check this with some direct io to a buffer in 
>>>>>> system
>>>>>> memory.
>>>>> Hmm,
>>>>>
>>>>> Docs (again vm_normal_page() say)
>>>>>
>>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or 
>>>>> without "struct
>>>>>     * page" backing, however the difference is that _all_ pages 
>>>>> with a struct
>>>>>     * page (that is, those where pfn_valid is true) are refcounted 
>>>>> and
>>>>> considered
>>>>>     * normal pages by the VM. The disadvantage is that pages are 
>>>>> refcounted
>>>>>     * (which can be slower and simply not an option for some PFNMAP
>>>>> users). The
>>>>>     * advantage is that we don't have to follow the strict 
>>>>> linearity rule of
>>>>>     * PFNMAP mappings in order to support COWable mappings.
>>>>>
>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() 
>>>>> path, so
>>>>> the above isn't really true, which makes me wonder if and in that 
>>>>> case
>>>>> why there could any longer ever be a significant performance 
>>>>> difference
>>>>> between MIXEDMAP and PFNMAP.
>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>> what sticks.
>>>>
>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that 
>>>>> devmap
>>>>> hack, so they are (for the non-gup case) relying on
>>>>> vma_is_special_huge(). For the gup case, I think the bug is still 
>>>>> there.
>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>> to find the underlying page.
>>>> -Daniel
>>> Hmm perhaps it might, but I don't think so. The fix I tried out was 
>>> to set
>>>
>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be 
>>> true, and
>>> then
>>>
>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and 
>>> gup_fast()
>>> backs off,
>>>
>>> in the end that would mean setting in stone that "if there is a huge 
>>> devmap
>>> page table entry for which we haven't registered any devmap struct 
>>> pages
>>> (get_dev_pagemap returns NULL), we should treat that as a "special" 
>>> huge
>>> page table entry".
>>>
>>>  From what I can tell, all code calling get_dev_pagemap() already 
>>> does that,
>>> it's just a question of getting it accepted and formalizing it.
>> Oh I thought that's already how it works, since I didn't spot anything
>> else that would block gup_fast from falling over. I guess really would
>> need some testcases to make sure direct i/o (that's the easiest to test)
>> fails like we expect.
>
> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. 
> Otherwise pmd_devmap() will not return true and since there is no 
> pmd_special() things break.

Is that maybe the issue we have seen with amdgpu and huge pages?

Apart from that I'm lost guys, that devmap and gup stuff is not 
something I have a good knowledge of apart from a one mile high view.

Christian.

>
> /Thomas
>
>
>
>> -Daniel

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-01 10:17                               ` Christian König
  (?)
@ 2021-03-01 14:09                                 ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01 14:09 UTC (permalink / raw)
  To: Christian König
  Cc: Thomas Hellström (Intel),
	Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK

On Mon, Mar 1, 2021 at 11:17 AM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> >
> > On 3/1/21 10:05 AM, Daniel Vetter wrote:
> >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> >> wrote:
> >>> Hi,
> >>>
> >>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> >>>> <thomas_os@shipmail.org> wrote:
> >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> >>>>>> So I think it stops gup. But I haven't verified at all. Would be
> >>>>>> good
> >>>>>> if Christian can check this with some direct io to a buffer in
> >>>>>> system
> >>>>>> memory.
> >>>>> Hmm,
> >>>>>
> >>>>> Docs (again vm_normal_page() say)
> >>>>>
> >>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or
> >>>>> without "struct
> >>>>>     * page" backing, however the difference is that _all_ pages
> >>>>> with a struct
> >>>>>     * page (that is, those where pfn_valid is true) are refcounted
> >>>>> and
> >>>>> considered
> >>>>>     * normal pages by the VM. The disadvantage is that pages are
> >>>>> refcounted
> >>>>>     * (which can be slower and simply not an option for some PFNMAP
> >>>>> users). The
> >>>>>     * advantage is that we don't have to follow the strict
> >>>>> linearity rule of
> >>>>>     * PFNMAP mappings in order to support COWable mappings.
> >>>>>
> >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
> >>>>> path, so
> >>>>> the above isn't really true, which makes me wonder if and in that
> >>>>> case
> >>>>> why there could any longer ever be a significant performance
> >>>>> difference
> >>>>> between MIXEDMAP and PFNMAP.
> >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> >>>> what sticks.
> >>>>
> >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
> >>>>> devmap
> >>>>> hack, so they are (for the non-gup case) relying on
> >>>>> vma_is_special_huge(). For the gup case, I think the bug is still
> >>>>> there.
> >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
> >>>> to find the underlying page.
> >>>> -Daniel
> >>> Hmm perhaps it might, but I don't think so. The fix I tried out was
> >>> to set
> >>>
> >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> >>> true, and
> >>> then
> >>>
> >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> >>> gup_fast()
> >>> backs off,
> >>>
> >>> in the end that would mean setting in stone that "if there is a huge
> >>> devmap
> >>> page table entry for which we haven't registered any devmap struct
> >>> pages
> >>> (get_dev_pagemap returns NULL), we should treat that as a "special"
> >>> huge
> >>> page table entry".
> >>>
> >>>  From what I can tell, all code calling get_dev_pagemap() already
> >>> does that,
> >>> it's just a question of getting it accepted and formalizing it.
> >> Oh I thought that's already how it works, since I didn't spot anything
> >> else that would block gup_fast from falling over. I guess really would
> >> need some testcases to make sure direct i/o (that's the easiest to test)
> >> fails like we expect.
> >
> > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> > Otherwise pmd_devmap() will not return true and since there is no
> > pmd_special() things break.
>
> Is that maybe the issue we have seen with amdgpu and huge pages?

Yeah, essentially when you have a hugepte inserted by ttm, and it
happens to point at system memory, then gup will work on that. And
create all kinds of havoc.

> Apart from that I'm lost guys, that devmap and gup stuff is not
> something I have a good knowledge of apart from a one mile high view.

I'm not really better, hence would be good to do a testcase and see.
This should provoke it:
- allocate nicely aligned bo in system memory
- mmap, again nicely aligned to 2M
- do some direct io from a filesystem into that mmap, that should trigger gup
- before the gup completes free the mmap and bo so that ttm recycles
the pages, which should trip up on the elevated refcount. If you wait
until the direct io is completely, then I think nothing bad can be
observed.

Ofc if your amdgpu+hugepte issue is something else, then maybe we have
another issue.

Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01 14:09                                 ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01 14:09 UTC (permalink / raw)
  To: Christian König
  Cc: Christian König, Thomas Hellström (Intel),
	Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK,
	Jason Gunthorpe, DRI Development, Daniel Vetter,
	Suren Baghdasaryan, Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Mon, Mar 1, 2021 at 11:17 AM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> >
> > On 3/1/21 10:05 AM, Daniel Vetter wrote:
> >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> >> wrote:
> >>> Hi,
> >>>
> >>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> >>>> <thomas_os@shipmail.org> wrote:
> >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> >>>>>> So I think it stops gup. But I haven't verified at all. Would be
> >>>>>> good
> >>>>>> if Christian can check this with some direct io to a buffer in
> >>>>>> system
> >>>>>> memory.
> >>>>> Hmm,
> >>>>>
> >>>>> Docs (again vm_normal_page() say)
> >>>>>
> >>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or
> >>>>> without "struct
> >>>>>     * page" backing, however the difference is that _all_ pages
> >>>>> with a struct
> >>>>>     * page (that is, those where pfn_valid is true) are refcounted
> >>>>> and
> >>>>> considered
> >>>>>     * normal pages by the VM. The disadvantage is that pages are
> >>>>> refcounted
> >>>>>     * (which can be slower and simply not an option for some PFNMAP
> >>>>> users). The
> >>>>>     * advantage is that we don't have to follow the strict
> >>>>> linearity rule of
> >>>>>     * PFNMAP mappings in order to support COWable mappings.
> >>>>>
> >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
> >>>>> path, so
> >>>>> the above isn't really true, which makes me wonder if and in that
> >>>>> case
> >>>>> why there could any longer ever be a significant performance
> >>>>> difference
> >>>>> between MIXEDMAP and PFNMAP.
> >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> >>>> what sticks.
> >>>>
> >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
> >>>>> devmap
> >>>>> hack, so they are (for the non-gup case) relying on
> >>>>> vma_is_special_huge(). For the gup case, I think the bug is still
> >>>>> there.
> >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
> >>>> to find the underlying page.
> >>>> -Daniel
> >>> Hmm perhaps it might, but I don't think so. The fix I tried out was
> >>> to set
> >>>
> >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> >>> true, and
> >>> then
> >>>
> >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> >>> gup_fast()
> >>> backs off,
> >>>
> >>> in the end that would mean setting in stone that "if there is a huge
> >>> devmap
> >>> page table entry for which we haven't registered any devmap struct
> >>> pages
> >>> (get_dev_pagemap returns NULL), we should treat that as a "special"
> >>> huge
> >>> page table entry".
> >>>
> >>>  From what I can tell, all code calling get_dev_pagemap() already
> >>> does that,
> >>> it's just a question of getting it accepted and formalizing it.
> >> Oh I thought that's already how it works, since I didn't spot anything
> >> else that would block gup_fast from falling over. I guess really would
> >> need some testcases to make sure direct i/o (that's the easiest to test)
> >> fails like we expect.
> >
> > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> > Otherwise pmd_devmap() will not return true and since there is no
> > pmd_special() things break.
>
> Is that maybe the issue we have seen with amdgpu and huge pages?

Yeah, essentially when you have a hugepte inserted by ttm, and it
happens to point at system memory, then gup will work on that. And
create all kinds of havoc.

> Apart from that I'm lost guys, that devmap and gup stuff is not
> something I have a good knowledge of apart from a one mile high view.

I'm not really better, hence would be good to do a testcase and see.
This should provoke it:
- allocate nicely aligned bo in system memory
- mmap, again nicely aligned to 2M
- do some direct io from a filesystem into that mmap, that should trigger gup
- before the gup completes free the mmap and bo so that ttm recycles
the pages, which should trip up on the elevated refcount. If you wait
until the direct io is completely, then I think nothing bad can be
observed.

Ofc if your amdgpu+hugepte issue is something else, then maybe we have
another issue.

Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-01 14:09                                 ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-01 14:09 UTC (permalink / raw)
  To: Christian König
  Cc: Christian König, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Intel Graphics Development,
	open list:DMA BUFFER SHARING FRAMEWORK

On Mon, Mar 1, 2021 at 11:17 AM Christian König
<christian.koenig@amd.com> wrote:
>
>
>
> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> >
> > On 3/1/21 10:05 AM, Daniel Vetter wrote:
> >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> >> wrote:
> >>> Hi,
> >>>
> >>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> >>>> <thomas_os@shipmail.org> wrote:
> >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> >>>>>> So I think it stops gup. But I haven't verified at all. Would be
> >>>>>> good
> >>>>>> if Christian can check this with some direct io to a buffer in
> >>>>>> system
> >>>>>> memory.
> >>>>> Hmm,
> >>>>>
> >>>>> Docs (again vm_normal_page() say)
> >>>>>
> >>>>>     * VM_MIXEDMAP mappings can likewise contain memory with or
> >>>>> without "struct
> >>>>>     * page" backing, however the difference is that _all_ pages
> >>>>> with a struct
> >>>>>     * page (that is, those where pfn_valid is true) are refcounted
> >>>>> and
> >>>>> considered
> >>>>>     * normal pages by the VM. The disadvantage is that pages are
> >>>>> refcounted
> >>>>>     * (which can be slower and simply not an option for some PFNMAP
> >>>>> users). The
> >>>>>     * advantage is that we don't have to follow the strict
> >>>>> linearity rule of
> >>>>>     * PFNMAP mappings in order to support COWable mappings.
> >>>>>
> >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
> >>>>> path, so
> >>>>> the above isn't really true, which makes me wonder if and in that
> >>>>> case
> >>>>> why there could any longer ever be a significant performance
> >>>>> difference
> >>>>> between MIXEDMAP and PFNMAP.
> >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> >>>> what sticks.
> >>>>
> >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
> >>>>> devmap
> >>>>> hack, so they are (for the non-gup case) relying on
> >>>>> vma_is_special_huge(). For the gup case, I think the bug is still
> >>>>> there.
> >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
> >>>> to find the underlying page.
> >>>> -Daniel
> >>> Hmm perhaps it might, but I don't think so. The fix I tried out was
> >>> to set
> >>>
> >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> >>> true, and
> >>> then
> >>>
> >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> >>> gup_fast()
> >>> backs off,
> >>>
> >>> in the end that would mean setting in stone that "if there is a huge
> >>> devmap
> >>> page table entry for which we haven't registered any devmap struct
> >>> pages
> >>> (get_dev_pagemap returns NULL), we should treat that as a "special"
> >>> huge
> >>> page table entry".
> >>>
> >>>  From what I can tell, all code calling get_dev_pagemap() already
> >>> does that,
> >>> it's just a question of getting it accepted and formalizing it.
> >> Oh I thought that's already how it works, since I didn't spot anything
> >> else that would block gup_fast from falling over. I guess really would
> >> need some testcases to make sure direct i/o (that's the easiest to test)
> >> fails like we expect.
> >
> > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> > Otherwise pmd_devmap() will not return true and since there is no
> > pmd_special() things break.
>
> Is that maybe the issue we have seen with amdgpu and huge pages?

Yeah, essentially when you have a hugepte inserted by ttm, and it
happens to point at system memory, then gup will work on that. And
create all kinds of havoc.

> Apart from that I'm lost guys, that devmap and gup stuff is not
> something I have a good knowledge of apart from a one mile high view.

I'm not really better, hence would be good to do a testcase and see.
This should provoke it:
- allocate nicely aligned bo in system memory
- mmap, again nicely aligned to 2M
- do some direct io from a filesystem into that mmap, that should trigger gup
- before the gup completes free the mmap and bo so that ttm recycles
the pages, which should trip up on the elevated refcount. If you wait
until the direct io is completely, then I think nothing bad can be
observed.

Ofc if your amdgpu+hugepte issue is something else, then maybe we have
another issue.

Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-01 14:09                                 ` Daniel Vetter
  (?)
@ 2021-03-11 10:22                                   ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 10:22 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK


On 3/1/21 3:09 PM, Daniel Vetter wrote:
> On Mon, Mar 1, 2021 at 11:17 AM Christian König
> <christian.koenig@amd.com> wrote:
>>
>>
>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>> good
>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>> system
>>>>>>>> memory.
>>>>>>> Hmm,
>>>>>>>
>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>
>>>>>>>      * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>> without "struct
>>>>>>>      * page" backing, however the difference is that _all_ pages
>>>>>>> with a struct
>>>>>>>      * page (that is, those where pfn_valid is true) are refcounted
>>>>>>> and
>>>>>>> considered
>>>>>>>      * normal pages by the VM. The disadvantage is that pages are
>>>>>>> refcounted
>>>>>>>      * (which can be slower and simply not an option for some PFNMAP
>>>>>>> users). The
>>>>>>>      * advantage is that we don't have to follow the strict
>>>>>>> linearity rule of
>>>>>>>      * PFNMAP mappings in order to support COWable mappings.
>>>>>>>
>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>> path, so
>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>> case
>>>>>>> why there could any longer ever be a significant performance
>>>>>>> difference
>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>> what sticks.
>>>>>>
>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>> devmap
>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>> there.
>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>> to find the underlying page.
>>>>>> -Daniel
>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>> to set
>>>>>
>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>> true, and
>>>>> then
>>>>>
>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>> gup_fast()
>>>>> backs off,
>>>>>
>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>> devmap
>>>>> page table entry for which we haven't registered any devmap struct
>>>>> pages
>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>> huge
>>>>> page table entry".
>>>>>
>>>>>   From what I can tell, all code calling get_dev_pagemap() already
>>>>> does that,
>>>>> it's just a question of getting it accepted and formalizing it.
>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>> else that would block gup_fast from falling over. I guess really would
>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>> fails like we expect.
>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>> Otherwise pmd_devmap() will not return true and since there is no
>>> pmd_special() things break.
>> Is that maybe the issue we have seen with amdgpu and huge pages?
> Yeah, essentially when you have a hugepte inserted by ttm, and it
> happens to point at system memory, then gup will work on that. And
> create all kinds of havoc.
>
>> Apart from that I'm lost guys, that devmap and gup stuff is not
>> something I have a good knowledge of apart from a one mile high view.
> I'm not really better, hence would be good to do a testcase and see.
> This should provoke it:
> - allocate nicely aligned bo in system memory
> - mmap, again nicely aligned to 2M
> - do some direct io from a filesystem into that mmap, that should trigger gup
> - before the gup completes free the mmap and bo so that ttm recycles
> the pages, which should trip up on the elevated refcount. If you wait
> until the direct io is completely, then I think nothing bad can be
> observed.
>
> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> another issue.
>
> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> -Daniel

So I did the following quick experiment on vmwgfx, and it turns out that 
with it,
fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds

I should probably craft an RFC formalizing this.

/Thomas

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 6dc96cf66744..72b6fb17c984 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         pfn_t pfnt;
         struct ttm_tt *ttm = bo->ttm;
         bool write = vmf->flags & FAULT_FLAG_WRITE;
+       struct dev_pagemap *pagemap;

         /* Fault should not cross bo boundary. */
         page_offset &= ~(fault_page_size - 1);
@@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         if ((pfn & (fault_page_size - 1)) != 0)
                 goto out_fallback;

+       /*
+        * Huge entries must be special, that is marking them as devmap
+        * with no backing device map range. If there is a backing
+        * range, Don't insert a huge entry.
+        */
+       pagemap = get_dev_pagemap(pfn, NULL);
+       if (pagemap) {
+               put_dev_pagemap(pagemap);
+               goto out_fallback;
+       }
+
         /* Check that memory is contiguous. */
         if (!bo->mem.bus.is_iomem) {
                 for (i = 1; i < fault_page_size; ++i) {
@@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
                 }
         }

-       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
+       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
         if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
                 ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
  #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
@@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         if (ret != VM_FAULT_NOPAGE)
                 goto out_fallback;

+#if 1
+       {
+               int npages;
+               struct page *page;
+
+               npages = get_user_pages_fast_only(vmf->address, 1, 0, 
&page);
+               if (npages == 1) {
+                       DRM_WARN("Fast gup succeeded. Bad.\n");
+                       put_page(page);
+               } else {
+                       DRM_INFO("Fast gup failed. Good.\n");
+               }
+       }
+#endif
+
         return VM_FAULT_NOPAGE;
  out_fallback:
         count_vm_event(THP_FAULT_FALLBACK);






^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 10:22                                   ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 10:22 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK


On 3/1/21 3:09 PM, Daniel Vetter wrote:
> On Mon, Mar 1, 2021 at 11:17 AM Christian König
> <christian.koenig@amd.com> wrote:
>>
>>
>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>> good
>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>> system
>>>>>>>> memory.
>>>>>>> Hmm,
>>>>>>>
>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>
>>>>>>>      * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>> without "struct
>>>>>>>      * page" backing, however the difference is that _all_ pages
>>>>>>> with a struct
>>>>>>>      * page (that is, those where pfn_valid is true) are refcounted
>>>>>>> and
>>>>>>> considered
>>>>>>>      * normal pages by the VM. The disadvantage is that pages are
>>>>>>> refcounted
>>>>>>>      * (which can be slower and simply not an option for some PFNMAP
>>>>>>> users). The
>>>>>>>      * advantage is that we don't have to follow the strict
>>>>>>> linearity rule of
>>>>>>>      * PFNMAP mappings in order to support COWable mappings.
>>>>>>>
>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>> path, so
>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>> case
>>>>>>> why there could any longer ever be a significant performance
>>>>>>> difference
>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>> what sticks.
>>>>>>
>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>> devmap
>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>> there.
>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>> to find the underlying page.
>>>>>> -Daniel
>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>> to set
>>>>>
>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>> true, and
>>>>> then
>>>>>
>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>> gup_fast()
>>>>> backs off,
>>>>>
>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>> devmap
>>>>> page table entry for which we haven't registered any devmap struct
>>>>> pages
>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>> huge
>>>>> page table entry".
>>>>>
>>>>>   From what I can tell, all code calling get_dev_pagemap() already
>>>>> does that,
>>>>> it's just a question of getting it accepted and formalizing it.
>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>> else that would block gup_fast from falling over. I guess really would
>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>> fails like we expect.
>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>> Otherwise pmd_devmap() will not return true and since there is no
>>> pmd_special() things break.
>> Is that maybe the issue we have seen with amdgpu and huge pages?
> Yeah, essentially when you have a hugepte inserted by ttm, and it
> happens to point at system memory, then gup will work on that. And
> create all kinds of havoc.
>
>> Apart from that I'm lost guys, that devmap and gup stuff is not
>> something I have a good knowledge of apart from a one mile high view.
> I'm not really better, hence would be good to do a testcase and see.
> This should provoke it:
> - allocate nicely aligned bo in system memory
> - mmap, again nicely aligned to 2M
> - do some direct io from a filesystem into that mmap, that should trigger gup
> - before the gup completes free the mmap and bo so that ttm recycles
> the pages, which should trip up on the elevated refcount. If you wait
> until the direct io is completely, then I think nothing bad can be
> observed.
>
> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> another issue.
>
> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> -Daniel

So I did the following quick experiment on vmwgfx, and it turns out that 
with it,
fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds

I should probably craft an RFC formalizing this.

/Thomas

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 6dc96cf66744..72b6fb17c984 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         pfn_t pfnt;
         struct ttm_tt *ttm = bo->ttm;
         bool write = vmf->flags & FAULT_FLAG_WRITE;
+       struct dev_pagemap *pagemap;

         /* Fault should not cross bo boundary. */
         page_offset &= ~(fault_page_size - 1);
@@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         if ((pfn & (fault_page_size - 1)) != 0)
                 goto out_fallback;

+       /*
+        * Huge entries must be special, that is marking them as devmap
+        * with no backing device map range. If there is a backing
+        * range, Don't insert a huge entry.
+        */
+       pagemap = get_dev_pagemap(pfn, NULL);
+       if (pagemap) {
+               put_dev_pagemap(pagemap);
+               goto out_fallback;
+       }
+
         /* Check that memory is contiguous. */
         if (!bo->mem.bus.is_iomem) {
                 for (i = 1; i < fault_page_size; ++i) {
@@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
                 }
         }

-       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
+       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
         if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
                 ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
  #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
@@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         if (ret != VM_FAULT_NOPAGE)
                 goto out_fallback;

+#if 1
+       {
+               int npages;
+               struct page *page;
+
+               npages = get_user_pages_fast_only(vmf->address, 1, 0, 
&page);
+               if (npages == 1) {
+                       DRM_WARN("Fast gup succeeded. Bad.\n");
+                       put_page(page);
+               } else {
+                       DRM_INFO("Fast gup failed. Good.\n");
+               }
+       }
+#endif
+
         return VM_FAULT_NOPAGE;
  out_fallback:
         count_vm_event(THP_FAULT_FALLBACK);





_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 10:22                                   ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 10:22 UTC (permalink / raw)
  To: Daniel Vetter, Christian König
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK


On 3/1/21 3:09 PM, Daniel Vetter wrote:
> On Mon, Mar 1, 2021 at 11:17 AM Christian König
> <christian.koenig@amd.com> wrote:
>>
>>
>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>> good
>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>> system
>>>>>>>> memory.
>>>>>>> Hmm,
>>>>>>>
>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>
>>>>>>>      * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>> without "struct
>>>>>>>      * page" backing, however the difference is that _all_ pages
>>>>>>> with a struct
>>>>>>>      * page (that is, those where pfn_valid is true) are refcounted
>>>>>>> and
>>>>>>> considered
>>>>>>>      * normal pages by the VM. The disadvantage is that pages are
>>>>>>> refcounted
>>>>>>>      * (which can be slower and simply not an option for some PFNMAP
>>>>>>> users). The
>>>>>>>      * advantage is that we don't have to follow the strict
>>>>>>> linearity rule of
>>>>>>>      * PFNMAP mappings in order to support COWable mappings.
>>>>>>>
>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>> path, so
>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>> case
>>>>>>> why there could any longer ever be a significant performance
>>>>>>> difference
>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>> what sticks.
>>>>>>
>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>> devmap
>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>> there.
>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>> to find the underlying page.
>>>>>> -Daniel
>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>> to set
>>>>>
>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>> true, and
>>>>> then
>>>>>
>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>> gup_fast()
>>>>> backs off,
>>>>>
>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>> devmap
>>>>> page table entry for which we haven't registered any devmap struct
>>>>> pages
>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>> huge
>>>>> page table entry".
>>>>>
>>>>>   From what I can tell, all code calling get_dev_pagemap() already
>>>>> does that,
>>>>> it's just a question of getting it accepted and formalizing it.
>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>> else that would block gup_fast from falling over. I guess really would
>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>> fails like we expect.
>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>> Otherwise pmd_devmap() will not return true and since there is no
>>> pmd_special() things break.
>> Is that maybe the issue we have seen with amdgpu and huge pages?
> Yeah, essentially when you have a hugepte inserted by ttm, and it
> happens to point at system memory, then gup will work on that. And
> create all kinds of havoc.
>
>> Apart from that I'm lost guys, that devmap and gup stuff is not
>> something I have a good knowledge of apart from a one mile high view.
> I'm not really better, hence would be good to do a testcase and see.
> This should provoke it:
> - allocate nicely aligned bo in system memory
> - mmap, again nicely aligned to 2M
> - do some direct io from a filesystem into that mmap, that should trigger gup
> - before the gup completes free the mmap and bo so that ttm recycles
> the pages, which should trip up on the elevated refcount. If you wait
> until the direct io is completely, then I think nothing bad can be
> observed.
>
> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> another issue.
>
> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> -Daniel

So I did the following quick experiment on vmwgfx, and it turns out that 
with it,
fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds

I should probably craft an RFC formalizing this.

/Thomas

diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c 
b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index 6dc96cf66744..72b6fb17c984 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         pfn_t pfnt;
         struct ttm_tt *ttm = bo->ttm;
         bool write = vmf->flags & FAULT_FLAG_WRITE;
+       struct dev_pagemap *pagemap;

         /* Fault should not cross bo boundary. */
         page_offset &= ~(fault_page_size - 1);
@@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         if ((pfn & (fault_page_size - 1)) != 0)
                 goto out_fallback;

+       /*
+        * Huge entries must be special, that is marking them as devmap
+        * with no backing device map range. If there is a backing
+        * range, Don't insert a huge entry.
+        */
+       pagemap = get_dev_pagemap(pfn, NULL);
+       if (pagemap) {
+               put_dev_pagemap(pagemap);
+               goto out_fallback;
+       }
+
         /* Check that memory is contiguous. */
         if (!bo->mem.bus.is_iomem) {
                 for (i = 1; i < fault_page_size; ++i) {
@@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
                 }
         }

-       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
+       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
         if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
                 ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
  #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
@@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct 
vm_fault *vmf,
         if (ret != VM_FAULT_NOPAGE)
                 goto out_fallback;

+#if 1
+       {
+               int npages;
+               struct page *page;
+
+               npages = get_user_pages_fast_only(vmf->address, 1, 0, 
&page);
+               if (npages == 1) {
+                       DRM_WARN("Fast gup succeeded. Bad.\n");
+                       put_page(page);
+               } else {
+                       DRM_INFO("Fast gup failed. Good.\n");
+               }
+       }
+#endif
+
         return VM_FAULT_NOPAGE;
  out_fallback:
         count_vm_event(THP_FAULT_FALLBACK);





_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 110+ messages in thread

* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [Linaro-mm-sig,1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev4)
  2021-02-23 10:59 ` Daniel Vetter
                   ` (8 preceding siblings ...)
  (?)
@ 2021-03-11 10:58 ` Patchwork
  -1 siblings, 0 replies; 110+ messages in thread
From: Patchwork @ 2021-03-11 10:58 UTC (permalink / raw)
  To: Thomas Hellström (Intel); +Cc: intel-gfx

== Series Details ==

Series: series starting with [Linaro-mm-sig,1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev4)
URL   : https://patchwork.freedesktop.org/series/87313/
State : failure

== Summary ==

Applying: dma-buf: Require VM_PFNMAP vma for mmap
error: git diff header lacks filename information when removing 1 leading pathname component (line 2)
error: could not build fake ancestor
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 dma-buf: Require VM_PFNMAP vma for mmap
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-11 10:22                                   ` Thomas Hellström (Intel)
  (?)
@ 2021-03-11 13:00                                     ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-11 13:00 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Daniel Vetter, Christian König, Christian König,
	Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
> 
> On 3/1/21 3:09 PM, Daniel Vetter wrote:
> > On Mon, Mar 1, 2021 at 11:17 AM Christian König
> > <christian.koenig@amd.com> wrote:
> > > 
> > > 
> > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> > > > On 3/1/21 10:05 AM, Daniel Vetter wrote:
> > > > > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> > > > > wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > On 3/1/21 9:28 AM, Daniel Vetter wrote:
> > > > > > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> > > > > > > <thomas_os@shipmail.org> wrote:
> > > > > > > > On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > > > > > > > > So I think it stops gup. But I haven't verified at all. Would be
> > > > > > > > > good
> > > > > > > > > if Christian can check this with some direct io to a buffer in
> > > > > > > > > system
> > > > > > > > > memory.
> > > > > > > > Hmm,
> > > > > > > > 
> > > > > > > > Docs (again vm_normal_page() say)
> > > > > > > > 
> > > > > > > >      * VM_MIXEDMAP mappings can likewise contain memory with or
> > > > > > > > without "struct
> > > > > > > >      * page" backing, however the difference is that _all_ pages
> > > > > > > > with a struct
> > > > > > > >      * page (that is, those where pfn_valid is true) are refcounted
> > > > > > > > and
> > > > > > > > considered
> > > > > > > >      * normal pages by the VM. The disadvantage is that pages are
> > > > > > > > refcounted
> > > > > > > >      * (which can be slower and simply not an option for some PFNMAP
> > > > > > > > users). The
> > > > > > > >      * advantage is that we don't have to follow the strict
> > > > > > > > linearity rule of
> > > > > > > >      * PFNMAP mappings in order to support COWable mappings.
> > > > > > > > 
> > > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn()
> > > > > > > > path, so
> > > > > > > > the above isn't really true, which makes me wonder if and in that
> > > > > > > > case
> > > > > > > > why there could any longer ever be a significant performance
> > > > > > > > difference
> > > > > > > > between MIXEDMAP and PFNMAP.
> > > > > > > Yeah it's definitely confusing. I guess I'll hack up a patch and see
> > > > > > > what sticks.
> > > > > > > 
> > > > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that
> > > > > > > > devmap
> > > > > > > > hack, so they are (for the non-gup case) relying on
> > > > > > > > vma_is_special_huge(). For the gup case, I think the bug is still
> > > > > > > > there.
> > > > > > > Maybe there's another devmap hack, but the ttm_vm_insert functions do
> > > > > > > use PFN_DEV and all that. And I think that stops gup_fast from trying
> > > > > > > to find the underlying page.
> > > > > > > -Daniel
> > > > > > Hmm perhaps it might, but I don't think so. The fix I tried out was
> > > > > > to set
> > > > > > 
> > > > > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> > > > > > true, and
> > > > > > then
> > > > > > 
> > > > > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> > > > > > gup_fast()
> > > > > > backs off,
> > > > > > 
> > > > > > in the end that would mean setting in stone that "if there is a huge
> > > > > > devmap
> > > > > > page table entry for which we haven't registered any devmap struct
> > > > > > pages
> > > > > > (get_dev_pagemap returns NULL), we should treat that as a "special"
> > > > > > huge
> > > > > > page table entry".
> > > > > > 
> > > > > >   From what I can tell, all code calling get_dev_pagemap() already
> > > > > > does that,
> > > > > > it's just a question of getting it accepted and formalizing it.
> > > > > Oh I thought that's already how it works, since I didn't spot anything
> > > > > else that would block gup_fast from falling over. I guess really would
> > > > > need some testcases to make sure direct i/o (that's the easiest to test)
> > > > > fails like we expect.
> > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> > > > Otherwise pmd_devmap() will not return true and since there is no
> > > > pmd_special() things break.
> > > Is that maybe the issue we have seen with amdgpu and huge pages?
> > Yeah, essentially when you have a hugepte inserted by ttm, and it
> > happens to point at system memory, then gup will work on that. And
> > create all kinds of havoc.
> > 
> > > Apart from that I'm lost guys, that devmap and gup stuff is not
> > > something I have a good knowledge of apart from a one mile high view.
> > I'm not really better, hence would be good to do a testcase and see.
> > This should provoke it:
> > - allocate nicely aligned bo in system memory
> > - mmap, again nicely aligned to 2M
> > - do some direct io from a filesystem into that mmap, that should trigger gup
> > - before the gup completes free the mmap and bo so that ttm recycles
> > the pages, which should trip up on the elevated refcount. If you wait
> > until the direct io is completely, then I think nothing bad can be
> > observed.
> > 
> > Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> > another issue.
> > 
> > Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> > -Daniel
> 
> So I did the following quick experiment on vmwgfx, and it turns out that
> with it,
> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
> 
> I should probably craft an RFC formalizing this.

Yeah I think that would be good. Maybe even more formalized if we also
switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
something like that.

Otoh your description of when it only sometimes succeeds would indicate my
understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.

Christian, what's your take?
-Daniel

> 
> /Thomas
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf66744..72b6fb17c984 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         pfn_t pfnt;
>         struct ttm_tt *ttm = bo->ttm;
>         bool write = vmf->flags & FAULT_FLAG_WRITE;
> +       struct dev_pagemap *pagemap;
> 
>         /* Fault should not cross bo boundary. */
>         page_offset &= ~(fault_page_size - 1);
> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         if ((pfn & (fault_page_size - 1)) != 0)
>                 goto out_fallback;
> 
> +       /*
> +        * Huge entries must be special, that is marking them as devmap
> +        * with no backing device map range. If there is a backing
> +        * range, Don't insert a huge entry.
> +        */
> +       pagemap = get_dev_pagemap(pfn, NULL);
> +       if (pagemap) {
> +               put_dev_pagemap(pagemap);
> +               goto out_fallback;
> +       }
> +
>         /* Check that memory is contiguous. */
>         if (!bo->mem.bus.is_iomem) {
>                 for (i = 1; i < fault_page_size; ++i) {
> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>                 }
>         }
> 
> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>         if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>                 ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>  #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         if (ret != VM_FAULT_NOPAGE)
>                 goto out_fallback;
> 
> +#if 1
> +       {
> +               int npages;
> +               struct page *page;
> +
> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
> &page);
> +               if (npages == 1) {
> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
> +                       put_page(page);
> +               } else {
> +                       DRM_INFO("Fast gup failed. Good.\n");
> +               }
> +       }
> +#endif
> +
>         return VM_FAULT_NOPAGE;
>  out_fallback:
>         count_vm_event(THP_FAULT_FALLBACK);
> 
> 
> 
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 13:00                                     ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-11 13:00 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
> 
> On 3/1/21 3:09 PM, Daniel Vetter wrote:
> > On Mon, Mar 1, 2021 at 11:17 AM Christian König
> > <christian.koenig@amd.com> wrote:
> > > 
> > > 
> > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> > > > On 3/1/21 10:05 AM, Daniel Vetter wrote:
> > > > > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> > > > > wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > On 3/1/21 9:28 AM, Daniel Vetter wrote:
> > > > > > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> > > > > > > <thomas_os@shipmail.org> wrote:
> > > > > > > > On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > > > > > > > > So I think it stops gup. But I haven't verified at all. Would be
> > > > > > > > > good
> > > > > > > > > if Christian can check this with some direct io to a buffer in
> > > > > > > > > system
> > > > > > > > > memory.
> > > > > > > > Hmm,
> > > > > > > > 
> > > > > > > > Docs (again vm_normal_page() say)
> > > > > > > > 
> > > > > > > >      * VM_MIXEDMAP mappings can likewise contain memory with or
> > > > > > > > without "struct
> > > > > > > >      * page" backing, however the difference is that _all_ pages
> > > > > > > > with a struct
> > > > > > > >      * page (that is, those where pfn_valid is true) are refcounted
> > > > > > > > and
> > > > > > > > considered
> > > > > > > >      * normal pages by the VM. The disadvantage is that pages are
> > > > > > > > refcounted
> > > > > > > >      * (which can be slower and simply not an option for some PFNMAP
> > > > > > > > users). The
> > > > > > > >      * advantage is that we don't have to follow the strict
> > > > > > > > linearity rule of
> > > > > > > >      * PFNMAP mappings in order to support COWable mappings.
> > > > > > > > 
> > > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn()
> > > > > > > > path, so
> > > > > > > > the above isn't really true, which makes me wonder if and in that
> > > > > > > > case
> > > > > > > > why there could any longer ever be a significant performance
> > > > > > > > difference
> > > > > > > > between MIXEDMAP and PFNMAP.
> > > > > > > Yeah it's definitely confusing. I guess I'll hack up a patch and see
> > > > > > > what sticks.
> > > > > > > 
> > > > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that
> > > > > > > > devmap
> > > > > > > > hack, so they are (for the non-gup case) relying on
> > > > > > > > vma_is_special_huge(). For the gup case, I think the bug is still
> > > > > > > > there.
> > > > > > > Maybe there's another devmap hack, but the ttm_vm_insert functions do
> > > > > > > use PFN_DEV and all that. And I think that stops gup_fast from trying
> > > > > > > to find the underlying page.
> > > > > > > -Daniel
> > > > > > Hmm perhaps it might, but I don't think so. The fix I tried out was
> > > > > > to set
> > > > > > 
> > > > > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> > > > > > true, and
> > > > > > then
> > > > > > 
> > > > > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> > > > > > gup_fast()
> > > > > > backs off,
> > > > > > 
> > > > > > in the end that would mean setting in stone that "if there is a huge
> > > > > > devmap
> > > > > > page table entry for which we haven't registered any devmap struct
> > > > > > pages
> > > > > > (get_dev_pagemap returns NULL), we should treat that as a "special"
> > > > > > huge
> > > > > > page table entry".
> > > > > > 
> > > > > >   From what I can tell, all code calling get_dev_pagemap() already
> > > > > > does that,
> > > > > > it's just a question of getting it accepted and formalizing it.
> > > > > Oh I thought that's already how it works, since I didn't spot anything
> > > > > else that would block gup_fast from falling over. I guess really would
> > > > > need some testcases to make sure direct i/o (that's the easiest to test)
> > > > > fails like we expect.
> > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> > > > Otherwise pmd_devmap() will not return true and since there is no
> > > > pmd_special() things break.
> > > Is that maybe the issue we have seen with amdgpu and huge pages?
> > Yeah, essentially when you have a hugepte inserted by ttm, and it
> > happens to point at system memory, then gup will work on that. And
> > create all kinds of havoc.
> > 
> > > Apart from that I'm lost guys, that devmap and gup stuff is not
> > > something I have a good knowledge of apart from a one mile high view.
> > I'm not really better, hence would be good to do a testcase and see.
> > This should provoke it:
> > - allocate nicely aligned bo in system memory
> > - mmap, again nicely aligned to 2M
> > - do some direct io from a filesystem into that mmap, that should trigger gup
> > - before the gup completes free the mmap and bo so that ttm recycles
> > the pages, which should trip up on the elevated refcount. If you wait
> > until the direct io is completely, then I think nothing bad can be
> > observed.
> > 
> > Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> > another issue.
> > 
> > Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> > -Daniel
> 
> So I did the following quick experiment on vmwgfx, and it turns out that
> with it,
> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
> 
> I should probably craft an RFC formalizing this.

Yeah I think that would be good. Maybe even more formalized if we also
switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
something like that.

Otoh your description of when it only sometimes succeeds would indicate my
understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.

Christian, what's your take?
-Daniel

> 
> /Thomas
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf66744..72b6fb17c984 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         pfn_t pfnt;
>         struct ttm_tt *ttm = bo->ttm;
>         bool write = vmf->flags & FAULT_FLAG_WRITE;
> +       struct dev_pagemap *pagemap;
> 
>         /* Fault should not cross bo boundary. */
>         page_offset &= ~(fault_page_size - 1);
> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         if ((pfn & (fault_page_size - 1)) != 0)
>                 goto out_fallback;
> 
> +       /*
> +        * Huge entries must be special, that is marking them as devmap
> +        * with no backing device map range. If there is a backing
> +        * range, Don't insert a huge entry.
> +        */
> +       pagemap = get_dev_pagemap(pfn, NULL);
> +       if (pagemap) {
> +               put_dev_pagemap(pagemap);
> +               goto out_fallback;
> +       }
> +
>         /* Check that memory is contiguous. */
>         if (!bo->mem.bus.is_iomem) {
>                 for (i = 1; i < fault_page_size; ++i) {
> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>                 }
>         }
> 
> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>         if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>                 ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>  #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         if (ret != VM_FAULT_NOPAGE)
>                 goto out_fallback;
> 
> +#if 1
> +       {
> +               int npages;
> +               struct page *page;
> +
> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
> &page);
> +               if (npages == 1) {
> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
> +                       put_page(page);
> +               } else {
> +                       DRM_INFO("Fast gup failed. Good.\n");
> +               }
> +       }
> +#endif
> +
>         return VM_FAULT_NOPAGE;
>  out_fallback:
>         count_vm_event(THP_FAULT_FALLBACK);
> 
> 
> 
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 13:00                                     ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-11 13:00 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
> 
> On 3/1/21 3:09 PM, Daniel Vetter wrote:
> > On Mon, Mar 1, 2021 at 11:17 AM Christian König
> > <christian.koenig@amd.com> wrote:
> > > 
> > > 
> > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> > > > On 3/1/21 10:05 AM, Daniel Vetter wrote:
> > > > > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> > > > > wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > On 3/1/21 9:28 AM, Daniel Vetter wrote:
> > > > > > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> > > > > > > <thomas_os@shipmail.org> wrote:
> > > > > > > > On 2/26/21 2:28 PM, Daniel Vetter wrote:
> > > > > > > > > So I think it stops gup. But I haven't verified at all. Would be
> > > > > > > > > good
> > > > > > > > > if Christian can check this with some direct io to a buffer in
> > > > > > > > > system
> > > > > > > > > memory.
> > > > > > > > Hmm,
> > > > > > > > 
> > > > > > > > Docs (again vm_normal_page() say)
> > > > > > > > 
> > > > > > > >      * VM_MIXEDMAP mappings can likewise contain memory with or
> > > > > > > > without "struct
> > > > > > > >      * page" backing, however the difference is that _all_ pages
> > > > > > > > with a struct
> > > > > > > >      * page (that is, those where pfn_valid is true) are refcounted
> > > > > > > > and
> > > > > > > > considered
> > > > > > > >      * normal pages by the VM. The disadvantage is that pages are
> > > > > > > > refcounted
> > > > > > > >      * (which can be slower and simply not an option for some PFNMAP
> > > > > > > > users). The
> > > > > > > >      * advantage is that we don't have to follow the strict
> > > > > > > > linearity rule of
> > > > > > > >      * PFNMAP mappings in order to support COWable mappings.
> > > > > > > > 
> > > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn()
> > > > > > > > path, so
> > > > > > > > the above isn't really true, which makes me wonder if and in that
> > > > > > > > case
> > > > > > > > why there could any longer ever be a significant performance
> > > > > > > > difference
> > > > > > > > between MIXEDMAP and PFNMAP.
> > > > > > > Yeah it's definitely confusing. I guess I'll hack up a patch and see
> > > > > > > what sticks.
> > > > > > > 
> > > > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that
> > > > > > > > devmap
> > > > > > > > hack, so they are (for the non-gup case) relying on
> > > > > > > > vma_is_special_huge(). For the gup case, I think the bug is still
> > > > > > > > there.
> > > > > > > Maybe there's another devmap hack, but the ttm_vm_insert functions do
> > > > > > > use PFN_DEV and all that. And I think that stops gup_fast from trying
> > > > > > > to find the underlying page.
> > > > > > > -Daniel
> > > > > > Hmm perhaps it might, but I don't think so. The fix I tried out was
> > > > > > to set
> > > > > > 
> > > > > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> > > > > > true, and
> > > > > > then
> > > > > > 
> > > > > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> > > > > > gup_fast()
> > > > > > backs off,
> > > > > > 
> > > > > > in the end that would mean setting in stone that "if there is a huge
> > > > > > devmap
> > > > > > page table entry for which we haven't registered any devmap struct
> > > > > > pages
> > > > > > (get_dev_pagemap returns NULL), we should treat that as a "special"
> > > > > > huge
> > > > > > page table entry".
> > > > > > 
> > > > > >   From what I can tell, all code calling get_dev_pagemap() already
> > > > > > does that,
> > > > > > it's just a question of getting it accepted and formalizing it.
> > > > > Oh I thought that's already how it works, since I didn't spot anything
> > > > > else that would block gup_fast from falling over. I guess really would
> > > > > need some testcases to make sure direct i/o (that's the easiest to test)
> > > > > fails like we expect.
> > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> > > > Otherwise pmd_devmap() will not return true and since there is no
> > > > pmd_special() things break.
> > > Is that maybe the issue we have seen with amdgpu and huge pages?
> > Yeah, essentially when you have a hugepte inserted by ttm, and it
> > happens to point at system memory, then gup will work on that. And
> > create all kinds of havoc.
> > 
> > > Apart from that I'm lost guys, that devmap and gup stuff is not
> > > something I have a good knowledge of apart from a one mile high view.
> > I'm not really better, hence would be good to do a testcase and see.
> > This should provoke it:
> > - allocate nicely aligned bo in system memory
> > - mmap, again nicely aligned to 2M
> > - do some direct io from a filesystem into that mmap, that should trigger gup
> > - before the gup completes free the mmap and bo so that ttm recycles
> > the pages, which should trip up on the elevated refcount. If you wait
> > until the direct io is completely, then I think nothing bad can be
> > observed.
> > 
> > Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> > another issue.
> > 
> > Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> > -Daniel
> 
> So I did the following quick experiment on vmwgfx, and it turns out that
> with it,
> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
> 
> I should probably craft an RFC formalizing this.

Yeah I think that would be good. Maybe even more formalized if we also
switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
something like that.

Otoh your description of when it only sometimes succeeds would indicate my
understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.

Christian, what's your take?
-Daniel

> 
> /Thomas
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> index 6dc96cf66744..72b6fb17c984 100644
> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         pfn_t pfnt;
>         struct ttm_tt *ttm = bo->ttm;
>         bool write = vmf->flags & FAULT_FLAG_WRITE;
> +       struct dev_pagemap *pagemap;
> 
>         /* Fault should not cross bo boundary. */
>         page_offset &= ~(fault_page_size - 1);
> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         if ((pfn & (fault_page_size - 1)) != 0)
>                 goto out_fallback;
> 
> +       /*
> +        * Huge entries must be special, that is marking them as devmap
> +        * with no backing device map range. If there is a backing
> +        * range, Don't insert a huge entry.
> +        */
> +       pagemap = get_dev_pagemap(pfn, NULL);
> +       if (pagemap) {
> +               put_dev_pagemap(pagemap);
> +               goto out_fallback;
> +       }
> +
>         /* Check that memory is contiguous. */
>         if (!bo->mem.bus.is_iomem) {
>                 for (i = 1; i < fault_page_size; ++i) {
> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>                 }
>         }
> 
> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>         if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>                 ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>  #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> *vmf,
>         if (ret != VM_FAULT_NOPAGE)
>                 goto out_fallback;
> 
> +#if 1
> +       {
> +               int npages;
> +               struct page *page;
> +
> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
> &page);
> +               if (npages == 1) {
> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
> +                       put_page(page);
> +               } else {
> +                       DRM_INFO("Fast gup failed. Good.\n");
> +               }
> +       }
> +#endif
> +
>         return VM_FAULT_NOPAGE;
>  out_fallback:
>         count_vm_event(THP_FAULT_FALLBACK);
> 
> 
> 
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-11 13:00                                     ` Daniel Vetter
  (?)
@ 2021-03-11 13:12                                       ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 13:12 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Christian König,
	Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK

Hi!

On 3/11/21 2:00 PM, Daniel Vetter wrote:
> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
>> On 3/1/21 3:09 PM, Daniel Vetter wrote:
>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>>
>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>>>> good
>>>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>>>> system
>>>>>>>>>> memory.
>>>>>>>>> Hmm,
>>>>>>>>>
>>>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>>>
>>>>>>>>>       * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>>>> without "struct
>>>>>>>>>       * page" backing, however the difference is that _all_ pages
>>>>>>>>> with a struct
>>>>>>>>>       * page (that is, those where pfn_valid is true) are refcounted
>>>>>>>>> and
>>>>>>>>> considered
>>>>>>>>>       * normal pages by the VM. The disadvantage is that pages are
>>>>>>>>> refcounted
>>>>>>>>>       * (which can be slower and simply not an option for some PFNMAP
>>>>>>>>> users). The
>>>>>>>>>       * advantage is that we don't have to follow the strict
>>>>>>>>> linearity rule of
>>>>>>>>>       * PFNMAP mappings in order to support COWable mappings.
>>>>>>>>>
>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>>>> path, so
>>>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>>>> case
>>>>>>>>> why there could any longer ever be a significant performance
>>>>>>>>> difference
>>>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>>>> what sticks.
>>>>>>>>
>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>>>> devmap
>>>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>>>> there.
>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>>>> to find the underlying page.
>>>>>>>> -Daniel
>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>>>> to set
>>>>>>>
>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>>>> true, and
>>>>>>> then
>>>>>>>
>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>>>> gup_fast()
>>>>>>> backs off,
>>>>>>>
>>>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>>>> devmap
>>>>>>> page table entry for which we haven't registered any devmap struct
>>>>>>> pages
>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>>>> huge
>>>>>>> page table entry".
>>>>>>>
>>>>>>>    From what I can tell, all code calling get_dev_pagemap() already
>>>>>>> does that,
>>>>>>> it's just a question of getting it accepted and formalizing it.
>>>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>>>> else that would block gup_fast from falling over. I guess really would
>>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>>>> fails like we expect.
>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>>>> Otherwise pmd_devmap() will not return true and since there is no
>>>>> pmd_special() things break.
>>>> Is that maybe the issue we have seen with amdgpu and huge pages?
>>> Yeah, essentially when you have a hugepte inserted by ttm, and it
>>> happens to point at system memory, then gup will work on that. And
>>> create all kinds of havoc.
>>>
>>>> Apart from that I'm lost guys, that devmap and gup stuff is not
>>>> something I have a good knowledge of apart from a one mile high view.
>>> I'm not really better, hence would be good to do a testcase and see.
>>> This should provoke it:
>>> - allocate nicely aligned bo in system memory
>>> - mmap, again nicely aligned to 2M
>>> - do some direct io from a filesystem into that mmap, that should trigger gup
>>> - before the gup completes free the mmap and bo so that ttm recycles
>>> the pages, which should trip up on the elevated refcount. If you wait
>>> until the direct io is completely, then I think nothing bad can be
>>> observed.
>>>
>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
>>> another issue.
>>>
>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
>>> -Daniel
>> So I did the following quick experiment on vmwgfx, and it turns out that
>> with it,
>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>
>> I should probably craft an RFC formalizing this.
> Yeah I think that would be good. Maybe even more formalized if we also
> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
> something like that.
>
> Otoh your description of when it only sometimes succeeds would indicate my
> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.

My understanding from reading the vmf_insert_mixed() code is that iff 
the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's 
not consistent with the vm_normal_page() doc. For architectures without 
pte_special, VM_PFNMAP must be used, and then we must also block COW 
mappings.

If we can get someone can commit to verify that the potential PAT WC 
performance issue is gone with PFNMAP, I can put together a series with 
that included.

As for existing userspace using COW TTM mappings, I once had a couple of 
test cases to verify that it actually worked, in particular together 
with huge PMDs and PUDs where breaking COW would imply splitting those, 
but I can't think of anything else actually wanting to do that other 
than by mistake.

/Thomas


>
> Christian, what's your take?
> -Daniel
>
>> /Thomas
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index 6dc96cf66744..72b6fb17c984 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          pfn_t pfnt;
>>          struct ttm_tt *ttm = bo->ttm;
>>          bool write = vmf->flags & FAULT_FLAG_WRITE;
>> +       struct dev_pagemap *pagemap;
>>
>>          /* Fault should not cross bo boundary. */
>>          page_offset &= ~(fault_page_size - 1);
>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          if ((pfn & (fault_page_size - 1)) != 0)
>>                  goto out_fallback;
>>
>> +       /*
>> +        * Huge entries must be special, that is marking them as devmap
>> +        * with no backing device map range. If there is a backing
>> +        * range, Don't insert a huge entry.
>> +        */
>> +       pagemap = get_dev_pagemap(pfn, NULL);
>> +       if (pagemap) {
>> +               put_dev_pagemap(pagemap);
>> +               goto out_fallback;
>> +       }
>> +
>>          /* Check that memory is contiguous. */
>>          if (!bo->mem.bus.is_iomem) {
>>                  for (i = 1; i < fault_page_size; ++i) {
>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>                  }
>>          }
>>
>> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
>> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>>          if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>>                  ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>>   #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          if (ret != VM_FAULT_NOPAGE)
>>                  goto out_fallback;
>>
>> +#if 1
>> +       {
>> +               int npages;
>> +               struct page *page;
>> +
>> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
>> &page);
>> +               if (npages == 1) {
>> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
>> +                       put_page(page);
>> +               } else {
>> +                       DRM_INFO("Fast gup failed. Good.\n");
>> +               }
>> +       }
>> +#endif
>> +
>>          return VM_FAULT_NOPAGE;
>>   out_fallback:
>>          count_vm_event(THP_FAULT_FALLBACK);
>>
>>
>>
>>
>>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 13:12                                       ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 13:12 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

Hi!

On 3/11/21 2:00 PM, Daniel Vetter wrote:
> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
>> On 3/1/21 3:09 PM, Daniel Vetter wrote:
>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>>
>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>>>> good
>>>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>>>> system
>>>>>>>>>> memory.
>>>>>>>>> Hmm,
>>>>>>>>>
>>>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>>>
>>>>>>>>>       * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>>>> without "struct
>>>>>>>>>       * page" backing, however the difference is that _all_ pages
>>>>>>>>> with a struct
>>>>>>>>>       * page (that is, those where pfn_valid is true) are refcounted
>>>>>>>>> and
>>>>>>>>> considered
>>>>>>>>>       * normal pages by the VM. The disadvantage is that pages are
>>>>>>>>> refcounted
>>>>>>>>>       * (which can be slower and simply not an option for some PFNMAP
>>>>>>>>> users). The
>>>>>>>>>       * advantage is that we don't have to follow the strict
>>>>>>>>> linearity rule of
>>>>>>>>>       * PFNMAP mappings in order to support COWable mappings.
>>>>>>>>>
>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>>>> path, so
>>>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>>>> case
>>>>>>>>> why there could any longer ever be a significant performance
>>>>>>>>> difference
>>>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>>>> what sticks.
>>>>>>>>
>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>>>> devmap
>>>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>>>> there.
>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>>>> to find the underlying page.
>>>>>>>> -Daniel
>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>>>> to set
>>>>>>>
>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>>>> true, and
>>>>>>> then
>>>>>>>
>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>>>> gup_fast()
>>>>>>> backs off,
>>>>>>>
>>>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>>>> devmap
>>>>>>> page table entry for which we haven't registered any devmap struct
>>>>>>> pages
>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>>>> huge
>>>>>>> page table entry".
>>>>>>>
>>>>>>>    From what I can tell, all code calling get_dev_pagemap() already
>>>>>>> does that,
>>>>>>> it's just a question of getting it accepted and formalizing it.
>>>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>>>> else that would block gup_fast from falling over. I guess really would
>>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>>>> fails like we expect.
>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>>>> Otherwise pmd_devmap() will not return true and since there is no
>>>>> pmd_special() things break.
>>>> Is that maybe the issue we have seen with amdgpu and huge pages?
>>> Yeah, essentially when you have a hugepte inserted by ttm, and it
>>> happens to point at system memory, then gup will work on that. And
>>> create all kinds of havoc.
>>>
>>>> Apart from that I'm lost guys, that devmap and gup stuff is not
>>>> something I have a good knowledge of apart from a one mile high view.
>>> I'm not really better, hence would be good to do a testcase and see.
>>> This should provoke it:
>>> - allocate nicely aligned bo in system memory
>>> - mmap, again nicely aligned to 2M
>>> - do some direct io from a filesystem into that mmap, that should trigger gup
>>> - before the gup completes free the mmap and bo so that ttm recycles
>>> the pages, which should trip up on the elevated refcount. If you wait
>>> until the direct io is completely, then I think nothing bad can be
>>> observed.
>>>
>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
>>> another issue.
>>>
>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
>>> -Daniel
>> So I did the following quick experiment on vmwgfx, and it turns out that
>> with it,
>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>
>> I should probably craft an RFC formalizing this.
> Yeah I think that would be good. Maybe even more formalized if we also
> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
> something like that.
>
> Otoh your description of when it only sometimes succeeds would indicate my
> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.

My understanding from reading the vmf_insert_mixed() code is that iff 
the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's 
not consistent with the vm_normal_page() doc. For architectures without 
pte_special, VM_PFNMAP must be used, and then we must also block COW 
mappings.

If we can get someone can commit to verify that the potential PAT WC 
performance issue is gone with PFNMAP, I can put together a series with 
that included.

As for existing userspace using COW TTM mappings, I once had a couple of 
test cases to verify that it actually worked, in particular together 
with huge PMDs and PUDs where breaking COW would imply splitting those, 
but I can't think of anything else actually wanting to do that other 
than by mistake.

/Thomas


>
> Christian, what's your take?
> -Daniel
>
>> /Thomas
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index 6dc96cf66744..72b6fb17c984 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          pfn_t pfnt;
>>          struct ttm_tt *ttm = bo->ttm;
>>          bool write = vmf->flags & FAULT_FLAG_WRITE;
>> +       struct dev_pagemap *pagemap;
>>
>>          /* Fault should not cross bo boundary. */
>>          page_offset &= ~(fault_page_size - 1);
>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          if ((pfn & (fault_page_size - 1)) != 0)
>>                  goto out_fallback;
>>
>> +       /*
>> +        * Huge entries must be special, that is marking them as devmap
>> +        * with no backing device map range. If there is a backing
>> +        * range, Don't insert a huge entry.
>> +        */
>> +       pagemap = get_dev_pagemap(pfn, NULL);
>> +       if (pagemap) {
>> +               put_dev_pagemap(pagemap);
>> +               goto out_fallback;
>> +       }
>> +
>>          /* Check that memory is contiguous. */
>>          if (!bo->mem.bus.is_iomem) {
>>                  for (i = 1; i < fault_page_size; ++i) {
>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>                  }
>>          }
>>
>> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
>> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>>          if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>>                  ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>>   #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          if (ret != VM_FAULT_NOPAGE)
>>                  goto out_fallback;
>>
>> +#if 1
>> +       {
>> +               int npages;
>> +               struct page *page;
>> +
>> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
>> &page);
>> +               if (npages == 1) {
>> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
>> +                       put_page(page);
>> +               } else {
>> +                       DRM_INFO("Fast gup failed. Good.\n");
>> +               }
>> +       }
>> +#endif
>> +
>>          return VM_FAULT_NOPAGE;
>>   out_fallback:
>>          count_vm_event(THP_FAULT_FALLBACK);
>>
>>
>>
>>
>>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 13:12                                       ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 13:12 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

Hi!

On 3/11/21 2:00 PM, Daniel Vetter wrote:
> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
>> On 3/1/21 3:09 PM, Daniel Vetter wrote:
>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>>
>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>>>> wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>>>> good
>>>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>>>> system
>>>>>>>>>> memory.
>>>>>>>>> Hmm,
>>>>>>>>>
>>>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>>>
>>>>>>>>>       * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>>>> without "struct
>>>>>>>>>       * page" backing, however the difference is that _all_ pages
>>>>>>>>> with a struct
>>>>>>>>>       * page (that is, those where pfn_valid is true) are refcounted
>>>>>>>>> and
>>>>>>>>> considered
>>>>>>>>>       * normal pages by the VM. The disadvantage is that pages are
>>>>>>>>> refcounted
>>>>>>>>>       * (which can be slower and simply not an option for some PFNMAP
>>>>>>>>> users). The
>>>>>>>>>       * advantage is that we don't have to follow the strict
>>>>>>>>> linearity rule of
>>>>>>>>>       * PFNMAP mappings in order to support COWable mappings.
>>>>>>>>>
>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>>>> path, so
>>>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>>>> case
>>>>>>>>> why there could any longer ever be a significant performance
>>>>>>>>> difference
>>>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>>>> what sticks.
>>>>>>>>
>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>>>> devmap
>>>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>>>> there.
>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>>>> to find the underlying page.
>>>>>>>> -Daniel
>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>>>> to set
>>>>>>>
>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>>>> true, and
>>>>>>> then
>>>>>>>
>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>>>> gup_fast()
>>>>>>> backs off,
>>>>>>>
>>>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>>>> devmap
>>>>>>> page table entry for which we haven't registered any devmap struct
>>>>>>> pages
>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>>>> huge
>>>>>>> page table entry".
>>>>>>>
>>>>>>>    From what I can tell, all code calling get_dev_pagemap() already
>>>>>>> does that,
>>>>>>> it's just a question of getting it accepted and formalizing it.
>>>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>>>> else that would block gup_fast from falling over. I guess really would
>>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>>>> fails like we expect.
>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>>>> Otherwise pmd_devmap() will not return true and since there is no
>>>>> pmd_special() things break.
>>>> Is that maybe the issue we have seen with amdgpu and huge pages?
>>> Yeah, essentially when you have a hugepte inserted by ttm, and it
>>> happens to point at system memory, then gup will work on that. And
>>> create all kinds of havoc.
>>>
>>>> Apart from that I'm lost guys, that devmap and gup stuff is not
>>>> something I have a good knowledge of apart from a one mile high view.
>>> I'm not really better, hence would be good to do a testcase and see.
>>> This should provoke it:
>>> - allocate nicely aligned bo in system memory
>>> - mmap, again nicely aligned to 2M
>>> - do some direct io from a filesystem into that mmap, that should trigger gup
>>> - before the gup completes free the mmap and bo so that ttm recycles
>>> the pages, which should trip up on the elevated refcount. If you wait
>>> until the direct io is completely, then I think nothing bad can be
>>> observed.
>>>
>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
>>> another issue.
>>>
>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
>>> -Daniel
>> So I did the following quick experiment on vmwgfx, and it turns out that
>> with it,
>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>
>> I should probably craft an RFC formalizing this.
> Yeah I think that would be good. Maybe even more formalized if we also
> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
> something like that.
>
> Otoh your description of when it only sometimes succeeds would indicate my
> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.

My understanding from reading the vmf_insert_mixed() code is that iff 
the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's 
not consistent with the vm_normal_page() doc. For architectures without 
pte_special, VM_PFNMAP must be used, and then we must also block COW 
mappings.

If we can get someone can commit to verify that the potential PAT WC 
performance issue is gone with PFNMAP, I can put together a series with 
that included.

As for existing userspace using COW TTM mappings, I once had a couple of 
test cases to verify that it actually worked, in particular together 
with huge PMDs and PUDs where breaking COW would imply splitting those, 
but I can't think of anything else actually wanting to do that other 
than by mistake.

/Thomas


>
> Christian, what's your take?
> -Daniel
>
>> /Thomas
>>
>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> index 6dc96cf66744..72b6fb17c984 100644
>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          pfn_t pfnt;
>>          struct ttm_tt *ttm = bo->ttm;
>>          bool write = vmf->flags & FAULT_FLAG_WRITE;
>> +       struct dev_pagemap *pagemap;
>>
>>          /* Fault should not cross bo boundary. */
>>          page_offset &= ~(fault_page_size - 1);
>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          if ((pfn & (fault_page_size - 1)) != 0)
>>                  goto out_fallback;
>>
>> +       /*
>> +        * Huge entries must be special, that is marking them as devmap
>> +        * with no backing device map range. If there is a backing
>> +        * range, Don't insert a huge entry.
>> +        */
>> +       pagemap = get_dev_pagemap(pfn, NULL);
>> +       if (pagemap) {
>> +               put_dev_pagemap(pagemap);
>> +               goto out_fallback;
>> +       }
>> +
>>          /* Check that memory is contiguous. */
>>          if (!bo->mem.bus.is_iomem) {
>>                  for (i = 1; i < fault_page_size; ++i) {
>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>                  }
>>          }
>>
>> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
>> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>>          if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>>                  ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>>   #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>> *vmf,
>>          if (ret != VM_FAULT_NOPAGE)
>>                  goto out_fallback;
>>
>> +#if 1
>> +       {
>> +               int npages;
>> +               struct page *page;
>> +
>> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
>> &page);
>> +               if (npages == 1) {
>> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
>> +                       put_page(page);
>> +               } else {
>> +                       DRM_INFO("Fast gup failed. Good.\n");
>> +               }
>> +       }
>> +#endif
>> +
>>          return VM_FAULT_NOPAGE;
>>   out_fallback:
>>          count_vm_event(THP_FAULT_FALLBACK);
>>
>>
>>
>>
>>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-11 13:12                                       ` Thomas Hellström (Intel)
  (?)
@ 2021-03-11 13:17                                         ` Daniel Vetter
  -1 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-11 13:17 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Christian König,
	Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
> Hi!
>
> On 3/11/21 2:00 PM, Daniel Vetter wrote:
> > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
> >> On 3/1/21 3:09 PM, Daniel Vetter wrote:
> >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>>
> >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
> >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> >>>>>> wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> >>>>>>>> <thomas_os@shipmail.org> wrote:
> >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
> >>>>>>>>>> good
> >>>>>>>>>> if Christian can check this with some direct io to a buffer in
> >>>>>>>>>> system
> >>>>>>>>>> memory.
> >>>>>>>>> Hmm,
> >>>>>>>>>
> >>>>>>>>> Docs (again vm_normal_page() say)
> >>>>>>>>>
> >>>>>>>>>       * VM_MIXEDMAP mappings can likewise contain memory with or
> >>>>>>>>> without "struct
> >>>>>>>>>       * page" backing, however the difference is that _all_ pages
> >>>>>>>>> with a struct
> >>>>>>>>>       * page (that is, those where pfn_valid is true) are refcounted
> >>>>>>>>> and
> >>>>>>>>> considered
> >>>>>>>>>       * normal pages by the VM. The disadvantage is that pages are
> >>>>>>>>> refcounted
> >>>>>>>>>       * (which can be slower and simply not an option for some PFNMAP
> >>>>>>>>> users). The
> >>>>>>>>>       * advantage is that we don't have to follow the strict
> >>>>>>>>> linearity rule of
> >>>>>>>>>       * PFNMAP mappings in order to support COWable mappings.
> >>>>>>>>>
> >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
> >>>>>>>>> path, so
> >>>>>>>>> the above isn't really true, which makes me wonder if and in that
> >>>>>>>>> case
> >>>>>>>>> why there could any longer ever be a significant performance
> >>>>>>>>> difference
> >>>>>>>>> between MIXEDMAP and PFNMAP.
> >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> >>>>>>>> what sticks.
> >>>>>>>>
> >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
> >>>>>>>>> devmap
> >>>>>>>>> hack, so they are (for the non-gup case) relying on
> >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
> >>>>>>>>> there.
> >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
> >>>>>>>> to find the underlying page.
> >>>>>>>> -Daniel
> >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
> >>>>>>> to set
> >>>>>>>
> >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> >>>>>>> true, and
> >>>>>>> then
> >>>>>>>
> >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> >>>>>>> gup_fast()
> >>>>>>> backs off,
> >>>>>>>
> >>>>>>> in the end that would mean setting in stone that "if there is a huge
> >>>>>>> devmap
> >>>>>>> page table entry for which we haven't registered any devmap struct
> >>>>>>> pages
> >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
> >>>>>>> huge
> >>>>>>> page table entry".
> >>>>>>>
> >>>>>>>    From what I can tell, all code calling get_dev_pagemap() already
> >>>>>>> does that,
> >>>>>>> it's just a question of getting it accepted and formalizing it.
> >>>>>> Oh I thought that's already how it works, since I didn't spot anything
> >>>>>> else that would block gup_fast from falling over. I guess really would
> >>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
> >>>>>> fails like we expect.
> >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> >>>>> Otherwise pmd_devmap() will not return true and since there is no
> >>>>> pmd_special() things break.
> >>>> Is that maybe the issue we have seen with amdgpu and huge pages?
> >>> Yeah, essentially when you have a hugepte inserted by ttm, and it
> >>> happens to point at system memory, then gup will work on that. And
> >>> create all kinds of havoc.
> >>>
> >>>> Apart from that I'm lost guys, that devmap and gup stuff is not
> >>>> something I have a good knowledge of apart from a one mile high view.
> >>> I'm not really better, hence would be good to do a testcase and see.
> >>> This should provoke it:
> >>> - allocate nicely aligned bo in system memory
> >>> - mmap, again nicely aligned to 2M
> >>> - do some direct io from a filesystem into that mmap, that should trigger gup
> >>> - before the gup completes free the mmap and bo so that ttm recycles
> >>> the pages, which should trip up on the elevated refcount. If you wait
> >>> until the direct io is completely, then I think nothing bad can be
> >>> observed.
> >>>
> >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> >>> another issue.
> >>>
> >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> >>> -Daniel
> >> So I did the following quick experiment on vmwgfx, and it turns out that
> >> with it,
> >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
> >>
> >> I should probably craft an RFC formalizing this.
> > Yeah I think that would be good. Maybe even more formalized if we also
> > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
> > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
> > something like that.
> >
> > Otoh your description of when it only sometimes succeeds would indicate my
> > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>
> My understanding from reading the vmf_insert_mixed() code is that iff
> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
> not consistent with the vm_normal_page() doc. For architectures without
> pte_special, VM_PFNMAP must be used, and then we must also block COW
> mappings.
>
> If we can get someone can commit to verify that the potential PAT WC
> performance issue is gone with PFNMAP, I can put together a series with
> that included.

Iirc when I checked there's not much archs without pte_special, so I
guess that's why we luck out. Hopefully.

> As for existing userspace using COW TTM mappings, I once had a couple of
> test cases to verify that it actually worked, in particular together
> with huge PMDs and PUDs where breaking COW would imply splitting those,
> but I can't think of anything else actually wanting to do that other
> than by mistake.

Yeah disallowing MAP_PRIVATE mappings would be another good thing to
lock down. Really doesn't make much sense.
-Daniel

> /Thomas
>
>
> >
> > Christian, what's your take?
> > -Daniel
> >
> >> /Thomas
> >>
> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> index 6dc96cf66744..72b6fb17c984 100644
> >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          pfn_t pfnt;
> >>          struct ttm_tt *ttm = bo->ttm;
> >>          bool write = vmf->flags & FAULT_FLAG_WRITE;
> >> +       struct dev_pagemap *pagemap;
> >>
> >>          /* Fault should not cross bo boundary. */
> >>          page_offset &= ~(fault_page_size - 1);
> >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          if ((pfn & (fault_page_size - 1)) != 0)
> >>                  goto out_fallback;
> >>
> >> +       /*
> >> +        * Huge entries must be special, that is marking them as devmap
> >> +        * with no backing device map range. If there is a backing
> >> +        * range, Don't insert a huge entry.
> >> +        */
> >> +       pagemap = get_dev_pagemap(pfn, NULL);
> >> +       if (pagemap) {
> >> +               put_dev_pagemap(pagemap);
> >> +               goto out_fallback;
> >> +       }
> >> +
> >>          /* Check that memory is contiguous. */
> >>          if (!bo->mem.bus.is_iomem) {
> >>                  for (i = 1; i < fault_page_size; ++i) {
> >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>                  }
> >>          }
> >>
> >> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
> >> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
> >>          if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
> >>                  ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
> >>   #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          if (ret != VM_FAULT_NOPAGE)
> >>                  goto out_fallback;
> >>
> >> +#if 1
> >> +       {
> >> +               int npages;
> >> +               struct page *page;
> >> +
> >> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
> >> &page);
> >> +               if (npages == 1) {
> >> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
> >> +                       put_page(page);
> >> +               } else {
> >> +                       DRM_INFO("Fast gup failed. Good.\n");
> >> +               }
> >> +       }
> >> +#endif
> >> +
> >>          return VM_FAULT_NOPAGE;
> >>   out_fallback:
> >>          count_vm_event(THP_FAULT_FALLBACK);
> >>
> >>
> >>
> >>
> >>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 13:17                                         ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-11 13:17 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
> Hi!
>
> On 3/11/21 2:00 PM, Daniel Vetter wrote:
> > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
> >> On 3/1/21 3:09 PM, Daniel Vetter wrote:
> >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>>
> >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
> >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> >>>>>> wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> >>>>>>>> <thomas_os@shipmail.org> wrote:
> >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
> >>>>>>>>>> good
> >>>>>>>>>> if Christian can check this with some direct io to a buffer in
> >>>>>>>>>> system
> >>>>>>>>>> memory.
> >>>>>>>>> Hmm,
> >>>>>>>>>
> >>>>>>>>> Docs (again vm_normal_page() say)
> >>>>>>>>>
> >>>>>>>>>       * VM_MIXEDMAP mappings can likewise contain memory with or
> >>>>>>>>> without "struct
> >>>>>>>>>       * page" backing, however the difference is that _all_ pages
> >>>>>>>>> with a struct
> >>>>>>>>>       * page (that is, those where pfn_valid is true) are refcounted
> >>>>>>>>> and
> >>>>>>>>> considered
> >>>>>>>>>       * normal pages by the VM. The disadvantage is that pages are
> >>>>>>>>> refcounted
> >>>>>>>>>       * (which can be slower and simply not an option for some PFNMAP
> >>>>>>>>> users). The
> >>>>>>>>>       * advantage is that we don't have to follow the strict
> >>>>>>>>> linearity rule of
> >>>>>>>>>       * PFNMAP mappings in order to support COWable mappings.
> >>>>>>>>>
> >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
> >>>>>>>>> path, so
> >>>>>>>>> the above isn't really true, which makes me wonder if and in that
> >>>>>>>>> case
> >>>>>>>>> why there could any longer ever be a significant performance
> >>>>>>>>> difference
> >>>>>>>>> between MIXEDMAP and PFNMAP.
> >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> >>>>>>>> what sticks.
> >>>>>>>>
> >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
> >>>>>>>>> devmap
> >>>>>>>>> hack, so they are (for the non-gup case) relying on
> >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
> >>>>>>>>> there.
> >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
> >>>>>>>> to find the underlying page.
> >>>>>>>> -Daniel
> >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
> >>>>>>> to set
> >>>>>>>
> >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> >>>>>>> true, and
> >>>>>>> then
> >>>>>>>
> >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> >>>>>>> gup_fast()
> >>>>>>> backs off,
> >>>>>>>
> >>>>>>> in the end that would mean setting in stone that "if there is a huge
> >>>>>>> devmap
> >>>>>>> page table entry for which we haven't registered any devmap struct
> >>>>>>> pages
> >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
> >>>>>>> huge
> >>>>>>> page table entry".
> >>>>>>>
> >>>>>>>    From what I can tell, all code calling get_dev_pagemap() already
> >>>>>>> does that,
> >>>>>>> it's just a question of getting it accepted and formalizing it.
> >>>>>> Oh I thought that's already how it works, since I didn't spot anything
> >>>>>> else that would block gup_fast from falling over. I guess really would
> >>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
> >>>>>> fails like we expect.
> >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> >>>>> Otherwise pmd_devmap() will not return true and since there is no
> >>>>> pmd_special() things break.
> >>>> Is that maybe the issue we have seen with amdgpu and huge pages?
> >>> Yeah, essentially when you have a hugepte inserted by ttm, and it
> >>> happens to point at system memory, then gup will work on that. And
> >>> create all kinds of havoc.
> >>>
> >>>> Apart from that I'm lost guys, that devmap and gup stuff is not
> >>>> something I have a good knowledge of apart from a one mile high view.
> >>> I'm not really better, hence would be good to do a testcase and see.
> >>> This should provoke it:
> >>> - allocate nicely aligned bo in system memory
> >>> - mmap, again nicely aligned to 2M
> >>> - do some direct io from a filesystem into that mmap, that should trigger gup
> >>> - before the gup completes free the mmap and bo so that ttm recycles
> >>> the pages, which should trip up on the elevated refcount. If you wait
> >>> until the direct io is completely, then I think nothing bad can be
> >>> observed.
> >>>
> >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> >>> another issue.
> >>>
> >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> >>> -Daniel
> >> So I did the following quick experiment on vmwgfx, and it turns out that
> >> with it,
> >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
> >>
> >> I should probably craft an RFC formalizing this.
> > Yeah I think that would be good. Maybe even more formalized if we also
> > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
> > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
> > something like that.
> >
> > Otoh your description of when it only sometimes succeeds would indicate my
> > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>
> My understanding from reading the vmf_insert_mixed() code is that iff
> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
> not consistent with the vm_normal_page() doc. For architectures without
> pte_special, VM_PFNMAP must be used, and then we must also block COW
> mappings.
>
> If we can get someone can commit to verify that the potential PAT WC
> performance issue is gone with PFNMAP, I can put together a series with
> that included.

Iirc when I checked there's not much archs without pte_special, so I
guess that's why we luck out. Hopefully.

> As for existing userspace using COW TTM mappings, I once had a couple of
> test cases to verify that it actually worked, in particular together
> with huge PMDs and PUDs where breaking COW would imply splitting those,
> but I can't think of anything else actually wanting to do that other
> than by mistake.

Yeah disallowing MAP_PRIVATE mappings would be another good thing to
lock down. Really doesn't make much sense.
-Daniel

> /Thomas
>
>
> >
> > Christian, what's your take?
> > -Daniel
> >
> >> /Thomas
> >>
> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> index 6dc96cf66744..72b6fb17c984 100644
> >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          pfn_t pfnt;
> >>          struct ttm_tt *ttm = bo->ttm;
> >>          bool write = vmf->flags & FAULT_FLAG_WRITE;
> >> +       struct dev_pagemap *pagemap;
> >>
> >>          /* Fault should not cross bo boundary. */
> >>          page_offset &= ~(fault_page_size - 1);
> >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          if ((pfn & (fault_page_size - 1)) != 0)
> >>                  goto out_fallback;
> >>
> >> +       /*
> >> +        * Huge entries must be special, that is marking them as devmap
> >> +        * with no backing device map range. If there is a backing
> >> +        * range, Don't insert a huge entry.
> >> +        */
> >> +       pagemap = get_dev_pagemap(pfn, NULL);
> >> +       if (pagemap) {
> >> +               put_dev_pagemap(pagemap);
> >> +               goto out_fallback;
> >> +       }
> >> +
> >>          /* Check that memory is contiguous. */
> >>          if (!bo->mem.bus.is_iomem) {
> >>                  for (i = 1; i < fault_page_size; ++i) {
> >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>                  }
> >>          }
> >>
> >> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
> >> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
> >>          if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
> >>                  ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
> >>   #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          if (ret != VM_FAULT_NOPAGE)
> >>                  goto out_fallback;
> >>
> >> +#if 1
> >> +       {
> >> +               int npages;
> >> +               struct page *page;
> >> +
> >> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
> >> &page);
> >> +               if (npages == 1) {
> >> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
> >> +                       put_page(page);
> >> +               } else {
> >> +                       DRM_INFO("Fast gup failed. Good.\n");
> >> +               }
> >> +       }
> >> +#endif
> >> +
> >>          return VM_FAULT_NOPAGE;
> >>   out_fallback:
> >>          count_vm_event(THP_FAULT_FALLBACK);
> >>
> >>
> >>
> >>
> >>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 13:17                                         ` Daniel Vetter
  0 siblings, 0 replies; 110+ messages in thread
From: Daniel Vetter @ 2021-03-11 13:17 UTC (permalink / raw)
  To: Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK

On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel)
<thomas_os@shipmail.org> wrote:
>
> Hi!
>
> On 3/11/21 2:00 PM, Daniel Vetter wrote:
> > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
> >> On 3/1/21 3:09 PM, Daniel Vetter wrote:
> >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
> >>> <christian.koenig@amd.com> wrote:
> >>>>
> >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
> >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
> >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
> >>>>>> wrote:
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
> >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
> >>>>>>>> <thomas_os@shipmail.org> wrote:
> >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
> >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
> >>>>>>>>>> good
> >>>>>>>>>> if Christian can check this with some direct io to a buffer in
> >>>>>>>>>> system
> >>>>>>>>>> memory.
> >>>>>>>>> Hmm,
> >>>>>>>>>
> >>>>>>>>> Docs (again vm_normal_page() say)
> >>>>>>>>>
> >>>>>>>>>       * VM_MIXEDMAP mappings can likewise contain memory with or
> >>>>>>>>> without "struct
> >>>>>>>>>       * page" backing, however the difference is that _all_ pages
> >>>>>>>>> with a struct
> >>>>>>>>>       * page (that is, those where pfn_valid is true) are refcounted
> >>>>>>>>> and
> >>>>>>>>> considered
> >>>>>>>>>       * normal pages by the VM. The disadvantage is that pages are
> >>>>>>>>> refcounted
> >>>>>>>>>       * (which can be slower and simply not an option for some PFNMAP
> >>>>>>>>> users). The
> >>>>>>>>>       * advantage is that we don't have to follow the strict
> >>>>>>>>> linearity rule of
> >>>>>>>>>       * PFNMAP mappings in order to support COWable mappings.
> >>>>>>>>>
> >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
> >>>>>>>>> path, so
> >>>>>>>>> the above isn't really true, which makes me wonder if and in that
> >>>>>>>>> case
> >>>>>>>>> why there could any longer ever be a significant performance
> >>>>>>>>> difference
> >>>>>>>>> between MIXEDMAP and PFNMAP.
> >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
> >>>>>>>> what sticks.
> >>>>>>>>
> >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
> >>>>>>>>> devmap
> >>>>>>>>> hack, so they are (for the non-gup case) relying on
> >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
> >>>>>>>>> there.
> >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
> >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
> >>>>>>>> to find the underlying page.
> >>>>>>>> -Daniel
> >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
> >>>>>>> to set
> >>>>>>>
> >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
> >>>>>>> true, and
> >>>>>>> then
> >>>>>>>
> >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
> >>>>>>> gup_fast()
> >>>>>>> backs off,
> >>>>>>>
> >>>>>>> in the end that would mean setting in stone that "if there is a huge
> >>>>>>> devmap
> >>>>>>> page table entry for which we haven't registered any devmap struct
> >>>>>>> pages
> >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
> >>>>>>> huge
> >>>>>>> page table entry".
> >>>>>>>
> >>>>>>>    From what I can tell, all code calling get_dev_pagemap() already
> >>>>>>> does that,
> >>>>>>> it's just a question of getting it accepted and formalizing it.
> >>>>>> Oh I thought that's already how it works, since I didn't spot anything
> >>>>>> else that would block gup_fast from falling over. I guess really would
> >>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
> >>>>>> fails like we expect.
> >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
> >>>>> Otherwise pmd_devmap() will not return true and since there is no
> >>>>> pmd_special() things break.
> >>>> Is that maybe the issue we have seen with amdgpu and huge pages?
> >>> Yeah, essentially when you have a hugepte inserted by ttm, and it
> >>> happens to point at system memory, then gup will work on that. And
> >>> create all kinds of havoc.
> >>>
> >>>> Apart from that I'm lost guys, that devmap and gup stuff is not
> >>>> something I have a good knowledge of apart from a one mile high view.
> >>> I'm not really better, hence would be good to do a testcase and see.
> >>> This should provoke it:
> >>> - allocate nicely aligned bo in system memory
> >>> - mmap, again nicely aligned to 2M
> >>> - do some direct io from a filesystem into that mmap, that should trigger gup
> >>> - before the gup completes free the mmap and bo so that ttm recycles
> >>> the pages, which should trip up on the elevated refcount. If you wait
> >>> until the direct io is completely, then I think nothing bad can be
> >>> observed.
> >>>
> >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
> >>> another issue.
> >>>
> >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
> >>> -Daniel
> >> So I did the following quick experiment on vmwgfx, and it turns out that
> >> with it,
> >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
> >>
> >> I should probably craft an RFC formalizing this.
> > Yeah I think that would be good. Maybe even more formalized if we also
> > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
> > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
> > something like that.
> >
> > Otoh your description of when it only sometimes succeeds would indicate my
> > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>
> My understanding from reading the vmf_insert_mixed() code is that iff
> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
> not consistent with the vm_normal_page() doc. For architectures without
> pte_special, VM_PFNMAP must be used, and then we must also block COW
> mappings.
>
> If we can get someone can commit to verify that the potential PAT WC
> performance issue is gone with PFNMAP, I can put together a series with
> that included.

Iirc when I checked there's not much archs without pte_special, so I
guess that's why we luck out. Hopefully.

> As for existing userspace using COW TTM mappings, I once had a couple of
> test cases to verify that it actually worked, in particular together
> with huge PMDs and PUDs where breaking COW would imply splitting those,
> but I can't think of anything else actually wanting to do that other
> than by mistake.

Yeah disallowing MAP_PRIVATE mappings would be another good thing to
lock down. Really doesn't make much sense.
-Daniel

> /Thomas
>
>
> >
> > Christian, what's your take?
> > -Daniel
> >
> >> /Thomas
> >>
> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> index 6dc96cf66744..72b6fb17c984 100644
> >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
> >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          pfn_t pfnt;
> >>          struct ttm_tt *ttm = bo->ttm;
> >>          bool write = vmf->flags & FAULT_FLAG_WRITE;
> >> +       struct dev_pagemap *pagemap;
> >>
> >>          /* Fault should not cross bo boundary. */
> >>          page_offset &= ~(fault_page_size - 1);
> >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          if ((pfn & (fault_page_size - 1)) != 0)
> >>                  goto out_fallback;
> >>
> >> +       /*
> >> +        * Huge entries must be special, that is marking them as devmap
> >> +        * with no backing device map range. If there is a backing
> >> +        * range, Don't insert a huge entry.
> >> +        */
> >> +       pagemap = get_dev_pagemap(pfn, NULL);
> >> +       if (pagemap) {
> >> +               put_dev_pagemap(pagemap);
> >> +               goto out_fallback;
> >> +       }
> >> +
> >>          /* Check that memory is contiguous. */
> >>          if (!bo->mem.bus.is_iomem) {
> >>                  for (i = 1; i < fault_page_size; ++i) {
> >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>                  }
> >>          }
> >>
> >> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
> >> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
> >>          if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
> >>                  ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
> >>   #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
> >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
> >> *vmf,
> >>          if (ret != VM_FAULT_NOPAGE)
> >>                  goto out_fallback;
> >>
> >> +#if 1
> >> +       {
> >> +               int npages;
> >> +               struct page *page;
> >> +
> >> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
> >> &page);
> >> +               if (npages == 1) {
> >> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
> >> +                       put_page(page);
> >> +               } else {
> >> +                       DRM_INFO("Fast gup failed. Good.\n");
> >> +               }
> >> +       }
> >> +#endif
> >> +
> >>          return VM_FAULT_NOPAGE;
> >>   out_fallback:
> >>          count_vm_event(THP_FAULT_FALLBACK);
> >>
> >>
> >>
> >>
> >>



-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-11 13:17                                         ` Daniel Vetter
  (?)
@ 2021-03-11 15:37                                           ` Thomas Hellström (Intel)
  -1 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 15:37 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Christian König,
	Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK


On 3/11/21 2:17 PM, Daniel Vetter wrote:
> On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>> Hi!
>>
>> On 3/11/21 2:00 PM, Daniel Vetter wrote:
>>> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
>>>> On 3/1/21 3:09 PM, Daniel Vetter wrote:
>>>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>>>>>> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>>>>>> good
>>>>>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>>>>>> system
>>>>>>>>>>>> memory.
>>>>>>>>>>> Hmm,
>>>>>>>>>>>
>>>>>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>>>>>
>>>>>>>>>>>        * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>>>>>> without "struct
>>>>>>>>>>>        * page" backing, however the difference is that _all_ pages
>>>>>>>>>>> with a struct
>>>>>>>>>>>        * page (that is, those where pfn_valid is true) are refcounted
>>>>>>>>>>> and
>>>>>>>>>>> considered
>>>>>>>>>>>        * normal pages by the VM. The disadvantage is that pages are
>>>>>>>>>>> refcounted
>>>>>>>>>>>        * (which can be slower and simply not an option for some PFNMAP
>>>>>>>>>>> users). The
>>>>>>>>>>>        * advantage is that we don't have to follow the strict
>>>>>>>>>>> linearity rule of
>>>>>>>>>>>        * PFNMAP mappings in order to support COWable mappings.
>>>>>>>>>>>
>>>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>>>>>> path, so
>>>>>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>>>>>> case
>>>>>>>>>>> why there could any longer ever be a significant performance
>>>>>>>>>>> difference
>>>>>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>>>>>> what sticks.
>>>>>>>>>>
>>>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>>>>>> devmap
>>>>>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>>>>>> there.
>>>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>>>>>> to find the underlying page.
>>>>>>>>>> -Daniel
>>>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>>>>>> to set
>>>>>>>>>
>>>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>>>>>> true, and
>>>>>>>>> then
>>>>>>>>>
>>>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>>>>>> gup_fast()
>>>>>>>>> backs off,
>>>>>>>>>
>>>>>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>>>>>> devmap
>>>>>>>>> page table entry for which we haven't registered any devmap struct
>>>>>>>>> pages
>>>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>>>>>> huge
>>>>>>>>> page table entry".
>>>>>>>>>
>>>>>>>>>     From what I can tell, all code calling get_dev_pagemap() already
>>>>>>>>> does that,
>>>>>>>>> it's just a question of getting it accepted and formalizing it.
>>>>>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>>>>>> else that would block gup_fast from falling over. I guess really would
>>>>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>>>>>> fails like we expect.
>>>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>>>>>> Otherwise pmd_devmap() will not return true and since there is no
>>>>>>> pmd_special() things break.
>>>>>> Is that maybe the issue we have seen with amdgpu and huge pages?
>>>>> Yeah, essentially when you have a hugepte inserted by ttm, and it
>>>>> happens to point at system memory, then gup will work on that. And
>>>>> create all kinds of havoc.
>>>>>
>>>>>> Apart from that I'm lost guys, that devmap and gup stuff is not
>>>>>> something I have a good knowledge of apart from a one mile high view.
>>>>> I'm not really better, hence would be good to do a testcase and see.
>>>>> This should provoke it:
>>>>> - allocate nicely aligned bo in system memory
>>>>> - mmap, again nicely aligned to 2M
>>>>> - do some direct io from a filesystem into that mmap, that should trigger gup
>>>>> - before the gup completes free the mmap and bo so that ttm recycles
>>>>> the pages, which should trip up on the elevated refcount. If you wait
>>>>> until the direct io is completely, then I think nothing bad can be
>>>>> observed.
>>>>>
>>>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
>>>>> another issue.
>>>>>
>>>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
>>>>> -Daniel
>>>> So I did the following quick experiment on vmwgfx, and it turns out that
>>>> with it,
>>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>>>
>>>> I should probably craft an RFC formalizing this.
>>> Yeah I think that would be good. Maybe even more formalized if we also
>>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
>>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
>>> something like that.
>>>
>>> Otoh your description of when it only sometimes succeeds would indicate my
>>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>> My understanding from reading the vmf_insert_mixed() code is that iff
>> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
>> not consistent with the vm_normal_page() doc. For architectures without
>> pte_special, VM_PFNMAP must be used, and then we must also block COW
>> mappings.
>>
>> If we can get someone can commit to verify that the potential PAT WC
>> performance issue is gone with PFNMAP, I can put together a series with
>> that included.
> Iirc when I checked there's not much archs without pte_special, so I
> guess that's why we luck out. Hopefully.
>
>> As for existing userspace using COW TTM mappings, I once had a couple of
>> test cases to verify that it actually worked, in particular together
>> with huge PMDs and PUDs where breaking COW would imply splitting those,
>> but I can't think of anything else actually wanting to do that other
>> than by mistake.
> Yeah disallowing MAP_PRIVATE mappings would be another good thing to
> lock down. Really doesn't make much sense.
> -Daniel

Yes, we can't allow them with PFNMAP + a non-linear address space...

/Thomas


>> /Thomas
>>
>>
>>> Christian, what's your take?
>>> -Daniel
>>>
>>>> /Thomas
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> index 6dc96cf66744..72b6fb17c984 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           pfn_t pfnt;
>>>>           struct ttm_tt *ttm = bo->ttm;
>>>>           bool write = vmf->flags & FAULT_FLAG_WRITE;
>>>> +       struct dev_pagemap *pagemap;
>>>>
>>>>           /* Fault should not cross bo boundary. */
>>>>           page_offset &= ~(fault_page_size - 1);
>>>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           if ((pfn & (fault_page_size - 1)) != 0)
>>>>                   goto out_fallback;
>>>>
>>>> +       /*
>>>> +        * Huge entries must be special, that is marking them as devmap
>>>> +        * with no backing device map range. If there is a backing
>>>> +        * range, Don't insert a huge entry.
>>>> +        */
>>>> +       pagemap = get_dev_pagemap(pfn, NULL);
>>>> +       if (pagemap) {
>>>> +               put_dev_pagemap(pagemap);
>>>> +               goto out_fallback;
>>>> +       }
>>>> +
>>>>           /* Check that memory is contiguous. */
>>>>           if (!bo->mem.bus.is_iomem) {
>>>>                   for (i = 1; i < fault_page_size; ++i) {
>>>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>                   }
>>>>           }
>>>>
>>>> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
>>>> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>>>>           if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>>>>                   ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>>>>    #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>>>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           if (ret != VM_FAULT_NOPAGE)
>>>>                   goto out_fallback;
>>>>
>>>> +#if 1
>>>> +       {
>>>> +               int npages;
>>>> +               struct page *page;
>>>> +
>>>> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
>>>> &page);
>>>> +               if (npages == 1) {
>>>> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
>>>> +                       put_page(page);
>>>> +               } else {
>>>> +                       DRM_INFO("Fast gup failed. Good.\n");
>>>> +               }
>>>> +       }
>>>> +#endif
>>>> +
>>>>           return VM_FAULT_NOPAGE;
>>>>    out_fallback:
>>>>           count_vm_event(THP_FAULT_FALLBACK);
>>>>
>>>>
>>>>
>>>>
>>>>
>
>

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 15:37                                           ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 15:37 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 3/11/21 2:17 PM, Daniel Vetter wrote:
> On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>> Hi!
>>
>> On 3/11/21 2:00 PM, Daniel Vetter wrote:
>>> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
>>>> On 3/1/21 3:09 PM, Daniel Vetter wrote:
>>>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>>>>>> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>>>>>> good
>>>>>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>>>>>> system
>>>>>>>>>>>> memory.
>>>>>>>>>>> Hmm,
>>>>>>>>>>>
>>>>>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>>>>>
>>>>>>>>>>>        * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>>>>>> without "struct
>>>>>>>>>>>        * page" backing, however the difference is that _all_ pages
>>>>>>>>>>> with a struct
>>>>>>>>>>>        * page (that is, those where pfn_valid is true) are refcounted
>>>>>>>>>>> and
>>>>>>>>>>> considered
>>>>>>>>>>>        * normal pages by the VM. The disadvantage is that pages are
>>>>>>>>>>> refcounted
>>>>>>>>>>>        * (which can be slower and simply not an option for some PFNMAP
>>>>>>>>>>> users). The
>>>>>>>>>>>        * advantage is that we don't have to follow the strict
>>>>>>>>>>> linearity rule of
>>>>>>>>>>>        * PFNMAP mappings in order to support COWable mappings.
>>>>>>>>>>>
>>>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>>>>>> path, so
>>>>>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>>>>>> case
>>>>>>>>>>> why there could any longer ever be a significant performance
>>>>>>>>>>> difference
>>>>>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>>>>>> what sticks.
>>>>>>>>>>
>>>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>>>>>> devmap
>>>>>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>>>>>> there.
>>>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>>>>>> to find the underlying page.
>>>>>>>>>> -Daniel
>>>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>>>>>> to set
>>>>>>>>>
>>>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>>>>>> true, and
>>>>>>>>> then
>>>>>>>>>
>>>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>>>>>> gup_fast()
>>>>>>>>> backs off,
>>>>>>>>>
>>>>>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>>>>>> devmap
>>>>>>>>> page table entry for which we haven't registered any devmap struct
>>>>>>>>> pages
>>>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>>>>>> huge
>>>>>>>>> page table entry".
>>>>>>>>>
>>>>>>>>>     From what I can tell, all code calling get_dev_pagemap() already
>>>>>>>>> does that,
>>>>>>>>> it's just a question of getting it accepted and formalizing it.
>>>>>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>>>>>> else that would block gup_fast from falling over. I guess really would
>>>>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>>>>>> fails like we expect.
>>>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>>>>>> Otherwise pmd_devmap() will not return true and since there is no
>>>>>>> pmd_special() things break.
>>>>>> Is that maybe the issue we have seen with amdgpu and huge pages?
>>>>> Yeah, essentially when you have a hugepte inserted by ttm, and it
>>>>> happens to point at system memory, then gup will work on that. And
>>>>> create all kinds of havoc.
>>>>>
>>>>>> Apart from that I'm lost guys, that devmap and gup stuff is not
>>>>>> something I have a good knowledge of apart from a one mile high view.
>>>>> I'm not really better, hence would be good to do a testcase and see.
>>>>> This should provoke it:
>>>>> - allocate nicely aligned bo in system memory
>>>>> - mmap, again nicely aligned to 2M
>>>>> - do some direct io from a filesystem into that mmap, that should trigger gup
>>>>> - before the gup completes free the mmap and bo so that ttm recycles
>>>>> the pages, which should trip up on the elevated refcount. If you wait
>>>>> until the direct io is completely, then I think nothing bad can be
>>>>> observed.
>>>>>
>>>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
>>>>> another issue.
>>>>>
>>>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
>>>>> -Daniel
>>>> So I did the following quick experiment on vmwgfx, and it turns out that
>>>> with it,
>>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>>>
>>>> I should probably craft an RFC formalizing this.
>>> Yeah I think that would be good. Maybe even more formalized if we also
>>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
>>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
>>> something like that.
>>>
>>> Otoh your description of when it only sometimes succeeds would indicate my
>>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>> My understanding from reading the vmf_insert_mixed() code is that iff
>> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
>> not consistent with the vm_normal_page() doc. For architectures without
>> pte_special, VM_PFNMAP must be used, and then we must also block COW
>> mappings.
>>
>> If we can get someone can commit to verify that the potential PAT WC
>> performance issue is gone with PFNMAP, I can put together a series with
>> that included.
> Iirc when I checked there's not much archs without pte_special, so I
> guess that's why we luck out. Hopefully.
>
>> As for existing userspace using COW TTM mappings, I once had a couple of
>> test cases to verify that it actually worked, in particular together
>> with huge PMDs and PUDs where breaking COW would imply splitting those,
>> but I can't think of anything else actually wanting to do that other
>> than by mistake.
> Yeah disallowing MAP_PRIVATE mappings would be another good thing to
> lock down. Really doesn't make much sense.
> -Daniel

Yes, we can't allow them with PFNMAP + a non-linear address space...

/Thomas


>> /Thomas
>>
>>
>>> Christian, what's your take?
>>> -Daniel
>>>
>>>> /Thomas
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> index 6dc96cf66744..72b6fb17c984 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           pfn_t pfnt;
>>>>           struct ttm_tt *ttm = bo->ttm;
>>>>           bool write = vmf->flags & FAULT_FLAG_WRITE;
>>>> +       struct dev_pagemap *pagemap;
>>>>
>>>>           /* Fault should not cross bo boundary. */
>>>>           page_offset &= ~(fault_page_size - 1);
>>>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           if ((pfn & (fault_page_size - 1)) != 0)
>>>>                   goto out_fallback;
>>>>
>>>> +       /*
>>>> +        * Huge entries must be special, that is marking them as devmap
>>>> +        * with no backing device map range. If there is a backing
>>>> +        * range, Don't insert a huge entry.
>>>> +        */
>>>> +       pagemap = get_dev_pagemap(pfn, NULL);
>>>> +       if (pagemap) {
>>>> +               put_dev_pagemap(pagemap);
>>>> +               goto out_fallback;
>>>> +       }
>>>> +
>>>>           /* Check that memory is contiguous. */
>>>>           if (!bo->mem.bus.is_iomem) {
>>>>                   for (i = 1; i < fault_page_size; ++i) {
>>>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>                   }
>>>>           }
>>>>
>>>> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
>>>> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>>>>           if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>>>>                   ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>>>>    #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>>>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           if (ret != VM_FAULT_NOPAGE)
>>>>                   goto out_fallback;
>>>>
>>>> +#if 1
>>>> +       {
>>>> +               int npages;
>>>> +               struct page *page;
>>>> +
>>>> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
>>>> &page);
>>>> +               if (npages == 1) {
>>>> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
>>>> +                       put_page(page);
>>>> +               } else {
>>>> +                       DRM_INFO("Fast gup failed. Good.\n");
>>>> +               }
>>>> +       }
>>>> +#endif
>>>> +
>>>>           return VM_FAULT_NOPAGE;
>>>>    out_fallback:
>>>>           count_vm_event(THP_FAULT_FALLBACK);
>>>>
>>>>
>>>>
>>>>
>>>>
>
>
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-11 15:37                                           ` Thomas Hellström (Intel)
  0 siblings, 0 replies; 110+ messages in thread
From: Thomas Hellström (Intel) @ 2021-03-11 15:37 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	Christian König, open list:DMA BUFFER SHARING FRAMEWORK


On 3/11/21 2:17 PM, Daniel Vetter wrote:
> On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel)
> <thomas_os@shipmail.org> wrote:
>> Hi!
>>
>> On 3/11/21 2:00 PM, Daniel Vetter wrote:
>>> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote:
>>>> On 3/1/21 3:09 PM, Daniel Vetter wrote:
>>>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel):
>>>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote:
>>>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel)
>>>>>>>> wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote:
>>>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel)
>>>>>>>>>> <thomas_os@shipmail.org> wrote:
>>>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote:
>>>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be
>>>>>>>>>>>> good
>>>>>>>>>>>> if Christian can check this with some direct io to a buffer in
>>>>>>>>>>>> system
>>>>>>>>>>>> memory.
>>>>>>>>>>> Hmm,
>>>>>>>>>>>
>>>>>>>>>>> Docs (again vm_normal_page() say)
>>>>>>>>>>>
>>>>>>>>>>>        * VM_MIXEDMAP mappings can likewise contain memory with or
>>>>>>>>>>> without "struct
>>>>>>>>>>>        * page" backing, however the difference is that _all_ pages
>>>>>>>>>>> with a struct
>>>>>>>>>>>        * page (that is, those where pfn_valid is true) are refcounted
>>>>>>>>>>> and
>>>>>>>>>>> considered
>>>>>>>>>>>        * normal pages by the VM. The disadvantage is that pages are
>>>>>>>>>>> refcounted
>>>>>>>>>>>        * (which can be slower and simply not an option for some PFNMAP
>>>>>>>>>>> users). The
>>>>>>>>>>>        * advantage is that we don't have to follow the strict
>>>>>>>>>>> linearity rule of
>>>>>>>>>>>        * PFNMAP mappings in order to support COWable mappings.
>>>>>>>>>>>
>>>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn()
>>>>>>>>>>> path, so
>>>>>>>>>>> the above isn't really true, which makes me wonder if and in that
>>>>>>>>>>> case
>>>>>>>>>>> why there could any longer ever be a significant performance
>>>>>>>>>>> difference
>>>>>>>>>>> between MIXEDMAP and PFNMAP.
>>>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see
>>>>>>>>>> what sticks.
>>>>>>>>>>
>>>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that
>>>>>>>>>>> devmap
>>>>>>>>>>> hack, so they are (for the non-gup case) relying on
>>>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still
>>>>>>>>>>> there.
>>>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do
>>>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying
>>>>>>>>>> to find the underlying page.
>>>>>>>>>> -Daniel
>>>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was
>>>>>>>>> to set
>>>>>>>>>
>>>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be
>>>>>>>>> true, and
>>>>>>>>> then
>>>>>>>>>
>>>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and
>>>>>>>>> gup_fast()
>>>>>>>>> backs off,
>>>>>>>>>
>>>>>>>>> in the end that would mean setting in stone that "if there is a huge
>>>>>>>>> devmap
>>>>>>>>> page table entry for which we haven't registered any devmap struct
>>>>>>>>> pages
>>>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special"
>>>>>>>>> huge
>>>>>>>>> page table entry".
>>>>>>>>>
>>>>>>>>>     From what I can tell, all code calling get_dev_pagemap() already
>>>>>>>>> does that,
>>>>>>>>> it's just a question of getting it accepted and formalizing it.
>>>>>>>> Oh I thought that's already how it works, since I didn't spot anything
>>>>>>>> else that would block gup_fast from falling over. I guess really would
>>>>>>>> need some testcases to make sure direct i/o (that's the easiest to test)
>>>>>>>> fails like we expect.
>>>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes.
>>>>>>> Otherwise pmd_devmap() will not return true and since there is no
>>>>>>> pmd_special() things break.
>>>>>> Is that maybe the issue we have seen with amdgpu and huge pages?
>>>>> Yeah, essentially when you have a hugepte inserted by ttm, and it
>>>>> happens to point at system memory, then gup will work on that. And
>>>>> create all kinds of havoc.
>>>>>
>>>>>> Apart from that I'm lost guys, that devmap and gup stuff is not
>>>>>> something I have a good knowledge of apart from a one mile high view.
>>>>> I'm not really better, hence would be good to do a testcase and see.
>>>>> This should provoke it:
>>>>> - allocate nicely aligned bo in system memory
>>>>> - mmap, again nicely aligned to 2M
>>>>> - do some direct io from a filesystem into that mmap, that should trigger gup
>>>>> - before the gup completes free the mmap and bo so that ttm recycles
>>>>> the pages, which should trip up on the elevated refcount. If you wait
>>>>> until the direct io is completely, then I think nothing bad can be
>>>>> observed.
>>>>>
>>>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have
>>>>> another issue.
>>>>>
>>>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong.
>>>>> -Daniel
>>>> So I did the following quick experiment on vmwgfx, and it turns out that
>>>> with it,
>>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>>>
>>>> I should probably craft an RFC formalizing this.
>>> Yeah I think that would be good. Maybe even more formalized if we also
>>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
>>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
>>> something like that.
>>>
>>> Otoh your description of when it only sometimes succeeds would indicate my
>>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>> My understanding from reading the vmf_insert_mixed() code is that iff
>> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
>> not consistent with the vm_normal_page() doc. For architectures without
>> pte_special, VM_PFNMAP must be used, and then we must also block COW
>> mappings.
>>
>> If we can get someone can commit to verify that the potential PAT WC
>> performance issue is gone with PFNMAP, I can put together a series with
>> that included.
> Iirc when I checked there's not much archs without pte_special, so I
> guess that's why we luck out. Hopefully.
>
>> As for existing userspace using COW TTM mappings, I once had a couple of
>> test cases to verify that it actually worked, in particular together
>> with huge PMDs and PUDs where breaking COW would imply splitting those,
>> but I can't think of anything else actually wanting to do that other
>> than by mistake.
> Yeah disallowing MAP_PRIVATE mappings would be another good thing to
> lock down. Really doesn't make much sense.
> -Daniel

Yes, we can't allow them with PFNMAP + a non-linear address space...

/Thomas


>> /Thomas
>>
>>
>>> Christian, what's your take?
>>> -Daniel
>>>
>>>> /Thomas
>>>>
>>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> index 6dc96cf66744..72b6fb17c984 100644
>>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
>>>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           pfn_t pfnt;
>>>>           struct ttm_tt *ttm = bo->ttm;
>>>>           bool write = vmf->flags & FAULT_FLAG_WRITE;
>>>> +       struct dev_pagemap *pagemap;
>>>>
>>>>           /* Fault should not cross bo boundary. */
>>>>           page_offset &= ~(fault_page_size - 1);
>>>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           if ((pfn & (fault_page_size - 1)) != 0)
>>>>                   goto out_fallback;
>>>>
>>>> +       /*
>>>> +        * Huge entries must be special, that is marking them as devmap
>>>> +        * with no backing device map range. If there is a backing
>>>> +        * range, Don't insert a huge entry.
>>>> +        */
>>>> +       pagemap = get_dev_pagemap(pfn, NULL);
>>>> +       if (pagemap) {
>>>> +               put_dev_pagemap(pagemap);
>>>> +               goto out_fallback;
>>>> +       }
>>>> +
>>>>           /* Check that memory is contiguous. */
>>>>           if (!bo->mem.bus.is_iomem) {
>>>>                   for (i = 1; i < fault_page_size; ++i) {
>>>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>                   }
>>>>           }
>>>>
>>>> -       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV);
>>>> +       pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP);
>>>>           if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT))
>>>>                   ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write);
>>>>    #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
>>>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault
>>>> *vmf,
>>>>           if (ret != VM_FAULT_NOPAGE)
>>>>                   goto out_fallback;
>>>>
>>>> +#if 1
>>>> +       {
>>>> +               int npages;
>>>> +               struct page *page;
>>>> +
>>>> +               npages = get_user_pages_fast_only(vmf->address, 1, 0,
>>>> &page);
>>>> +               if (npages == 1) {
>>>> +                       DRM_WARN("Fast gup succeeded. Bad.\n");
>>>> +                       put_page(page);
>>>> +               } else {
>>>> +                       DRM_INFO("Fast gup failed. Good.\n");
>>>> +               }
>>>> +       }
>>>> +#endif
>>>> +
>>>>           return VM_FAULT_NOPAGE;
>>>>    out_fallback:
>>>>           count_vm_event(THP_FAULT_FALLBACK);
>>>>
>>>>
>>>>
>>>>
>>>>
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
  2021-03-11 13:17                                         ` Daniel Vetter
  (?)
@ 2021-03-12  7:51                                           ` Christian König
  -1 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-03-12  7:51 UTC (permalink / raw)
  To: Daniel Vetter, Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 11.03.21 um 14:17 schrieb Daniel Vetter:
> [SNIP]
>>>> So I did the following quick experiment on vmwgfx, and it turns out that
>>>> with it,
>>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>>>
>>>> I should probably craft an RFC formalizing this.
>>> Yeah I think that would be good. Maybe even more formalized if we also
>>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
>>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
>>> something like that.
>>>
>>> Otoh your description of when it only sometimes succeeds would indicate my
>>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>> My understanding from reading the vmf_insert_mixed() code is that iff
>> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
>> not consistent with the vm_normal_page() doc. For architectures without
>> pte_special, VM_PFNMAP must be used, and then we must also block COW
>> mappings.
>>
>> If we can get someone can commit to verify that the potential PAT WC
>> performance issue is gone with PFNMAP, I can put together a series with
>> that included.
> Iirc when I checked there's not much archs without pte_special, so I
> guess that's why we luck out. Hopefully.

I still need to read up a bit on what you guys are discussing here, but 
it starts to make a picture. Especially my understanding of what 
VM_MIXEDMAP means seems to have been slightly of.

I would say just go ahead and provide patches to always use VM_PFNMAP in 
TTM and we can test it and see if there are still some issues.

>> As for existing userspace using COW TTM mappings, I once had a couple of
>> test cases to verify that it actually worked, in particular together
>> with huge PMDs and PUDs where breaking COW would imply splitting those,
>> but I can't think of anything else actually wanting to do that other
>> than by mistake.
> Yeah disallowing MAP_PRIVATE mappings would be another good thing to
> lock down. Really doesn't make much sense.

Completely agree. That sounds like something we should try to avoid.

Regards,
Christian.

> -Daniel
>


^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-12  7:51                                           ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-03-12  7:51 UTC (permalink / raw)
  To: Daniel Vetter, Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 11.03.21 um 14:17 schrieb Daniel Vetter:
> [SNIP]
>>>> So I did the following quick experiment on vmwgfx, and it turns out that
>>>> with it,
>>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>>>
>>>> I should probably craft an RFC formalizing this.
>>> Yeah I think that would be good. Maybe even more formalized if we also
>>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
>>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
>>> something like that.
>>>
>>> Otoh your description of when it only sometimes succeeds would indicate my
>>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>> My understanding from reading the vmf_insert_mixed() code is that iff
>> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
>> not consistent with the vm_normal_page() doc. For architectures without
>> pte_special, VM_PFNMAP must be used, and then we must also block COW
>> mappings.
>>
>> If we can get someone can commit to verify that the potential PAT WC
>> performance issue is gone with PFNMAP, I can put together a series with
>> that included.
> Iirc when I checked there's not much archs without pte_special, so I
> guess that's why we luck out. Hopefully.

I still need to read up a bit on what you guys are discussing here, but 
it starts to make a picture. Especially my understanding of what 
VM_MIXEDMAP means seems to have been slightly of.

I would say just go ahead and provide patches to always use VM_PFNMAP in 
TTM and we can test it and see if there are still some issues.

>> As for existing userspace using COW TTM mappings, I once had a couple of
>> test cases to verify that it actually worked, in particular together
>> with huge PMDs and PUDs where breaking COW would imply splitting those,
>> but I can't think of anything else actually wanting to do that other
>> than by mistake.
> Yeah disallowing MAP_PRIVATE mappings would be another good thing to
> lock down. Really doesn't make much sense.

Completely agree. That sounds like something we should try to avoid.

Regards,
Christian.

> -Daniel
>

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 110+ messages in thread

* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap
@ 2021-03-12  7:51                                           ` Christian König
  0 siblings, 0 replies; 110+ messages in thread
From: Christian König @ 2021-03-12  7:51 UTC (permalink / raw)
  To: Daniel Vetter, Thomas Hellström (Intel)
  Cc: Christian König, Intel Graphics Development, Matthew Wilcox,
	moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe,
	John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan,
	open list:DMA BUFFER SHARING FRAMEWORK



Am 11.03.21 um 14:17 schrieb Daniel Vetter:
> [SNIP]
>>>> So I did the following quick experiment on vmwgfx, and it turns out that
>>>> with it,
>>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds
>>>>
>>>> I should probably craft an RFC formalizing this.
>>> Yeah I think that would be good. Maybe even more formalized if we also
>>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the
>>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or
>>> something like that.
>>>
>>> Otoh your description of when it only sometimes succeeds would indicate my
>>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here.
>> My understanding from reading the vmf_insert_mixed() code is that iff
>> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's
>> not consistent with the vm_normal_page() doc. For architectures without
>> pte_special, VM_PFNMAP must be used, and then we must also block COW
>> mappings.
>>
>> If we can get someone can commit to verify that the potential PAT WC
>> performance issue is gone with PFNMAP, I can put together a series with
>> that included.
> Iirc when I checked there's not much archs without pte_special, so I
> guess that's why we luck out. Hopefully.

I still need to read up a bit on what you guys are discussing here, but 
it starts to make a picture. Especially my understanding of what 
VM_MIXEDMAP means seems to have been slightly of.

I would say just go ahead and provide patches to always use VM_PFNMAP in 
TTM and we can test it and see if there are still some issues.

>> As for existing userspace using COW TTM mappings, I once had a couple of
>> test cases to verify that it actually worked, in particular together
>> with huge PMDs and PUDs where breaking COW would imply splitting those,
>> but I can't think of anything else actually wanting to do that other
>> than by mistake.
> Yeah disallowing MAP_PRIVATE mappings would be another good thing to
> lock down. Really doesn't make much sense.

Completely agree. That sounds like something we should try to avoid.

Regards,
Christian.

> -Daniel
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 110+ messages in thread

end of thread, other threads:[~2021-03-12  7:56 UTC | newest]

Thread overview: 110+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-23 10:59 [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap Daniel Vetter
2021-02-23 10:59 ` [Intel-gfx] " Daniel Vetter
2021-02-23 10:59 ` Daniel Vetter
2021-02-23 10:59 ` [PATCH 2/2] drm/vgem: use shmem helpers Daniel Vetter
2021-02-23 10:59   ` [Intel-gfx] " Daniel Vetter
2021-02-23 11:19   ` Thomas Zimmermann
2021-02-23 11:19     ` [Intel-gfx] " Thomas Zimmermann
2021-02-23 11:51   ` [PATCH] " Daniel Vetter
2021-02-23 11:51     ` [Intel-gfx] " Daniel Vetter
2021-02-23 14:21   ` [PATCH 2/2] " kernel test robot
2021-02-23 14:21     ` kernel test robot
2021-02-23 14:21     ` [Intel-gfx] " kernel test robot
2021-02-23 15:07   ` kernel test robot
2021-02-23 15:07     ` kernel test robot
2021-02-23 15:07     ` [Intel-gfx] " kernel test robot
2021-02-25 10:23   ` [PATCH] " Daniel Vetter
2021-02-25 10:23     ` [Intel-gfx] " Daniel Vetter
2021-02-26  9:19     ` Thomas Zimmermann
2021-02-26  9:19       ` [Intel-gfx] " Thomas Zimmermann
2021-02-26 13:30       ` Daniel Vetter
2021-02-26 13:30         ` [Intel-gfx] " Daniel Vetter
2021-02-26 13:51         ` Thomas Zimmermann
2021-02-26 13:51           ` [Intel-gfx] " Thomas Zimmermann
2021-02-26 14:04           ` Daniel Vetter
2021-02-26 14:04             ` [Intel-gfx] " Daniel Vetter
2021-02-23 11:19 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap Patchwork
2021-02-23 13:11 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev2) Patchwork
2021-02-24  7:46 ` [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap Thomas Hellström (Intel)
2021-02-24  7:46   ` [Intel-gfx] " Thomas Hellström (Intel)
2021-02-24  7:46   ` Thomas Hellström (Intel)
2021-02-24  8:45   ` Daniel Vetter
2021-02-24  8:45     ` [Intel-gfx] " Daniel Vetter
2021-02-24  8:45     ` Daniel Vetter
2021-02-24  9:15     ` Thomas Hellström (Intel)
2021-02-24  9:15       ` [Intel-gfx] " Thomas Hellström (Intel)
2021-02-24  9:15       ` Thomas Hellström (Intel)
2021-02-24  9:31       ` Daniel Vetter
2021-02-24  9:31         ` [Intel-gfx] " Daniel Vetter
2021-02-24  9:31         ` Daniel Vetter
2021-02-25 10:28         ` Christian König
2021-02-25 10:28           ` [Intel-gfx] " Christian König
2021-02-25 10:28           ` Christian König
2021-02-25 10:44           ` Daniel Vetter
2021-02-25 10:44             ` [Intel-gfx] " Daniel Vetter
2021-02-25 10:44             ` Daniel Vetter
2021-02-25 15:49             ` Daniel Vetter
2021-02-25 15:49               ` [Intel-gfx] " Daniel Vetter
2021-02-25 15:49               ` Daniel Vetter
2021-02-25 16:53               ` Christian König
2021-02-25 16:53                 ` [Intel-gfx] " Christian König
2021-02-25 16:53                 ` Christian König
2021-02-26  9:41               ` Thomas Hellström (Intel)
2021-02-26  9:41                 ` [Intel-gfx] " Thomas Hellström (Intel)
2021-02-26  9:41                 ` Thomas Hellström (Intel)
2021-02-26 13:28                 ` Daniel Vetter
2021-02-26 13:28                   ` [Intel-gfx] " Daniel Vetter
2021-02-26 13:28                   ` Daniel Vetter
2021-02-27  8:06                   ` Thomas Hellström (Intel)
2021-02-27  8:06                     ` [Intel-gfx] " Thomas Hellström (Intel)
2021-02-27  8:06                     ` Thomas Hellström (Intel)
2021-03-01  8:28                     ` Daniel Vetter
2021-03-01  8:28                       ` [Intel-gfx] " Daniel Vetter
2021-03-01  8:28                       ` Daniel Vetter
2021-03-01  8:39                       ` Thomas Hellström (Intel)
2021-03-01  8:39                         ` [Intel-gfx] " Thomas Hellström (Intel)
2021-03-01  8:39                         ` Thomas Hellström (Intel)
2021-03-01  9:05                         ` Daniel Vetter
2021-03-01  9:05                           ` [Intel-gfx] " Daniel Vetter
2021-03-01  9:05                           ` Daniel Vetter
2021-03-01  9:21                           ` Thomas Hellström (Intel)
2021-03-01  9:21                             ` [Intel-gfx] " Thomas Hellström (Intel)
2021-03-01  9:21                             ` Thomas Hellström (Intel)
2021-03-01 10:17                             ` Christian König
2021-03-01 10:17                               ` [Intel-gfx] " Christian König
2021-03-01 10:17                               ` Christian König
2021-03-01 14:09                               ` Daniel Vetter
2021-03-01 14:09                                 ` [Intel-gfx] " Daniel Vetter
2021-03-01 14:09                                 ` Daniel Vetter
2021-03-11 10:22                                 ` Thomas Hellström (Intel)
2021-03-11 10:22                                   ` [Intel-gfx] " Thomas Hellström (Intel)
2021-03-11 10:22                                   ` Thomas Hellström (Intel)
2021-03-11 13:00                                   ` Daniel Vetter
2021-03-11 13:00                                     ` [Intel-gfx] " Daniel Vetter
2021-03-11 13:00                                     ` Daniel Vetter
2021-03-11 13:12                                     ` Thomas Hellström (Intel)
2021-03-11 13:12                                       ` [Intel-gfx] " Thomas Hellström (Intel)
2021-03-11 13:12                                       ` Thomas Hellström (Intel)
2021-03-11 13:17                                       ` Daniel Vetter
2021-03-11 13:17                                         ` [Intel-gfx] " Daniel Vetter
2021-03-11 13:17                                         ` Daniel Vetter
2021-03-11 15:37                                         ` Thomas Hellström (Intel)
2021-03-11 15:37                                           ` [Intel-gfx] " Thomas Hellström (Intel)
2021-03-11 15:37                                           ` Thomas Hellström (Intel)
2021-03-12  7:51                                         ` Christian König
2021-03-12  7:51                                           ` [Intel-gfx] " Christian König
2021-03-12  7:51                                           ` Christian König
2021-02-24 18:46     ` Jason Gunthorpe
2021-02-24 18:46       ` Jason Gunthorpe
2021-02-25 10:30       ` Christian König
2021-02-25 10:30         ` [Intel-gfx] " Christian König
2021-02-25 10:30         ` Christian König
2021-02-25 10:45         ` Daniel Vetter
2021-02-25 10:45           ` [Intel-gfx] " Daniel Vetter
2021-02-25 10:45           ` Daniel Vetter
2021-02-25 10:38 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3) Patchwork
2021-02-25 11:19 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2021-02-26  3:57 ` [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap John Stultz
2021-02-26  3:57   ` [Intel-gfx] " John Stultz
2021-02-26  3:57   ` John Stultz
2021-03-11 10:58 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [Linaro-mm-sig,1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev4) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.