* [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-23 10:59 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw) To: DRI Development Cc: Intel Graphics Development, Daniel Vetter, Christian König, Jason Gunthorpe, Suren Baghdasaryan, Matthew Wilcox, John Stultz, Daniel Vetter, Sumit Semwal, linux-media, linaro-mm-sig tldr; DMA buffers aren't normal memory, expecting that you can use them like that (like calling get_user_pages works, or that they're accounting like any other normal memory) cannot be guaranteed. Since some userspace only runs on integrated devices, where all buffers are actually all resident system memory, there's a huge temptation to assume that a struct page is always present and useable like for any more pagecache backed mmap. This has the potential to result in a uapi nightmare. To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which blocks get_user_pages and all the other struct page based infrastructure for everyone. In spirit this is the uapi counterpart to the kernel-internal CONFIG_DMABUF_DEBUG. Motivated by a recent patch which wanted to swich the system dma-buf heap to vm_insert_page instead of vm_insert_pfn. v2: Jason brought up that we also want to guarantee that all ptes have the pte_special flag set, to catch fast get_user_pages (on architectures that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. From auditing the various functions to insert pfn pte entires (vm_insert_pfn_prot, remap_pfn_range and all it's callers like dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so this should be the correct flag to check for. References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/ Acked-by: Christian König <christian.koenig@amd.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org --- drivers/dma-buf/dma-buf.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index f264b70c383e..06cb1d2e9fdc 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -127,6 +127,7 @@ static struct file_system_type dma_buf_fs_type = { static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) { struct dma_buf *dmabuf; + int ret; if (!is_dma_buf_file(file)) return -EINVAL; @@ -142,7 +143,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) dmabuf->size >> PAGE_SHIFT) return -EINVAL; - return dmabuf->ops->mmap(dmabuf, vma); + ret = dmabuf->ops->mmap(dmabuf, vma); + + WARN_ON(!(vma->vm_flags & VM_PFNMAP)); + + return ret; } static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) @@ -1244,6 +1249,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access); int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, unsigned long pgoff) { + int ret; + if (WARN_ON(!dmabuf || !vma)) return -EINVAL; @@ -1264,7 +1271,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, vma_set_file(vma, dmabuf->file); vma->vm_pgoff = pgoff; - return dmabuf->ops->mmap(dmabuf, vma); + ret = dmabuf->ops->mmap(dmabuf, vma); + + WARN_ON(!(vma->vm_flags & VM_PFNMAP)); + + return ret; } EXPORT_SYMBOL_GPL(dma_buf_mmap); -- 2.30.0 ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [Intel-gfx] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-23 10:59 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox, Sumit Semwal, linaro-mm-sig, Jason Gunthorpe, John Stultz, Daniel Vetter, Suren Baghdasaryan, Christian König, linux-media tldr; DMA buffers aren't normal memory, expecting that you can use them like that (like calling get_user_pages works, or that they're accounting like any other normal memory) cannot be guaranteed. Since some userspace only runs on integrated devices, where all buffers are actually all resident system memory, there's a huge temptation to assume that a struct page is always present and useable like for any more pagecache backed mmap. This has the potential to result in a uapi nightmare. To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which blocks get_user_pages and all the other struct page based infrastructure for everyone. In spirit this is the uapi counterpart to the kernel-internal CONFIG_DMABUF_DEBUG. Motivated by a recent patch which wanted to swich the system dma-buf heap to vm_insert_page instead of vm_insert_pfn. v2: Jason brought up that we also want to guarantee that all ptes have the pte_special flag set, to catch fast get_user_pages (on architectures that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. From auditing the various functions to insert pfn pte entires (vm_insert_pfn_prot, remap_pfn_range and all it's callers like dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so this should be the correct flag to check for. References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/ Acked-by: Christian König <christian.koenig@amd.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org --- drivers/dma-buf/dma-buf.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index f264b70c383e..06cb1d2e9fdc 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -127,6 +127,7 @@ static struct file_system_type dma_buf_fs_type = { static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) { struct dma_buf *dmabuf; + int ret; if (!is_dma_buf_file(file)) return -EINVAL; @@ -142,7 +143,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) dmabuf->size >> PAGE_SHIFT) return -EINVAL; - return dmabuf->ops->mmap(dmabuf, vma); + ret = dmabuf->ops->mmap(dmabuf, vma); + + WARN_ON(!(vma->vm_flags & VM_PFNMAP)); + + return ret; } static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) @@ -1244,6 +1249,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access); int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, unsigned long pgoff) { + int ret; + if (WARN_ON(!dmabuf || !vma)) return -EINVAL; @@ -1264,7 +1271,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, vma_set_file(vma, dmabuf->file); vma->vm_pgoff = pgoff; - return dmabuf->ops->mmap(dmabuf, vma); + ret = dmabuf->ops->mmap(dmabuf, vma); + + WARN_ON(!(vma->vm_flags & VM_PFNMAP)); + + return ret; } EXPORT_SYMBOL_GPL(dma_buf_mmap); -- 2.30.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-23 10:59 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox, linaro-mm-sig, Jason Gunthorpe, Daniel Vetter, Suren Baghdasaryan, Christian König, linux-media tldr; DMA buffers aren't normal memory, expecting that you can use them like that (like calling get_user_pages works, or that they're accounting like any other normal memory) cannot be guaranteed. Since some userspace only runs on integrated devices, where all buffers are actually all resident system memory, there's a huge temptation to assume that a struct page is always present and useable like for any more pagecache backed mmap. This has the potential to result in a uapi nightmare. To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which blocks get_user_pages and all the other struct page based infrastructure for everyone. In spirit this is the uapi counterpart to the kernel-internal CONFIG_DMABUF_DEBUG. Motivated by a recent patch which wanted to swich the system dma-buf heap to vm_insert_page instead of vm_insert_pfn. v2: Jason brought up that we also want to guarantee that all ptes have the pte_special flag set, to catch fast get_user_pages (on architectures that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. From auditing the various functions to insert pfn pte entires (vm_insert_pfn_prot, remap_pfn_range and all it's callers like dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so this should be the correct flag to check for. References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/ Acked-by: Christian König <christian.koenig@amd.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: John Stultz <john.stultz@linaro.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org --- drivers/dma-buf/dma-buf.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index f264b70c383e..06cb1d2e9fdc 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -127,6 +127,7 @@ static struct file_system_type dma_buf_fs_type = { static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) { struct dma_buf *dmabuf; + int ret; if (!is_dma_buf_file(file)) return -EINVAL; @@ -142,7 +143,11 @@ static int dma_buf_mmap_internal(struct file *file, struct vm_area_struct *vma) dmabuf->size >> PAGE_SHIFT) return -EINVAL; - return dmabuf->ops->mmap(dmabuf, vma); + ret = dmabuf->ops->mmap(dmabuf, vma); + + WARN_ON(!(vma->vm_flags & VM_PFNMAP)); + + return ret; } static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) @@ -1244,6 +1249,8 @@ EXPORT_SYMBOL_GPL(dma_buf_end_cpu_access); int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, unsigned long pgoff) { + int ret; + if (WARN_ON(!dmabuf || !vma)) return -EINVAL; @@ -1264,7 +1271,11 @@ int dma_buf_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma, vma_set_file(vma, dmabuf->file); vma->vm_pgoff = pgoff; - return dmabuf->ops->mmap(dmabuf, vma); + ret = dmabuf->ops->mmap(dmabuf, vma); + + WARN_ON(!(vma->vm_flags & VM_PFNMAP)); + + return ret; } EXPORT_SYMBOL_GPL(dma_buf_mmap); -- 2.30.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [PATCH 2/2] drm/vgem: use shmem helpers 2021-02-23 10:59 ` Daniel Vetter @ 2021-02-23 10:59 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Intel Graphics Development, Christian König, Melissa Wen, Daniel Vetter, Chris Wilson Aside from deleting lots of code the real motivation here is to switch the mmap over to VM_PFNMAP, to be more consistent with what real gpu drivers do. They're all VM_PFNMP, which means get_user_pages doesn't work, and even if you try and there's a struct page behind that, touching it and mucking around with its refcount can upset drivers real bad. Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Melissa Wen <melissa.srw@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/vgem/vgem_drv.c | 280 +------------------------------- 1 file changed, 3 insertions(+), 277 deletions(-) diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index a0e75f1d5d01..88b3d125a610 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -40,6 +40,7 @@ #include <drm/drm_file.h> #include <drm/drm_ioctl.h> #include <drm/drm_managed.h> +#include <drm/drm_gem_shmem_helper.h> #include <drm/drm_prime.h> #include "vgem_drv.h" @@ -50,27 +51,11 @@ #define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 -static const struct drm_gem_object_funcs vgem_gem_object_funcs; - static struct vgem_device { struct drm_device drm; struct platform_device *platform; } *vgem_device; -static void vgem_gem_free_object(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); - - kvfree(vgem_obj->pages); - mutex_destroy(&vgem_obj->pages_lock); - - if (obj->import_attach) - drm_prime_gem_destroy(obj, vgem_obj->table); - - drm_gem_object_release(obj); - kfree(vgem_obj); -} - static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; @@ -159,265 +144,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) kfree(vfile); } -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = kzalloc(sizeof(*obj), GFP_KERNEL); - if (!obj) - return ERR_PTR(-ENOMEM); - - obj->base.funcs = &vgem_gem_object_funcs; - - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); - if (ret) { - kfree(obj); - return ERR_PTR(ret); - } - - mutex_init(&obj->pages_lock); - - return obj; -} - -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) -{ - drm_gem_object_release(&obj->base); - kfree(obj); -} - -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, - struct drm_file *file, - unsigned int *handle, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = __vgem_gem_create(dev, size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - ret = drm_gem_handle_create(file, &obj->base, handle); - if (ret) { - drm_gem_object_put(&obj->base); - return ERR_PTR(ret); - } - - return &obj->base; -} - -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, - struct drm_mode_create_dumb *args) -{ - struct drm_gem_object *gem_object; - u64 pitch, size; - - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); - size = args->height * pitch; - if (size == 0) - return -EINVAL; - - gem_object = vgem_gem_create(dev, file, &args->handle, size); - if (IS_ERR(gem_object)) - return PTR_ERR(gem_object); - - args->size = gem_object->size; - args->pitch = pitch; - - drm_gem_object_put(gem_object); - - DRM_DEBUG("Created object of size %llu\n", args->size); - - return 0; -} - static struct drm_ioctl_desc vgem_ioctls[] = { DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), }; -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) -{ - unsigned long flags = vma->vm_flags; - int ret; - - ret = drm_gem_mmap(filp, vma); - if (ret) - return ret; - - /* Keep the WC mmaping set by drm_gem_mmap() but our pages - * are ordinary and not special. - */ - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; - return 0; -} - -static const struct file_operations vgem_driver_fops = { - .owner = THIS_MODULE, - .open = drm_open, - .mmap = vgem_mmap, - .poll = drm_poll, - .read = drm_read, - .unlocked_ioctl = drm_ioctl, - .compat_ioctl = drm_compat_ioctl, - .release = drm_release, -}; - -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (bo->pages_pin_count++ == 0) { - struct page **pages; - - pages = drm_gem_get_pages(&bo->base); - if (IS_ERR(pages)) { - bo->pages_pin_count--; - mutex_unlock(&bo->pages_lock); - return pages; - } - - bo->pages = pages; - } - mutex_unlock(&bo->pages_lock); - - return bo->pages; -} - -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (--bo->pages_pin_count == 0) { - drm_gem_put_pages(&bo->base, bo->pages, true, true); - bo->pages = NULL; - } - mutex_unlock(&bo->pages_lock); -} - -static int vgem_prime_pin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - /* Flush the object from the CPU cache so that importers can rely - * on coherent indirect access via the exported dma-address. - */ - drm_clflush_pages(pages, n_pages); - - return 0; -} - -static void vgem_prime_unpin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vgem_unpin_pages(bo); -} - -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); -} - -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) -{ - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); - - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); -} - -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, - struct dma_buf_attachment *attach, struct sg_table *sg) -{ - struct drm_vgem_gem_object *obj; - int npages; - - obj = __vgem_gem_create(dev, attach->dmabuf->size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; - - obj->table = sg; - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); - if (!obj->pages) { - __vgem_gem_destroy(obj); - return ERR_PTR(-ENOMEM); - } - - obj->pages_pin_count++; /* perma-pinned */ - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); - return &obj->base; -} - -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - void *vaddr; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); - if (!vaddr) - return -ENOMEM; - dma_buf_map_set_vaddr(map, vaddr); - - return 0; -} - -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vunmap(map->vaddr); - vgem_unpin_pages(bo); -} - -static int vgem_prime_mmap(struct drm_gem_object *obj, - struct vm_area_struct *vma) -{ - int ret; - - if (obj->size < vma->vm_end - vma->vm_start) - return -EINVAL; - - if (!obj->filp) - return -ENODEV; - - ret = call_mmap(obj->filp, vma); - if (ret) - return ret; - - vma_set_file(vma, obj->filp); - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); - - return 0; -} - -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { - .free = vgem_gem_free_object, - .pin = vgem_prime_pin, - .unpin = vgem_prime_unpin, - .get_sg_table = vgem_prime_get_sg_table, - .vmap = vgem_prime_vmap, - .vunmap = vgem_prime_vunmap, - .vm_ops = &vgem_gem_vm_ops, -}; +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); static const struct drm_driver vgem_driver = { .driver_features = DRIVER_GEM | DRIVER_RENDER, @@ -427,13 +159,7 @@ static const struct drm_driver vgem_driver = { .num_ioctls = ARRAY_SIZE(vgem_ioctls), .fops = &vgem_driver_fops, - .dumb_create = vgem_gem_dumb_create, - - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_import = vgem_prime_import, - .gem_prime_import_sg_table = vgem_prime_import_sg_table, - .gem_prime_mmap = vgem_prime_mmap, + DRM_GEM_SHMEM_DRIVER_OPS, .name = DRIVER_NAME, .desc = DRIVER_DESC, -- 2.30.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [Intel-gfx] [PATCH 2/2] drm/vgem: use shmem helpers @ 2021-02-23 10:59 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-23 10:59 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Intel Graphics Development, Christian König, Melissa Wen, John Stultz, Daniel Vetter, Chris Wilson, Sumit Semwal Aside from deleting lots of code the real motivation here is to switch the mmap over to VM_PFNMAP, to be more consistent with what real gpu drivers do. They're all VM_PFNMP, which means get_user_pages doesn't work, and even if you try and there's a struct page behind that, touching it and mucking around with its refcount can upset drivers real bad. Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Melissa Wen <melissa.srw@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/vgem/vgem_drv.c | 280 +------------------------------- 1 file changed, 3 insertions(+), 277 deletions(-) diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index a0e75f1d5d01..88b3d125a610 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -40,6 +40,7 @@ #include <drm/drm_file.h> #include <drm/drm_ioctl.h> #include <drm/drm_managed.h> +#include <drm/drm_gem_shmem_helper.h> #include <drm/drm_prime.h> #include "vgem_drv.h" @@ -50,27 +51,11 @@ #define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 -static const struct drm_gem_object_funcs vgem_gem_object_funcs; - static struct vgem_device { struct drm_device drm; struct platform_device *platform; } *vgem_device; -static void vgem_gem_free_object(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); - - kvfree(vgem_obj->pages); - mutex_destroy(&vgem_obj->pages_lock); - - if (obj->import_attach) - drm_prime_gem_destroy(obj, vgem_obj->table); - - drm_gem_object_release(obj); - kfree(vgem_obj); -} - static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; @@ -159,265 +144,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) kfree(vfile); } -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = kzalloc(sizeof(*obj), GFP_KERNEL); - if (!obj) - return ERR_PTR(-ENOMEM); - - obj->base.funcs = &vgem_gem_object_funcs; - - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); - if (ret) { - kfree(obj); - return ERR_PTR(ret); - } - - mutex_init(&obj->pages_lock); - - return obj; -} - -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) -{ - drm_gem_object_release(&obj->base); - kfree(obj); -} - -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, - struct drm_file *file, - unsigned int *handle, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = __vgem_gem_create(dev, size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - ret = drm_gem_handle_create(file, &obj->base, handle); - if (ret) { - drm_gem_object_put(&obj->base); - return ERR_PTR(ret); - } - - return &obj->base; -} - -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, - struct drm_mode_create_dumb *args) -{ - struct drm_gem_object *gem_object; - u64 pitch, size; - - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); - size = args->height * pitch; - if (size == 0) - return -EINVAL; - - gem_object = vgem_gem_create(dev, file, &args->handle, size); - if (IS_ERR(gem_object)) - return PTR_ERR(gem_object); - - args->size = gem_object->size; - args->pitch = pitch; - - drm_gem_object_put(gem_object); - - DRM_DEBUG("Created object of size %llu\n", args->size); - - return 0; -} - static struct drm_ioctl_desc vgem_ioctls[] = { DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), }; -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) -{ - unsigned long flags = vma->vm_flags; - int ret; - - ret = drm_gem_mmap(filp, vma); - if (ret) - return ret; - - /* Keep the WC mmaping set by drm_gem_mmap() but our pages - * are ordinary and not special. - */ - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; - return 0; -} - -static const struct file_operations vgem_driver_fops = { - .owner = THIS_MODULE, - .open = drm_open, - .mmap = vgem_mmap, - .poll = drm_poll, - .read = drm_read, - .unlocked_ioctl = drm_ioctl, - .compat_ioctl = drm_compat_ioctl, - .release = drm_release, -}; - -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (bo->pages_pin_count++ == 0) { - struct page **pages; - - pages = drm_gem_get_pages(&bo->base); - if (IS_ERR(pages)) { - bo->pages_pin_count--; - mutex_unlock(&bo->pages_lock); - return pages; - } - - bo->pages = pages; - } - mutex_unlock(&bo->pages_lock); - - return bo->pages; -} - -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (--bo->pages_pin_count == 0) { - drm_gem_put_pages(&bo->base, bo->pages, true, true); - bo->pages = NULL; - } - mutex_unlock(&bo->pages_lock); -} - -static int vgem_prime_pin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - /* Flush the object from the CPU cache so that importers can rely - * on coherent indirect access via the exported dma-address. - */ - drm_clflush_pages(pages, n_pages); - - return 0; -} - -static void vgem_prime_unpin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vgem_unpin_pages(bo); -} - -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); -} - -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) -{ - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); - - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); -} - -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, - struct dma_buf_attachment *attach, struct sg_table *sg) -{ - struct drm_vgem_gem_object *obj; - int npages; - - obj = __vgem_gem_create(dev, attach->dmabuf->size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; - - obj->table = sg; - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); - if (!obj->pages) { - __vgem_gem_destroy(obj); - return ERR_PTR(-ENOMEM); - } - - obj->pages_pin_count++; /* perma-pinned */ - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); - return &obj->base; -} - -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - void *vaddr; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); - if (!vaddr) - return -ENOMEM; - dma_buf_map_set_vaddr(map, vaddr); - - return 0; -} - -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vunmap(map->vaddr); - vgem_unpin_pages(bo); -} - -static int vgem_prime_mmap(struct drm_gem_object *obj, - struct vm_area_struct *vma) -{ - int ret; - - if (obj->size < vma->vm_end - vma->vm_start) - return -EINVAL; - - if (!obj->filp) - return -ENODEV; - - ret = call_mmap(obj->filp, vma); - if (ret) - return ret; - - vma_set_file(vma, obj->filp); - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); - - return 0; -} - -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { - .free = vgem_gem_free_object, - .pin = vgem_prime_pin, - .unpin = vgem_prime_unpin, - .get_sg_table = vgem_prime_get_sg_table, - .vmap = vgem_prime_vmap, - .vunmap = vgem_prime_vunmap, - .vm_ops = &vgem_gem_vm_ops, -}; +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); static const struct drm_driver vgem_driver = { .driver_features = DRIVER_GEM | DRIVER_RENDER, @@ -427,13 +159,7 @@ static const struct drm_driver vgem_driver = { .num_ioctls = ARRAY_SIZE(vgem_ioctls), .fops = &vgem_driver_fops, - .dumb_create = vgem_gem_dumb_create, - - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_import = vgem_prime_import, - .gem_prime_import_sg_table = vgem_prime_import_sg_table, - .gem_prime_mmap = vgem_prime_mmap, + DRM_GEM_SHMEM_DRIVER_OPS, .name = DRIVER_NAME, .desc = DRIVER_DESC, -- 2.30.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 2/2] drm/vgem: use shmem helpers 2021-02-23 10:59 ` [Intel-gfx] " Daniel Vetter @ 2021-02-23 11:19 ` Thomas Zimmermann -1 siblings, 0 replies; 110+ messages in thread From: Thomas Zimmermann @ 2021-02-23 11:19 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: Melissa Wen, Daniel Vetter, Intel Graphics Development, Christian König, Chris Wilson [-- Attachment #1.1.1: Type: text/plain, Size: 10420 bytes --] Hi Am 23.02.21 um 11:59 schrieb Daniel Vetter: > Aside from deleting lots of code the real motivation here is to switch > the mmap over to VM_PFNMAP, to be more consistent with what real gpu > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't > work, and even if you try and there's a struct page behind that, > touching it and mucking around with its refcount can upset drivers > real bad. > > Cc: John Stultz <john.stultz@linaro.org> > Cc: Sumit Semwal <sumit.semwal@linaro.org> > Cc: "Christian König" <christian.koenig@amd.com> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > Cc: Melissa Wen <melissa.srw@gmail.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/vgem/vgem_drv.c | 280 +------------------------------- > 1 file changed, 3 insertions(+), 277 deletions(-) > > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c > index a0e75f1d5d01..88b3d125a610 100644 > --- a/drivers/gpu/drm/vgem/vgem_drv.c > +++ b/drivers/gpu/drm/vgem/vgem_drv.c > @@ -40,6 +40,7 @@ > #include <drm/drm_file.h> > #include <drm/drm_ioctl.h> > #include <drm/drm_managed.h> > +#include <drm/drm_gem_shmem_helper.h> This should be between file.h and ioctl.h > #include <drm/drm_prime.h> > > #include "vgem_drv.h" > @@ -50,27 +51,11 @@ > #define DRIVER_MAJOR 1 > #define DRIVER_MINOR 0 > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs; > - > static struct vgem_device { > struct drm_device drm; > struct platform_device *platform; > } *vgem_device; > > -static void vgem_gem_free_object(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); > - > - kvfree(vgem_obj->pages); > - mutex_destroy(&vgem_obj->pages_lock); > - > - if (obj->import_attach) > - drm_prime_gem_destroy(obj, vgem_obj->table); > - > - drm_gem_object_release(obj); > - kfree(vgem_obj); > -} > - > static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) From a quick grep it looks like you should be able to remove this function and vgam_gem_vm_ops as well. The rest of the patch looks good to me. Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > { > struct vm_area_struct *vma = vmf->vma; > @@ -159,265 +144,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) > kfree(vfile); > } > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > - unsigned long size) > -{ > - struct drm_vgem_gem_object *obj; > - int ret; > - > - obj = kzalloc(sizeof(*obj), GFP_KERNEL); > - if (!obj) > - return ERR_PTR(-ENOMEM); > - > - obj->base.funcs = &vgem_gem_object_funcs; > - > - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); > - if (ret) { > - kfree(obj); > - return ERR_PTR(ret); > - } > - > - mutex_init(&obj->pages_lock); > - > - return obj; > -} > - > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) > -{ > - drm_gem_object_release(&obj->base); > - kfree(obj); > -} > - > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > - struct drm_file *file, > - unsigned int *handle, > - unsigned long size) > -{ > - struct drm_vgem_gem_object *obj; > - int ret; > - > - obj = __vgem_gem_create(dev, size); > - if (IS_ERR(obj)) > - return ERR_CAST(obj); > - > - ret = drm_gem_handle_create(file, &obj->base, handle); > - if (ret) { > - drm_gem_object_put(&obj->base); > - return ERR_PTR(ret); > - } > - > - return &obj->base; > -} > - > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, > - struct drm_mode_create_dumb *args) > -{ > - struct drm_gem_object *gem_object; > - u64 pitch, size; > - > - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); > - size = args->height * pitch; > - if (size == 0) > - return -EINVAL; > - > - gem_object = vgem_gem_create(dev, file, &args->handle, size); > - if (IS_ERR(gem_object)) > - return PTR_ERR(gem_object); > - > - args->size = gem_object->size; > - args->pitch = pitch; > - > - drm_gem_object_put(gem_object); > - > - DRM_DEBUG("Created object of size %llu\n", args->size); > - > - return 0; > -} > - > static struct drm_ioctl_desc vgem_ioctls[] = { > DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), > }; > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) > -{ > - unsigned long flags = vma->vm_flags; > - int ret; > - > - ret = drm_gem_mmap(filp, vma); > - if (ret) > - return ret; > - > - /* Keep the WC mmaping set by drm_gem_mmap() but our pages > - * are ordinary and not special. > - */ > - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; > - return 0; > -} > - > -static const struct file_operations vgem_driver_fops = { > - .owner = THIS_MODULE, > - .open = drm_open, > - .mmap = vgem_mmap, > - .poll = drm_poll, > - .read = drm_read, > - .unlocked_ioctl = drm_ioctl, > - .compat_ioctl = drm_compat_ioctl, > - .release = drm_release, > -}; > - > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) > -{ > - mutex_lock(&bo->pages_lock); > - if (bo->pages_pin_count++ == 0) { > - struct page **pages; > - > - pages = drm_gem_get_pages(&bo->base); > - if (IS_ERR(pages)) { > - bo->pages_pin_count--; > - mutex_unlock(&bo->pages_lock); > - return pages; > - } > - > - bo->pages = pages; > - } > - mutex_unlock(&bo->pages_lock); > - > - return bo->pages; > -} > - > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) > -{ > - mutex_lock(&bo->pages_lock); > - if (--bo->pages_pin_count == 0) { > - drm_gem_put_pages(&bo->base, bo->pages, true, true); > - bo->pages = NULL; > - } > - mutex_unlock(&bo->pages_lock); > -} > - > -static int vgem_prime_pin(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - long n_pages = obj->size >> PAGE_SHIFT; > - struct page **pages; > - > - pages = vgem_pin_pages(bo); > - if (IS_ERR(pages)) > - return PTR_ERR(pages); > - > - /* Flush the object from the CPU cache so that importers can rely > - * on coherent indirect access via the exported dma-address. > - */ > - drm_clflush_pages(pages, n_pages); > - > - return 0; > -} > - > -static void vgem_prime_unpin(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - vgem_unpin_pages(bo); > -} > - > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); > -} > - > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, > - struct dma_buf *dma_buf) > -{ > - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); > - > - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); > -} > - > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, > - struct dma_buf_attachment *attach, struct sg_table *sg) > -{ > - struct drm_vgem_gem_object *obj; > - int npages; > - > - obj = __vgem_gem_create(dev, attach->dmabuf->size); > - if (IS_ERR(obj)) > - return ERR_CAST(obj); > - > - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; > - > - obj->table = sg; > - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); > - if (!obj->pages) { > - __vgem_gem_destroy(obj); > - return ERR_PTR(-ENOMEM); > - } > - > - obj->pages_pin_count++; /* perma-pinned */ > - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); > - return &obj->base; > -} > - > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - long n_pages = obj->size >> PAGE_SHIFT; > - struct page **pages; > - void *vaddr; > - > - pages = vgem_pin_pages(bo); > - if (IS_ERR(pages)) > - return PTR_ERR(pages); > - > - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); > - if (!vaddr) > - return -ENOMEM; > - dma_buf_map_set_vaddr(map, vaddr); > - > - return 0; > -} > - > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - vunmap(map->vaddr); > - vgem_unpin_pages(bo); > -} > - > -static int vgem_prime_mmap(struct drm_gem_object *obj, > - struct vm_area_struct *vma) > -{ > - int ret; > - > - if (obj->size < vma->vm_end - vma->vm_start) > - return -EINVAL; > - > - if (!obj->filp) > - return -ENODEV; > - > - ret = call_mmap(obj->filp, vma); > - if (ret) > - return ret; > - > - vma_set_file(vma, obj->filp); > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); > - > - return 0; > -} > - > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { > - .free = vgem_gem_free_object, > - .pin = vgem_prime_pin, > - .unpin = vgem_prime_unpin, > - .get_sg_table = vgem_prime_get_sg_table, > - .vmap = vgem_prime_vmap, > - .vunmap = vgem_prime_vunmap, > - .vm_ops = &vgem_gem_vm_ops, > -}; > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); > > static const struct drm_driver vgem_driver = { > .driver_features = DRIVER_GEM | DRIVER_RENDER, > @@ -427,13 +159,7 @@ static const struct drm_driver vgem_driver = { > .num_ioctls = ARRAY_SIZE(vgem_ioctls), > .fops = &vgem_driver_fops, > > - .dumb_create = vgem_gem_dumb_create, > - > - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > - .gem_prime_import = vgem_prime_import, > - .gem_prime_import_sg_table = vgem_prime_import_sg_table, > - .gem_prime_mmap = vgem_prime_mmap, > + DRM_GEM_SHMEM_DRIVER_OPS, > > .name = DRIVER_NAME, > .desc = DRIVER_DESC, > -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 840 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [PATCH 2/2] drm/vgem: use shmem helpers @ 2021-02-23 11:19 ` Thomas Zimmermann 0 siblings, 0 replies; 110+ messages in thread From: Thomas Zimmermann @ 2021-02-23 11:19 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: Melissa Wen, Daniel Vetter, Intel Graphics Development, Christian König, Chris Wilson [-- Attachment #1.1.1: Type: text/plain, Size: 10420 bytes --] Hi Am 23.02.21 um 11:59 schrieb Daniel Vetter: > Aside from deleting lots of code the real motivation here is to switch > the mmap over to VM_PFNMAP, to be more consistent with what real gpu > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't > work, and even if you try and there's a struct page behind that, > touching it and mucking around with its refcount can upset drivers > real bad. > > Cc: John Stultz <john.stultz@linaro.org> > Cc: Sumit Semwal <sumit.semwal@linaro.org> > Cc: "Christian König" <christian.koenig@amd.com> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > Cc: Melissa Wen <melissa.srw@gmail.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/vgem/vgem_drv.c | 280 +------------------------------- > 1 file changed, 3 insertions(+), 277 deletions(-) > > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c > index a0e75f1d5d01..88b3d125a610 100644 > --- a/drivers/gpu/drm/vgem/vgem_drv.c > +++ b/drivers/gpu/drm/vgem/vgem_drv.c > @@ -40,6 +40,7 @@ > #include <drm/drm_file.h> > #include <drm/drm_ioctl.h> > #include <drm/drm_managed.h> > +#include <drm/drm_gem_shmem_helper.h> This should be between file.h and ioctl.h > #include <drm/drm_prime.h> > > #include "vgem_drv.h" > @@ -50,27 +51,11 @@ > #define DRIVER_MAJOR 1 > #define DRIVER_MINOR 0 > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs; > - > static struct vgem_device { > struct drm_device drm; > struct platform_device *platform; > } *vgem_device; > > -static void vgem_gem_free_object(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); > - > - kvfree(vgem_obj->pages); > - mutex_destroy(&vgem_obj->pages_lock); > - > - if (obj->import_attach) > - drm_prime_gem_destroy(obj, vgem_obj->table); > - > - drm_gem_object_release(obj); > - kfree(vgem_obj); > -} > - > static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) From a quick grep it looks like you should be able to remove this function and vgam_gem_vm_ops as well. The rest of the patch looks good to me. Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > { > struct vm_area_struct *vma = vmf->vma; > @@ -159,265 +144,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) > kfree(vfile); > } > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > - unsigned long size) > -{ > - struct drm_vgem_gem_object *obj; > - int ret; > - > - obj = kzalloc(sizeof(*obj), GFP_KERNEL); > - if (!obj) > - return ERR_PTR(-ENOMEM); > - > - obj->base.funcs = &vgem_gem_object_funcs; > - > - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); > - if (ret) { > - kfree(obj); > - return ERR_PTR(ret); > - } > - > - mutex_init(&obj->pages_lock); > - > - return obj; > -} > - > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) > -{ > - drm_gem_object_release(&obj->base); > - kfree(obj); > -} > - > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > - struct drm_file *file, > - unsigned int *handle, > - unsigned long size) > -{ > - struct drm_vgem_gem_object *obj; > - int ret; > - > - obj = __vgem_gem_create(dev, size); > - if (IS_ERR(obj)) > - return ERR_CAST(obj); > - > - ret = drm_gem_handle_create(file, &obj->base, handle); > - if (ret) { > - drm_gem_object_put(&obj->base); > - return ERR_PTR(ret); > - } > - > - return &obj->base; > -} > - > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, > - struct drm_mode_create_dumb *args) > -{ > - struct drm_gem_object *gem_object; > - u64 pitch, size; > - > - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); > - size = args->height * pitch; > - if (size == 0) > - return -EINVAL; > - > - gem_object = vgem_gem_create(dev, file, &args->handle, size); > - if (IS_ERR(gem_object)) > - return PTR_ERR(gem_object); > - > - args->size = gem_object->size; > - args->pitch = pitch; > - > - drm_gem_object_put(gem_object); > - > - DRM_DEBUG("Created object of size %llu\n", args->size); > - > - return 0; > -} > - > static struct drm_ioctl_desc vgem_ioctls[] = { > DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), > }; > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) > -{ > - unsigned long flags = vma->vm_flags; > - int ret; > - > - ret = drm_gem_mmap(filp, vma); > - if (ret) > - return ret; > - > - /* Keep the WC mmaping set by drm_gem_mmap() but our pages > - * are ordinary and not special. > - */ > - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; > - return 0; > -} > - > -static const struct file_operations vgem_driver_fops = { > - .owner = THIS_MODULE, > - .open = drm_open, > - .mmap = vgem_mmap, > - .poll = drm_poll, > - .read = drm_read, > - .unlocked_ioctl = drm_ioctl, > - .compat_ioctl = drm_compat_ioctl, > - .release = drm_release, > -}; > - > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) > -{ > - mutex_lock(&bo->pages_lock); > - if (bo->pages_pin_count++ == 0) { > - struct page **pages; > - > - pages = drm_gem_get_pages(&bo->base); > - if (IS_ERR(pages)) { > - bo->pages_pin_count--; > - mutex_unlock(&bo->pages_lock); > - return pages; > - } > - > - bo->pages = pages; > - } > - mutex_unlock(&bo->pages_lock); > - > - return bo->pages; > -} > - > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) > -{ > - mutex_lock(&bo->pages_lock); > - if (--bo->pages_pin_count == 0) { > - drm_gem_put_pages(&bo->base, bo->pages, true, true); > - bo->pages = NULL; > - } > - mutex_unlock(&bo->pages_lock); > -} > - > -static int vgem_prime_pin(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - long n_pages = obj->size >> PAGE_SHIFT; > - struct page **pages; > - > - pages = vgem_pin_pages(bo); > - if (IS_ERR(pages)) > - return PTR_ERR(pages); > - > - /* Flush the object from the CPU cache so that importers can rely > - * on coherent indirect access via the exported dma-address. > - */ > - drm_clflush_pages(pages, n_pages); > - > - return 0; > -} > - > -static void vgem_prime_unpin(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - vgem_unpin_pages(bo); > -} > - > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); > -} > - > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, > - struct dma_buf *dma_buf) > -{ > - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); > - > - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); > -} > - > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, > - struct dma_buf_attachment *attach, struct sg_table *sg) > -{ > - struct drm_vgem_gem_object *obj; > - int npages; > - > - obj = __vgem_gem_create(dev, attach->dmabuf->size); > - if (IS_ERR(obj)) > - return ERR_CAST(obj); > - > - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; > - > - obj->table = sg; > - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); > - if (!obj->pages) { > - __vgem_gem_destroy(obj); > - return ERR_PTR(-ENOMEM); > - } > - > - obj->pages_pin_count++; /* perma-pinned */ > - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); > - return &obj->base; > -} > - > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - long n_pages = obj->size >> PAGE_SHIFT; > - struct page **pages; > - void *vaddr; > - > - pages = vgem_pin_pages(bo); > - if (IS_ERR(pages)) > - return PTR_ERR(pages); > - > - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); > - if (!vaddr) > - return -ENOMEM; > - dma_buf_map_set_vaddr(map, vaddr); > - > - return 0; > -} > - > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - vunmap(map->vaddr); > - vgem_unpin_pages(bo); > -} > - > -static int vgem_prime_mmap(struct drm_gem_object *obj, > - struct vm_area_struct *vma) > -{ > - int ret; > - > - if (obj->size < vma->vm_end - vma->vm_start) > - return -EINVAL; > - > - if (!obj->filp) > - return -ENODEV; > - > - ret = call_mmap(obj->filp, vma); > - if (ret) > - return ret; > - > - vma_set_file(vma, obj->filp); > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); > - > - return 0; > -} > - > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { > - .free = vgem_gem_free_object, > - .pin = vgem_prime_pin, > - .unpin = vgem_prime_unpin, > - .get_sg_table = vgem_prime_get_sg_table, > - .vmap = vgem_prime_vmap, > - .vunmap = vgem_prime_vunmap, > - .vm_ops = &vgem_gem_vm_ops, > -}; > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); > > static const struct drm_driver vgem_driver = { > .driver_features = DRIVER_GEM | DRIVER_RENDER, > @@ -427,13 +159,7 @@ static const struct drm_driver vgem_driver = { > .num_ioctls = ARRAY_SIZE(vgem_ioctls), > .fops = &vgem_driver_fops, > > - .dumb_create = vgem_gem_dumb_create, > - > - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > - .gem_prime_import = vgem_prime_import, > - .gem_prime_import_sg_table = vgem_prime_import_sg_table, > - .gem_prime_mmap = vgem_prime_mmap, > + DRM_GEM_SHMEM_DRIVER_OPS, > > .name = DRIVER_NAME, > .desc = DRIVER_DESC, > -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 840 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH] drm/vgem: use shmem helpers 2021-02-23 10:59 ` [Intel-gfx] " Daniel Vetter @ 2021-02-23 11:51 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-23 11:51 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Intel Graphics Development, Christian König, Melissa Wen, Thomas Zimmermann, Daniel Vetter, Chris Wilson Aside from deleting lots of code the real motivation here is to switch the mmap over to VM_PFNMAP, to be more consistent with what real gpu drivers do. They're all VM_PFNMP, which means get_user_pages doesn't work, and even if you try and there's a struct page behind that, touching it and mucking around with its refcount can upset drivers real bad. v2: Review from Thomas: - sort #include - drop more dead code that I didn't spot somehow Cc: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Melissa Wen <melissa.srw@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- 1 file changed, 3 insertions(+), 337 deletions(-) diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index a0e75f1d5d01..b1b3a5ffc542 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -38,6 +38,7 @@ #include <drm/drm_drv.h> #include <drm/drm_file.h> +#include <drm/drm_gem_shmem_helper.h> #include <drm/drm_ioctl.h> #include <drm/drm_managed.h> #include <drm/drm_prime.h> @@ -50,87 +51,11 @@ #define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 -static const struct drm_gem_object_funcs vgem_gem_object_funcs; - static struct vgem_device { struct drm_device drm; struct platform_device *platform; } *vgem_device; -static void vgem_gem_free_object(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); - - kvfree(vgem_obj->pages); - mutex_destroy(&vgem_obj->pages_lock); - - if (obj->import_attach) - drm_prime_gem_destroy(obj, vgem_obj->table); - - drm_gem_object_release(obj); - kfree(vgem_obj); -} - -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) -{ - struct vm_area_struct *vma = vmf->vma; - struct drm_vgem_gem_object *obj = vma->vm_private_data; - /* We don't use vmf->pgoff since that has the fake offset */ - unsigned long vaddr = vmf->address; - vm_fault_t ret = VM_FAULT_SIGBUS; - loff_t num_pages; - pgoff_t page_offset; - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; - - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); - - if (page_offset >= num_pages) - return VM_FAULT_SIGBUS; - - mutex_lock(&obj->pages_lock); - if (obj->pages) { - get_page(obj->pages[page_offset]); - vmf->page = obj->pages[page_offset]; - ret = 0; - } - mutex_unlock(&obj->pages_lock); - if (ret) { - struct page *page; - - page = shmem_read_mapping_page( - file_inode(obj->base.filp)->i_mapping, - page_offset); - if (!IS_ERR(page)) { - vmf->page = page; - ret = 0; - } else switch (PTR_ERR(page)) { - case -ENOSPC: - case -ENOMEM: - ret = VM_FAULT_OOM; - break; - case -EBUSY: - ret = VM_FAULT_RETRY; - break; - case -EFAULT: - case -EINVAL: - ret = VM_FAULT_SIGBUS; - break; - default: - WARN_ON(PTR_ERR(page)); - ret = VM_FAULT_SIGBUS; - break; - } - - } - return ret; -} - -static const struct vm_operations_struct vgem_gem_vm_ops = { - .fault = vgem_gem_fault, - .open = drm_gem_vm_open, - .close = drm_gem_vm_close, -}; - static int vgem_open(struct drm_device *dev, struct drm_file *file) { struct vgem_file *vfile; @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) kfree(vfile); } -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = kzalloc(sizeof(*obj), GFP_KERNEL); - if (!obj) - return ERR_PTR(-ENOMEM); - - obj->base.funcs = &vgem_gem_object_funcs; - - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); - if (ret) { - kfree(obj); - return ERR_PTR(ret); - } - - mutex_init(&obj->pages_lock); - - return obj; -} - -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) -{ - drm_gem_object_release(&obj->base); - kfree(obj); -} - -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, - struct drm_file *file, - unsigned int *handle, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = __vgem_gem_create(dev, size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - ret = drm_gem_handle_create(file, &obj->base, handle); - if (ret) { - drm_gem_object_put(&obj->base); - return ERR_PTR(ret); - } - - return &obj->base; -} - -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, - struct drm_mode_create_dumb *args) -{ - struct drm_gem_object *gem_object; - u64 pitch, size; - - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); - size = args->height * pitch; - if (size == 0) - return -EINVAL; - - gem_object = vgem_gem_create(dev, file, &args->handle, size); - if (IS_ERR(gem_object)) - return PTR_ERR(gem_object); - - args->size = gem_object->size; - args->pitch = pitch; - - drm_gem_object_put(gem_object); - - DRM_DEBUG("Created object of size %llu\n", args->size); - - return 0; -} - static struct drm_ioctl_desc vgem_ioctls[] = { DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), }; -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) -{ - unsigned long flags = vma->vm_flags; - int ret; - - ret = drm_gem_mmap(filp, vma); - if (ret) - return ret; - - /* Keep the WC mmaping set by drm_gem_mmap() but our pages - * are ordinary and not special. - */ - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; - return 0; -} - -static const struct file_operations vgem_driver_fops = { - .owner = THIS_MODULE, - .open = drm_open, - .mmap = vgem_mmap, - .poll = drm_poll, - .read = drm_read, - .unlocked_ioctl = drm_ioctl, - .compat_ioctl = drm_compat_ioctl, - .release = drm_release, -}; - -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (bo->pages_pin_count++ == 0) { - struct page **pages; - - pages = drm_gem_get_pages(&bo->base); - if (IS_ERR(pages)) { - bo->pages_pin_count--; - mutex_unlock(&bo->pages_lock); - return pages; - } - - bo->pages = pages; - } - mutex_unlock(&bo->pages_lock); - - return bo->pages; -} - -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (--bo->pages_pin_count == 0) { - drm_gem_put_pages(&bo->base, bo->pages, true, true); - bo->pages = NULL; - } - mutex_unlock(&bo->pages_lock); -} - -static int vgem_prime_pin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - /* Flush the object from the CPU cache so that importers can rely - * on coherent indirect access via the exported dma-address. - */ - drm_clflush_pages(pages, n_pages); - - return 0; -} - -static void vgem_prime_unpin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vgem_unpin_pages(bo); -} - -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); -} - -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) -{ - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); - - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); -} - -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, - struct dma_buf_attachment *attach, struct sg_table *sg) -{ - struct drm_vgem_gem_object *obj; - int npages; - - obj = __vgem_gem_create(dev, attach->dmabuf->size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; - - obj->table = sg; - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); - if (!obj->pages) { - __vgem_gem_destroy(obj); - return ERR_PTR(-ENOMEM); - } - - obj->pages_pin_count++; /* perma-pinned */ - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); - return &obj->base; -} - -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - void *vaddr; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); - if (!vaddr) - return -ENOMEM; - dma_buf_map_set_vaddr(map, vaddr); - - return 0; -} - -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vunmap(map->vaddr); - vgem_unpin_pages(bo); -} - -static int vgem_prime_mmap(struct drm_gem_object *obj, - struct vm_area_struct *vma) -{ - int ret; - - if (obj->size < vma->vm_end - vma->vm_start) - return -EINVAL; - - if (!obj->filp) - return -ENODEV; - - ret = call_mmap(obj->filp, vma); - if (ret) - return ret; - - vma_set_file(vma, obj->filp); - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); - - return 0; -} - -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { - .free = vgem_gem_free_object, - .pin = vgem_prime_pin, - .unpin = vgem_prime_unpin, - .get_sg_table = vgem_prime_get_sg_table, - .vmap = vgem_prime_vmap, - .vunmap = vgem_prime_vunmap, - .vm_ops = &vgem_gem_vm_ops, -}; +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); static const struct drm_driver vgem_driver = { .driver_features = DRIVER_GEM | DRIVER_RENDER, @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { .num_ioctls = ARRAY_SIZE(vgem_ioctls), .fops = &vgem_driver_fops, - .dumb_create = vgem_gem_dumb_create, - - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_import = vgem_prime_import, - .gem_prime_import_sg_table = vgem_prime_import_sg_table, - .gem_prime_mmap = vgem_prime_mmap, + DRM_GEM_SHMEM_DRIVER_OPS, .name = DRIVER_NAME, .desc = DRIVER_DESC, -- 2.30.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [Intel-gfx] [PATCH] drm/vgem: use shmem helpers @ 2021-02-23 11:51 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-23 11:51 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Intel Graphics Development, Christian König, Melissa Wen, John Stultz, Thomas Zimmermann, Daniel Vetter, Chris Wilson, Sumit Semwal Aside from deleting lots of code the real motivation here is to switch the mmap over to VM_PFNMAP, to be more consistent with what real gpu drivers do. They're all VM_PFNMP, which means get_user_pages doesn't work, and even if you try and there's a struct page behind that, touching it and mucking around with its refcount can upset drivers real bad. v2: Review from Thomas: - sort #include - drop more dead code that I didn't spot somehow Cc: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Melissa Wen <melissa.srw@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- 1 file changed, 3 insertions(+), 337 deletions(-) diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index a0e75f1d5d01..b1b3a5ffc542 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -38,6 +38,7 @@ #include <drm/drm_drv.h> #include <drm/drm_file.h> +#include <drm/drm_gem_shmem_helper.h> #include <drm/drm_ioctl.h> #include <drm/drm_managed.h> #include <drm/drm_prime.h> @@ -50,87 +51,11 @@ #define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 -static const struct drm_gem_object_funcs vgem_gem_object_funcs; - static struct vgem_device { struct drm_device drm; struct platform_device *platform; } *vgem_device; -static void vgem_gem_free_object(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); - - kvfree(vgem_obj->pages); - mutex_destroy(&vgem_obj->pages_lock); - - if (obj->import_attach) - drm_prime_gem_destroy(obj, vgem_obj->table); - - drm_gem_object_release(obj); - kfree(vgem_obj); -} - -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) -{ - struct vm_area_struct *vma = vmf->vma; - struct drm_vgem_gem_object *obj = vma->vm_private_data; - /* We don't use vmf->pgoff since that has the fake offset */ - unsigned long vaddr = vmf->address; - vm_fault_t ret = VM_FAULT_SIGBUS; - loff_t num_pages; - pgoff_t page_offset; - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; - - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); - - if (page_offset >= num_pages) - return VM_FAULT_SIGBUS; - - mutex_lock(&obj->pages_lock); - if (obj->pages) { - get_page(obj->pages[page_offset]); - vmf->page = obj->pages[page_offset]; - ret = 0; - } - mutex_unlock(&obj->pages_lock); - if (ret) { - struct page *page; - - page = shmem_read_mapping_page( - file_inode(obj->base.filp)->i_mapping, - page_offset); - if (!IS_ERR(page)) { - vmf->page = page; - ret = 0; - } else switch (PTR_ERR(page)) { - case -ENOSPC: - case -ENOMEM: - ret = VM_FAULT_OOM; - break; - case -EBUSY: - ret = VM_FAULT_RETRY; - break; - case -EFAULT: - case -EINVAL: - ret = VM_FAULT_SIGBUS; - break; - default: - WARN_ON(PTR_ERR(page)); - ret = VM_FAULT_SIGBUS; - break; - } - - } - return ret; -} - -static const struct vm_operations_struct vgem_gem_vm_ops = { - .fault = vgem_gem_fault, - .open = drm_gem_vm_open, - .close = drm_gem_vm_close, -}; - static int vgem_open(struct drm_device *dev, struct drm_file *file) { struct vgem_file *vfile; @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) kfree(vfile); } -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = kzalloc(sizeof(*obj), GFP_KERNEL); - if (!obj) - return ERR_PTR(-ENOMEM); - - obj->base.funcs = &vgem_gem_object_funcs; - - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); - if (ret) { - kfree(obj); - return ERR_PTR(ret); - } - - mutex_init(&obj->pages_lock); - - return obj; -} - -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) -{ - drm_gem_object_release(&obj->base); - kfree(obj); -} - -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, - struct drm_file *file, - unsigned int *handle, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = __vgem_gem_create(dev, size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - ret = drm_gem_handle_create(file, &obj->base, handle); - if (ret) { - drm_gem_object_put(&obj->base); - return ERR_PTR(ret); - } - - return &obj->base; -} - -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, - struct drm_mode_create_dumb *args) -{ - struct drm_gem_object *gem_object; - u64 pitch, size; - - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); - size = args->height * pitch; - if (size == 0) - return -EINVAL; - - gem_object = vgem_gem_create(dev, file, &args->handle, size); - if (IS_ERR(gem_object)) - return PTR_ERR(gem_object); - - args->size = gem_object->size; - args->pitch = pitch; - - drm_gem_object_put(gem_object); - - DRM_DEBUG("Created object of size %llu\n", args->size); - - return 0; -} - static struct drm_ioctl_desc vgem_ioctls[] = { DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), }; -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) -{ - unsigned long flags = vma->vm_flags; - int ret; - - ret = drm_gem_mmap(filp, vma); - if (ret) - return ret; - - /* Keep the WC mmaping set by drm_gem_mmap() but our pages - * are ordinary and not special. - */ - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; - return 0; -} - -static const struct file_operations vgem_driver_fops = { - .owner = THIS_MODULE, - .open = drm_open, - .mmap = vgem_mmap, - .poll = drm_poll, - .read = drm_read, - .unlocked_ioctl = drm_ioctl, - .compat_ioctl = drm_compat_ioctl, - .release = drm_release, -}; - -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (bo->pages_pin_count++ == 0) { - struct page **pages; - - pages = drm_gem_get_pages(&bo->base); - if (IS_ERR(pages)) { - bo->pages_pin_count--; - mutex_unlock(&bo->pages_lock); - return pages; - } - - bo->pages = pages; - } - mutex_unlock(&bo->pages_lock); - - return bo->pages; -} - -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (--bo->pages_pin_count == 0) { - drm_gem_put_pages(&bo->base, bo->pages, true, true); - bo->pages = NULL; - } - mutex_unlock(&bo->pages_lock); -} - -static int vgem_prime_pin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - /* Flush the object from the CPU cache so that importers can rely - * on coherent indirect access via the exported dma-address. - */ - drm_clflush_pages(pages, n_pages); - - return 0; -} - -static void vgem_prime_unpin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vgem_unpin_pages(bo); -} - -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); -} - -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) -{ - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); - - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); -} - -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, - struct dma_buf_attachment *attach, struct sg_table *sg) -{ - struct drm_vgem_gem_object *obj; - int npages; - - obj = __vgem_gem_create(dev, attach->dmabuf->size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; - - obj->table = sg; - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); - if (!obj->pages) { - __vgem_gem_destroy(obj); - return ERR_PTR(-ENOMEM); - } - - obj->pages_pin_count++; /* perma-pinned */ - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); - return &obj->base; -} - -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - void *vaddr; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); - if (!vaddr) - return -ENOMEM; - dma_buf_map_set_vaddr(map, vaddr); - - return 0; -} - -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vunmap(map->vaddr); - vgem_unpin_pages(bo); -} - -static int vgem_prime_mmap(struct drm_gem_object *obj, - struct vm_area_struct *vma) -{ - int ret; - - if (obj->size < vma->vm_end - vma->vm_start) - return -EINVAL; - - if (!obj->filp) - return -ENODEV; - - ret = call_mmap(obj->filp, vma); - if (ret) - return ret; - - vma_set_file(vma, obj->filp); - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); - - return 0; -} - -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { - .free = vgem_gem_free_object, - .pin = vgem_prime_pin, - .unpin = vgem_prime_unpin, - .get_sg_table = vgem_prime_get_sg_table, - .vmap = vgem_prime_vmap, - .vunmap = vgem_prime_vunmap, - .vm_ops = &vgem_gem_vm_ops, -}; +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); static const struct drm_driver vgem_driver = { .driver_features = DRIVER_GEM | DRIVER_RENDER, @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { .num_ioctls = ARRAY_SIZE(vgem_ioctls), .fops = &vgem_driver_fops, - .dumb_create = vgem_gem_dumb_create, - - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_import = vgem_prime_import, - .gem_prime_import_sg_table = vgem_prime_import_sg_table, - .gem_prime_mmap = vgem_prime_mmap, + DRM_GEM_SHMEM_DRIVER_OPS, .name = DRIVER_NAME, .desc = DRIVER_DESC, -- 2.30.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH 2/2] drm/vgem: use shmem helpers 2021-02-23 10:59 ` [Intel-gfx] " Daniel Vetter (?) @ 2021-02-23 14:21 ` kernel test robot -1 siblings, 0 replies; 110+ messages in thread From: kernel test robot @ 2021-02-23 14:21 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: kbuild-all, Daniel Vetter, Intel Graphics Development, Chris Wilson, Melissa Wen, Christian König [-- Attachment #1: Type: text/plain, Size: 2076 bytes --] Hi Daniel, I love your patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on drm-tip/drm-tip linus/master next-20210223] [cannot apply to drm/drm-next v5.11] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: openrisc-randconfig-r026-20210223 (attached as .config) compiler: or1k-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 git checkout 5c544c63e333016d58d3e6f4802093906ef5456e # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=openrisc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): or1k-linux-ld: arch/openrisc/kernel/entry.o: in function `_external_irq_handler': (.text+0x83c): undefined reference to `printk' (.text+0x83c): relocation truncated to fit: R_OR1K_INSN_REL_26 against undefined symbol `printk' >> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x44): undefined reference to `drm_gem_shmem_prime_import_sg_table' >> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x4c): undefined reference to `drm_gem_shmem_dumb_create' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 25334 bytes --] [-- Attachment #3: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 2/2] drm/vgem: use shmem helpers @ 2021-02-23 14:21 ` kernel test robot 0 siblings, 0 replies; 110+ messages in thread From: kernel test robot @ 2021-02-23 14:21 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 2117 bytes --] Hi Daniel, I love your patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on drm-tip/drm-tip linus/master next-20210223] [cannot apply to drm/drm-next v5.11] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: openrisc-randconfig-r026-20210223 (attached as .config) compiler: or1k-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 git checkout 5c544c63e333016d58d3e6f4802093906ef5456e # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=openrisc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): or1k-linux-ld: arch/openrisc/kernel/entry.o: in function `_external_irq_handler': (.text+0x83c): undefined reference to `printk' (.text+0x83c): relocation truncated to fit: R_OR1K_INSN_REL_26 against undefined symbol `printk' >> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x44): undefined reference to `drm_gem_shmem_prime_import_sg_table' >> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x4c): undefined reference to `drm_gem_shmem_dumb_create' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 25334 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [PATCH 2/2] drm/vgem: use shmem helpers @ 2021-02-23 14:21 ` kernel test robot 0 siblings, 0 replies; 110+ messages in thread From: kernel test robot @ 2021-02-23 14:21 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: kbuild-all, Daniel Vetter, Intel Graphics Development, Chris Wilson, Melissa Wen, Christian König [-- Attachment #1: Type: text/plain, Size: 2076 bytes --] Hi Daniel, I love your patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on drm-tip/drm-tip linus/master next-20210223] [cannot apply to drm/drm-next v5.11] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: openrisc-randconfig-r026-20210223 (attached as .config) compiler: or1k-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 git checkout 5c544c63e333016d58d3e6f4802093906ef5456e # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=openrisc If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): or1k-linux-ld: arch/openrisc/kernel/entry.o: in function `_external_irq_handler': (.text+0x83c): undefined reference to `printk' (.text+0x83c): relocation truncated to fit: R_OR1K_INSN_REL_26 against undefined symbol `printk' >> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x44): undefined reference to `drm_gem_shmem_prime_import_sg_table' >> or1k-linux-ld: drivers/gpu/drm/vgem/vgem_drv.o:(.rodata+0x4c): undefined reference to `drm_gem_shmem_dumb_create' --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 25334 bytes --] [-- Attachment #3: Type: text/plain, Size: 160 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 2/2] drm/vgem: use shmem helpers 2021-02-23 10:59 ` [Intel-gfx] " Daniel Vetter (?) @ 2021-02-23 15:07 ` kernel test robot -1 siblings, 0 replies; 110+ messages in thread From: kernel test robot @ 2021-02-23 15:07 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: kbuild-all, Daniel Vetter, Intel Graphics Development, Chris Wilson, Melissa Wen, Christian König [-- Attachment #1: Type: text/plain, Size: 1876 bytes --] Hi Daniel, I love your patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on drm-tip/drm-tip linus/master next-20210223] [cannot apply to tegra-drm/drm/tegra/for-next drm-exynos/exynos-drm-next drm/drm-next v5.11] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: microblaze-randconfig-r013-20210223 (attached as .config) compiler: microblaze-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 git checkout 5c544c63e333016d58d3e6f4802093906ef5456e # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=microblaze If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined! >> ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined! --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 26303 bytes --] [-- Attachment #3: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 2/2] drm/vgem: use shmem helpers @ 2021-02-23 15:07 ` kernel test robot 0 siblings, 0 replies; 110+ messages in thread From: kernel test robot @ 2021-02-23 15:07 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 1914 bytes --] Hi Daniel, I love your patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on drm-tip/drm-tip linus/master next-20210223] [cannot apply to tegra-drm/drm/tegra/for-next drm-exynos/exynos-drm-next drm/drm-next v5.11] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: microblaze-randconfig-r013-20210223 (attached as .config) compiler: microblaze-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 git checkout 5c544c63e333016d58d3e6f4802093906ef5456e # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=microblaze If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined! >> ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined! --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 26303 bytes --] ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [PATCH 2/2] drm/vgem: use shmem helpers @ 2021-02-23 15:07 ` kernel test robot 0 siblings, 0 replies; 110+ messages in thread From: kernel test robot @ 2021-02-23 15:07 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: kbuild-all, Daniel Vetter, Intel Graphics Development, Chris Wilson, Melissa Wen, Christian König [-- Attachment #1: Type: text/plain, Size: 1876 bytes --] Hi Daniel, I love your patch! Yet something to improve: [auto build test ERROR on drm-intel/for-linux-next] [also build test ERROR on drm-tip/drm-tip linus/master next-20210223] [cannot apply to tegra-drm/drm/tegra/for-next drm-exynos/exynos-drm-next drm/drm-next v5.11] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: microblaze-randconfig-r013-20210223 (attached as .config) compiler: microblaze-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/5c544c63e333016d58d3e6f4802093906ef5456e git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Daniel-Vetter/dma-buf-Require-VM_PFNMAP-vma-for-mmap/20210223-190209 git checkout 5c544c63e333016d58d3e6f4802093906ef5456e # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=microblaze If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>, old ones prefixed by <<): >> ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined! >> ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined! --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 26303 bytes --] [-- Attachment #3: Type: text/plain, Size: 160 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* [PATCH] drm/vgem: use shmem helpers 2021-02-23 10:59 ` [Intel-gfx] " Daniel Vetter @ 2021-02-25 10:23 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 10:23 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Intel Graphics Development, Christian König, Melissa Wen, Thomas Zimmermann, Daniel Vetter, Chris Wilson Aside from deleting lots of code the real motivation here is to switch the mmap over to VM_PFNMAP, to be more consistent with what real gpu drivers do. They're all VM_PFNMP, which means get_user_pages doesn't work, and even if you try and there's a struct page behind that, touching it and mucking around with its refcount can upset drivers real bad. v2: Review from Thomas: - sort #include - drop more dead code that I didn't spot somehow v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) Cc: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Melissa Wen <melissa.srw@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- 2 files changed, 4 insertions(+), 337 deletions(-) diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 8e73311de583..94e4ac830283 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" config DRM_VGEM tristate "Virtual GEM provider" depends on DRM + select DRM_GEM_SHMEM_HELPER help Choose this option to get a virtual graphics memory manager, as used by Mesa's software renderer for enhanced performance. diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index a0e75f1d5d01..b1b3a5ffc542 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -38,6 +38,7 @@ #include <drm/drm_drv.h> #include <drm/drm_file.h> +#include <drm/drm_gem_shmem_helper.h> #include <drm/drm_ioctl.h> #include <drm/drm_managed.h> #include <drm/drm_prime.h> @@ -50,87 +51,11 @@ #define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 -static const struct drm_gem_object_funcs vgem_gem_object_funcs; - static struct vgem_device { struct drm_device drm; struct platform_device *platform; } *vgem_device; -static void vgem_gem_free_object(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); - - kvfree(vgem_obj->pages); - mutex_destroy(&vgem_obj->pages_lock); - - if (obj->import_attach) - drm_prime_gem_destroy(obj, vgem_obj->table); - - drm_gem_object_release(obj); - kfree(vgem_obj); -} - -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) -{ - struct vm_area_struct *vma = vmf->vma; - struct drm_vgem_gem_object *obj = vma->vm_private_data; - /* We don't use vmf->pgoff since that has the fake offset */ - unsigned long vaddr = vmf->address; - vm_fault_t ret = VM_FAULT_SIGBUS; - loff_t num_pages; - pgoff_t page_offset; - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; - - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); - - if (page_offset >= num_pages) - return VM_FAULT_SIGBUS; - - mutex_lock(&obj->pages_lock); - if (obj->pages) { - get_page(obj->pages[page_offset]); - vmf->page = obj->pages[page_offset]; - ret = 0; - } - mutex_unlock(&obj->pages_lock); - if (ret) { - struct page *page; - - page = shmem_read_mapping_page( - file_inode(obj->base.filp)->i_mapping, - page_offset); - if (!IS_ERR(page)) { - vmf->page = page; - ret = 0; - } else switch (PTR_ERR(page)) { - case -ENOSPC: - case -ENOMEM: - ret = VM_FAULT_OOM; - break; - case -EBUSY: - ret = VM_FAULT_RETRY; - break; - case -EFAULT: - case -EINVAL: - ret = VM_FAULT_SIGBUS; - break; - default: - WARN_ON(PTR_ERR(page)); - ret = VM_FAULT_SIGBUS; - break; - } - - } - return ret; -} - -static const struct vm_operations_struct vgem_gem_vm_ops = { - .fault = vgem_gem_fault, - .open = drm_gem_vm_open, - .close = drm_gem_vm_close, -}; - static int vgem_open(struct drm_device *dev, struct drm_file *file) { struct vgem_file *vfile; @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) kfree(vfile); } -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = kzalloc(sizeof(*obj), GFP_KERNEL); - if (!obj) - return ERR_PTR(-ENOMEM); - - obj->base.funcs = &vgem_gem_object_funcs; - - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); - if (ret) { - kfree(obj); - return ERR_PTR(ret); - } - - mutex_init(&obj->pages_lock); - - return obj; -} - -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) -{ - drm_gem_object_release(&obj->base); - kfree(obj); -} - -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, - struct drm_file *file, - unsigned int *handle, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = __vgem_gem_create(dev, size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - ret = drm_gem_handle_create(file, &obj->base, handle); - if (ret) { - drm_gem_object_put(&obj->base); - return ERR_PTR(ret); - } - - return &obj->base; -} - -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, - struct drm_mode_create_dumb *args) -{ - struct drm_gem_object *gem_object; - u64 pitch, size; - - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); - size = args->height * pitch; - if (size == 0) - return -EINVAL; - - gem_object = vgem_gem_create(dev, file, &args->handle, size); - if (IS_ERR(gem_object)) - return PTR_ERR(gem_object); - - args->size = gem_object->size; - args->pitch = pitch; - - drm_gem_object_put(gem_object); - - DRM_DEBUG("Created object of size %llu\n", args->size); - - return 0; -} - static struct drm_ioctl_desc vgem_ioctls[] = { DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), }; -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) -{ - unsigned long flags = vma->vm_flags; - int ret; - - ret = drm_gem_mmap(filp, vma); - if (ret) - return ret; - - /* Keep the WC mmaping set by drm_gem_mmap() but our pages - * are ordinary and not special. - */ - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; - return 0; -} - -static const struct file_operations vgem_driver_fops = { - .owner = THIS_MODULE, - .open = drm_open, - .mmap = vgem_mmap, - .poll = drm_poll, - .read = drm_read, - .unlocked_ioctl = drm_ioctl, - .compat_ioctl = drm_compat_ioctl, - .release = drm_release, -}; - -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (bo->pages_pin_count++ == 0) { - struct page **pages; - - pages = drm_gem_get_pages(&bo->base); - if (IS_ERR(pages)) { - bo->pages_pin_count--; - mutex_unlock(&bo->pages_lock); - return pages; - } - - bo->pages = pages; - } - mutex_unlock(&bo->pages_lock); - - return bo->pages; -} - -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (--bo->pages_pin_count == 0) { - drm_gem_put_pages(&bo->base, bo->pages, true, true); - bo->pages = NULL; - } - mutex_unlock(&bo->pages_lock); -} - -static int vgem_prime_pin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - /* Flush the object from the CPU cache so that importers can rely - * on coherent indirect access via the exported dma-address. - */ - drm_clflush_pages(pages, n_pages); - - return 0; -} - -static void vgem_prime_unpin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vgem_unpin_pages(bo); -} - -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); -} - -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) -{ - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); - - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); -} - -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, - struct dma_buf_attachment *attach, struct sg_table *sg) -{ - struct drm_vgem_gem_object *obj; - int npages; - - obj = __vgem_gem_create(dev, attach->dmabuf->size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; - - obj->table = sg; - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); - if (!obj->pages) { - __vgem_gem_destroy(obj); - return ERR_PTR(-ENOMEM); - } - - obj->pages_pin_count++; /* perma-pinned */ - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); - return &obj->base; -} - -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - void *vaddr; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); - if (!vaddr) - return -ENOMEM; - dma_buf_map_set_vaddr(map, vaddr); - - return 0; -} - -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vunmap(map->vaddr); - vgem_unpin_pages(bo); -} - -static int vgem_prime_mmap(struct drm_gem_object *obj, - struct vm_area_struct *vma) -{ - int ret; - - if (obj->size < vma->vm_end - vma->vm_start) - return -EINVAL; - - if (!obj->filp) - return -ENODEV; - - ret = call_mmap(obj->filp, vma); - if (ret) - return ret; - - vma_set_file(vma, obj->filp); - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); - - return 0; -} - -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { - .free = vgem_gem_free_object, - .pin = vgem_prime_pin, - .unpin = vgem_prime_unpin, - .get_sg_table = vgem_prime_get_sg_table, - .vmap = vgem_prime_vmap, - .vunmap = vgem_prime_vunmap, - .vm_ops = &vgem_gem_vm_ops, -}; +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); static const struct drm_driver vgem_driver = { .driver_features = DRIVER_GEM | DRIVER_RENDER, @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { .num_ioctls = ARRAY_SIZE(vgem_ioctls), .fops = &vgem_driver_fops, - .dumb_create = vgem_gem_dumb_create, - - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_import = vgem_prime_import, - .gem_prime_import_sg_table = vgem_prime_import_sg_table, - .gem_prime_mmap = vgem_prime_mmap, + DRM_GEM_SHMEM_DRIVER_OPS, .name = DRIVER_NAME, .desc = DRIVER_DESC, -- 2.30.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 110+ messages in thread
* [Intel-gfx] [PATCH] drm/vgem: use shmem helpers @ 2021-02-25 10:23 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 10:23 UTC (permalink / raw) To: DRI Development Cc: Daniel Vetter, Intel Graphics Development, Christian König, Melissa Wen, John Stultz, Thomas Zimmermann, Daniel Vetter, Chris Wilson, Sumit Semwal Aside from deleting lots of code the real motivation here is to switch the mmap over to VM_PFNMAP, to be more consistent with what real gpu drivers do. They're all VM_PFNMP, which means get_user_pages doesn't work, and even if you try and there's a struct page behind that, touching it and mucking around with its refcount can upset drivers real bad. v2: Review from Thomas: - sort #include - drop more dead code that I didn't spot somehow v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) Cc: Thomas Zimmermann <tzimmermann@suse.de> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Cc: John Stultz <john.stultz@linaro.org> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Melissa Wen <melissa.srw@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/gpu/drm/Kconfig | 1 + drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- 2 files changed, 4 insertions(+), 337 deletions(-) diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 8e73311de583..94e4ac830283 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" config DRM_VGEM tristate "Virtual GEM provider" depends on DRM + select DRM_GEM_SHMEM_HELPER help Choose this option to get a virtual graphics memory manager, as used by Mesa's software renderer for enhanced performance. diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c index a0e75f1d5d01..b1b3a5ffc542 100644 --- a/drivers/gpu/drm/vgem/vgem_drv.c +++ b/drivers/gpu/drm/vgem/vgem_drv.c @@ -38,6 +38,7 @@ #include <drm/drm_drv.h> #include <drm/drm_file.h> +#include <drm/drm_gem_shmem_helper.h> #include <drm/drm_ioctl.h> #include <drm/drm_managed.h> #include <drm/drm_prime.h> @@ -50,87 +51,11 @@ #define DRIVER_MAJOR 1 #define DRIVER_MINOR 0 -static const struct drm_gem_object_funcs vgem_gem_object_funcs; - static struct vgem_device { struct drm_device drm; struct platform_device *platform; } *vgem_device; -static void vgem_gem_free_object(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); - - kvfree(vgem_obj->pages); - mutex_destroy(&vgem_obj->pages_lock); - - if (obj->import_attach) - drm_prime_gem_destroy(obj, vgem_obj->table); - - drm_gem_object_release(obj); - kfree(vgem_obj); -} - -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) -{ - struct vm_area_struct *vma = vmf->vma; - struct drm_vgem_gem_object *obj = vma->vm_private_data; - /* We don't use vmf->pgoff since that has the fake offset */ - unsigned long vaddr = vmf->address; - vm_fault_t ret = VM_FAULT_SIGBUS; - loff_t num_pages; - pgoff_t page_offset; - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; - - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); - - if (page_offset >= num_pages) - return VM_FAULT_SIGBUS; - - mutex_lock(&obj->pages_lock); - if (obj->pages) { - get_page(obj->pages[page_offset]); - vmf->page = obj->pages[page_offset]; - ret = 0; - } - mutex_unlock(&obj->pages_lock); - if (ret) { - struct page *page; - - page = shmem_read_mapping_page( - file_inode(obj->base.filp)->i_mapping, - page_offset); - if (!IS_ERR(page)) { - vmf->page = page; - ret = 0; - } else switch (PTR_ERR(page)) { - case -ENOSPC: - case -ENOMEM: - ret = VM_FAULT_OOM; - break; - case -EBUSY: - ret = VM_FAULT_RETRY; - break; - case -EFAULT: - case -EINVAL: - ret = VM_FAULT_SIGBUS; - break; - default: - WARN_ON(PTR_ERR(page)); - ret = VM_FAULT_SIGBUS; - break; - } - - } - return ret; -} - -static const struct vm_operations_struct vgem_gem_vm_ops = { - .fault = vgem_gem_fault, - .open = drm_gem_vm_open, - .close = drm_gem_vm_close, -}; - static int vgem_open(struct drm_device *dev, struct drm_file *file) { struct vgem_file *vfile; @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) kfree(vfile); } -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = kzalloc(sizeof(*obj), GFP_KERNEL); - if (!obj) - return ERR_PTR(-ENOMEM); - - obj->base.funcs = &vgem_gem_object_funcs; - - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); - if (ret) { - kfree(obj); - return ERR_PTR(ret); - } - - mutex_init(&obj->pages_lock); - - return obj; -} - -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) -{ - drm_gem_object_release(&obj->base); - kfree(obj); -} - -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, - struct drm_file *file, - unsigned int *handle, - unsigned long size) -{ - struct drm_vgem_gem_object *obj; - int ret; - - obj = __vgem_gem_create(dev, size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - ret = drm_gem_handle_create(file, &obj->base, handle); - if (ret) { - drm_gem_object_put(&obj->base); - return ERR_PTR(ret); - } - - return &obj->base; -} - -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, - struct drm_mode_create_dumb *args) -{ - struct drm_gem_object *gem_object; - u64 pitch, size; - - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); - size = args->height * pitch; - if (size == 0) - return -EINVAL; - - gem_object = vgem_gem_create(dev, file, &args->handle, size); - if (IS_ERR(gem_object)) - return PTR_ERR(gem_object); - - args->size = gem_object->size; - args->pitch = pitch; - - drm_gem_object_put(gem_object); - - DRM_DEBUG("Created object of size %llu\n", args->size); - - return 0; -} - static struct drm_ioctl_desc vgem_ioctls[] = { DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), }; -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) -{ - unsigned long flags = vma->vm_flags; - int ret; - - ret = drm_gem_mmap(filp, vma); - if (ret) - return ret; - - /* Keep the WC mmaping set by drm_gem_mmap() but our pages - * are ordinary and not special. - */ - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; - return 0; -} - -static const struct file_operations vgem_driver_fops = { - .owner = THIS_MODULE, - .open = drm_open, - .mmap = vgem_mmap, - .poll = drm_poll, - .read = drm_read, - .unlocked_ioctl = drm_ioctl, - .compat_ioctl = drm_compat_ioctl, - .release = drm_release, -}; - -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (bo->pages_pin_count++ == 0) { - struct page **pages; - - pages = drm_gem_get_pages(&bo->base); - if (IS_ERR(pages)) { - bo->pages_pin_count--; - mutex_unlock(&bo->pages_lock); - return pages; - } - - bo->pages = pages; - } - mutex_unlock(&bo->pages_lock); - - return bo->pages; -} - -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) -{ - mutex_lock(&bo->pages_lock); - if (--bo->pages_pin_count == 0) { - drm_gem_put_pages(&bo->base, bo->pages, true, true); - bo->pages = NULL; - } - mutex_unlock(&bo->pages_lock); -} - -static int vgem_prime_pin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - /* Flush the object from the CPU cache so that importers can rely - * on coherent indirect access via the exported dma-address. - */ - drm_clflush_pages(pages, n_pages); - - return 0; -} - -static void vgem_prime_unpin(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vgem_unpin_pages(bo); -} - -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); -} - -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, - struct dma_buf *dma_buf) -{ - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); - - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); -} - -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, - struct dma_buf_attachment *attach, struct sg_table *sg) -{ - struct drm_vgem_gem_object *obj; - int npages; - - obj = __vgem_gem_create(dev, attach->dmabuf->size); - if (IS_ERR(obj)) - return ERR_CAST(obj); - - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; - - obj->table = sg; - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); - if (!obj->pages) { - __vgem_gem_destroy(obj); - return ERR_PTR(-ENOMEM); - } - - obj->pages_pin_count++; /* perma-pinned */ - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); - return &obj->base; -} - -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - long n_pages = obj->size >> PAGE_SHIFT; - struct page **pages; - void *vaddr; - - pages = vgem_pin_pages(bo); - if (IS_ERR(pages)) - return PTR_ERR(pages); - - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); - if (!vaddr) - return -ENOMEM; - dma_buf_map_set_vaddr(map, vaddr); - - return 0; -} - -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) -{ - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); - - vunmap(map->vaddr); - vgem_unpin_pages(bo); -} - -static int vgem_prime_mmap(struct drm_gem_object *obj, - struct vm_area_struct *vma) -{ - int ret; - - if (obj->size < vma->vm_end - vma->vm_start) - return -EINVAL; - - if (!obj->filp) - return -ENODEV; - - ret = call_mmap(obj->filp, vma); - if (ret) - return ret; - - vma_set_file(vma, obj->filp); - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); - - return 0; -} - -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { - .free = vgem_gem_free_object, - .pin = vgem_prime_pin, - .unpin = vgem_prime_unpin, - .get_sg_table = vgem_prime_get_sg_table, - .vmap = vgem_prime_vmap, - .vunmap = vgem_prime_vunmap, - .vm_ops = &vgem_gem_vm_ops, -}; +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); static const struct drm_driver vgem_driver = { .driver_features = DRIVER_GEM | DRIVER_RENDER, @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { .num_ioctls = ARRAY_SIZE(vgem_ioctls), .fops = &vgem_driver_fops, - .dumb_create = vgem_gem_dumb_create, - - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, - .gem_prime_import = vgem_prime_import, - .gem_prime_import_sg_table = vgem_prime_import_sg_table, - .gem_prime_mmap = vgem_prime_mmap, + DRM_GEM_SHMEM_DRIVER_OPS, .name = DRIVER_NAME, .desc = DRIVER_DESC, -- 2.30.0 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [PATCH] drm/vgem: use shmem helpers 2021-02-25 10:23 ` [Intel-gfx] " Daniel Vetter @ 2021-02-26 9:19 ` Thomas Zimmermann -1 siblings, 0 replies; 110+ messages in thread From: Thomas Zimmermann @ 2021-02-26 9:19 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: Intel Graphics Development, Christian König, Melissa Wen, Daniel Vetter, Chris Wilson [-- Attachment #1.1.1: Type: text/plain, Size: 12676 bytes --] Hi Am 25.02.21 um 11:23 schrieb Daniel Vetter: > Aside from deleting lots of code the real motivation here is to switch > the mmap over to VM_PFNMAP, to be more consistent with what real gpu > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't > work, and even if you try and there's a struct page behind that, > touching it and mucking around with its refcount can upset drivers > real bad. > > v2: Review from Thomas: > - sort #include > - drop more dead code that I didn't spot somehow > > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) Since you're working on it, could you move the config item into a Kconfig file under vgem? Best regards Thomas > > Cc: Thomas Zimmermann <tzimmermann@suse.de> > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > Cc: John Stultz <john.stultz@linaro.org> > Cc: Sumit Semwal <sumit.semwal@linaro.org> > Cc: "Christian König" <christian.koenig@amd.com> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > Cc: Melissa Wen <melissa.srw@gmail.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/Kconfig | 1 + > drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- > 2 files changed, 4 insertions(+), 337 deletions(-) > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > index 8e73311de583..94e4ac830283 100644 > --- a/drivers/gpu/drm/Kconfig > +++ b/drivers/gpu/drm/Kconfig > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" > config DRM_VGEM > tristate "Virtual GEM provider" > depends on DRM > + select DRM_GEM_SHMEM_HELPER > help > Choose this option to get a virtual graphics memory manager, > as used by Mesa's software renderer for enhanced performance. > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c > index a0e75f1d5d01..b1b3a5ffc542 100644 > --- a/drivers/gpu/drm/vgem/vgem_drv.c > +++ b/drivers/gpu/drm/vgem/vgem_drv.c > @@ -38,6 +38,7 @@ > > #include <drm/drm_drv.h> > #include <drm/drm_file.h> > +#include <drm/drm_gem_shmem_helper.h> > #include <drm/drm_ioctl.h> > #include <drm/drm_managed.h> > #include <drm/drm_prime.h> > @@ -50,87 +51,11 @@ > #define DRIVER_MAJOR 1 > #define DRIVER_MINOR 0 > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs; > - > static struct vgem_device { > struct drm_device drm; > struct platform_device *platform; > } *vgem_device; > > -static void vgem_gem_free_object(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); > - > - kvfree(vgem_obj->pages); > - mutex_destroy(&vgem_obj->pages_lock); > - > - if (obj->import_attach) > - drm_prime_gem_destroy(obj, vgem_obj->table); > - > - drm_gem_object_release(obj); > - kfree(vgem_obj); > -} > - > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) > -{ > - struct vm_area_struct *vma = vmf->vma; > - struct drm_vgem_gem_object *obj = vma->vm_private_data; > - /* We don't use vmf->pgoff since that has the fake offset */ > - unsigned long vaddr = vmf->address; > - vm_fault_t ret = VM_FAULT_SIGBUS; > - loff_t num_pages; > - pgoff_t page_offset; > - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; > - > - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); > - > - if (page_offset >= num_pages) > - return VM_FAULT_SIGBUS; > - > - mutex_lock(&obj->pages_lock); > - if (obj->pages) { > - get_page(obj->pages[page_offset]); > - vmf->page = obj->pages[page_offset]; > - ret = 0; > - } > - mutex_unlock(&obj->pages_lock); > - if (ret) { > - struct page *page; > - > - page = shmem_read_mapping_page( > - file_inode(obj->base.filp)->i_mapping, > - page_offset); > - if (!IS_ERR(page)) { > - vmf->page = page; > - ret = 0; > - } else switch (PTR_ERR(page)) { > - case -ENOSPC: > - case -ENOMEM: > - ret = VM_FAULT_OOM; > - break; > - case -EBUSY: > - ret = VM_FAULT_RETRY; > - break; > - case -EFAULT: > - case -EINVAL: > - ret = VM_FAULT_SIGBUS; > - break; > - default: > - WARN_ON(PTR_ERR(page)); > - ret = VM_FAULT_SIGBUS; > - break; > - } > - > - } > - return ret; > -} > - > -static const struct vm_operations_struct vgem_gem_vm_ops = { > - .fault = vgem_gem_fault, > - .open = drm_gem_vm_open, > - .close = drm_gem_vm_close, > -}; > - > static int vgem_open(struct drm_device *dev, struct drm_file *file) > { > struct vgem_file *vfile; > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) > kfree(vfile); > } > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > - unsigned long size) > -{ > - struct drm_vgem_gem_object *obj; > - int ret; > - > - obj = kzalloc(sizeof(*obj), GFP_KERNEL); > - if (!obj) > - return ERR_PTR(-ENOMEM); > - > - obj->base.funcs = &vgem_gem_object_funcs; > - > - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); > - if (ret) { > - kfree(obj); > - return ERR_PTR(ret); > - } > - > - mutex_init(&obj->pages_lock); > - > - return obj; > -} > - > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) > -{ > - drm_gem_object_release(&obj->base); > - kfree(obj); > -} > - > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > - struct drm_file *file, > - unsigned int *handle, > - unsigned long size) > -{ > - struct drm_vgem_gem_object *obj; > - int ret; > - > - obj = __vgem_gem_create(dev, size); > - if (IS_ERR(obj)) > - return ERR_CAST(obj); > - > - ret = drm_gem_handle_create(file, &obj->base, handle); > - if (ret) { > - drm_gem_object_put(&obj->base); > - return ERR_PTR(ret); > - } > - > - return &obj->base; > -} > - > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, > - struct drm_mode_create_dumb *args) > -{ > - struct drm_gem_object *gem_object; > - u64 pitch, size; > - > - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); > - size = args->height * pitch; > - if (size == 0) > - return -EINVAL; > - > - gem_object = vgem_gem_create(dev, file, &args->handle, size); > - if (IS_ERR(gem_object)) > - return PTR_ERR(gem_object); > - > - args->size = gem_object->size; > - args->pitch = pitch; > - > - drm_gem_object_put(gem_object); > - > - DRM_DEBUG("Created object of size %llu\n", args->size); > - > - return 0; > -} > - > static struct drm_ioctl_desc vgem_ioctls[] = { > DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), > }; > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) > -{ > - unsigned long flags = vma->vm_flags; > - int ret; > - > - ret = drm_gem_mmap(filp, vma); > - if (ret) > - return ret; > - > - /* Keep the WC mmaping set by drm_gem_mmap() but our pages > - * are ordinary and not special. > - */ > - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; > - return 0; > -} > - > -static const struct file_operations vgem_driver_fops = { > - .owner = THIS_MODULE, > - .open = drm_open, > - .mmap = vgem_mmap, > - .poll = drm_poll, > - .read = drm_read, > - .unlocked_ioctl = drm_ioctl, > - .compat_ioctl = drm_compat_ioctl, > - .release = drm_release, > -}; > - > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) > -{ > - mutex_lock(&bo->pages_lock); > - if (bo->pages_pin_count++ == 0) { > - struct page **pages; > - > - pages = drm_gem_get_pages(&bo->base); > - if (IS_ERR(pages)) { > - bo->pages_pin_count--; > - mutex_unlock(&bo->pages_lock); > - return pages; > - } > - > - bo->pages = pages; > - } > - mutex_unlock(&bo->pages_lock); > - > - return bo->pages; > -} > - > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) > -{ > - mutex_lock(&bo->pages_lock); > - if (--bo->pages_pin_count == 0) { > - drm_gem_put_pages(&bo->base, bo->pages, true, true); > - bo->pages = NULL; > - } > - mutex_unlock(&bo->pages_lock); > -} > - > -static int vgem_prime_pin(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - long n_pages = obj->size >> PAGE_SHIFT; > - struct page **pages; > - > - pages = vgem_pin_pages(bo); > - if (IS_ERR(pages)) > - return PTR_ERR(pages); > - > - /* Flush the object from the CPU cache so that importers can rely > - * on coherent indirect access via the exported dma-address. > - */ > - drm_clflush_pages(pages, n_pages); > - > - return 0; > -} > - > -static void vgem_prime_unpin(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - vgem_unpin_pages(bo); > -} > - > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); > -} > - > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, > - struct dma_buf *dma_buf) > -{ > - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); > - > - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); > -} > - > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, > - struct dma_buf_attachment *attach, struct sg_table *sg) > -{ > - struct drm_vgem_gem_object *obj; > - int npages; > - > - obj = __vgem_gem_create(dev, attach->dmabuf->size); > - if (IS_ERR(obj)) > - return ERR_CAST(obj); > - > - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; > - > - obj->table = sg; > - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); > - if (!obj->pages) { > - __vgem_gem_destroy(obj); > - return ERR_PTR(-ENOMEM); > - } > - > - obj->pages_pin_count++; /* perma-pinned */ > - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); > - return &obj->base; > -} > - > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - long n_pages = obj->size >> PAGE_SHIFT; > - struct page **pages; > - void *vaddr; > - > - pages = vgem_pin_pages(bo); > - if (IS_ERR(pages)) > - return PTR_ERR(pages); > - > - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); > - if (!vaddr) > - return -ENOMEM; > - dma_buf_map_set_vaddr(map, vaddr); > - > - return 0; > -} > - > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - vunmap(map->vaddr); > - vgem_unpin_pages(bo); > -} > - > -static int vgem_prime_mmap(struct drm_gem_object *obj, > - struct vm_area_struct *vma) > -{ > - int ret; > - > - if (obj->size < vma->vm_end - vma->vm_start) > - return -EINVAL; > - > - if (!obj->filp) > - return -ENODEV; > - > - ret = call_mmap(obj->filp, vma); > - if (ret) > - return ret; > - > - vma_set_file(vma, obj->filp); > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); > - > - return 0; > -} > - > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { > - .free = vgem_gem_free_object, > - .pin = vgem_prime_pin, > - .unpin = vgem_prime_unpin, > - .get_sg_table = vgem_prime_get_sg_table, > - .vmap = vgem_prime_vmap, > - .vunmap = vgem_prime_vunmap, > - .vm_ops = &vgem_gem_vm_ops, > -}; > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); > > static const struct drm_driver vgem_driver = { > .driver_features = DRIVER_GEM | DRIVER_RENDER, > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { > .num_ioctls = ARRAY_SIZE(vgem_ioctls), > .fops = &vgem_driver_fops, > > - .dumb_create = vgem_gem_dumb_create, > - > - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > - .gem_prime_import = vgem_prime_import, > - .gem_prime_import_sg_table = vgem_prime_import_sg_table, > - .gem_prime_mmap = vgem_prime_mmap, > + DRM_GEM_SHMEM_DRIVER_OPS, > > .name = DRIVER_NAME, > .desc = DRIVER_DESC, > -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 840 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/vgem: use shmem helpers @ 2021-02-26 9:19 ` Thomas Zimmermann 0 siblings, 0 replies; 110+ messages in thread From: Thomas Zimmermann @ 2021-02-26 9:19 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: Intel Graphics Development, Christian König, Melissa Wen, John Stultz, Daniel Vetter, Chris Wilson, Sumit Semwal [-- Attachment #1.1.1: Type: text/plain, Size: 12676 bytes --] Hi Am 25.02.21 um 11:23 schrieb Daniel Vetter: > Aside from deleting lots of code the real motivation here is to switch > the mmap over to VM_PFNMAP, to be more consistent with what real gpu > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't > work, and even if you try and there's a struct page behind that, > touching it and mucking around with its refcount can upset drivers > real bad. > > v2: Review from Thomas: > - sort #include > - drop more dead code that I didn't spot somehow > > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) Since you're working on it, could you move the config item into a Kconfig file under vgem? Best regards Thomas > > Cc: Thomas Zimmermann <tzimmermann@suse.de> > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > Cc: John Stultz <john.stultz@linaro.org> > Cc: Sumit Semwal <sumit.semwal@linaro.org> > Cc: "Christian König" <christian.koenig@amd.com> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > Cc: Melissa Wen <melissa.srw@gmail.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > --- > drivers/gpu/drm/Kconfig | 1 + > drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- > 2 files changed, 4 insertions(+), 337 deletions(-) > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > index 8e73311de583..94e4ac830283 100644 > --- a/drivers/gpu/drm/Kconfig > +++ b/drivers/gpu/drm/Kconfig > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" > config DRM_VGEM > tristate "Virtual GEM provider" > depends on DRM > + select DRM_GEM_SHMEM_HELPER > help > Choose this option to get a virtual graphics memory manager, > as used by Mesa's software renderer for enhanced performance. > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c > index a0e75f1d5d01..b1b3a5ffc542 100644 > --- a/drivers/gpu/drm/vgem/vgem_drv.c > +++ b/drivers/gpu/drm/vgem/vgem_drv.c > @@ -38,6 +38,7 @@ > > #include <drm/drm_drv.h> > #include <drm/drm_file.h> > +#include <drm/drm_gem_shmem_helper.h> > #include <drm/drm_ioctl.h> > #include <drm/drm_managed.h> > #include <drm/drm_prime.h> > @@ -50,87 +51,11 @@ > #define DRIVER_MAJOR 1 > #define DRIVER_MINOR 0 > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs; > - > static struct vgem_device { > struct drm_device drm; > struct platform_device *platform; > } *vgem_device; > > -static void vgem_gem_free_object(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); > - > - kvfree(vgem_obj->pages); > - mutex_destroy(&vgem_obj->pages_lock); > - > - if (obj->import_attach) > - drm_prime_gem_destroy(obj, vgem_obj->table); > - > - drm_gem_object_release(obj); > - kfree(vgem_obj); > -} > - > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) > -{ > - struct vm_area_struct *vma = vmf->vma; > - struct drm_vgem_gem_object *obj = vma->vm_private_data; > - /* We don't use vmf->pgoff since that has the fake offset */ > - unsigned long vaddr = vmf->address; > - vm_fault_t ret = VM_FAULT_SIGBUS; > - loff_t num_pages; > - pgoff_t page_offset; > - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; > - > - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); > - > - if (page_offset >= num_pages) > - return VM_FAULT_SIGBUS; > - > - mutex_lock(&obj->pages_lock); > - if (obj->pages) { > - get_page(obj->pages[page_offset]); > - vmf->page = obj->pages[page_offset]; > - ret = 0; > - } > - mutex_unlock(&obj->pages_lock); > - if (ret) { > - struct page *page; > - > - page = shmem_read_mapping_page( > - file_inode(obj->base.filp)->i_mapping, > - page_offset); > - if (!IS_ERR(page)) { > - vmf->page = page; > - ret = 0; > - } else switch (PTR_ERR(page)) { > - case -ENOSPC: > - case -ENOMEM: > - ret = VM_FAULT_OOM; > - break; > - case -EBUSY: > - ret = VM_FAULT_RETRY; > - break; > - case -EFAULT: > - case -EINVAL: > - ret = VM_FAULT_SIGBUS; > - break; > - default: > - WARN_ON(PTR_ERR(page)); > - ret = VM_FAULT_SIGBUS; > - break; > - } > - > - } > - return ret; > -} > - > -static const struct vm_operations_struct vgem_gem_vm_ops = { > - .fault = vgem_gem_fault, > - .open = drm_gem_vm_open, > - .close = drm_gem_vm_close, > -}; > - > static int vgem_open(struct drm_device *dev, struct drm_file *file) > { > struct vgem_file *vfile; > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) > kfree(vfile); > } > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > - unsigned long size) > -{ > - struct drm_vgem_gem_object *obj; > - int ret; > - > - obj = kzalloc(sizeof(*obj), GFP_KERNEL); > - if (!obj) > - return ERR_PTR(-ENOMEM); > - > - obj->base.funcs = &vgem_gem_object_funcs; > - > - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); > - if (ret) { > - kfree(obj); > - return ERR_PTR(ret); > - } > - > - mutex_init(&obj->pages_lock); > - > - return obj; > -} > - > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) > -{ > - drm_gem_object_release(&obj->base); > - kfree(obj); > -} > - > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > - struct drm_file *file, > - unsigned int *handle, > - unsigned long size) > -{ > - struct drm_vgem_gem_object *obj; > - int ret; > - > - obj = __vgem_gem_create(dev, size); > - if (IS_ERR(obj)) > - return ERR_CAST(obj); > - > - ret = drm_gem_handle_create(file, &obj->base, handle); > - if (ret) { > - drm_gem_object_put(&obj->base); > - return ERR_PTR(ret); > - } > - > - return &obj->base; > -} > - > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, > - struct drm_mode_create_dumb *args) > -{ > - struct drm_gem_object *gem_object; > - u64 pitch, size; > - > - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); > - size = args->height * pitch; > - if (size == 0) > - return -EINVAL; > - > - gem_object = vgem_gem_create(dev, file, &args->handle, size); > - if (IS_ERR(gem_object)) > - return PTR_ERR(gem_object); > - > - args->size = gem_object->size; > - args->pitch = pitch; > - > - drm_gem_object_put(gem_object); > - > - DRM_DEBUG("Created object of size %llu\n", args->size); > - > - return 0; > -} > - > static struct drm_ioctl_desc vgem_ioctls[] = { > DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), > DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), > }; > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) > -{ > - unsigned long flags = vma->vm_flags; > - int ret; > - > - ret = drm_gem_mmap(filp, vma); > - if (ret) > - return ret; > - > - /* Keep the WC mmaping set by drm_gem_mmap() but our pages > - * are ordinary and not special. > - */ > - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; > - return 0; > -} > - > -static const struct file_operations vgem_driver_fops = { > - .owner = THIS_MODULE, > - .open = drm_open, > - .mmap = vgem_mmap, > - .poll = drm_poll, > - .read = drm_read, > - .unlocked_ioctl = drm_ioctl, > - .compat_ioctl = drm_compat_ioctl, > - .release = drm_release, > -}; > - > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) > -{ > - mutex_lock(&bo->pages_lock); > - if (bo->pages_pin_count++ == 0) { > - struct page **pages; > - > - pages = drm_gem_get_pages(&bo->base); > - if (IS_ERR(pages)) { > - bo->pages_pin_count--; > - mutex_unlock(&bo->pages_lock); > - return pages; > - } > - > - bo->pages = pages; > - } > - mutex_unlock(&bo->pages_lock); > - > - return bo->pages; > -} > - > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) > -{ > - mutex_lock(&bo->pages_lock); > - if (--bo->pages_pin_count == 0) { > - drm_gem_put_pages(&bo->base, bo->pages, true, true); > - bo->pages = NULL; > - } > - mutex_unlock(&bo->pages_lock); > -} > - > -static int vgem_prime_pin(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - long n_pages = obj->size >> PAGE_SHIFT; > - struct page **pages; > - > - pages = vgem_pin_pages(bo); > - if (IS_ERR(pages)) > - return PTR_ERR(pages); > - > - /* Flush the object from the CPU cache so that importers can rely > - * on coherent indirect access via the exported dma-address. > - */ > - drm_clflush_pages(pages, n_pages); > - > - return 0; > -} > - > -static void vgem_prime_unpin(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - vgem_unpin_pages(bo); > -} > - > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); > -} > - > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, > - struct dma_buf *dma_buf) > -{ > - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); > - > - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); > -} > - > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, > - struct dma_buf_attachment *attach, struct sg_table *sg) > -{ > - struct drm_vgem_gem_object *obj; > - int npages; > - > - obj = __vgem_gem_create(dev, attach->dmabuf->size); > - if (IS_ERR(obj)) > - return ERR_CAST(obj); > - > - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; > - > - obj->table = sg; > - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); > - if (!obj->pages) { > - __vgem_gem_destroy(obj); > - return ERR_PTR(-ENOMEM); > - } > - > - obj->pages_pin_count++; /* perma-pinned */ > - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); > - return &obj->base; > -} > - > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - long n_pages = obj->size >> PAGE_SHIFT; > - struct page **pages; > - void *vaddr; > - > - pages = vgem_pin_pages(bo); > - if (IS_ERR(pages)) > - return PTR_ERR(pages); > - > - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); > - if (!vaddr) > - return -ENOMEM; > - dma_buf_map_set_vaddr(map, vaddr); > - > - return 0; > -} > - > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) > -{ > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > - > - vunmap(map->vaddr); > - vgem_unpin_pages(bo); > -} > - > -static int vgem_prime_mmap(struct drm_gem_object *obj, > - struct vm_area_struct *vma) > -{ > - int ret; > - > - if (obj->size < vma->vm_end - vma->vm_start) > - return -EINVAL; > - > - if (!obj->filp) > - return -ENODEV; > - > - ret = call_mmap(obj->filp, vma); > - if (ret) > - return ret; > - > - vma_set_file(vma, obj->filp); > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); > - > - return 0; > -} > - > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { > - .free = vgem_gem_free_object, > - .pin = vgem_prime_pin, > - .unpin = vgem_prime_unpin, > - .get_sg_table = vgem_prime_get_sg_table, > - .vmap = vgem_prime_vmap, > - .vunmap = vgem_prime_vunmap, > - .vm_ops = &vgem_gem_vm_ops, > -}; > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); > > static const struct drm_driver vgem_driver = { > .driver_features = DRIVER_GEM | DRIVER_RENDER, > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { > .num_ioctls = ARRAY_SIZE(vgem_ioctls), > .fops = &vgem_driver_fops, > > - .dumb_create = vgem_gem_dumb_create, > - > - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > - .gem_prime_import = vgem_prime_import, > - .gem_prime_import_sg_table = vgem_prime_import_sg_table, > - .gem_prime_mmap = vgem_prime_mmap, > + DRM_GEM_SHMEM_DRIVER_OPS, > > .name = DRIVER_NAME, > .desc = DRIVER_DESC, > -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 840 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH] drm/vgem: use shmem helpers 2021-02-26 9:19 ` [Intel-gfx] " Thomas Zimmermann @ 2021-02-26 13:30 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-26 13:30 UTC (permalink / raw) To: Thomas Zimmermann Cc: Intel Graphics Development, DRI Development, Christian König, Melissa Wen, Daniel Vetter, Chris Wilson On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > Hi > > Am 25.02.21 um 11:23 schrieb Daniel Vetter: > > Aside from deleting lots of code the real motivation here is to switch > > the mmap over to VM_PFNMAP, to be more consistent with what real gpu > > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't > > work, and even if you try and there's a struct page behind that, > > touching it and mucking around with its refcount can upset drivers > > real bad. > > > > v2: Review from Thomas: > > - sort #include > > - drop more dead code that I didn't spot somehow > > > > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) > > Since you're working on it, could you move the config item into a > Kconfig file under vgem? We have a lot of drivers still without their own Kconfig. I thought we're only doing that for drivers which have multiple options, or otherwise would clutter up the main drm/Kconfig file? Not opposed to this, just feels like if we do this, should do it for all of them. -Daniel > > Best regards > Thomas > > > > > Cc: Thomas Zimmermann <tzimmermann@suse.de> > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > > Cc: John Stultz <john.stultz@linaro.org> > > Cc: Sumit Semwal <sumit.semwal@linaro.org> > > Cc: "Christian König" <christian.koenig@amd.com> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > Cc: Melissa Wen <melissa.srw@gmail.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > --- > > drivers/gpu/drm/Kconfig | 1 + > > drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- > > 2 files changed, 4 insertions(+), 337 deletions(-) > > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > > index 8e73311de583..94e4ac830283 100644 > > --- a/drivers/gpu/drm/Kconfig > > +++ b/drivers/gpu/drm/Kconfig > > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" > > config DRM_VGEM > > tristate "Virtual GEM provider" > > depends on DRM > > + select DRM_GEM_SHMEM_HELPER > > help > > Choose this option to get a virtual graphics memory manager, > > as used by Mesa's software renderer for enhanced performance. > > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c > > index a0e75f1d5d01..b1b3a5ffc542 100644 > > --- a/drivers/gpu/drm/vgem/vgem_drv.c > > +++ b/drivers/gpu/drm/vgem/vgem_drv.c > > @@ -38,6 +38,7 @@ > > > > #include <drm/drm_drv.h> > > #include <drm/drm_file.h> > > +#include <drm/drm_gem_shmem_helper.h> > > #include <drm/drm_ioctl.h> > > #include <drm/drm_managed.h> > > #include <drm/drm_prime.h> > > @@ -50,87 +51,11 @@ > > #define DRIVER_MAJOR 1 > > #define DRIVER_MINOR 0 > > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs; > > - > > static struct vgem_device { > > struct drm_device drm; > > struct platform_device *platform; > > } *vgem_device; > > > > -static void vgem_gem_free_object(struct drm_gem_object *obj) > > -{ > > - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); > > - > > - kvfree(vgem_obj->pages); > > - mutex_destroy(&vgem_obj->pages_lock); > > - > > - if (obj->import_attach) > > - drm_prime_gem_destroy(obj, vgem_obj->table); > > - > > - drm_gem_object_release(obj); > > - kfree(vgem_obj); > > -} > > - > > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) > > -{ > > - struct vm_area_struct *vma = vmf->vma; > > - struct drm_vgem_gem_object *obj = vma->vm_private_data; > > - /* We don't use vmf->pgoff since that has the fake offset */ > > - unsigned long vaddr = vmf->address; > > - vm_fault_t ret = VM_FAULT_SIGBUS; > > - loff_t num_pages; > > - pgoff_t page_offset; > > - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; > > - > > - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); > > - > > - if (page_offset >= num_pages) > > - return VM_FAULT_SIGBUS; > > - > > - mutex_lock(&obj->pages_lock); > > - if (obj->pages) { > > - get_page(obj->pages[page_offset]); > > - vmf->page = obj->pages[page_offset]; > > - ret = 0; > > - } > > - mutex_unlock(&obj->pages_lock); > > - if (ret) { > > - struct page *page; > > - > > - page = shmem_read_mapping_page( > > - file_inode(obj->base.filp)->i_mapping, > > - page_offset); > > - if (!IS_ERR(page)) { > > - vmf->page = page; > > - ret = 0; > > - } else switch (PTR_ERR(page)) { > > - case -ENOSPC: > > - case -ENOMEM: > > - ret = VM_FAULT_OOM; > > - break; > > - case -EBUSY: > > - ret = VM_FAULT_RETRY; > > - break; > > - case -EFAULT: > > - case -EINVAL: > > - ret = VM_FAULT_SIGBUS; > > - break; > > - default: > > - WARN_ON(PTR_ERR(page)); > > - ret = VM_FAULT_SIGBUS; > > - break; > > - } > > - > > - } > > - return ret; > > -} > > - > > -static const struct vm_operations_struct vgem_gem_vm_ops = { > > - .fault = vgem_gem_fault, > > - .open = drm_gem_vm_open, > > - .close = drm_gem_vm_close, > > -}; > > - > > static int vgem_open(struct drm_device *dev, struct drm_file *file) > > { > > struct vgem_file *vfile; > > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) > > kfree(vfile); > > } > > > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > > - unsigned long size) > > -{ > > - struct drm_vgem_gem_object *obj; > > - int ret; > > - > > - obj = kzalloc(sizeof(*obj), GFP_KERNEL); > > - if (!obj) > > - return ERR_PTR(-ENOMEM); > > - > > - obj->base.funcs = &vgem_gem_object_funcs; > > - > > - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); > > - if (ret) { > > - kfree(obj); > > - return ERR_PTR(ret); > > - } > > - > > - mutex_init(&obj->pages_lock); > > - > > - return obj; > > -} > > - > > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) > > -{ > > - drm_gem_object_release(&obj->base); > > - kfree(obj); > > -} > > - > > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > > - struct drm_file *file, > > - unsigned int *handle, > > - unsigned long size) > > -{ > > - struct drm_vgem_gem_object *obj; > > - int ret; > > - > > - obj = __vgem_gem_create(dev, size); > > - if (IS_ERR(obj)) > > - return ERR_CAST(obj); > > - > > - ret = drm_gem_handle_create(file, &obj->base, handle); > > - if (ret) { > > - drm_gem_object_put(&obj->base); > > - return ERR_PTR(ret); > > - } > > - > > - return &obj->base; > > -} > > - > > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, > > - struct drm_mode_create_dumb *args) > > -{ > > - struct drm_gem_object *gem_object; > > - u64 pitch, size; > > - > > - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); > > - size = args->height * pitch; > > - if (size == 0) > > - return -EINVAL; > > - > > - gem_object = vgem_gem_create(dev, file, &args->handle, size); > > - if (IS_ERR(gem_object)) > > - return PTR_ERR(gem_object); > > - > > - args->size = gem_object->size; > > - args->pitch = pitch; > > - > > - drm_gem_object_put(gem_object); > > - > > - DRM_DEBUG("Created object of size %llu\n", args->size); > > - > > - return 0; > > -} > > - > > static struct drm_ioctl_desc vgem_ioctls[] = { > > DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), > > DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), > > }; > > > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) > > -{ > > - unsigned long flags = vma->vm_flags; > > - int ret; > > - > > - ret = drm_gem_mmap(filp, vma); > > - if (ret) > > - return ret; > > - > > - /* Keep the WC mmaping set by drm_gem_mmap() but our pages > > - * are ordinary and not special. > > - */ > > - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; > > - return 0; > > -} > > - > > -static const struct file_operations vgem_driver_fops = { > > - .owner = THIS_MODULE, > > - .open = drm_open, > > - .mmap = vgem_mmap, > > - .poll = drm_poll, > > - .read = drm_read, > > - .unlocked_ioctl = drm_ioctl, > > - .compat_ioctl = drm_compat_ioctl, > > - .release = drm_release, > > -}; > > - > > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) > > -{ > > - mutex_lock(&bo->pages_lock); > > - if (bo->pages_pin_count++ == 0) { > > - struct page **pages; > > - > > - pages = drm_gem_get_pages(&bo->base); > > - if (IS_ERR(pages)) { > > - bo->pages_pin_count--; > > - mutex_unlock(&bo->pages_lock); > > - return pages; > > - } > > - > > - bo->pages = pages; > > - } > > - mutex_unlock(&bo->pages_lock); > > - > > - return bo->pages; > > -} > > - > > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) > > -{ > > - mutex_lock(&bo->pages_lock); > > - if (--bo->pages_pin_count == 0) { > > - drm_gem_put_pages(&bo->base, bo->pages, true, true); > > - bo->pages = NULL; > > - } > > - mutex_unlock(&bo->pages_lock); > > -} > > - > > -static int vgem_prime_pin(struct drm_gem_object *obj) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - long n_pages = obj->size >> PAGE_SHIFT; > > - struct page **pages; > > - > > - pages = vgem_pin_pages(bo); > > - if (IS_ERR(pages)) > > - return PTR_ERR(pages); > > - > > - /* Flush the object from the CPU cache so that importers can rely > > - * on coherent indirect access via the exported dma-address. > > - */ > > - drm_clflush_pages(pages, n_pages); > > - > > - return 0; > > -} > > - > > -static void vgem_prime_unpin(struct drm_gem_object *obj) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - > > - vgem_unpin_pages(bo); > > -} > > - > > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - > > - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); > > -} > > - > > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, > > - struct dma_buf *dma_buf) > > -{ > > - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); > > - > > - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); > > -} > > - > > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, > > - struct dma_buf_attachment *attach, struct sg_table *sg) > > -{ > > - struct drm_vgem_gem_object *obj; > > - int npages; > > - > > - obj = __vgem_gem_create(dev, attach->dmabuf->size); > > - if (IS_ERR(obj)) > > - return ERR_CAST(obj); > > - > > - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; > > - > > - obj->table = sg; > > - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); > > - if (!obj->pages) { > > - __vgem_gem_destroy(obj); > > - return ERR_PTR(-ENOMEM); > > - } > > - > > - obj->pages_pin_count++; /* perma-pinned */ > > - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); > > - return &obj->base; > > -} > > - > > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - long n_pages = obj->size >> PAGE_SHIFT; > > - struct page **pages; > > - void *vaddr; > > - > > - pages = vgem_pin_pages(bo); > > - if (IS_ERR(pages)) > > - return PTR_ERR(pages); > > - > > - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); > > - if (!vaddr) > > - return -ENOMEM; > > - dma_buf_map_set_vaddr(map, vaddr); > > - > > - return 0; > > -} > > - > > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - > > - vunmap(map->vaddr); > > - vgem_unpin_pages(bo); > > -} > > - > > -static int vgem_prime_mmap(struct drm_gem_object *obj, > > - struct vm_area_struct *vma) > > -{ > > - int ret; > > - > > - if (obj->size < vma->vm_end - vma->vm_start) > > - return -EINVAL; > > - > > - if (!obj->filp) > > - return -ENODEV; > > - > > - ret = call_mmap(obj->filp, vma); > > - if (ret) > > - return ret; > > - > > - vma_set_file(vma, obj->filp); > > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > > - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); > > - > > - return 0; > > -} > > - > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { > > - .free = vgem_gem_free_object, > > - .pin = vgem_prime_pin, > > - .unpin = vgem_prime_unpin, > > - .get_sg_table = vgem_prime_get_sg_table, > > - .vmap = vgem_prime_vmap, > > - .vunmap = vgem_prime_vunmap, > > - .vm_ops = &vgem_gem_vm_ops, > > -}; > > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); > > > > static const struct drm_driver vgem_driver = { > > .driver_features = DRIVER_GEM | DRIVER_RENDER, > > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { > > .num_ioctls = ARRAY_SIZE(vgem_ioctls), > > .fops = &vgem_driver_fops, > > > > - .dumb_create = vgem_gem_dumb_create, > > - > > - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > > - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > > - .gem_prime_import = vgem_prime_import, > > - .gem_prime_import_sg_table = vgem_prime_import_sg_table, > > - .gem_prime_mmap = vgem_prime_mmap, > > + DRM_GEM_SHMEM_DRIVER_OPS, > > > > .name = DRIVER_NAME, > > .desc = DRIVER_DESC, > > > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Felix Imendörffer > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/vgem: use shmem helpers @ 2021-02-26 13:30 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-26 13:30 UTC (permalink / raw) To: Thomas Zimmermann Cc: Intel Graphics Development, DRI Development, Christian König, Melissa Wen, John Stultz, Daniel Vetter, Chris Wilson, Sumit Semwal On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > Hi > > Am 25.02.21 um 11:23 schrieb Daniel Vetter: > > Aside from deleting lots of code the real motivation here is to switch > > the mmap over to VM_PFNMAP, to be more consistent with what real gpu > > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't > > work, and even if you try and there's a struct page behind that, > > touching it and mucking around with its refcount can upset drivers > > real bad. > > > > v2: Review from Thomas: > > - sort #include > > - drop more dead code that I didn't spot somehow > > > > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) > > Since you're working on it, could you move the config item into a > Kconfig file under vgem? We have a lot of drivers still without their own Kconfig. I thought we're only doing that for drivers which have multiple options, or otherwise would clutter up the main drm/Kconfig file? Not opposed to this, just feels like if we do this, should do it for all of them. -Daniel > > Best regards > Thomas > > > > > Cc: Thomas Zimmermann <tzimmermann@suse.de> > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > > Cc: John Stultz <john.stultz@linaro.org> > > Cc: Sumit Semwal <sumit.semwal@linaro.org> > > Cc: "Christian König" <christian.koenig@amd.com> > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > Cc: Melissa Wen <melissa.srw@gmail.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > --- > > drivers/gpu/drm/Kconfig | 1 + > > drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- > > 2 files changed, 4 insertions(+), 337 deletions(-) > > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > > index 8e73311de583..94e4ac830283 100644 > > --- a/drivers/gpu/drm/Kconfig > > +++ b/drivers/gpu/drm/Kconfig > > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" > > config DRM_VGEM > > tristate "Virtual GEM provider" > > depends on DRM > > + select DRM_GEM_SHMEM_HELPER > > help > > Choose this option to get a virtual graphics memory manager, > > as used by Mesa's software renderer for enhanced performance. > > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c > > index a0e75f1d5d01..b1b3a5ffc542 100644 > > --- a/drivers/gpu/drm/vgem/vgem_drv.c > > +++ b/drivers/gpu/drm/vgem/vgem_drv.c > > @@ -38,6 +38,7 @@ > > > > #include <drm/drm_drv.h> > > #include <drm/drm_file.h> > > +#include <drm/drm_gem_shmem_helper.h> > > #include <drm/drm_ioctl.h> > > #include <drm/drm_managed.h> > > #include <drm/drm_prime.h> > > @@ -50,87 +51,11 @@ > > #define DRIVER_MAJOR 1 > > #define DRIVER_MINOR 0 > > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs; > > - > > static struct vgem_device { > > struct drm_device drm; > > struct platform_device *platform; > > } *vgem_device; > > > > -static void vgem_gem_free_object(struct drm_gem_object *obj) > > -{ > > - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); > > - > > - kvfree(vgem_obj->pages); > > - mutex_destroy(&vgem_obj->pages_lock); > > - > > - if (obj->import_attach) > > - drm_prime_gem_destroy(obj, vgem_obj->table); > > - > > - drm_gem_object_release(obj); > > - kfree(vgem_obj); > > -} > > - > > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) > > -{ > > - struct vm_area_struct *vma = vmf->vma; > > - struct drm_vgem_gem_object *obj = vma->vm_private_data; > > - /* We don't use vmf->pgoff since that has the fake offset */ > > - unsigned long vaddr = vmf->address; > > - vm_fault_t ret = VM_FAULT_SIGBUS; > > - loff_t num_pages; > > - pgoff_t page_offset; > > - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; > > - > > - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); > > - > > - if (page_offset >= num_pages) > > - return VM_FAULT_SIGBUS; > > - > > - mutex_lock(&obj->pages_lock); > > - if (obj->pages) { > > - get_page(obj->pages[page_offset]); > > - vmf->page = obj->pages[page_offset]; > > - ret = 0; > > - } > > - mutex_unlock(&obj->pages_lock); > > - if (ret) { > > - struct page *page; > > - > > - page = shmem_read_mapping_page( > > - file_inode(obj->base.filp)->i_mapping, > > - page_offset); > > - if (!IS_ERR(page)) { > > - vmf->page = page; > > - ret = 0; > > - } else switch (PTR_ERR(page)) { > > - case -ENOSPC: > > - case -ENOMEM: > > - ret = VM_FAULT_OOM; > > - break; > > - case -EBUSY: > > - ret = VM_FAULT_RETRY; > > - break; > > - case -EFAULT: > > - case -EINVAL: > > - ret = VM_FAULT_SIGBUS; > > - break; > > - default: > > - WARN_ON(PTR_ERR(page)); > > - ret = VM_FAULT_SIGBUS; > > - break; > > - } > > - > > - } > > - return ret; > > -} > > - > > -static const struct vm_operations_struct vgem_gem_vm_ops = { > > - .fault = vgem_gem_fault, > > - .open = drm_gem_vm_open, > > - .close = drm_gem_vm_close, > > -}; > > - > > static int vgem_open(struct drm_device *dev, struct drm_file *file) > > { > > struct vgem_file *vfile; > > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) > > kfree(vfile); > > } > > > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > > - unsigned long size) > > -{ > > - struct drm_vgem_gem_object *obj; > > - int ret; > > - > > - obj = kzalloc(sizeof(*obj), GFP_KERNEL); > > - if (!obj) > > - return ERR_PTR(-ENOMEM); > > - > > - obj->base.funcs = &vgem_gem_object_funcs; > > - > > - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); > > - if (ret) { > > - kfree(obj); > > - return ERR_PTR(ret); > > - } > > - > > - mutex_init(&obj->pages_lock); > > - > > - return obj; > > -} > > - > > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) > > -{ > > - drm_gem_object_release(&obj->base); > > - kfree(obj); > > -} > > - > > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > > - struct drm_file *file, > > - unsigned int *handle, > > - unsigned long size) > > -{ > > - struct drm_vgem_gem_object *obj; > > - int ret; > > - > > - obj = __vgem_gem_create(dev, size); > > - if (IS_ERR(obj)) > > - return ERR_CAST(obj); > > - > > - ret = drm_gem_handle_create(file, &obj->base, handle); > > - if (ret) { > > - drm_gem_object_put(&obj->base); > > - return ERR_PTR(ret); > > - } > > - > > - return &obj->base; > > -} > > - > > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, > > - struct drm_mode_create_dumb *args) > > -{ > > - struct drm_gem_object *gem_object; > > - u64 pitch, size; > > - > > - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); > > - size = args->height * pitch; > > - if (size == 0) > > - return -EINVAL; > > - > > - gem_object = vgem_gem_create(dev, file, &args->handle, size); > > - if (IS_ERR(gem_object)) > > - return PTR_ERR(gem_object); > > - > > - args->size = gem_object->size; > > - args->pitch = pitch; > > - > > - drm_gem_object_put(gem_object); > > - > > - DRM_DEBUG("Created object of size %llu\n", args->size); > > - > > - return 0; > > -} > > - > > static struct drm_ioctl_desc vgem_ioctls[] = { > > DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), > > DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), > > }; > > > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) > > -{ > > - unsigned long flags = vma->vm_flags; > > - int ret; > > - > > - ret = drm_gem_mmap(filp, vma); > > - if (ret) > > - return ret; > > - > > - /* Keep the WC mmaping set by drm_gem_mmap() but our pages > > - * are ordinary and not special. > > - */ > > - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; > > - return 0; > > -} > > - > > -static const struct file_operations vgem_driver_fops = { > > - .owner = THIS_MODULE, > > - .open = drm_open, > > - .mmap = vgem_mmap, > > - .poll = drm_poll, > > - .read = drm_read, > > - .unlocked_ioctl = drm_ioctl, > > - .compat_ioctl = drm_compat_ioctl, > > - .release = drm_release, > > -}; > > - > > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) > > -{ > > - mutex_lock(&bo->pages_lock); > > - if (bo->pages_pin_count++ == 0) { > > - struct page **pages; > > - > > - pages = drm_gem_get_pages(&bo->base); > > - if (IS_ERR(pages)) { > > - bo->pages_pin_count--; > > - mutex_unlock(&bo->pages_lock); > > - return pages; > > - } > > - > > - bo->pages = pages; > > - } > > - mutex_unlock(&bo->pages_lock); > > - > > - return bo->pages; > > -} > > - > > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) > > -{ > > - mutex_lock(&bo->pages_lock); > > - if (--bo->pages_pin_count == 0) { > > - drm_gem_put_pages(&bo->base, bo->pages, true, true); > > - bo->pages = NULL; > > - } > > - mutex_unlock(&bo->pages_lock); > > -} > > - > > -static int vgem_prime_pin(struct drm_gem_object *obj) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - long n_pages = obj->size >> PAGE_SHIFT; > > - struct page **pages; > > - > > - pages = vgem_pin_pages(bo); > > - if (IS_ERR(pages)) > > - return PTR_ERR(pages); > > - > > - /* Flush the object from the CPU cache so that importers can rely > > - * on coherent indirect access via the exported dma-address. > > - */ > > - drm_clflush_pages(pages, n_pages); > > - > > - return 0; > > -} > > - > > -static void vgem_prime_unpin(struct drm_gem_object *obj) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - > > - vgem_unpin_pages(bo); > > -} > > - > > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - > > - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); > > -} > > - > > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, > > - struct dma_buf *dma_buf) > > -{ > > - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); > > - > > - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); > > -} > > - > > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, > > - struct dma_buf_attachment *attach, struct sg_table *sg) > > -{ > > - struct drm_vgem_gem_object *obj; > > - int npages; > > - > > - obj = __vgem_gem_create(dev, attach->dmabuf->size); > > - if (IS_ERR(obj)) > > - return ERR_CAST(obj); > > - > > - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; > > - > > - obj->table = sg; > > - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); > > - if (!obj->pages) { > > - __vgem_gem_destroy(obj); > > - return ERR_PTR(-ENOMEM); > > - } > > - > > - obj->pages_pin_count++; /* perma-pinned */ > > - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); > > - return &obj->base; > > -} > > - > > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - long n_pages = obj->size >> PAGE_SHIFT; > > - struct page **pages; > > - void *vaddr; > > - > > - pages = vgem_pin_pages(bo); > > - if (IS_ERR(pages)) > > - return PTR_ERR(pages); > > - > > - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); > > - if (!vaddr) > > - return -ENOMEM; > > - dma_buf_map_set_vaddr(map, vaddr); > > - > > - return 0; > > -} > > - > > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) > > -{ > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > - > > - vunmap(map->vaddr); > > - vgem_unpin_pages(bo); > > -} > > - > > -static int vgem_prime_mmap(struct drm_gem_object *obj, > > - struct vm_area_struct *vma) > > -{ > > - int ret; > > - > > - if (obj->size < vma->vm_end - vma->vm_start) > > - return -EINVAL; > > - > > - if (!obj->filp) > > - return -ENODEV; > > - > > - ret = call_mmap(obj->filp, vma); > > - if (ret) > > - return ret; > > - > > - vma_set_file(vma, obj->filp); > > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > > - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); > > - > > - return 0; > > -} > > - > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { > > - .free = vgem_gem_free_object, > > - .pin = vgem_prime_pin, > > - .unpin = vgem_prime_unpin, > > - .get_sg_table = vgem_prime_get_sg_table, > > - .vmap = vgem_prime_vmap, > > - .vunmap = vgem_prime_vunmap, > > - .vm_ops = &vgem_gem_vm_ops, > > -}; > > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); > > > > static const struct drm_driver vgem_driver = { > > .driver_features = DRIVER_GEM | DRIVER_RENDER, > > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { > > .num_ioctls = ARRAY_SIZE(vgem_ioctls), > > .fops = &vgem_driver_fops, > > > > - .dumb_create = vgem_gem_dumb_create, > > - > > - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > > - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > > - .gem_prime_import = vgem_prime_import, > > - .gem_prime_import_sg_table = vgem_prime_import_sg_table, > > - .gem_prime_mmap = vgem_prime_mmap, > > + DRM_GEM_SHMEM_DRIVER_OPS, > > > > .name = DRIVER_NAME, > > .desc = DRIVER_DESC, > > > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Felix Imendörffer > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH] drm/vgem: use shmem helpers 2021-02-26 13:30 ` [Intel-gfx] " Daniel Vetter @ 2021-02-26 13:51 ` Thomas Zimmermann -1 siblings, 0 replies; 110+ messages in thread From: Thomas Zimmermann @ 2021-02-26 13:51 UTC (permalink / raw) To: Daniel Vetter Cc: Intel Graphics Development, DRI Development, Chris Wilson, Melissa Wen, Daniel Vetter, Christian König [-- Attachment #1.1.1: Type: text/plain, Size: 16362 bytes --] Hi Am 26.02.21 um 14:30 schrieb Daniel Vetter: > On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: >> >> Hi >> >> Am 25.02.21 um 11:23 schrieb Daniel Vetter: >>> Aside from deleting lots of code the real motivation here is to switch >>> the mmap over to VM_PFNMAP, to be more consistent with what real gpu >>> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't >>> work, and even if you try and there's a struct page behind that, >>> touching it and mucking around with its refcount can upset drivers >>> real bad. >>> >>> v2: Review from Thomas: >>> - sort #include >>> - drop more dead code that I didn't spot somehow >>> >>> v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) >> >> Since you're working on it, could you move the config item into a >> Kconfig file under vgem? > > We have a lot of drivers still without their own Kconfig. I thought > we're only doing that for drivers which have multiple options, or > otherwise would clutter up the main drm/Kconfig file? > > Not opposed to this, just feels like if we do this, should do it for > all of them. I didn't know that there was a rule for how to handle this. I just didn't like to have driver config rules in the main Kconfig file. But yeah, maybe let's change this consistently in a separate patchset. Best regards Thomas > -Daniel > > >> >> Best regards >> Thomas >> >>> >>> Cc: Thomas Zimmermann <tzimmermann@suse.de> >>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> >>> Cc: John Stultz <john.stultz@linaro.org> >>> Cc: Sumit Semwal <sumit.semwal@linaro.org> >>> Cc: "Christian König" <christian.koenig@amd.com> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> >>> Cc: Melissa Wen <melissa.srw@gmail.com> >>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>> --- >>> drivers/gpu/drm/Kconfig | 1 + >>> drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- >>> 2 files changed, 4 insertions(+), 337 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig >>> index 8e73311de583..94e4ac830283 100644 >>> --- a/drivers/gpu/drm/Kconfig >>> +++ b/drivers/gpu/drm/Kconfig >>> @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" >>> config DRM_VGEM >>> tristate "Virtual GEM provider" >>> depends on DRM >>> + select DRM_GEM_SHMEM_HELPER >>> help >>> Choose this option to get a virtual graphics memory manager, >>> as used by Mesa's software renderer for enhanced performance. >>> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c >>> index a0e75f1d5d01..b1b3a5ffc542 100644 >>> --- a/drivers/gpu/drm/vgem/vgem_drv.c >>> +++ b/drivers/gpu/drm/vgem/vgem_drv.c >>> @@ -38,6 +38,7 @@ >>> >>> #include <drm/drm_drv.h> >>> #include <drm/drm_file.h> >>> +#include <drm/drm_gem_shmem_helper.h> >>> #include <drm/drm_ioctl.h> >>> #include <drm/drm_managed.h> >>> #include <drm/drm_prime.h> >>> @@ -50,87 +51,11 @@ >>> #define DRIVER_MAJOR 1 >>> #define DRIVER_MINOR 0 >>> >>> -static const struct drm_gem_object_funcs vgem_gem_object_funcs; >>> - >>> static struct vgem_device { >>> struct drm_device drm; >>> struct platform_device *platform; >>> } *vgem_device; >>> >>> -static void vgem_gem_free_object(struct drm_gem_object *obj) >>> -{ >>> - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); >>> - >>> - kvfree(vgem_obj->pages); >>> - mutex_destroy(&vgem_obj->pages_lock); >>> - >>> - if (obj->import_attach) >>> - drm_prime_gem_destroy(obj, vgem_obj->table); >>> - >>> - drm_gem_object_release(obj); >>> - kfree(vgem_obj); >>> -} >>> - >>> -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) >>> -{ >>> - struct vm_area_struct *vma = vmf->vma; >>> - struct drm_vgem_gem_object *obj = vma->vm_private_data; >>> - /* We don't use vmf->pgoff since that has the fake offset */ >>> - unsigned long vaddr = vmf->address; >>> - vm_fault_t ret = VM_FAULT_SIGBUS; >>> - loff_t num_pages; >>> - pgoff_t page_offset; >>> - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; >>> - >>> - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); >>> - >>> - if (page_offset >= num_pages) >>> - return VM_FAULT_SIGBUS; >>> - >>> - mutex_lock(&obj->pages_lock); >>> - if (obj->pages) { >>> - get_page(obj->pages[page_offset]); >>> - vmf->page = obj->pages[page_offset]; >>> - ret = 0; >>> - } >>> - mutex_unlock(&obj->pages_lock); >>> - if (ret) { >>> - struct page *page; >>> - >>> - page = shmem_read_mapping_page( >>> - file_inode(obj->base.filp)->i_mapping, >>> - page_offset); >>> - if (!IS_ERR(page)) { >>> - vmf->page = page; >>> - ret = 0; >>> - } else switch (PTR_ERR(page)) { >>> - case -ENOSPC: >>> - case -ENOMEM: >>> - ret = VM_FAULT_OOM; >>> - break; >>> - case -EBUSY: >>> - ret = VM_FAULT_RETRY; >>> - break; >>> - case -EFAULT: >>> - case -EINVAL: >>> - ret = VM_FAULT_SIGBUS; >>> - break; >>> - default: >>> - WARN_ON(PTR_ERR(page)); >>> - ret = VM_FAULT_SIGBUS; >>> - break; >>> - } >>> - >>> - } >>> - return ret; >>> -} >>> - >>> -static const struct vm_operations_struct vgem_gem_vm_ops = { >>> - .fault = vgem_gem_fault, >>> - .open = drm_gem_vm_open, >>> - .close = drm_gem_vm_close, >>> -}; >>> - >>> static int vgem_open(struct drm_device *dev, struct drm_file *file) >>> { >>> struct vgem_file *vfile; >>> @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) >>> kfree(vfile); >>> } >>> >>> -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, >>> - unsigned long size) >>> -{ >>> - struct drm_vgem_gem_object *obj; >>> - int ret; >>> - >>> - obj = kzalloc(sizeof(*obj), GFP_KERNEL); >>> - if (!obj) >>> - return ERR_PTR(-ENOMEM); >>> - >>> - obj->base.funcs = &vgem_gem_object_funcs; >>> - >>> - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); >>> - if (ret) { >>> - kfree(obj); >>> - return ERR_PTR(ret); >>> - } >>> - >>> - mutex_init(&obj->pages_lock); >>> - >>> - return obj; >>> -} >>> - >>> -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) >>> -{ >>> - drm_gem_object_release(&obj->base); >>> - kfree(obj); >>> -} >>> - >>> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, >>> - struct drm_file *file, >>> - unsigned int *handle, >>> - unsigned long size) >>> -{ >>> - struct drm_vgem_gem_object *obj; >>> - int ret; >>> - >>> - obj = __vgem_gem_create(dev, size); >>> - if (IS_ERR(obj)) >>> - return ERR_CAST(obj); >>> - >>> - ret = drm_gem_handle_create(file, &obj->base, handle); >>> - if (ret) { >>> - drm_gem_object_put(&obj->base); >>> - return ERR_PTR(ret); >>> - } >>> - >>> - return &obj->base; >>> -} >>> - >>> -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, >>> - struct drm_mode_create_dumb *args) >>> -{ >>> - struct drm_gem_object *gem_object; >>> - u64 pitch, size; >>> - >>> - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); >>> - size = args->height * pitch; >>> - if (size == 0) >>> - return -EINVAL; >>> - >>> - gem_object = vgem_gem_create(dev, file, &args->handle, size); >>> - if (IS_ERR(gem_object)) >>> - return PTR_ERR(gem_object); >>> - >>> - args->size = gem_object->size; >>> - args->pitch = pitch; >>> - >>> - drm_gem_object_put(gem_object); >>> - >>> - DRM_DEBUG("Created object of size %llu\n", args->size); >>> - >>> - return 0; >>> -} >>> - >>> static struct drm_ioctl_desc vgem_ioctls[] = { >>> DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), >>> DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), >>> }; >>> >>> -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) >>> -{ >>> - unsigned long flags = vma->vm_flags; >>> - int ret; >>> - >>> - ret = drm_gem_mmap(filp, vma); >>> - if (ret) >>> - return ret; >>> - >>> - /* Keep the WC mmaping set by drm_gem_mmap() but our pages >>> - * are ordinary and not special. >>> - */ >>> - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; >>> - return 0; >>> -} >>> - >>> -static const struct file_operations vgem_driver_fops = { >>> - .owner = THIS_MODULE, >>> - .open = drm_open, >>> - .mmap = vgem_mmap, >>> - .poll = drm_poll, >>> - .read = drm_read, >>> - .unlocked_ioctl = drm_ioctl, >>> - .compat_ioctl = drm_compat_ioctl, >>> - .release = drm_release, >>> -}; >>> - >>> -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) >>> -{ >>> - mutex_lock(&bo->pages_lock); >>> - if (bo->pages_pin_count++ == 0) { >>> - struct page **pages; >>> - >>> - pages = drm_gem_get_pages(&bo->base); >>> - if (IS_ERR(pages)) { >>> - bo->pages_pin_count--; >>> - mutex_unlock(&bo->pages_lock); >>> - return pages; >>> - } >>> - >>> - bo->pages = pages; >>> - } >>> - mutex_unlock(&bo->pages_lock); >>> - >>> - return bo->pages; >>> -} >>> - >>> -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) >>> -{ >>> - mutex_lock(&bo->pages_lock); >>> - if (--bo->pages_pin_count == 0) { >>> - drm_gem_put_pages(&bo->base, bo->pages, true, true); >>> - bo->pages = NULL; >>> - } >>> - mutex_unlock(&bo->pages_lock); >>> -} >>> - >>> -static int vgem_prime_pin(struct drm_gem_object *obj) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - long n_pages = obj->size >> PAGE_SHIFT; >>> - struct page **pages; >>> - >>> - pages = vgem_pin_pages(bo); >>> - if (IS_ERR(pages)) >>> - return PTR_ERR(pages); >>> - >>> - /* Flush the object from the CPU cache so that importers can rely >>> - * on coherent indirect access via the exported dma-address. >>> - */ >>> - drm_clflush_pages(pages, n_pages); >>> - >>> - return 0; >>> -} >>> - >>> -static void vgem_prime_unpin(struct drm_gem_object *obj) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - >>> - vgem_unpin_pages(bo); >>> -} >>> - >>> -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - >>> - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); >>> -} >>> - >>> -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, >>> - struct dma_buf *dma_buf) >>> -{ >>> - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); >>> - >>> - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); >>> -} >>> - >>> -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, >>> - struct dma_buf_attachment *attach, struct sg_table *sg) >>> -{ >>> - struct drm_vgem_gem_object *obj; >>> - int npages; >>> - >>> - obj = __vgem_gem_create(dev, attach->dmabuf->size); >>> - if (IS_ERR(obj)) >>> - return ERR_CAST(obj); >>> - >>> - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; >>> - >>> - obj->table = sg; >>> - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); >>> - if (!obj->pages) { >>> - __vgem_gem_destroy(obj); >>> - return ERR_PTR(-ENOMEM); >>> - } >>> - >>> - obj->pages_pin_count++; /* perma-pinned */ >>> - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); >>> - return &obj->base; >>> -} >>> - >>> -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - long n_pages = obj->size >> PAGE_SHIFT; >>> - struct page **pages; >>> - void *vaddr; >>> - >>> - pages = vgem_pin_pages(bo); >>> - if (IS_ERR(pages)) >>> - return PTR_ERR(pages); >>> - >>> - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); >>> - if (!vaddr) >>> - return -ENOMEM; >>> - dma_buf_map_set_vaddr(map, vaddr); >>> - >>> - return 0; >>> -} >>> - >>> -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - >>> - vunmap(map->vaddr); >>> - vgem_unpin_pages(bo); >>> -} >>> - >>> -static int vgem_prime_mmap(struct drm_gem_object *obj, >>> - struct vm_area_struct *vma) >>> -{ >>> - int ret; >>> - >>> - if (obj->size < vma->vm_end - vma->vm_start) >>> - return -EINVAL; >>> - >>> - if (!obj->filp) >>> - return -ENODEV; >>> - >>> - ret = call_mmap(obj->filp, vma); >>> - if (ret) >>> - return ret; >>> - >>> - vma_set_file(vma, obj->filp); >>> - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; >>> - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); >>> - >>> - return 0; >>> -} >>> - >>> -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { >>> - .free = vgem_gem_free_object, >>> - .pin = vgem_prime_pin, >>> - .unpin = vgem_prime_unpin, >>> - .get_sg_table = vgem_prime_get_sg_table, >>> - .vmap = vgem_prime_vmap, >>> - .vunmap = vgem_prime_vunmap, >>> - .vm_ops = &vgem_gem_vm_ops, >>> -}; >>> +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); >>> >>> static const struct drm_driver vgem_driver = { >>> .driver_features = DRIVER_GEM | DRIVER_RENDER, >>> @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { >>> .num_ioctls = ARRAY_SIZE(vgem_ioctls), >>> .fops = &vgem_driver_fops, >>> >>> - .dumb_create = vgem_gem_dumb_create, >>> - >>> - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, >>> - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, >>> - .gem_prime_import = vgem_prime_import, >>> - .gem_prime_import_sg_table = vgem_prime_import_sg_table, >>> - .gem_prime_mmap = vgem_prime_mmap, >>> + DRM_GEM_SHMEM_DRIVER_OPS, >>> >>> .name = DRIVER_NAME, >>> .desc = DRIVER_DESC, >>> >> >> -- >> Thomas Zimmermann >> Graphics Driver Developer >> SUSE Software Solutions Germany GmbH >> Maxfeldstr. 5, 90409 Nürnberg, Germany >> (HRB 36809, AG Nürnberg) >> Geschäftsführer: Felix Imendörffer >> > > -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 840 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/vgem: use shmem helpers @ 2021-02-26 13:51 ` Thomas Zimmermann 0 siblings, 0 replies; 110+ messages in thread From: Thomas Zimmermann @ 2021-02-26 13:51 UTC (permalink / raw) To: Daniel Vetter Cc: Intel Graphics Development, DRI Development, Chris Wilson, Melissa Wen, Daniel Vetter, Christian König [-- Attachment #1.1.1: Type: text/plain, Size: 16362 bytes --] Hi Am 26.02.21 um 14:30 schrieb Daniel Vetter: > On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: >> >> Hi >> >> Am 25.02.21 um 11:23 schrieb Daniel Vetter: >>> Aside from deleting lots of code the real motivation here is to switch >>> the mmap over to VM_PFNMAP, to be more consistent with what real gpu >>> drivers do. They're all VM_PFNMP, which means get_user_pages doesn't >>> work, and even if you try and there's a struct page behind that, >>> touching it and mucking around with its refcount can upset drivers >>> real bad. >>> >>> v2: Review from Thomas: >>> - sort #include >>> - drop more dead code that I didn't spot somehow >>> >>> v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) >> >> Since you're working on it, could you move the config item into a >> Kconfig file under vgem? > > We have a lot of drivers still without their own Kconfig. I thought > we're only doing that for drivers which have multiple options, or > otherwise would clutter up the main drm/Kconfig file? > > Not opposed to this, just feels like if we do this, should do it for > all of them. I didn't know that there was a rule for how to handle this. I just didn't like to have driver config rules in the main Kconfig file. But yeah, maybe let's change this consistently in a separate patchset. Best regards Thomas > -Daniel > > >> >> Best regards >> Thomas >> >>> >>> Cc: Thomas Zimmermann <tzimmermann@suse.de> >>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> >>> Cc: John Stultz <john.stultz@linaro.org> >>> Cc: Sumit Semwal <sumit.semwal@linaro.org> >>> Cc: "Christian König" <christian.koenig@amd.com> >>> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> >>> Cc: Melissa Wen <melissa.srw@gmail.com> >>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>> --- >>> drivers/gpu/drm/Kconfig | 1 + >>> drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- >>> 2 files changed, 4 insertions(+), 337 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig >>> index 8e73311de583..94e4ac830283 100644 >>> --- a/drivers/gpu/drm/Kconfig >>> +++ b/drivers/gpu/drm/Kconfig >>> @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" >>> config DRM_VGEM >>> tristate "Virtual GEM provider" >>> depends on DRM >>> + select DRM_GEM_SHMEM_HELPER >>> help >>> Choose this option to get a virtual graphics memory manager, >>> as used by Mesa's software renderer for enhanced performance. >>> diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c >>> index a0e75f1d5d01..b1b3a5ffc542 100644 >>> --- a/drivers/gpu/drm/vgem/vgem_drv.c >>> +++ b/drivers/gpu/drm/vgem/vgem_drv.c >>> @@ -38,6 +38,7 @@ >>> >>> #include <drm/drm_drv.h> >>> #include <drm/drm_file.h> >>> +#include <drm/drm_gem_shmem_helper.h> >>> #include <drm/drm_ioctl.h> >>> #include <drm/drm_managed.h> >>> #include <drm/drm_prime.h> >>> @@ -50,87 +51,11 @@ >>> #define DRIVER_MAJOR 1 >>> #define DRIVER_MINOR 0 >>> >>> -static const struct drm_gem_object_funcs vgem_gem_object_funcs; >>> - >>> static struct vgem_device { >>> struct drm_device drm; >>> struct platform_device *platform; >>> } *vgem_device; >>> >>> -static void vgem_gem_free_object(struct drm_gem_object *obj) >>> -{ >>> - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); >>> - >>> - kvfree(vgem_obj->pages); >>> - mutex_destroy(&vgem_obj->pages_lock); >>> - >>> - if (obj->import_attach) >>> - drm_prime_gem_destroy(obj, vgem_obj->table); >>> - >>> - drm_gem_object_release(obj); >>> - kfree(vgem_obj); >>> -} >>> - >>> -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) >>> -{ >>> - struct vm_area_struct *vma = vmf->vma; >>> - struct drm_vgem_gem_object *obj = vma->vm_private_data; >>> - /* We don't use vmf->pgoff since that has the fake offset */ >>> - unsigned long vaddr = vmf->address; >>> - vm_fault_t ret = VM_FAULT_SIGBUS; >>> - loff_t num_pages; >>> - pgoff_t page_offset; >>> - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; >>> - >>> - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); >>> - >>> - if (page_offset >= num_pages) >>> - return VM_FAULT_SIGBUS; >>> - >>> - mutex_lock(&obj->pages_lock); >>> - if (obj->pages) { >>> - get_page(obj->pages[page_offset]); >>> - vmf->page = obj->pages[page_offset]; >>> - ret = 0; >>> - } >>> - mutex_unlock(&obj->pages_lock); >>> - if (ret) { >>> - struct page *page; >>> - >>> - page = shmem_read_mapping_page( >>> - file_inode(obj->base.filp)->i_mapping, >>> - page_offset); >>> - if (!IS_ERR(page)) { >>> - vmf->page = page; >>> - ret = 0; >>> - } else switch (PTR_ERR(page)) { >>> - case -ENOSPC: >>> - case -ENOMEM: >>> - ret = VM_FAULT_OOM; >>> - break; >>> - case -EBUSY: >>> - ret = VM_FAULT_RETRY; >>> - break; >>> - case -EFAULT: >>> - case -EINVAL: >>> - ret = VM_FAULT_SIGBUS; >>> - break; >>> - default: >>> - WARN_ON(PTR_ERR(page)); >>> - ret = VM_FAULT_SIGBUS; >>> - break; >>> - } >>> - >>> - } >>> - return ret; >>> -} >>> - >>> -static const struct vm_operations_struct vgem_gem_vm_ops = { >>> - .fault = vgem_gem_fault, >>> - .open = drm_gem_vm_open, >>> - .close = drm_gem_vm_close, >>> -}; >>> - >>> static int vgem_open(struct drm_device *dev, struct drm_file *file) >>> { >>> struct vgem_file *vfile; >>> @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) >>> kfree(vfile); >>> } >>> >>> -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, >>> - unsigned long size) >>> -{ >>> - struct drm_vgem_gem_object *obj; >>> - int ret; >>> - >>> - obj = kzalloc(sizeof(*obj), GFP_KERNEL); >>> - if (!obj) >>> - return ERR_PTR(-ENOMEM); >>> - >>> - obj->base.funcs = &vgem_gem_object_funcs; >>> - >>> - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); >>> - if (ret) { >>> - kfree(obj); >>> - return ERR_PTR(ret); >>> - } >>> - >>> - mutex_init(&obj->pages_lock); >>> - >>> - return obj; >>> -} >>> - >>> -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) >>> -{ >>> - drm_gem_object_release(&obj->base); >>> - kfree(obj); >>> -} >>> - >>> -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, >>> - struct drm_file *file, >>> - unsigned int *handle, >>> - unsigned long size) >>> -{ >>> - struct drm_vgem_gem_object *obj; >>> - int ret; >>> - >>> - obj = __vgem_gem_create(dev, size); >>> - if (IS_ERR(obj)) >>> - return ERR_CAST(obj); >>> - >>> - ret = drm_gem_handle_create(file, &obj->base, handle); >>> - if (ret) { >>> - drm_gem_object_put(&obj->base); >>> - return ERR_PTR(ret); >>> - } >>> - >>> - return &obj->base; >>> -} >>> - >>> -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, >>> - struct drm_mode_create_dumb *args) >>> -{ >>> - struct drm_gem_object *gem_object; >>> - u64 pitch, size; >>> - >>> - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); >>> - size = args->height * pitch; >>> - if (size == 0) >>> - return -EINVAL; >>> - >>> - gem_object = vgem_gem_create(dev, file, &args->handle, size); >>> - if (IS_ERR(gem_object)) >>> - return PTR_ERR(gem_object); >>> - >>> - args->size = gem_object->size; >>> - args->pitch = pitch; >>> - >>> - drm_gem_object_put(gem_object); >>> - >>> - DRM_DEBUG("Created object of size %llu\n", args->size); >>> - >>> - return 0; >>> -} >>> - >>> static struct drm_ioctl_desc vgem_ioctls[] = { >>> DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), >>> DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), >>> }; >>> >>> -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) >>> -{ >>> - unsigned long flags = vma->vm_flags; >>> - int ret; >>> - >>> - ret = drm_gem_mmap(filp, vma); >>> - if (ret) >>> - return ret; >>> - >>> - /* Keep the WC mmaping set by drm_gem_mmap() but our pages >>> - * are ordinary and not special. >>> - */ >>> - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; >>> - return 0; >>> -} >>> - >>> -static const struct file_operations vgem_driver_fops = { >>> - .owner = THIS_MODULE, >>> - .open = drm_open, >>> - .mmap = vgem_mmap, >>> - .poll = drm_poll, >>> - .read = drm_read, >>> - .unlocked_ioctl = drm_ioctl, >>> - .compat_ioctl = drm_compat_ioctl, >>> - .release = drm_release, >>> -}; >>> - >>> -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) >>> -{ >>> - mutex_lock(&bo->pages_lock); >>> - if (bo->pages_pin_count++ == 0) { >>> - struct page **pages; >>> - >>> - pages = drm_gem_get_pages(&bo->base); >>> - if (IS_ERR(pages)) { >>> - bo->pages_pin_count--; >>> - mutex_unlock(&bo->pages_lock); >>> - return pages; >>> - } >>> - >>> - bo->pages = pages; >>> - } >>> - mutex_unlock(&bo->pages_lock); >>> - >>> - return bo->pages; >>> -} >>> - >>> -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) >>> -{ >>> - mutex_lock(&bo->pages_lock); >>> - if (--bo->pages_pin_count == 0) { >>> - drm_gem_put_pages(&bo->base, bo->pages, true, true); >>> - bo->pages = NULL; >>> - } >>> - mutex_unlock(&bo->pages_lock); >>> -} >>> - >>> -static int vgem_prime_pin(struct drm_gem_object *obj) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - long n_pages = obj->size >> PAGE_SHIFT; >>> - struct page **pages; >>> - >>> - pages = vgem_pin_pages(bo); >>> - if (IS_ERR(pages)) >>> - return PTR_ERR(pages); >>> - >>> - /* Flush the object from the CPU cache so that importers can rely >>> - * on coherent indirect access via the exported dma-address. >>> - */ >>> - drm_clflush_pages(pages, n_pages); >>> - >>> - return 0; >>> -} >>> - >>> -static void vgem_prime_unpin(struct drm_gem_object *obj) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - >>> - vgem_unpin_pages(bo); >>> -} >>> - >>> -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - >>> - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); >>> -} >>> - >>> -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, >>> - struct dma_buf *dma_buf) >>> -{ >>> - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); >>> - >>> - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); >>> -} >>> - >>> -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, >>> - struct dma_buf_attachment *attach, struct sg_table *sg) >>> -{ >>> - struct drm_vgem_gem_object *obj; >>> - int npages; >>> - >>> - obj = __vgem_gem_create(dev, attach->dmabuf->size); >>> - if (IS_ERR(obj)) >>> - return ERR_CAST(obj); >>> - >>> - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; >>> - >>> - obj->table = sg; >>> - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); >>> - if (!obj->pages) { >>> - __vgem_gem_destroy(obj); >>> - return ERR_PTR(-ENOMEM); >>> - } >>> - >>> - obj->pages_pin_count++; /* perma-pinned */ >>> - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); >>> - return &obj->base; >>> -} >>> - >>> -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - long n_pages = obj->size >> PAGE_SHIFT; >>> - struct page **pages; >>> - void *vaddr; >>> - >>> - pages = vgem_pin_pages(bo); >>> - if (IS_ERR(pages)) >>> - return PTR_ERR(pages); >>> - >>> - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); >>> - if (!vaddr) >>> - return -ENOMEM; >>> - dma_buf_map_set_vaddr(map, vaddr); >>> - >>> - return 0; >>> -} >>> - >>> -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) >>> -{ >>> - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); >>> - >>> - vunmap(map->vaddr); >>> - vgem_unpin_pages(bo); >>> -} >>> - >>> -static int vgem_prime_mmap(struct drm_gem_object *obj, >>> - struct vm_area_struct *vma) >>> -{ >>> - int ret; >>> - >>> - if (obj->size < vma->vm_end - vma->vm_start) >>> - return -EINVAL; >>> - >>> - if (!obj->filp) >>> - return -ENODEV; >>> - >>> - ret = call_mmap(obj->filp, vma); >>> - if (ret) >>> - return ret; >>> - >>> - vma_set_file(vma, obj->filp); >>> - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; >>> - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); >>> - >>> - return 0; >>> -} >>> - >>> -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { >>> - .free = vgem_gem_free_object, >>> - .pin = vgem_prime_pin, >>> - .unpin = vgem_prime_unpin, >>> - .get_sg_table = vgem_prime_get_sg_table, >>> - .vmap = vgem_prime_vmap, >>> - .vunmap = vgem_prime_vunmap, >>> - .vm_ops = &vgem_gem_vm_ops, >>> -}; >>> +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); >>> >>> static const struct drm_driver vgem_driver = { >>> .driver_features = DRIVER_GEM | DRIVER_RENDER, >>> @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { >>> .num_ioctls = ARRAY_SIZE(vgem_ioctls), >>> .fops = &vgem_driver_fops, >>> >>> - .dumb_create = vgem_gem_dumb_create, >>> - >>> - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, >>> - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, >>> - .gem_prime_import = vgem_prime_import, >>> - .gem_prime_import_sg_table = vgem_prime_import_sg_table, >>> - .gem_prime_mmap = vgem_prime_mmap, >>> + DRM_GEM_SHMEM_DRIVER_OPS, >>> >>> .name = DRIVER_NAME, >>> .desc = DRIVER_DESC, >>> >> >> -- >> Thomas Zimmermann >> Graphics Driver Developer >> SUSE Software Solutions Germany GmbH >> Maxfeldstr. 5, 90409 Nürnberg, Germany >> (HRB 36809, AG Nürnberg) >> Geschäftsführer: Felix Imendörffer >> > > -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Maxfeldstr. 5, 90409 Nürnberg, Germany (HRB 36809, AG Nürnberg) Geschäftsführer: Felix Imendörffer [-- Attachment #1.2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 840 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH] drm/vgem: use shmem helpers 2021-02-26 13:51 ` [Intel-gfx] " Thomas Zimmermann @ 2021-02-26 14:04 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-26 14:04 UTC (permalink / raw) To: Thomas Zimmermann Cc: Daniel Vetter, Intel Graphics Development, DRI Development, Chris Wilson, Melissa Wen, Daniel Vetter, Christian König On Fri, Feb 26, 2021 at 02:51:58PM +0100, Thomas Zimmermann wrote: > Hi > > Am 26.02.21 um 14:30 schrieb Daniel Vetter: > > On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > > > > > Hi > > > > > > Am 25.02.21 um 11:23 schrieb Daniel Vetter: > > > > Aside from deleting lots of code the real motivation here is to switch > > > > the mmap over to VM_PFNMAP, to be more consistent with what real gpu > > > > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't > > > > work, and even if you try and there's a struct page behind that, > > > > touching it and mucking around with its refcount can upset drivers > > > > real bad. > > > > > > > > v2: Review from Thomas: > > > > - sort #include > > > > - drop more dead code that I didn't spot somehow > > > > > > > > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) > > > > > > Since you're working on it, could you move the config item into a > > > Kconfig file under vgem? > > > > We have a lot of drivers still without their own Kconfig. I thought > > we're only doing that for drivers which have multiple options, or > > otherwise would clutter up the main drm/Kconfig file? > > > > Not opposed to this, just feels like if we do this, should do it for > > all of them. > > I didn't know that there was a rule for how to handle this. I just didn't > like to have driver config rules in the main Kconfig file. I don't think it is an actual rule, just how the driver Kconfig files started out. > But yeah, maybe let's change this consistently in a separate patchset. Yeah I looked, we should also put all the driver files at the bottom, and maybe sort them alphabetically or something like that. It's a bit a mess right now. -Daniel > > Best regards > Thomas > > > -Daniel > > > > > > > > > > Best regards > > > Thomas > > > > > > > > > > > Cc: Thomas Zimmermann <tzimmermann@suse.de> > > > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > > > > Cc: John Stultz <john.stultz@linaro.org> > > > > Cc: Sumit Semwal <sumit.semwal@linaro.org> > > > > Cc: "Christian König" <christian.koenig@amd.com> > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > > > Cc: Melissa Wen <melissa.srw@gmail.com> > > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > > --- > > > > drivers/gpu/drm/Kconfig | 1 + > > > > drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- > > > > 2 files changed, 4 insertions(+), 337 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > > > > index 8e73311de583..94e4ac830283 100644 > > > > --- a/drivers/gpu/drm/Kconfig > > > > +++ b/drivers/gpu/drm/Kconfig > > > > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" > > > > config DRM_VGEM > > > > tristate "Virtual GEM provider" > > > > depends on DRM > > > > + select DRM_GEM_SHMEM_HELPER > > > > help > > > > Choose this option to get a virtual graphics memory manager, > > > > as used by Mesa's software renderer for enhanced performance. > > > > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c > > > > index a0e75f1d5d01..b1b3a5ffc542 100644 > > > > --- a/drivers/gpu/drm/vgem/vgem_drv.c > > > > +++ b/drivers/gpu/drm/vgem/vgem_drv.c > > > > @@ -38,6 +38,7 @@ > > > > > > > > #include <drm/drm_drv.h> > > > > #include <drm/drm_file.h> > > > > +#include <drm/drm_gem_shmem_helper.h> > > > > #include <drm/drm_ioctl.h> > > > > #include <drm/drm_managed.h> > > > > #include <drm/drm_prime.h> > > > > @@ -50,87 +51,11 @@ > > > > #define DRIVER_MAJOR 1 > > > > #define DRIVER_MINOR 0 > > > > > > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs; > > > > - > > > > static struct vgem_device { > > > > struct drm_device drm; > > > > struct platform_device *platform; > > > > } *vgem_device; > > > > > > > > -static void vgem_gem_free_object(struct drm_gem_object *obj) > > > > -{ > > > > - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); > > > > - > > > > - kvfree(vgem_obj->pages); > > > > - mutex_destroy(&vgem_obj->pages_lock); > > > > - > > > > - if (obj->import_attach) > > > > - drm_prime_gem_destroy(obj, vgem_obj->table); > > > > - > > > > - drm_gem_object_release(obj); > > > > - kfree(vgem_obj); > > > > -} > > > > - > > > > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) > > > > -{ > > > > - struct vm_area_struct *vma = vmf->vma; > > > > - struct drm_vgem_gem_object *obj = vma->vm_private_data; > > > > - /* We don't use vmf->pgoff since that has the fake offset */ > > > > - unsigned long vaddr = vmf->address; > > > > - vm_fault_t ret = VM_FAULT_SIGBUS; > > > > - loff_t num_pages; > > > > - pgoff_t page_offset; > > > > - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; > > > > - > > > > - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); > > > > - > > > > - if (page_offset >= num_pages) > > > > - return VM_FAULT_SIGBUS; > > > > - > > > > - mutex_lock(&obj->pages_lock); > > > > - if (obj->pages) { > > > > - get_page(obj->pages[page_offset]); > > > > - vmf->page = obj->pages[page_offset]; > > > > - ret = 0; > > > > - } > > > > - mutex_unlock(&obj->pages_lock); > > > > - if (ret) { > > > > - struct page *page; > > > > - > > > > - page = shmem_read_mapping_page( > > > > - file_inode(obj->base.filp)->i_mapping, > > > > - page_offset); > > > > - if (!IS_ERR(page)) { > > > > - vmf->page = page; > > > > - ret = 0; > > > > - } else switch (PTR_ERR(page)) { > > > > - case -ENOSPC: > > > > - case -ENOMEM: > > > > - ret = VM_FAULT_OOM; > > > > - break; > > > > - case -EBUSY: > > > > - ret = VM_FAULT_RETRY; > > > > - break; > > > > - case -EFAULT: > > > > - case -EINVAL: > > > > - ret = VM_FAULT_SIGBUS; > > > > - break; > > > > - default: > > > > - WARN_ON(PTR_ERR(page)); > > > > - ret = VM_FAULT_SIGBUS; > > > > - break; > > > > - } > > > > - > > > > - } > > > > - return ret; > > > > -} > > > > - > > > > -static const struct vm_operations_struct vgem_gem_vm_ops = { > > > > - .fault = vgem_gem_fault, > > > > - .open = drm_gem_vm_open, > > > > - .close = drm_gem_vm_close, > > > > -}; > > > > - > > > > static int vgem_open(struct drm_device *dev, struct drm_file *file) > > > > { > > > > struct vgem_file *vfile; > > > > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) > > > > kfree(vfile); > > > > } > > > > > > > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > > > > - unsigned long size) > > > > -{ > > > > - struct drm_vgem_gem_object *obj; > > > > - int ret; > > > > - > > > > - obj = kzalloc(sizeof(*obj), GFP_KERNEL); > > > > - if (!obj) > > > > - return ERR_PTR(-ENOMEM); > > > > - > > > > - obj->base.funcs = &vgem_gem_object_funcs; > > > > - > > > > - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); > > > > - if (ret) { > > > > - kfree(obj); > > > > - return ERR_PTR(ret); > > > > - } > > > > - > > > > - mutex_init(&obj->pages_lock); > > > > - > > > > - return obj; > > > > -} > > > > - > > > > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) > > > > -{ > > > > - drm_gem_object_release(&obj->base); > > > > - kfree(obj); > > > > -} > > > > - > > > > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > > > > - struct drm_file *file, > > > > - unsigned int *handle, > > > > - unsigned long size) > > > > -{ > > > > - struct drm_vgem_gem_object *obj; > > > > - int ret; > > > > - > > > > - obj = __vgem_gem_create(dev, size); > > > > - if (IS_ERR(obj)) > > > > - return ERR_CAST(obj); > > > > - > > > > - ret = drm_gem_handle_create(file, &obj->base, handle); > > > > - if (ret) { > > > > - drm_gem_object_put(&obj->base); > > > > - return ERR_PTR(ret); > > > > - } > > > > - > > > > - return &obj->base; > > > > -} > > > > - > > > > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, > > > > - struct drm_mode_create_dumb *args) > > > > -{ > > > > - struct drm_gem_object *gem_object; > > > > - u64 pitch, size; > > > > - > > > > - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); > > > > - size = args->height * pitch; > > > > - if (size == 0) > > > > - return -EINVAL; > > > > - > > > > - gem_object = vgem_gem_create(dev, file, &args->handle, size); > > > > - if (IS_ERR(gem_object)) > > > > - return PTR_ERR(gem_object); > > > > - > > > > - args->size = gem_object->size; > > > > - args->pitch = pitch; > > > > - > > > > - drm_gem_object_put(gem_object); > > > > - > > > > - DRM_DEBUG("Created object of size %llu\n", args->size); > > > > - > > > > - return 0; > > > > -} > > > > - > > > > static struct drm_ioctl_desc vgem_ioctls[] = { > > > > DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), > > > > DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), > > > > }; > > > > > > > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) > > > > -{ > > > > - unsigned long flags = vma->vm_flags; > > > > - int ret; > > > > - > > > > - ret = drm_gem_mmap(filp, vma); > > > > - if (ret) > > > > - return ret; > > > > - > > > > - /* Keep the WC mmaping set by drm_gem_mmap() but our pages > > > > - * are ordinary and not special. > > > > - */ > > > > - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; > > > > - return 0; > > > > -} > > > > - > > > > -static const struct file_operations vgem_driver_fops = { > > > > - .owner = THIS_MODULE, > > > > - .open = drm_open, > > > > - .mmap = vgem_mmap, > > > > - .poll = drm_poll, > > > > - .read = drm_read, > > > > - .unlocked_ioctl = drm_ioctl, > > > > - .compat_ioctl = drm_compat_ioctl, > > > > - .release = drm_release, > > > > -}; > > > > - > > > > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) > > > > -{ > > > > - mutex_lock(&bo->pages_lock); > > > > - if (bo->pages_pin_count++ == 0) { > > > > - struct page **pages; > > > > - > > > > - pages = drm_gem_get_pages(&bo->base); > > > > - if (IS_ERR(pages)) { > > > > - bo->pages_pin_count--; > > > > - mutex_unlock(&bo->pages_lock); > > > > - return pages; > > > > - } > > > > - > > > > - bo->pages = pages; > > > > - } > > > > - mutex_unlock(&bo->pages_lock); > > > > - > > > > - return bo->pages; > > > > -} > > > > - > > > > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) > > > > -{ > > > > - mutex_lock(&bo->pages_lock); > > > > - if (--bo->pages_pin_count == 0) { > > > > - drm_gem_put_pages(&bo->base, bo->pages, true, true); > > > > - bo->pages = NULL; > > > > - } > > > > - mutex_unlock(&bo->pages_lock); > > > > -} > > > > - > > > > -static int vgem_prime_pin(struct drm_gem_object *obj) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - long n_pages = obj->size >> PAGE_SHIFT; > > > > - struct page **pages; > > > > - > > > > - pages = vgem_pin_pages(bo); > > > > - if (IS_ERR(pages)) > > > > - return PTR_ERR(pages); > > > > - > > > > - /* Flush the object from the CPU cache so that importers can rely > > > > - * on coherent indirect access via the exported dma-address. > > > > - */ > > > > - drm_clflush_pages(pages, n_pages); > > > > - > > > > - return 0; > > > > -} > > > > - > > > > -static void vgem_prime_unpin(struct drm_gem_object *obj) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - > > > > - vgem_unpin_pages(bo); > > > > -} > > > > - > > > > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - > > > > - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); > > > > -} > > > > - > > > > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, > > > > - struct dma_buf *dma_buf) > > > > -{ > > > > - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); > > > > - > > > > - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); > > > > -} > > > > - > > > > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, > > > > - struct dma_buf_attachment *attach, struct sg_table *sg) > > > > -{ > > > > - struct drm_vgem_gem_object *obj; > > > > - int npages; > > > > - > > > > - obj = __vgem_gem_create(dev, attach->dmabuf->size); > > > > - if (IS_ERR(obj)) > > > > - return ERR_CAST(obj); > > > > - > > > > - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; > > > > - > > > > - obj->table = sg; > > > > - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); > > > > - if (!obj->pages) { > > > > - __vgem_gem_destroy(obj); > > > > - return ERR_PTR(-ENOMEM); > > > > - } > > > > - > > > > - obj->pages_pin_count++; /* perma-pinned */ > > > > - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); > > > > - return &obj->base; > > > > -} > > > > - > > > > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - long n_pages = obj->size >> PAGE_SHIFT; > > > > - struct page **pages; > > > > - void *vaddr; > > > > - > > > > - pages = vgem_pin_pages(bo); > > > > - if (IS_ERR(pages)) > > > > - return PTR_ERR(pages); > > > > - > > > > - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); > > > > - if (!vaddr) > > > > - return -ENOMEM; > > > > - dma_buf_map_set_vaddr(map, vaddr); > > > > - > > > > - return 0; > > > > -} > > > > - > > > > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - > > > > - vunmap(map->vaddr); > > > > - vgem_unpin_pages(bo); > > > > -} > > > > - > > > > -static int vgem_prime_mmap(struct drm_gem_object *obj, > > > > - struct vm_area_struct *vma) > > > > -{ > > > > - int ret; > > > > - > > > > - if (obj->size < vma->vm_end - vma->vm_start) > > > > - return -EINVAL; > > > > - > > > > - if (!obj->filp) > > > > - return -ENODEV; > > > > - > > > > - ret = call_mmap(obj->filp, vma); > > > > - if (ret) > > > > - return ret; > > > > - > > > > - vma_set_file(vma, obj->filp); > > > > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > > > > - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); > > > > - > > > > - return 0; > > > > -} > > > > - > > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { > > > > - .free = vgem_gem_free_object, > > > > - .pin = vgem_prime_pin, > > > > - .unpin = vgem_prime_unpin, > > > > - .get_sg_table = vgem_prime_get_sg_table, > > > > - .vmap = vgem_prime_vmap, > > > > - .vunmap = vgem_prime_vunmap, > > > > - .vm_ops = &vgem_gem_vm_ops, > > > > -}; > > > > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); > > > > > > > > static const struct drm_driver vgem_driver = { > > > > .driver_features = DRIVER_GEM | DRIVER_RENDER, > > > > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { > > > > .num_ioctls = ARRAY_SIZE(vgem_ioctls), > > > > .fops = &vgem_driver_fops, > > > > > > > > - .dumb_create = vgem_gem_dumb_create, > > > > - > > > > - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > > > > - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > > > > - .gem_prime_import = vgem_prime_import, > > > > - .gem_prime_import_sg_table = vgem_prime_import_sg_table, > > > > - .gem_prime_mmap = vgem_prime_mmap, > > > > + DRM_GEM_SHMEM_DRIVER_OPS, > > > > > > > > .name = DRIVER_NAME, > > > > .desc = DRIVER_DESC, > > > > > > > > > > -- > > > Thomas Zimmermann > > > Graphics Driver Developer > > > SUSE Software Solutions Germany GmbH > > > Maxfeldstr. 5, 90409 Nürnberg, Germany > > > (HRB 36809, AG Nürnberg) > > > Geschäftsführer: Felix Imendörffer > > > > > > > > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Felix Imendörffer > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [PATCH] drm/vgem: use shmem helpers @ 2021-02-26 14:04 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-26 14:04 UTC (permalink / raw) To: Thomas Zimmermann Cc: Daniel Vetter, Intel Graphics Development, DRI Development, Chris Wilson, Melissa Wen, Daniel Vetter, Christian König On Fri, Feb 26, 2021 at 02:51:58PM +0100, Thomas Zimmermann wrote: > Hi > > Am 26.02.21 um 14:30 schrieb Daniel Vetter: > > On Fri, Feb 26, 2021 at 10:19 AM Thomas Zimmermann <tzimmermann@suse.de> wrote: > > > > > > Hi > > > > > > Am 25.02.21 um 11:23 schrieb Daniel Vetter: > > > > Aside from deleting lots of code the real motivation here is to switch > > > > the mmap over to VM_PFNMAP, to be more consistent with what real gpu > > > > drivers do. They're all VM_PFNMP, which means get_user_pages doesn't > > > > work, and even if you try and there's a struct page behind that, > > > > touching it and mucking around with its refcount can upset drivers > > > > real bad. > > > > > > > > v2: Review from Thomas: > > > > - sort #include > > > > - drop more dead code that I didn't spot somehow > > > > > > > > v3: select DRM_GEM_SHMEM_HELPER to make it build (intel-gfx-ci) > > > > > > Since you're working on it, could you move the config item into a > > > Kconfig file under vgem? > > > > We have a lot of drivers still without their own Kconfig. I thought > > we're only doing that for drivers which have multiple options, or > > otherwise would clutter up the main drm/Kconfig file? > > > > Not opposed to this, just feels like if we do this, should do it for > > all of them. > > I didn't know that there was a rule for how to handle this. I just didn't > like to have driver config rules in the main Kconfig file. I don't think it is an actual rule, just how the driver Kconfig files started out. > But yeah, maybe let's change this consistently in a separate patchset. Yeah I looked, we should also put all the driver files at the bottom, and maybe sort them alphabetically or something like that. It's a bit a mess right now. -Daniel > > Best regards > Thomas > > > -Daniel > > > > > > > > > > Best regards > > > Thomas > > > > > > > > > > > Cc: Thomas Zimmermann <tzimmermann@suse.de> > > > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > > > > Cc: John Stultz <john.stultz@linaro.org> > > > > Cc: Sumit Semwal <sumit.semwal@linaro.org> > > > > Cc: "Christian König" <christian.koenig@amd.com> > > > > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > > > > Cc: Melissa Wen <melissa.srw@gmail.com> > > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > > --- > > > > drivers/gpu/drm/Kconfig | 1 + > > > > drivers/gpu/drm/vgem/vgem_drv.c | 340 +------------------------------- > > > > 2 files changed, 4 insertions(+), 337 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig > > > > index 8e73311de583..94e4ac830283 100644 > > > > --- a/drivers/gpu/drm/Kconfig > > > > +++ b/drivers/gpu/drm/Kconfig > > > > @@ -274,6 +274,7 @@ source "drivers/gpu/drm/kmb/Kconfig" > > > > config DRM_VGEM > > > > tristate "Virtual GEM provider" > > > > depends on DRM > > > > + select DRM_GEM_SHMEM_HELPER > > > > help > > > > Choose this option to get a virtual graphics memory manager, > > > > as used by Mesa's software renderer for enhanced performance. > > > > diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c > > > > index a0e75f1d5d01..b1b3a5ffc542 100644 > > > > --- a/drivers/gpu/drm/vgem/vgem_drv.c > > > > +++ b/drivers/gpu/drm/vgem/vgem_drv.c > > > > @@ -38,6 +38,7 @@ > > > > > > > > #include <drm/drm_drv.h> > > > > #include <drm/drm_file.h> > > > > +#include <drm/drm_gem_shmem_helper.h> > > > > #include <drm/drm_ioctl.h> > > > > #include <drm/drm_managed.h> > > > > #include <drm/drm_prime.h> > > > > @@ -50,87 +51,11 @@ > > > > #define DRIVER_MAJOR 1 > > > > #define DRIVER_MINOR 0 > > > > > > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs; > > > > - > > > > static struct vgem_device { > > > > struct drm_device drm; > > > > struct platform_device *platform; > > > > } *vgem_device; > > > > > > > > -static void vgem_gem_free_object(struct drm_gem_object *obj) > > > > -{ > > > > - struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj); > > > > - > > > > - kvfree(vgem_obj->pages); > > > > - mutex_destroy(&vgem_obj->pages_lock); > > > > - > > > > - if (obj->import_attach) > > > > - drm_prime_gem_destroy(obj, vgem_obj->table); > > > > - > > > > - drm_gem_object_release(obj); > > > > - kfree(vgem_obj); > > > > -} > > > > - > > > > -static vm_fault_t vgem_gem_fault(struct vm_fault *vmf) > > > > -{ > > > > - struct vm_area_struct *vma = vmf->vma; > > > > - struct drm_vgem_gem_object *obj = vma->vm_private_data; > > > > - /* We don't use vmf->pgoff since that has the fake offset */ > > > > - unsigned long vaddr = vmf->address; > > > > - vm_fault_t ret = VM_FAULT_SIGBUS; > > > > - loff_t num_pages; > > > > - pgoff_t page_offset; > > > > - page_offset = (vaddr - vma->vm_start) >> PAGE_SHIFT; > > > > - > > > > - num_pages = DIV_ROUND_UP(obj->base.size, PAGE_SIZE); > > > > - > > > > - if (page_offset >= num_pages) > > > > - return VM_FAULT_SIGBUS; > > > > - > > > > - mutex_lock(&obj->pages_lock); > > > > - if (obj->pages) { > > > > - get_page(obj->pages[page_offset]); > > > > - vmf->page = obj->pages[page_offset]; > > > > - ret = 0; > > > > - } > > > > - mutex_unlock(&obj->pages_lock); > > > > - if (ret) { > > > > - struct page *page; > > > > - > > > > - page = shmem_read_mapping_page( > > > > - file_inode(obj->base.filp)->i_mapping, > > > > - page_offset); > > > > - if (!IS_ERR(page)) { > > > > - vmf->page = page; > > > > - ret = 0; > > > > - } else switch (PTR_ERR(page)) { > > > > - case -ENOSPC: > > > > - case -ENOMEM: > > > > - ret = VM_FAULT_OOM; > > > > - break; > > > > - case -EBUSY: > > > > - ret = VM_FAULT_RETRY; > > > > - break; > > > > - case -EFAULT: > > > > - case -EINVAL: > > > > - ret = VM_FAULT_SIGBUS; > > > > - break; > > > > - default: > > > > - WARN_ON(PTR_ERR(page)); > > > > - ret = VM_FAULT_SIGBUS; > > > > - break; > > > > - } > > > > - > > > > - } > > > > - return ret; > > > > -} > > > > - > > > > -static const struct vm_operations_struct vgem_gem_vm_ops = { > > > > - .fault = vgem_gem_fault, > > > > - .open = drm_gem_vm_open, > > > > - .close = drm_gem_vm_close, > > > > -}; > > > > - > > > > static int vgem_open(struct drm_device *dev, struct drm_file *file) > > > > { > > > > struct vgem_file *vfile; > > > > @@ -159,265 +84,12 @@ static void vgem_postclose(struct drm_device *dev, struct drm_file *file) > > > > kfree(vfile); > > > > } > > > > > > > > -static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev, > > > > - unsigned long size) > > > > -{ > > > > - struct drm_vgem_gem_object *obj; > > > > - int ret; > > > > - > > > > - obj = kzalloc(sizeof(*obj), GFP_KERNEL); > > > > - if (!obj) > > > > - return ERR_PTR(-ENOMEM); > > > > - > > > > - obj->base.funcs = &vgem_gem_object_funcs; > > > > - > > > > - ret = drm_gem_object_init(dev, &obj->base, roundup(size, PAGE_SIZE)); > > > > - if (ret) { > > > > - kfree(obj); > > > > - return ERR_PTR(ret); > > > > - } > > > > - > > > > - mutex_init(&obj->pages_lock); > > > > - > > > > - return obj; > > > > -} > > > > - > > > > -static void __vgem_gem_destroy(struct drm_vgem_gem_object *obj) > > > > -{ > > > > - drm_gem_object_release(&obj->base); > > > > - kfree(obj); > > > > -} > > > > - > > > > -static struct drm_gem_object *vgem_gem_create(struct drm_device *dev, > > > > - struct drm_file *file, > > > > - unsigned int *handle, > > > > - unsigned long size) > > > > -{ > > > > - struct drm_vgem_gem_object *obj; > > > > - int ret; > > > > - > > > > - obj = __vgem_gem_create(dev, size); > > > > - if (IS_ERR(obj)) > > > > - return ERR_CAST(obj); > > > > - > > > > - ret = drm_gem_handle_create(file, &obj->base, handle); > > > > - if (ret) { > > > > - drm_gem_object_put(&obj->base); > > > > - return ERR_PTR(ret); > > > > - } > > > > - > > > > - return &obj->base; > > > > -} > > > > - > > > > -static int vgem_gem_dumb_create(struct drm_file *file, struct drm_device *dev, > > > > - struct drm_mode_create_dumb *args) > > > > -{ > > > > - struct drm_gem_object *gem_object; > > > > - u64 pitch, size; > > > > - > > > > - pitch = args->width * DIV_ROUND_UP(args->bpp, 8); > > > > - size = args->height * pitch; > > > > - if (size == 0) > > > > - return -EINVAL; > > > > - > > > > - gem_object = vgem_gem_create(dev, file, &args->handle, size); > > > > - if (IS_ERR(gem_object)) > > > > - return PTR_ERR(gem_object); > > > > - > > > > - args->size = gem_object->size; > > > > - args->pitch = pitch; > > > > - > > > > - drm_gem_object_put(gem_object); > > > > - > > > > - DRM_DEBUG("Created object of size %llu\n", args->size); > > > > - > > > > - return 0; > > > > -} > > > > - > > > > static struct drm_ioctl_desc vgem_ioctls[] = { > > > > DRM_IOCTL_DEF_DRV(VGEM_FENCE_ATTACH, vgem_fence_attach_ioctl, DRM_RENDER_ALLOW), > > > > DRM_IOCTL_DEF_DRV(VGEM_FENCE_SIGNAL, vgem_fence_signal_ioctl, DRM_RENDER_ALLOW), > > > > }; > > > > > > > > -static int vgem_mmap(struct file *filp, struct vm_area_struct *vma) > > > > -{ > > > > - unsigned long flags = vma->vm_flags; > > > > - int ret; > > > > - > > > > - ret = drm_gem_mmap(filp, vma); > > > > - if (ret) > > > > - return ret; > > > > - > > > > - /* Keep the WC mmaping set by drm_gem_mmap() but our pages > > > > - * are ordinary and not special. > > > > - */ > > > > - vma->vm_flags = flags | VM_DONTEXPAND | VM_DONTDUMP; > > > > - return 0; > > > > -} > > > > - > > > > -static const struct file_operations vgem_driver_fops = { > > > > - .owner = THIS_MODULE, > > > > - .open = drm_open, > > > > - .mmap = vgem_mmap, > > > > - .poll = drm_poll, > > > > - .read = drm_read, > > > > - .unlocked_ioctl = drm_ioctl, > > > > - .compat_ioctl = drm_compat_ioctl, > > > > - .release = drm_release, > > > > -}; > > > > - > > > > -static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo) > > > > -{ > > > > - mutex_lock(&bo->pages_lock); > > > > - if (bo->pages_pin_count++ == 0) { > > > > - struct page **pages; > > > > - > > > > - pages = drm_gem_get_pages(&bo->base); > > > > - if (IS_ERR(pages)) { > > > > - bo->pages_pin_count--; > > > > - mutex_unlock(&bo->pages_lock); > > > > - return pages; > > > > - } > > > > - > > > > - bo->pages = pages; > > > > - } > > > > - mutex_unlock(&bo->pages_lock); > > > > - > > > > - return bo->pages; > > > > -} > > > > - > > > > -static void vgem_unpin_pages(struct drm_vgem_gem_object *bo) > > > > -{ > > > > - mutex_lock(&bo->pages_lock); > > > > - if (--bo->pages_pin_count == 0) { > > > > - drm_gem_put_pages(&bo->base, bo->pages, true, true); > > > > - bo->pages = NULL; > > > > - } > > > > - mutex_unlock(&bo->pages_lock); > > > > -} > > > > - > > > > -static int vgem_prime_pin(struct drm_gem_object *obj) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - long n_pages = obj->size >> PAGE_SHIFT; > > > > - struct page **pages; > > > > - > > > > - pages = vgem_pin_pages(bo); > > > > - if (IS_ERR(pages)) > > > > - return PTR_ERR(pages); > > > > - > > > > - /* Flush the object from the CPU cache so that importers can rely > > > > - * on coherent indirect access via the exported dma-address. > > > > - */ > > > > - drm_clflush_pages(pages, n_pages); > > > > - > > > > - return 0; > > > > -} > > > > - > > > > -static void vgem_prime_unpin(struct drm_gem_object *obj) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - > > > > - vgem_unpin_pages(bo); > > > > -} > > > > - > > > > -static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - > > > > - return drm_prime_pages_to_sg(obj->dev, bo->pages, bo->base.size >> PAGE_SHIFT); > > > > -} > > > > - > > > > -static struct drm_gem_object* vgem_prime_import(struct drm_device *dev, > > > > - struct dma_buf *dma_buf) > > > > -{ > > > > - struct vgem_device *vgem = container_of(dev, typeof(*vgem), drm); > > > > - > > > > - return drm_gem_prime_import_dev(dev, dma_buf, &vgem->platform->dev); > > > > -} > > > > - > > > > -static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev, > > > > - struct dma_buf_attachment *attach, struct sg_table *sg) > > > > -{ > > > > - struct drm_vgem_gem_object *obj; > > > > - int npages; > > > > - > > > > - obj = __vgem_gem_create(dev, attach->dmabuf->size); > > > > - if (IS_ERR(obj)) > > > > - return ERR_CAST(obj); > > > > - > > > > - npages = PAGE_ALIGN(attach->dmabuf->size) / PAGE_SIZE; > > > > - > > > > - obj->table = sg; > > > > - obj->pages = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL); > > > > - if (!obj->pages) { > > > > - __vgem_gem_destroy(obj); > > > > - return ERR_PTR(-ENOMEM); > > > > - } > > > > - > > > > - obj->pages_pin_count++; /* perma-pinned */ > > > > - drm_prime_sg_to_page_array(obj->table, obj->pages, npages); > > > > - return &obj->base; > > > > -} > > > > - > > > > -static int vgem_prime_vmap(struct drm_gem_object *obj, struct dma_buf_map *map) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - long n_pages = obj->size >> PAGE_SHIFT; > > > > - struct page **pages; > > > > - void *vaddr; > > > > - > > > > - pages = vgem_pin_pages(bo); > > > > - if (IS_ERR(pages)) > > > > - return PTR_ERR(pages); > > > > - > > > > - vaddr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL)); > > > > - if (!vaddr) > > > > - return -ENOMEM; > > > > - dma_buf_map_set_vaddr(map, vaddr); > > > > - > > > > - return 0; > > > > -} > > > > - > > > > -static void vgem_prime_vunmap(struct drm_gem_object *obj, struct dma_buf_map *map) > > > > -{ > > > > - struct drm_vgem_gem_object *bo = to_vgem_bo(obj); > > > > - > > > > - vunmap(map->vaddr); > > > > - vgem_unpin_pages(bo); > > > > -} > > > > - > > > > -static int vgem_prime_mmap(struct drm_gem_object *obj, > > > > - struct vm_area_struct *vma) > > > > -{ > > > > - int ret; > > > > - > > > > - if (obj->size < vma->vm_end - vma->vm_start) > > > > - return -EINVAL; > > > > - > > > > - if (!obj->filp) > > > > - return -ENODEV; > > > > - > > > > - ret = call_mmap(obj->filp, vma); > > > > - if (ret) > > > > - return ret; > > > > - > > > > - vma_set_file(vma, obj->filp); > > > > - vma->vm_flags |= VM_DONTEXPAND | VM_DONTDUMP; > > > > - vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags)); > > > > - > > > > - return 0; > > > > -} > > > > - > > > > -static const struct drm_gem_object_funcs vgem_gem_object_funcs = { > > > > - .free = vgem_gem_free_object, > > > > - .pin = vgem_prime_pin, > > > > - .unpin = vgem_prime_unpin, > > > > - .get_sg_table = vgem_prime_get_sg_table, > > > > - .vmap = vgem_prime_vmap, > > > > - .vunmap = vgem_prime_vunmap, > > > > - .vm_ops = &vgem_gem_vm_ops, > > > > -}; > > > > +DEFINE_DRM_GEM_FOPS(vgem_driver_fops); > > > > > > > > static const struct drm_driver vgem_driver = { > > > > .driver_features = DRIVER_GEM | DRIVER_RENDER, > > > > @@ -427,13 +99,7 @@ static const struct drm_driver vgem_driver = { > > > > .num_ioctls = ARRAY_SIZE(vgem_ioctls), > > > > .fops = &vgem_driver_fops, > > > > > > > > - .dumb_create = vgem_gem_dumb_create, > > > > - > > > > - .prime_handle_to_fd = drm_gem_prime_handle_to_fd, > > > > - .prime_fd_to_handle = drm_gem_prime_fd_to_handle, > > > > - .gem_prime_import = vgem_prime_import, > > > > - .gem_prime_import_sg_table = vgem_prime_import_sg_table, > > > > - .gem_prime_mmap = vgem_prime_mmap, > > > > + DRM_GEM_SHMEM_DRIVER_OPS, > > > > > > > > .name = DRIVER_NAME, > > > > .desc = DRIVER_DESC, > > > > > > > > > > -- > > > Thomas Zimmermann > > > Graphics Driver Developer > > > SUSE Software Solutions Germany GmbH > > > Maxfeldstr. 5, 90409 Nürnberg, Germany > > > (HRB 36809, AG Nürnberg) > > > Geschäftsführer: Felix Imendörffer > > > > > > > > > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Maxfeldstr. 5, 90409 Nürnberg, Germany > (HRB 36809, AG Nürnberg) > Geschäftsführer: Felix Imendörffer > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-23 10:59 ` Daniel Vetter ` (2 preceding siblings ...) (?) @ 2021-02-23 11:19 ` Patchwork -1 siblings, 0 replies; 110+ messages in thread From: Patchwork @ 2021-02-23 11:19 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx == Series Details == Series: series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap URL : https://patchwork.freedesktop.org/series/87313/ State : failure == Summary == CALL scripts/checksyscalls.sh CALL scripts/atomic/check-atomics.sh DESCEND objtool CHK include/generated/compile.h Kernel: arch/x86/boot/bzImage is ready (#1) MODPOST Module.symvers ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined! ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined! scripts/Makefile.modpost:111: recipe for target 'Module.symvers' failed make[1]: *** [Module.symvers] Error 1 make[1]: *** Deleting file 'Module.symvers' Makefile:1391: recipe for target 'modules' failed make: *** [modules] Error 2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev2) 2021-02-23 10:59 ` Daniel Vetter ` (3 preceding siblings ...) (?) @ 2021-02-23 13:11 ` Patchwork -1 siblings, 0 replies; 110+ messages in thread From: Patchwork @ 2021-02-23 13:11 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx == Series Details == Series: series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev2) URL : https://patchwork.freedesktop.org/series/87313/ State : failure == Summary == CALL scripts/checksyscalls.sh CALL scripts/atomic/check-atomics.sh DESCEND objtool CHK include/generated/compile.h Kernel: arch/x86/boot/bzImage is ready (#1) MODPOST Module.symvers ERROR: modpost: "drm_gem_shmem_prime_import_sg_table" [drivers/gpu/drm/vgem/vgem.ko] undefined! ERROR: modpost: "drm_gem_shmem_dumb_create" [drivers/gpu/drm/vgem/vgem.ko] undefined! scripts/Makefile.modpost:111: recipe for target 'Module.symvers' failed make[1]: *** [Module.symvers] Error 1 make[1]: *** Deleting file 'Module.symvers' Makefile:1391: recipe for target 'modules' failed make: *** [modules] Error 2 _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-23 10:59 ` Daniel Vetter (?) @ 2021-02-24 7:46 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-24 7:46 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: Intel Graphics Development, Matthew Wilcox, linaro-mm-sig, Jason Gunthorpe, John Stultz, Daniel Vetter, Suren Baghdasaryan, Christian König, linux-media On 2/23/21 11:59 AM, Daniel Vetter wrote: > tldr; DMA buffers aren't normal memory, expecting that you can use > them like that (like calling get_user_pages works, or that they're > accounting like any other normal memory) cannot be guaranteed. > > Since some userspace only runs on integrated devices, where all > buffers are actually all resident system memory, there's a huge > temptation to assume that a struct page is always present and useable > like for any more pagecache backed mmap. This has the potential to > result in a uapi nightmare. > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > blocks get_user_pages and all the other struct page based > infrastructure for everyone. In spirit this is the uapi counterpart to > the kernel-internal CONFIG_DMABUF_DEBUG. > > Motivated by a recent patch which wanted to swich the system dma-buf > heap to vm_insert_page instead of vm_insert_pfn. > > v2: > > Jason brought up that we also want to guarantee that all ptes have the > pte_special flag set, to catch fast get_user_pages (on architectures > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > From auditing the various functions to insert pfn pte entires > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > this should be the correct flag to check for. > If we require VM_PFNMAP, for ordinary page mappings, we also need to disallow COW mappings, since it will not work on architectures that don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). Also worth noting is the comment in ttm_bo_mmap_vma_setup() with possible performance implications with x86 + PAT + VM_PFNMAP + normal pages. That's a very old comment, though, and might not be valid anymore. /Thomas ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 7:46 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-24 7:46 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: Intel Graphics Development, Matthew Wilcox, linaro-mm-sig, Jason Gunthorpe, John Stultz, Daniel Vetter, Suren Baghdasaryan, Christian König, linux-media On 2/23/21 11:59 AM, Daniel Vetter wrote: > tldr; DMA buffers aren't normal memory, expecting that you can use > them like that (like calling get_user_pages works, or that they're > accounting like any other normal memory) cannot be guaranteed. > > Since some userspace only runs on integrated devices, where all > buffers are actually all resident system memory, there's a huge > temptation to assume that a struct page is always present and useable > like for any more pagecache backed mmap. This has the potential to > result in a uapi nightmare. > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > blocks get_user_pages and all the other struct page based > infrastructure for everyone. In spirit this is the uapi counterpart to > the kernel-internal CONFIG_DMABUF_DEBUG. > > Motivated by a recent patch which wanted to swich the system dma-buf > heap to vm_insert_page instead of vm_insert_pfn. > > v2: > > Jason brought up that we also want to guarantee that all ptes have the > pte_special flag set, to catch fast get_user_pages (on architectures > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > From auditing the various functions to insert pfn pte entires > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > this should be the correct flag to check for. > If we require VM_PFNMAP, for ordinary page mappings, we also need to disallow COW mappings, since it will not work on architectures that don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). Also worth noting is the comment in ttm_bo_mmap_vma_setup() with possible performance implications with x86 + PAT + VM_PFNMAP + normal pages. That's a very old comment, though, and might not be valid anymore. /Thomas _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 7:46 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-24 7:46 UTC (permalink / raw) To: Daniel Vetter, DRI Development Cc: Intel Graphics Development, Matthew Wilcox, linaro-mm-sig, Jason Gunthorpe, Daniel Vetter, Suren Baghdasaryan, Christian König, linux-media On 2/23/21 11:59 AM, Daniel Vetter wrote: > tldr; DMA buffers aren't normal memory, expecting that you can use > them like that (like calling get_user_pages works, or that they're > accounting like any other normal memory) cannot be guaranteed. > > Since some userspace only runs on integrated devices, where all > buffers are actually all resident system memory, there's a huge > temptation to assume that a struct page is always present and useable > like for any more pagecache backed mmap. This has the potential to > result in a uapi nightmare. > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > blocks get_user_pages and all the other struct page based > infrastructure for everyone. In spirit this is the uapi counterpart to > the kernel-internal CONFIG_DMABUF_DEBUG. > > Motivated by a recent patch which wanted to swich the system dma-buf > heap to vm_insert_page instead of vm_insert_pfn. > > v2: > > Jason brought up that we also want to guarantee that all ptes have the > pte_special flag set, to catch fast get_user_pages (on architectures > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > From auditing the various functions to insert pfn pte entires > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > this should be the correct flag to check for. > If we require VM_PFNMAP, for ordinary page mappings, we also need to disallow COW mappings, since it will not work on architectures that don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). Also worth noting is the comment in ttm_bo_mmap_vma_setup() with possible performance implications with x86 + PAT + VM_PFNMAP + normal pages. That's a very old comment, though, and might not be valid anymore. /Thomas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-24 7:46 ` Thomas Hellström (Intel) (?) @ 2021-02-24 8:45 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-24 8:45 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: DRI Development, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > tldr; DMA buffers aren't normal memory, expecting that you can use > > them like that (like calling get_user_pages works, or that they're > > accounting like any other normal memory) cannot be guaranteed. > > > > Since some userspace only runs on integrated devices, where all > > buffers are actually all resident system memory, there's a huge > > temptation to assume that a struct page is always present and useable > > like for any more pagecache backed mmap. This has the potential to > > result in a uapi nightmare. > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > blocks get_user_pages and all the other struct page based > > infrastructure for everyone. In spirit this is the uapi counterpart to > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > heap to vm_insert_page instead of vm_insert_pfn. > > > > v2: > > > > Jason brought up that we also want to guarantee that all ptes have the > > pte_special flag set, to catch fast get_user_pages (on architectures > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > From auditing the various functions to insert pfn pte entires > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > this should be the correct flag to check for. > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > disallow COW mappings, since it will not work on architectures that > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). Hm I figured everyone just uses MAP_SHARED for buffer objects since COW really makes absolutely no sense. How would we enforce this? > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > possible performance implications with x86 + PAT + VM_PFNMAP + normal > pages. That's a very old comment, though, and might not be valid anymore. I think that's why ttm has a page cache for these, because it indeed sucks. The PAT changes on pages are rather expensive. There is still an issue for iomem mappings, because the PAT validation does a linear walk of the resource tree (lol) for every vm_insert_pfn. But for i915 at least this is fixed by using the io_mapping infrastructure, which does the PAT reservation only once when you set up the mapping area at driver load. Also TTM uses VM_PFNMAP right now for everything, so it can't be a problem that hurts much :-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 8:45 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-24 8:45 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > tldr; DMA buffers aren't normal memory, expecting that you can use > > them like that (like calling get_user_pages works, or that they're > > accounting like any other normal memory) cannot be guaranteed. > > > > Since some userspace only runs on integrated devices, where all > > buffers are actually all resident system memory, there's a huge > > temptation to assume that a struct page is always present and useable > > like for any more pagecache backed mmap. This has the potential to > > result in a uapi nightmare. > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > blocks get_user_pages and all the other struct page based > > infrastructure for everyone. In spirit this is the uapi counterpart to > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > heap to vm_insert_page instead of vm_insert_pfn. > > > > v2: > > > > Jason brought up that we also want to guarantee that all ptes have the > > pte_special flag set, to catch fast get_user_pages (on architectures > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > From auditing the various functions to insert pfn pte entires > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > this should be the correct flag to check for. > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > disallow COW mappings, since it will not work on architectures that > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). Hm I figured everyone just uses MAP_SHARED for buffer objects since COW really makes absolutely no sense. How would we enforce this? > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > possible performance implications with x86 + PAT + VM_PFNMAP + normal > pages. That's a very old comment, though, and might not be valid anymore. I think that's why ttm has a page cache for these, because it indeed sucks. The PAT changes on pages are rather expensive. There is still an issue for iomem mappings, because the PAT validation does a linear walk of the resource tree (lol) for every vm_insert_pfn. But for i915 at least this is fixed by using the io_mapping infrastructure, which does the PAT reservation only once when you set up the mapping area at driver load. Also TTM uses VM_PFNMAP right now for everything, so it can't be a problem that hurts much :-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 8:45 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-24 8:45 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > tldr; DMA buffers aren't normal memory, expecting that you can use > > them like that (like calling get_user_pages works, or that they're > > accounting like any other normal memory) cannot be guaranteed. > > > > Since some userspace only runs on integrated devices, where all > > buffers are actually all resident system memory, there's a huge > > temptation to assume that a struct page is always present and useable > > like for any more pagecache backed mmap. This has the potential to > > result in a uapi nightmare. > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > blocks get_user_pages and all the other struct page based > > infrastructure for everyone. In spirit this is the uapi counterpart to > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > heap to vm_insert_page instead of vm_insert_pfn. > > > > v2: > > > > Jason brought up that we also want to guarantee that all ptes have the > > pte_special flag set, to catch fast get_user_pages (on architectures > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > From auditing the various functions to insert pfn pte entires > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > this should be the correct flag to check for. > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > disallow COW mappings, since it will not work on architectures that > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). Hm I figured everyone just uses MAP_SHARED for buffer objects since COW really makes absolutely no sense. How would we enforce this? > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > possible performance implications with x86 + PAT + VM_PFNMAP + normal > pages. That's a very old comment, though, and might not be valid anymore. I think that's why ttm has a page cache for these, because it indeed sucks. The PAT changes on pages are rather expensive. There is still an issue for iomem mappings, because the PAT validation does a linear walk of the resource tree (lol) for every vm_insert_pfn. But for i915 at least this is fixed by using the io_mapping infrastructure, which does the PAT reservation only once when you set up the mapping area at driver load. Also TTM uses VM_PFNMAP right now for everything, so it can't be a problem that hurts much :-) -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-24 8:45 ` Daniel Vetter (?) @ 2021-02-24 9:15 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-24 9:15 UTC (permalink / raw) To: Daniel Vetter Cc: DRI Development, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/24/21 9:45 AM, Daniel Vetter wrote: > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>> tldr; DMA buffers aren't normal memory, expecting that you can use >>> them like that (like calling get_user_pages works, or that they're >>> accounting like any other normal memory) cannot be guaranteed. >>> >>> Since some userspace only runs on integrated devices, where all >>> buffers are actually all resident system memory, there's a huge >>> temptation to assume that a struct page is always present and useable >>> like for any more pagecache backed mmap. This has the potential to >>> result in a uapi nightmare. >>> >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>> blocks get_user_pages and all the other struct page based >>> infrastructure for everyone. In spirit this is the uapi counterpart to >>> the kernel-internal CONFIG_DMABUF_DEBUG. >>> >>> Motivated by a recent patch which wanted to swich the system dma-buf >>> heap to vm_insert_page instead of vm_insert_pfn. >>> >>> v2: >>> >>> Jason brought up that we also want to guarantee that all ptes have the >>> pte_special flag set, to catch fast get_user_pages (on architectures >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>> >>> From auditing the various functions to insert pfn pte entires >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>> this should be the correct flag to check for. >>> >> If we require VM_PFNMAP, for ordinary page mappings, we also need to >> disallow COW mappings, since it will not work on architectures that >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > Hm I figured everyone just uses MAP_SHARED for buffer objects since > COW really makes absolutely no sense. How would we enforce this? Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that or allowing MIXEDMAP. >> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >> possible performance implications with x86 + PAT + VM_PFNMAP + normal >> pages. That's a very old comment, though, and might not be valid anymore. > I think that's why ttm has a page cache for these, because it indeed > sucks. The PAT changes on pages are rather expensive. IIRC the page cache was implemented because of the slowness of the caching mode transition itself, more specifically the wbinvd() call + global TLB flush. > > There is still an issue for iomem mappings, because the PAT validation > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > But for i915 at least this is fixed by using the io_mapping > infrastructure, which does the PAT reservation only once when you set > up the mapping area at driver load. Yes, I guess that was the issue that the comment describes, but the issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > problem that hurts much :-) Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > -Daniel /Thomas ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 9:15 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-24 9:15 UTC (permalink / raw) To: Daniel Vetter Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/24/21 9:45 AM, Daniel Vetter wrote: > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>> tldr; DMA buffers aren't normal memory, expecting that you can use >>> them like that (like calling get_user_pages works, or that they're >>> accounting like any other normal memory) cannot be guaranteed. >>> >>> Since some userspace only runs on integrated devices, where all >>> buffers are actually all resident system memory, there's a huge >>> temptation to assume that a struct page is always present and useable >>> like for any more pagecache backed mmap. This has the potential to >>> result in a uapi nightmare. >>> >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>> blocks get_user_pages and all the other struct page based >>> infrastructure for everyone. In spirit this is the uapi counterpart to >>> the kernel-internal CONFIG_DMABUF_DEBUG. >>> >>> Motivated by a recent patch which wanted to swich the system dma-buf >>> heap to vm_insert_page instead of vm_insert_pfn. >>> >>> v2: >>> >>> Jason brought up that we also want to guarantee that all ptes have the >>> pte_special flag set, to catch fast get_user_pages (on architectures >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>> >>> From auditing the various functions to insert pfn pte entires >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>> this should be the correct flag to check for. >>> >> If we require VM_PFNMAP, for ordinary page mappings, we also need to >> disallow COW mappings, since it will not work on architectures that >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > Hm I figured everyone just uses MAP_SHARED for buffer objects since > COW really makes absolutely no sense. How would we enforce this? Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that or allowing MIXEDMAP. >> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >> possible performance implications with x86 + PAT + VM_PFNMAP + normal >> pages. That's a very old comment, though, and might not be valid anymore. > I think that's why ttm has a page cache for these, because it indeed > sucks. The PAT changes on pages are rather expensive. IIRC the page cache was implemented because of the slowness of the caching mode transition itself, more specifically the wbinvd() call + global TLB flush. > > There is still an issue for iomem mappings, because the PAT validation > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > But for i915 at least this is fixed by using the io_mapping > infrastructure, which does the PAT reservation only once when you set > up the mapping area at driver load. Yes, I guess that was the issue that the comment describes, but the issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > problem that hurts much :-) Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > -Daniel /Thomas _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 9:15 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-24 9:15 UTC (permalink / raw) To: Daniel Vetter Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/24/21 9:45 AM, Daniel Vetter wrote: > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>> tldr; DMA buffers aren't normal memory, expecting that you can use >>> them like that (like calling get_user_pages works, or that they're >>> accounting like any other normal memory) cannot be guaranteed. >>> >>> Since some userspace only runs on integrated devices, where all >>> buffers are actually all resident system memory, there's a huge >>> temptation to assume that a struct page is always present and useable >>> like for any more pagecache backed mmap. This has the potential to >>> result in a uapi nightmare. >>> >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>> blocks get_user_pages and all the other struct page based >>> infrastructure for everyone. In spirit this is the uapi counterpart to >>> the kernel-internal CONFIG_DMABUF_DEBUG. >>> >>> Motivated by a recent patch which wanted to swich the system dma-buf >>> heap to vm_insert_page instead of vm_insert_pfn. >>> >>> v2: >>> >>> Jason brought up that we also want to guarantee that all ptes have the >>> pte_special flag set, to catch fast get_user_pages (on architectures >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>> >>> From auditing the various functions to insert pfn pte entires >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>> this should be the correct flag to check for. >>> >> If we require VM_PFNMAP, for ordinary page mappings, we also need to >> disallow COW mappings, since it will not work on architectures that >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > Hm I figured everyone just uses MAP_SHARED for buffer objects since > COW really makes absolutely no sense. How would we enforce this? Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that or allowing MIXEDMAP. >> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >> possible performance implications with x86 + PAT + VM_PFNMAP + normal >> pages. That's a very old comment, though, and might not be valid anymore. > I think that's why ttm has a page cache for these, because it indeed > sucks. The PAT changes on pages are rather expensive. IIRC the page cache was implemented because of the slowness of the caching mode transition itself, more specifically the wbinvd() call + global TLB flush. > > There is still an issue for iomem mappings, because the PAT validation > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > But for i915 at least this is fixed by using the io_mapping > infrastructure, which does the PAT reservation only once when you set > up the mapping area at driver load. Yes, I guess that was the issue that the comment describes, but the issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > problem that hurts much :-) Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > -Daniel /Thomas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-24 9:15 ` Thomas Hellström (Intel) (?) @ 2021-02-24 9:31 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-24 9:31 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: DRI Development, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > >> > >> On 2/23/21 11:59 AM, Daniel Vetter wrote: > >>> tldr; DMA buffers aren't normal memory, expecting that you can use > >>> them like that (like calling get_user_pages works, or that they're > >>> accounting like any other normal memory) cannot be guaranteed. > >>> > >>> Since some userspace only runs on integrated devices, where all > >>> buffers are actually all resident system memory, there's a huge > >>> temptation to assume that a struct page is always present and useable > >>> like for any more pagecache backed mmap. This has the potential to > >>> result in a uapi nightmare. > >>> > >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > >>> blocks get_user_pages and all the other struct page based > >>> infrastructure for everyone. In spirit this is the uapi counterpart to > >>> the kernel-internal CONFIG_DMABUF_DEBUG. > >>> > >>> Motivated by a recent patch which wanted to swich the system dma-buf > >>> heap to vm_insert_page instead of vm_insert_pfn. > >>> > >>> v2: > >>> > >>> Jason brought up that we also want to guarantee that all ptes have the > >>> pte_special flag set, to catch fast get_user_pages (on architectures > >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > >>> > >>> From auditing the various functions to insert pfn pte entires > >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > >>> this should be the correct flag to check for. > >>> > >> If we require VM_PFNMAP, for ordinary page mappings, we also need to > >> disallow COW mappings, since it will not work on architectures that > >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > COW really makes absolutely no sense. How would we enforce this? > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > or allowing MIXEDMAP. > > >> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > >> possible performance implications with x86 + PAT + VM_PFNMAP + normal > >> pages. That's a very old comment, though, and might not be valid anymore. > > I think that's why ttm has a page cache for these, because it indeed > > sucks. The PAT changes on pages are rather expensive. > > IIRC the page cache was implemented because of the slowness of the > caching mode transition itself, more specifically the wbinvd() call + > global TLB flush. > > > > > There is still an issue for iomem mappings, because the PAT validation > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > But for i915 at least this is fixed by using the io_mapping > > infrastructure, which does the PAT reservation only once when you set > > up the mapping area at driver load. > > Yes, I guess that was the issue that the comment describes, but the > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > problem that hurts much :-) > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 Uh that's bad, because mixed maps pointing at struct page wont stop gup. At least afaik. Christian, do we need to patch this up, and maybe fix up ttm fault handler to use io_mapping so the vm_insert_pfn stuff is fast? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 9:31 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-24 9:31 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > >> > >> On 2/23/21 11:59 AM, Daniel Vetter wrote: > >>> tldr; DMA buffers aren't normal memory, expecting that you can use > >>> them like that (like calling get_user_pages works, or that they're > >>> accounting like any other normal memory) cannot be guaranteed. > >>> > >>> Since some userspace only runs on integrated devices, where all > >>> buffers are actually all resident system memory, there's a huge > >>> temptation to assume that a struct page is always present and useable > >>> like for any more pagecache backed mmap. This has the potential to > >>> result in a uapi nightmare. > >>> > >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > >>> blocks get_user_pages and all the other struct page based > >>> infrastructure for everyone. In spirit this is the uapi counterpart to > >>> the kernel-internal CONFIG_DMABUF_DEBUG. > >>> > >>> Motivated by a recent patch which wanted to swich the system dma-buf > >>> heap to vm_insert_page instead of vm_insert_pfn. > >>> > >>> v2: > >>> > >>> Jason brought up that we also want to guarantee that all ptes have the > >>> pte_special flag set, to catch fast get_user_pages (on architectures > >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > >>> > >>> From auditing the various functions to insert pfn pte entires > >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > >>> this should be the correct flag to check for. > >>> > >> If we require VM_PFNMAP, for ordinary page mappings, we also need to > >> disallow COW mappings, since it will not work on architectures that > >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > COW really makes absolutely no sense. How would we enforce this? > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > or allowing MIXEDMAP. > > >> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > >> possible performance implications with x86 + PAT + VM_PFNMAP + normal > >> pages. That's a very old comment, though, and might not be valid anymore. > > I think that's why ttm has a page cache for these, because it indeed > > sucks. The PAT changes on pages are rather expensive. > > IIRC the page cache was implemented because of the slowness of the > caching mode transition itself, more specifically the wbinvd() call + > global TLB flush. > > > > > There is still an issue for iomem mappings, because the PAT validation > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > But for i915 at least this is fixed by using the io_mapping > > infrastructure, which does the PAT reservation only once when you set > > up the mapping area at driver load. > > Yes, I guess that was the issue that the comment describes, but the > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > problem that hurts much :-) > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 Uh that's bad, because mixed maps pointing at struct page wont stop gup. At least afaik. Christian, do we need to patch this up, and maybe fix up ttm fault handler to use io_mapping so the vm_insert_pfn stuff is fast? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 9:31 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-24 9:31 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > >> > >> On 2/23/21 11:59 AM, Daniel Vetter wrote: > >>> tldr; DMA buffers aren't normal memory, expecting that you can use > >>> them like that (like calling get_user_pages works, or that they're > >>> accounting like any other normal memory) cannot be guaranteed. > >>> > >>> Since some userspace only runs on integrated devices, where all > >>> buffers are actually all resident system memory, there's a huge > >>> temptation to assume that a struct page is always present and useable > >>> like for any more pagecache backed mmap. This has the potential to > >>> result in a uapi nightmare. > >>> > >>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > >>> blocks get_user_pages and all the other struct page based > >>> infrastructure for everyone. In spirit this is the uapi counterpart to > >>> the kernel-internal CONFIG_DMABUF_DEBUG. > >>> > >>> Motivated by a recent patch which wanted to swich the system dma-buf > >>> heap to vm_insert_page instead of vm_insert_pfn. > >>> > >>> v2: > >>> > >>> Jason brought up that we also want to guarantee that all ptes have the > >>> pte_special flag set, to catch fast get_user_pages (on architectures > >>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > >>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > >>> > >>> From auditing the various functions to insert pfn pte entires > >>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > >>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > >>> this should be the correct flag to check for. > >>> > >> If we require VM_PFNMAP, for ordinary page mappings, we also need to > >> disallow COW mappings, since it will not work on architectures that > >> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > COW really makes absolutely no sense. How would we enforce this? > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > or allowing MIXEDMAP. > > >> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > >> possible performance implications with x86 + PAT + VM_PFNMAP + normal > >> pages. That's a very old comment, though, and might not be valid anymore. > > I think that's why ttm has a page cache for these, because it indeed > > sucks. The PAT changes on pages are rather expensive. > > IIRC the page cache was implemented because of the slowness of the > caching mode transition itself, more specifically the wbinvd() call + > global TLB flush. > > > > > There is still an issue for iomem mappings, because the PAT validation > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > But for i915 at least this is fixed by using the io_mapping > > infrastructure, which does the PAT reservation only once when you set > > up the mapping area at driver load. > > Yes, I guess that was the issue that the comment describes, but the > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > problem that hurts much :-) > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 Uh that's bad, because mixed maps pointing at struct page wont stop gup. At least afaik. Christian, do we need to patch this up, and maybe fix up ttm fault handler to use io_mapping so the vm_insert_pfn stuff is fast? -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-24 9:31 ` Daniel Vetter (?) @ 2021-02-25 10:28 ` Christian König -1 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 10:28 UTC (permalink / raw) To: Daniel Vetter, Thomas Hellström (Intel) Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK Am 24.02.21 um 10:31 schrieb Daniel Vetter: > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>> <thomas_os@shipmail.org> wrote: >>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>> them like that (like calling get_user_pages works, or that they're >>>>> accounting like any other normal memory) cannot be guaranteed. >>>>> >>>>> Since some userspace only runs on integrated devices, where all >>>>> buffers are actually all resident system memory, there's a huge >>>>> temptation to assume that a struct page is always present and useable >>>>> like for any more pagecache backed mmap. This has the potential to >>>>> result in a uapi nightmare. >>>>> >>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>> blocks get_user_pages and all the other struct page based >>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>> >>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>> >>>>> v2: >>>>> >>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>> >>>>> From auditing the various functions to insert pfn pte entires >>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>> this should be the correct flag to check for. >>>>> >>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>> disallow COW mappings, since it will not work on architectures that >>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>> COW really makes absolutely no sense. How would we enforce this? >> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >> or allowing MIXEDMAP. >> >>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>> pages. That's a very old comment, though, and might not be valid anymore. >>> I think that's why ttm has a page cache for these, because it indeed >>> sucks. The PAT changes on pages are rather expensive. >> IIRC the page cache was implemented because of the slowness of the >> caching mode transition itself, more specifically the wbinvd() call + >> global TLB flush. Yes, exactly that. The global TLB flush is what really breaks our neck here from a performance perspective. >>> There is still an issue for iomem mappings, because the PAT validation >>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>> But for i915 at least this is fixed by using the io_mapping >>> infrastructure, which does the PAT reservation only once when you set >>> up the mapping area at driver load. >> Yes, I guess that was the issue that the comment describes, but the >> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >> >>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>> problem that hurts much :-) >> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >> >> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > Uh that's bad, because mixed maps pointing at struct page wont stop > gup. At least afaik. Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have already seen tons of problems with the page cache. Regards, Christian. > Christian, do we need to patch this up, and maybe fix up ttm fault > handler to use io_mapping so the vm_insert_pfn stuff is fast? > -Daniel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 10:28 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 10:28 UTC (permalink / raw) To: Daniel Vetter, Thomas Hellström (Intel) Cc: Intel Graphics Development, DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, Matthew Wilcox, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK Am 24.02.21 um 10:31 schrieb Daniel Vetter: > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>> <thomas_os@shipmail.org> wrote: >>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>> them like that (like calling get_user_pages works, or that they're >>>>> accounting like any other normal memory) cannot be guaranteed. >>>>> >>>>> Since some userspace only runs on integrated devices, where all >>>>> buffers are actually all resident system memory, there's a huge >>>>> temptation to assume that a struct page is always present and useable >>>>> like for any more pagecache backed mmap. This has the potential to >>>>> result in a uapi nightmare. >>>>> >>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>> blocks get_user_pages and all the other struct page based >>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>> >>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>> >>>>> v2: >>>>> >>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>> >>>>> From auditing the various functions to insert pfn pte entires >>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>> this should be the correct flag to check for. >>>>> >>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>> disallow COW mappings, since it will not work on architectures that >>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>> COW really makes absolutely no sense. How would we enforce this? >> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >> or allowing MIXEDMAP. >> >>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>> pages. That's a very old comment, though, and might not be valid anymore. >>> I think that's why ttm has a page cache for these, because it indeed >>> sucks. The PAT changes on pages are rather expensive. >> IIRC the page cache was implemented because of the slowness of the >> caching mode transition itself, more specifically the wbinvd() call + >> global TLB flush. Yes, exactly that. The global TLB flush is what really breaks our neck here from a performance perspective. >>> There is still an issue for iomem mappings, because the PAT validation >>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>> But for i915 at least this is fixed by using the io_mapping >>> infrastructure, which does the PAT reservation only once when you set >>> up the mapping area at driver load. >> Yes, I guess that was the issue that the comment describes, but the >> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >> >>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>> problem that hurts much :-) >> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >> >> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > Uh that's bad, because mixed maps pointing at struct page wont stop > gup. At least afaik. Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have already seen tons of problems with the page cache. Regards, Christian. > Christian, do we need to patch this up, and maybe fix up ttm fault > handler to use io_mapping so the vm_insert_pfn stuff is fast? > -Daniel _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 10:28 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 10:28 UTC (permalink / raw) To: Daniel Vetter, Thomas Hellström (Intel) Cc: Intel Graphics Development, DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, Matthew Wilcox, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK Am 24.02.21 um 10:31 schrieb Daniel Vetter: > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>> <thomas_os@shipmail.org> wrote: >>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>> them like that (like calling get_user_pages works, or that they're >>>>> accounting like any other normal memory) cannot be guaranteed. >>>>> >>>>> Since some userspace only runs on integrated devices, where all >>>>> buffers are actually all resident system memory, there's a huge >>>>> temptation to assume that a struct page is always present and useable >>>>> like for any more pagecache backed mmap. This has the potential to >>>>> result in a uapi nightmare. >>>>> >>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>> blocks get_user_pages and all the other struct page based >>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>> >>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>> >>>>> v2: >>>>> >>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>> >>>>> From auditing the various functions to insert pfn pte entires >>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>> this should be the correct flag to check for. >>>>> >>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>> disallow COW mappings, since it will not work on architectures that >>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>> COW really makes absolutely no sense. How would we enforce this? >> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >> or allowing MIXEDMAP. >> >>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>> pages. That's a very old comment, though, and might not be valid anymore. >>> I think that's why ttm has a page cache for these, because it indeed >>> sucks. The PAT changes on pages are rather expensive. >> IIRC the page cache was implemented because of the slowness of the >> caching mode transition itself, more specifically the wbinvd() call + >> global TLB flush. Yes, exactly that. The global TLB flush is what really breaks our neck here from a performance perspective. >>> There is still an issue for iomem mappings, because the PAT validation >>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>> But for i915 at least this is fixed by using the io_mapping >>> infrastructure, which does the PAT reservation only once when you set >>> up the mapping area at driver load. >> Yes, I guess that was the issue that the comment describes, but the >> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >> >>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>> problem that hurts much :-) >> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >> >> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > Uh that's bad, because mixed maps pointing at struct page wont stop > gup. At least afaik. Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have already seen tons of problems with the page cache. Regards, Christian. > Christian, do we need to patch this up, and maybe fix up ttm fault > handler to use io_mapping so the vm_insert_pfn stuff is fast? > -Daniel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-25 10:28 ` Christian König (?) @ 2021-02-25 10:44 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 10:44 UTC (permalink / raw) To: Christian König Cc: Daniel Vetter, Thomas Hellström (Intel), Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > Am 24.02.21 um 10:31 schrieb Daniel Vetter: > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > > > > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > > > <thomas_os@shipmail.org> wrote: > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use > > > > > > them like that (like calling get_user_pages works, or that they're > > > > > > accounting like any other normal memory) cannot be guaranteed. > > > > > > > > > > > > Since some userspace only runs on integrated devices, where all > > > > > > buffers are actually all resident system memory, there's a huge > > > > > > temptation to assume that a struct page is always present and useable > > > > > > like for any more pagecache backed mmap. This has the potential to > > > > > > result in a uapi nightmare. > > > > > > > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > > > > > blocks get_user_pages and all the other struct page based > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > > > > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > > > > > heap to vm_insert_page instead of vm_insert_pfn. > > > > > > > > > > > > v2: > > > > > > > > > > > > Jason brought up that we also want to guarantee that all ptes have the > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > > > > > > > > > From auditing the various functions to insert pfn pte entires > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > > > > > this should be the correct flag to check for. > > > > > > > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > > > > > disallow COW mappings, since it will not work on architectures that > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > > COW really makes absolutely no sense. How would we enforce this? > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > > > or allowing MIXEDMAP. > > > > > > > > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal > > > > > pages. That's a very old comment, though, and might not be valid anymore. > > > > I think that's why ttm has a page cache for these, because it indeed > > > > sucks. The PAT changes on pages are rather expensive. > > > IIRC the page cache was implemented because of the slowness of the > > > caching mode transition itself, more specifically the wbinvd() call + > > > global TLB flush. > > Yes, exactly that. The global TLB flush is what really breaks our neck here > from a performance perspective. > > > > > There is still an issue for iomem mappings, because the PAT validation > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > > > But for i915 at least this is fixed by using the io_mapping > > > > infrastructure, which does the PAT reservation only once when you set > > > > up the mapping area at driver load. > > > Yes, I guess that was the issue that the comment describes, but the > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > > > problem that hurts much :-) > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > > Uh that's bad, because mixed maps pointing at struct page wont stop > > gup. At least afaik. > > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > already seen tons of problems with the page cache. On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how you're stopping gup slow path. See check_vma_flags() in mm/gup.c. Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think vm_insert_mixed even works on iomem pfns. There's the devmap exception, but we're not devmap. Worse ttm abuses some accidental codepath to smuggle in hugepte support by intentionally not being devmap. So I'm really not sure this works as we think it should. Maybe good to do a quick test program on amdgpu with a buffer in system memory only and try to do direct io into it. If it works, you have a problem, and a bad one. -Daniel > > Regards, > Christian. > > > Christian, do we need to patch this up, and maybe fix up ttm fault > > handler to use io_mapping so the vm_insert_pfn stuff is fast? > > -Daniel > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 10:44 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 10:44 UTC (permalink / raw) To: Christian König Cc: Daniel Vetter, Matthew Wilcox, Christian König, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > Am 24.02.21 um 10:31 schrieb Daniel Vetter: > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > > > > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > > > <thomas_os@shipmail.org> wrote: > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use > > > > > > them like that (like calling get_user_pages works, or that they're > > > > > > accounting like any other normal memory) cannot be guaranteed. > > > > > > > > > > > > Since some userspace only runs on integrated devices, where all > > > > > > buffers are actually all resident system memory, there's a huge > > > > > > temptation to assume that a struct page is always present and useable > > > > > > like for any more pagecache backed mmap. This has the potential to > > > > > > result in a uapi nightmare. > > > > > > > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > > > > > blocks get_user_pages and all the other struct page based > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > > > > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > > > > > heap to vm_insert_page instead of vm_insert_pfn. > > > > > > > > > > > > v2: > > > > > > > > > > > > Jason brought up that we also want to guarantee that all ptes have the > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > > > > > > > > > From auditing the various functions to insert pfn pte entires > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > > > > > this should be the correct flag to check for. > > > > > > > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > > > > > disallow COW mappings, since it will not work on architectures that > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > > COW really makes absolutely no sense. How would we enforce this? > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > > > or allowing MIXEDMAP. > > > > > > > > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal > > > > > pages. That's a very old comment, though, and might not be valid anymore. > > > > I think that's why ttm has a page cache for these, because it indeed > > > > sucks. The PAT changes on pages are rather expensive. > > > IIRC the page cache was implemented because of the slowness of the > > > caching mode transition itself, more specifically the wbinvd() call + > > > global TLB flush. > > Yes, exactly that. The global TLB flush is what really breaks our neck here > from a performance perspective. > > > > > There is still an issue for iomem mappings, because the PAT validation > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > > > But for i915 at least this is fixed by using the io_mapping > > > > infrastructure, which does the PAT reservation only once when you set > > > > up the mapping area at driver load. > > > Yes, I guess that was the issue that the comment describes, but the > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > > > problem that hurts much :-) > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > > Uh that's bad, because mixed maps pointing at struct page wont stop > > gup. At least afaik. > > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > already seen tons of problems with the page cache. On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how you're stopping gup slow path. See check_vma_flags() in mm/gup.c. Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think vm_insert_mixed even works on iomem pfns. There's the devmap exception, but we're not devmap. Worse ttm abuses some accidental codepath to smuggle in hugepte support by intentionally not being devmap. So I'm really not sure this works as we think it should. Maybe good to do a quick test program on amdgpu with a buffer in system memory only and try to do direct io into it. If it works, you have a problem, and a bad one. -Daniel > > Regards, > Christian. > > > Christian, do we need to patch this up, and maybe fix up ttm fault > > handler to use io_mapping so the vm_insert_pfn stuff is fast? > > -Daniel > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 10:44 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 10:44 UTC (permalink / raw) To: Christian König Cc: Daniel Vetter, Thomas Hellström (Intel), Matthew Wilcox, Christian König, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > Am 24.02.21 um 10:31 schrieb Daniel Vetter: > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > > > > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > > > <thomas_os@shipmail.org> wrote: > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use > > > > > > them like that (like calling get_user_pages works, or that they're > > > > > > accounting like any other normal memory) cannot be guaranteed. > > > > > > > > > > > > Since some userspace only runs on integrated devices, where all > > > > > > buffers are actually all resident system memory, there's a huge > > > > > > temptation to assume that a struct page is always present and useable > > > > > > like for any more pagecache backed mmap. This has the potential to > > > > > > result in a uapi nightmare. > > > > > > > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > > > > > blocks get_user_pages and all the other struct page based > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > > > > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > > > > > heap to vm_insert_page instead of vm_insert_pfn. > > > > > > > > > > > > v2: > > > > > > > > > > > > Jason brought up that we also want to guarantee that all ptes have the > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > > > > > > > > > From auditing the various functions to insert pfn pte entires > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > > > > > this should be the correct flag to check for. > > > > > > > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > > > > > disallow COW mappings, since it will not work on architectures that > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > > COW really makes absolutely no sense. How would we enforce this? > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > > > or allowing MIXEDMAP. > > > > > > > > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal > > > > > pages. That's a very old comment, though, and might not be valid anymore. > > > > I think that's why ttm has a page cache for these, because it indeed > > > > sucks. The PAT changes on pages are rather expensive. > > > IIRC the page cache was implemented because of the slowness of the > > > caching mode transition itself, more specifically the wbinvd() call + > > > global TLB flush. > > Yes, exactly that. The global TLB flush is what really breaks our neck here > from a performance perspective. > > > > > There is still an issue for iomem mappings, because the PAT validation > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > > > But for i915 at least this is fixed by using the io_mapping > > > > infrastructure, which does the PAT reservation only once when you set > > > > up the mapping area at driver load. > > > Yes, I guess that was the issue that the comment describes, but the > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > > > problem that hurts much :-) > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > > Uh that's bad, because mixed maps pointing at struct page wont stop > > gup. At least afaik. > > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > already seen tons of problems with the page cache. On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how you're stopping gup slow path. See check_vma_flags() in mm/gup.c. Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think vm_insert_mixed even works on iomem pfns. There's the devmap exception, but we're not devmap. Worse ttm abuses some accidental codepath to smuggle in hugepte support by intentionally not being devmap. So I'm really not sure this works as we think it should. Maybe good to do a quick test program on amdgpu with a buffer in system memory only and try to do direct io into it. If it works, you have a problem, and a bad one. -Daniel > > Regards, > Christian. > > > Christian, do we need to patch this up, and maybe fix up ttm fault > > handler to use io_mapping so the vm_insert_pfn stuff is fast? > > -Daniel > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-25 10:44 ` Daniel Vetter (?) @ 2021-02-25 15:49 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 15:49 UTC (permalink / raw) To: Christian König Cc: Thomas Hellström (Intel), Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > > Am 24.02.21 um 10:31 schrieb Daniel Vetter: > > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > > > <thomas_os@shipmail.org> wrote: > > > > > > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > > > > <thomas_os@shipmail.org> wrote: > > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use > > > > > > > them like that (like calling get_user_pages works, or that they're > > > > > > > accounting like any other normal memory) cannot be guaranteed. > > > > > > > > > > > > > > Since some userspace only runs on integrated devices, where all > > > > > > > buffers are actually all resident system memory, there's a huge > > > > > > > temptation to assume that a struct page is always present and useable > > > > > > > like for any more pagecache backed mmap. This has the potential to > > > > > > > result in a uapi nightmare. > > > > > > > > > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > > > > > > blocks get_user_pages and all the other struct page based > > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to > > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > > > > > > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > > > > > > heap to vm_insert_page instead of vm_insert_pfn. > > > > > > > > > > > > > > v2: > > > > > > > > > > > > > > Jason brought up that we also want to guarantee that all ptes have the > > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures > > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > > > > > > > > > > > From auditing the various functions to insert pfn pte entires > > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > > > > > > this should be the correct flag to check for. > > > > > > > > > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > > > > > > disallow COW mappings, since it will not work on architectures that > > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > > > COW really makes absolutely no sense. How would we enforce this? > > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > > > > or allowing MIXEDMAP. > > > > > > > > > > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal > > > > > > pages. That's a very old comment, though, and might not be valid anymore. > > > > > I think that's why ttm has a page cache for these, because it indeed > > > > > sucks. The PAT changes on pages are rather expensive. > > > > IIRC the page cache was implemented because of the slowness of the > > > > caching mode transition itself, more specifically the wbinvd() call + > > > > global TLB flush. > > > > Yes, exactly that. The global TLB flush is what really breaks our neck here > > from a performance perspective. > > > > > > > There is still an issue for iomem mappings, because the PAT validation > > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > > > > But for i915 at least this is fixed by using the io_mapping > > > > > infrastructure, which does the PAT reservation only once when you set > > > > > up the mapping area at driver load. > > > > Yes, I guess that was the issue that the comment describes, but the > > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > > > > problem that hurts much :-) > > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > > > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > > > Uh that's bad, because mixed maps pointing at struct page wont stop > > > gup. At least afaik. > > > > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > > already seen tons of problems with the page cache. > > On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed > boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. > > But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how > you're stopping gup slow path. See check_vma_flags() in mm/gup.c. > > Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think > vm_insert_mixed even works on iomem pfns. There's the devmap exception, > but we're not devmap. Worse ttm abuses some accidental codepath to smuggle > in hugepte support by intentionally not being devmap. > > So I'm really not sure this works as we think it should. Maybe good to do > a quick test program on amdgpu with a buffer in system memory only and try > to do direct io into it. If it works, you have a problem, and a bad one. That's probably impossible, since a quick git grep shows that pretty much anything reasonable has special ptes: arc, arm, arm64, powerpc, riscv, s390, sh, sparc, x86. I don't think you'll have a platform where you can plug an amdgpu in and actually exercise the bug :-) So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? -Daniel > > > > > Regards, > > Christian. > > > > > Christian, do we need to patch this up, and maybe fix up ttm fault > > > handler to use io_mapping so the vm_insert_pfn stuff is fast? > > > -Daniel > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 15:49 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 15:49 UTC (permalink / raw) To: Christian König Cc: Matthew Wilcox, Christian König, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > > Am 24.02.21 um 10:31 schrieb Daniel Vetter: > > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > > > <thomas_os@shipmail.org> wrote: > > > > > > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > > > > <thomas_os@shipmail.org> wrote: > > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use > > > > > > > them like that (like calling get_user_pages works, or that they're > > > > > > > accounting like any other normal memory) cannot be guaranteed. > > > > > > > > > > > > > > Since some userspace only runs on integrated devices, where all > > > > > > > buffers are actually all resident system memory, there's a huge > > > > > > > temptation to assume that a struct page is always present and useable > > > > > > > like for any more pagecache backed mmap. This has the potential to > > > > > > > result in a uapi nightmare. > > > > > > > > > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > > > > > > blocks get_user_pages and all the other struct page based > > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to > > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > > > > > > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > > > > > > heap to vm_insert_page instead of vm_insert_pfn. > > > > > > > > > > > > > > v2: > > > > > > > > > > > > > > Jason brought up that we also want to guarantee that all ptes have the > > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures > > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > > > > > > > > > > > From auditing the various functions to insert pfn pte entires > > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > > > > > > this should be the correct flag to check for. > > > > > > > > > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > > > > > > disallow COW mappings, since it will not work on architectures that > > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > > > COW really makes absolutely no sense. How would we enforce this? > > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > > > > or allowing MIXEDMAP. > > > > > > > > > > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal > > > > > > pages. That's a very old comment, though, and might not be valid anymore. > > > > > I think that's why ttm has a page cache for these, because it indeed > > > > > sucks. The PAT changes on pages are rather expensive. > > > > IIRC the page cache was implemented because of the slowness of the > > > > caching mode transition itself, more specifically the wbinvd() call + > > > > global TLB flush. > > > > Yes, exactly that. The global TLB flush is what really breaks our neck here > > from a performance perspective. > > > > > > > There is still an issue for iomem mappings, because the PAT validation > > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > > > > But for i915 at least this is fixed by using the io_mapping > > > > > infrastructure, which does the PAT reservation only once when you set > > > > > up the mapping area at driver load. > > > > Yes, I guess that was the issue that the comment describes, but the > > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > > > > problem that hurts much :-) > > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > > > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > > > Uh that's bad, because mixed maps pointing at struct page wont stop > > > gup. At least afaik. > > > > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > > already seen tons of problems with the page cache. > > On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed > boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. > > But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how > you're stopping gup slow path. See check_vma_flags() in mm/gup.c. > > Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think > vm_insert_mixed even works on iomem pfns. There's the devmap exception, > but we're not devmap. Worse ttm abuses some accidental codepath to smuggle > in hugepte support by intentionally not being devmap. > > So I'm really not sure this works as we think it should. Maybe good to do > a quick test program on amdgpu with a buffer in system memory only and try > to do direct io into it. If it works, you have a problem, and a bad one. That's probably impossible, since a quick git grep shows that pretty much anything reasonable has special ptes: arc, arm, arm64, powerpc, riscv, s390, sh, sparc, x86. I don't think you'll have a platform where you can plug an amdgpu in and actually exercise the bug :-) So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? -Daniel > > > > > Regards, > > Christian. > > > > > Christian, do we need to patch this up, and maybe fix up ttm fault > > > handler to use io_mapping so the vm_insert_pfn stuff is fast? > > > -Daniel > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 15:49 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 15:49 UTC (permalink / raw) To: Christian König Cc: Thomas Hellström (Intel), Matthew Wilcox, Christian König, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > > On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > > Am 24.02.21 um 10:31 schrieb Daniel Vetter: > > > On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > > > <thomas_os@shipmail.org> wrote: > > > > > > > > On 2/24/21 9:45 AM, Daniel Vetter wrote: > > > > > On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > > > > > <thomas_os@shipmail.org> wrote: > > > > > > On 2/23/21 11:59 AM, Daniel Vetter wrote: > > > > > > > tldr; DMA buffers aren't normal memory, expecting that you can use > > > > > > > them like that (like calling get_user_pages works, or that they're > > > > > > > accounting like any other normal memory) cannot be guaranteed. > > > > > > > > > > > > > > Since some userspace only runs on integrated devices, where all > > > > > > > buffers are actually all resident system memory, there's a huge > > > > > > > temptation to assume that a struct page is always present and useable > > > > > > > like for any more pagecache backed mmap. This has the potential to > > > > > > > result in a uapi nightmare. > > > > > > > > > > > > > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > > > > > > > blocks get_user_pages and all the other struct page based > > > > > > > infrastructure for everyone. In spirit this is the uapi counterpart to > > > > > > > the kernel-internal CONFIG_DMABUF_DEBUG. > > > > > > > > > > > > > > Motivated by a recent patch which wanted to swich the system dma-buf > > > > > > > heap to vm_insert_page instead of vm_insert_pfn. > > > > > > > > > > > > > > v2: > > > > > > > > > > > > > > Jason brought up that we also want to guarantee that all ptes have the > > > > > > > pte_special flag set, to catch fast get_user_pages (on architectures > > > > > > > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > > > > > > > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > > > > > > > > > > > > > From auditing the various functions to insert pfn pte entires > > > > > > > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > > > > > > > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > > > > > > > this should be the correct flag to check for. > > > > > > > > > > > > > If we require VM_PFNMAP, for ordinary page mappings, we also need to > > > > > > disallow COW mappings, since it will not work on architectures that > > > > > > don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > > > COW really makes absolutely no sense. How would we enforce this? > > > > Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > > > > or allowing MIXEDMAP. > > > > > > > > > > Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > > > > > > possible performance implications with x86 + PAT + VM_PFNMAP + normal > > > > > > pages. That's a very old comment, though, and might not be valid anymore. > > > > > I think that's why ttm has a page cache for these, because it indeed > > > > > sucks. The PAT changes on pages are rather expensive. > > > > IIRC the page cache was implemented because of the slowness of the > > > > caching mode transition itself, more specifically the wbinvd() call + > > > > global TLB flush. > > > > Yes, exactly that. The global TLB flush is what really breaks our neck here > > from a performance perspective. > > > > > > > There is still an issue for iomem mappings, because the PAT validation > > > > > does a linear walk of the resource tree (lol) for every vm_insert_pfn. > > > > > But for i915 at least this is fixed by using the io_mapping > > > > > infrastructure, which does the PAT reservation only once when you set > > > > > up the mapping area at driver load. > > > > Yes, I guess that was the issue that the comment describes, but the > > > > issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > > > > > > > > > Also TTM uses VM_PFNMAP right now for everything, so it can't be a > > > > > problem that hurts much :-) > > > > Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > > > > > > > > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > > > Uh that's bad, because mixed maps pointing at struct page wont stop > > > gup. At least afaik. > > > > Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > > already seen tons of problems with the page cache. > > On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed > boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. > > But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how > you're stopping gup slow path. See check_vma_flags() in mm/gup.c. > > Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think > vm_insert_mixed even works on iomem pfns. There's the devmap exception, > but we're not devmap. Worse ttm abuses some accidental codepath to smuggle > in hugepte support by intentionally not being devmap. > > So I'm really not sure this works as we think it should. Maybe good to do > a quick test program on amdgpu with a buffer in system memory only and try > to do direct io into it. If it works, you have a problem, and a bad one. That's probably impossible, since a quick git grep shows that pretty much anything reasonable has special ptes: arc, arm, arm64, powerpc, riscv, s390, sh, sparc, x86. I don't think you'll have a platform where you can plug an amdgpu in and actually exercise the bug :-) So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? -Daniel > > > > > Regards, > > Christian. > > > > > Christian, do we need to patch this up, and maybe fix up ttm fault > > > handler to use io_mapping so the vm_insert_pfn stuff is fast? > > > -Daniel > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-25 15:49 ` Daniel Vetter (?) @ 2021-02-25 16:53 ` Christian König -1 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 16:53 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Thomas Hellström (Intel), Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Am 25.02.21 um 16:49 schrieb Daniel Vetter: > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>> >>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>> result in a uapi nightmare. >>>>>>>> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>> >>>>>>>> v2: >>>>>>>> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>> >>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>> this should be the correct flag to check for. >>>>>>>> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>> or allowing MIXEDMAP. >>>>> >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>> sucks. The PAT changes on pages are rather expensive. >>>>> IIRC the page cache was implemented because of the slowness of the >>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>> global TLB flush. >>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>> from a performance perspective. >>> >>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>> up the mapping area at driver load. >>>>> Yes, I guess that was the issue that the comment describes, but the >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>> problem that hurts much :-) >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_bo_vm.c%23L554&data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7%2BO0WNdBF62eVDy7u4hRydsfviF6dBJEDeZiYIzQAcc%3D&reserved=0 >>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>> gup. At least afaik. >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>> already seen tons of problems with the page cache. >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >> in hugepte support by intentionally not being devmap. >> >> So I'm really not sure this works as we think it should. Maybe good to do >> a quick test program on amdgpu with a buffer in system memory only and try >> to do direct io into it. If it works, you have a problem, and a bad one. > That's probably impossible, since a quick git grep shows that pretty > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > where you can plug an amdgpu in and actually exercise the bug :-) > > So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? Maybe yes, but not sure. I've once had a request to do this from some google guys, but rejected it because I wasn't sure of the consequences. Christian. > -Daniel > > >>> Regards, >>> Christian. >>> >>>> Christian, do we need to patch this up, and maybe fix up ttm fault >>>> handler to use io_mapping so the vm_insert_pfn stuff is fast? >>>> -Daniel >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PmdIbYM6kemXstScf2OoZU9YyXGGzzNzeWEyL8ZDnfo%3D&reserved=0 > > ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 16:53 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 16:53 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK Am 25.02.21 um 16:49 schrieb Daniel Vetter: > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>> >>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>> result in a uapi nightmare. >>>>>>>> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>> >>>>>>>> v2: >>>>>>>> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>> >>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>> this should be the correct flag to check for. >>>>>>>> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>> or allowing MIXEDMAP. >>>>> >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>> sucks. The PAT changes on pages are rather expensive. >>>>> IIRC the page cache was implemented because of the slowness of the >>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>> global TLB flush. >>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>> from a performance perspective. >>> >>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>> up the mapping area at driver load. >>>>> Yes, I guess that was the issue that the comment describes, but the >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>> problem that hurts much :-) >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_bo_vm.c%23L554&data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7%2BO0WNdBF62eVDy7u4hRydsfviF6dBJEDeZiYIzQAcc%3D&reserved=0 >>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>> gup. At least afaik. >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>> already seen tons of problems with the page cache. >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >> in hugepte support by intentionally not being devmap. >> >> So I'm really not sure this works as we think it should. Maybe good to do >> a quick test program on amdgpu with a buffer in system memory only and try >> to do direct io into it. If it works, you have a problem, and a bad one. > That's probably impossible, since a quick git grep shows that pretty > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > where you can plug an amdgpu in and actually exercise the bug :-) > > So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? Maybe yes, but not sure. I've once had a request to do this from some google guys, but rejected it because I wasn't sure of the consequences. Christian. > -Daniel > > >>> Regards, >>> Christian. >>> >>>> Christian, do we need to patch this up, and maybe fix up ttm fault >>>> handler to use io_mapping so the vm_insert_pfn stuff is fast? >>>> -Daniel >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PmdIbYM6kemXstScf2OoZU9YyXGGzzNzeWEyL8ZDnfo%3D&reserved=0 > > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 16:53 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 16:53 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Thomas Hellström (Intel), Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK Am 25.02.21 um 16:49 schrieb Daniel Vetter: > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>> >>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>> result in a uapi nightmare. >>>>>>>> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>> >>>>>>>> v2: >>>>>>>> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>> >>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>> this should be the correct flag to check for. >>>>>>>> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>> or allowing MIXEDMAP. >>>>> >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>> sucks. The PAT changes on pages are rather expensive. >>>>> IIRC the page cache was implemented because of the slowness of the >>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>> global TLB flush. >>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>> from a performance perspective. >>> >>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>> up the mapping area at driver load. >>>>> Yes, I guess that was the issue that the comment describes, but the >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>> problem that hurts much :-) >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>> >>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Fttm%2Fttm_bo_vm.c%23L554&data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=7%2BO0WNdBF62eVDy7u4hRydsfviF6dBJEDeZiYIzQAcc%3D&reserved=0 >>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>> gup. At least afaik. >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>> already seen tons of problems with the page cache. >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >> in hugepte support by intentionally not being devmap. >> >> So I'm really not sure this works as we think it should. Maybe good to do >> a quick test program on amdgpu with a buffer in system memory only and try >> to do direct io into it. If it works, you have a problem, and a bad one. > That's probably impossible, since a quick git grep shows that pretty > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > where you can plug an amdgpu in and actually exercise the bug :-) > > So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? Maybe yes, but not sure. I've once had a request to do this from some google guys, but rejected it because I wasn't sure of the consequences. Christian. > -Daniel > > >>> Regards, >>> Christian. >>> >>>> Christian, do we need to patch this up, and maybe fix up ttm fault >>>> handler to use io_mapping so the vm_insert_pfn stuff is fast? >>>> -Daniel >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2F&data=04%7C01%7Cchristian.koenig%40amd.com%7Ca93d0dbbc0484fec118808d8d9a4fc22%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637498649935442516%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=PmdIbYM6kemXstScf2OoZU9YyXGGzzNzeWEyL8ZDnfo%3D&reserved=0 > > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-25 15:49 ` Daniel Vetter (?) @ 2021-02-26 9:41 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-26 9:41 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/25/21 4:49 PM, Daniel Vetter wrote: > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>> >>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>> result in a uapi nightmare. >>>>>>>> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>> >>>>>>>> v2: >>>>>>>> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>> >>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>> this should be the correct flag to check for. >>>>>>>> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>> or allowing MIXEDMAP. >>>>> >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>> sucks. The PAT changes on pages are rather expensive. >>>>> IIRC the page cache was implemented because of the slowness of the >>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>> global TLB flush. >>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>> from a performance perspective. >>> >>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>> up the mapping area at driver load. >>>>> Yes, I guess that was the issue that the comment describes, but the >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>> problem that hurts much :-) >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>> >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 >>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>> gup. At least afaik. >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>> already seen tons of problems with the page cache. >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >> in hugepte support by intentionally not being devmap. >> >> So I'm really not sure this works as we think it should. Maybe good to do >> a quick test program on amdgpu with a buffer in system memory only and try >> to do direct io into it. If it works, you have a problem, and a bad one. > That's probably impossible, since a quick git grep shows that pretty > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > where you can plug an amdgpu in and actually exercise the bug :-) Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I don't see what should be stopping gup to those? /Thomas > > So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? > -Daniel > > >>> Regards, >>> Christian. >>> >>>> Christian, do we need to patch this up, and maybe fix up ttm fault >>>> handler to use io_mapping so the vm_insert_pfn stuff is fast? >>>> -Daniel >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> http://blog.ffwll.ch > > ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-26 9:41 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-26 9:41 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Intel Graphics Development, DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, Matthew Wilcox, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/25/21 4:49 PM, Daniel Vetter wrote: > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>> >>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>> result in a uapi nightmare. >>>>>>>> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>> >>>>>>>> v2: >>>>>>>> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>> >>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>> this should be the correct flag to check for. >>>>>>>> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>> or allowing MIXEDMAP. >>>>> >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>> sucks. The PAT changes on pages are rather expensive. >>>>> IIRC the page cache was implemented because of the slowness of the >>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>> global TLB flush. >>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>> from a performance perspective. >>> >>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>> up the mapping area at driver load. >>>>> Yes, I guess that was the issue that the comment describes, but the >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>> problem that hurts much :-) >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>> >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 >>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>> gup. At least afaik. >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>> already seen tons of problems with the page cache. >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >> in hugepte support by intentionally not being devmap. >> >> So I'm really not sure this works as we think it should. Maybe good to do >> a quick test program on amdgpu with a buffer in system memory only and try >> to do direct io into it. If it works, you have a problem, and a bad one. > That's probably impossible, since a quick git grep shows that pretty > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > where you can plug an amdgpu in and actually exercise the bug :-) Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I don't see what should be stopping gup to those? /Thomas > > So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? > -Daniel > > >>> Regards, >>> Christian. >>> >>>> Christian, do we need to patch this up, and maybe fix up ttm fault >>>> handler to use io_mapping so the vm_insert_pfn stuff is fast? >>>> -Daniel >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> http://blog.ffwll.ch > > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-26 9:41 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-26 9:41 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Intel Graphics Development, DRI Development, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, Matthew Wilcox, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/25/21 4:49 PM, Daniel Vetter wrote: > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>> >>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>> result in a uapi nightmare. >>>>>>>> >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>> >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>> >>>>>>>> v2: >>>>>>>> >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>> >>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>> this should be the correct flag to check for. >>>>>>>> >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>> or allowing MIXEDMAP. >>>>> >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>> sucks. The PAT changes on pages are rather expensive. >>>>> IIRC the page cache was implemented because of the slowness of the >>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>> global TLB flush. >>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>> from a performance perspective. >>> >>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>> up the mapping area at driver load. >>>>> Yes, I guess that was the issue that the comment describes, but the >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>> >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>> problem that hurts much :-) >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>> >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 >>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>> gup. At least afaik. >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>> already seen tons of problems with the page cache. >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >> >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >> >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >> in hugepte support by intentionally not being devmap. >> >> So I'm really not sure this works as we think it should. Maybe good to do >> a quick test program on amdgpu with a buffer in system memory only and try >> to do direct io into it. If it works, you have a problem, and a bad one. > That's probably impossible, since a quick git grep shows that pretty > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > where you can plug an amdgpu in and actually exercise the bug :-) Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I don't see what should be stopping gup to those? /Thomas > > So maybe we should just switch over to VM_PFNMAP for ttm for more clarity? > -Daniel > > >>> Regards, >>> Christian. >>> >>>> Christian, do we need to patch this up, and maybe fix up ttm fault >>>> handler to use io_mapping so the vm_insert_pfn stuff is fast? >>>> -Daniel >> -- >> Daniel Vetter >> Software Engineer, Intel Corporation >> http://blog.ffwll.ch > > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-26 9:41 ` Thomas Hellström (Intel) (?) @ 2021-02-26 13:28 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-26 13:28 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/25/21 4:49 PM, Daniel Vetter wrote: > > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: > >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > >>>> <thomas_os@shipmail.org> wrote: > >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: > >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > >>>>>> <thomas_os@shipmail.org> wrote: > >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: > >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use > >>>>>>>> them like that (like calling get_user_pages works, or that they're > >>>>>>>> accounting like any other normal memory) cannot be guaranteed. > >>>>>>>> > >>>>>>>> Since some userspace only runs on integrated devices, where all > >>>>>>>> buffers are actually all resident system memory, there's a huge > >>>>>>>> temptation to assume that a struct page is always present and useable > >>>>>>>> like for any more pagecache backed mmap. This has the potential to > >>>>>>>> result in a uapi nightmare. > >>>>>>>> > >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > >>>>>>>> blocks get_user_pages and all the other struct page based > >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to > >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. > >>>>>>>> > >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf > >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. > >>>>>>>> > >>>>>>>> v2: > >>>>>>>> > >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the > >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures > >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > >>>>>>>> > >>>>>>>> From auditing the various functions to insert pfn pte entires > >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > >>>>>>>> this should be the correct flag to check for. > >>>>>>>> > >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to > >>>>>>> disallow COW mappings, since it will not work on architectures that > >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since > >>>>>> COW really makes absolutely no sense. How would we enforce this? > >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > >>>>> or allowing MIXEDMAP. > >>>>> > >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal > >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. > >>>>>> I think that's why ttm has a page cache for these, because it indeed > >>>>>> sucks. The PAT changes on pages are rather expensive. > >>>>> IIRC the page cache was implemented because of the slowness of the > >>>>> caching mode transition itself, more specifically the wbinvd() call + > >>>>> global TLB flush. > >>> Yes, exactly that. The global TLB flush is what really breaks our neck here > >>> from a performance perspective. > >>> > >>>>>> There is still an issue for iomem mappings, because the PAT validation > >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. > >>>>>> But for i915 at least this is fixed by using the io_mapping > >>>>>> infrastructure, which does the PAT reservation only once when you set > >>>>>> up the mapping area at driver load. > >>>>> Yes, I guess that was the issue that the comment describes, but the > >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > >>>>> > >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a > >>>>>> problem that hurts much :-) > >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > >>>>> > >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > >>>> Uh that's bad, because mixed maps pointing at struct page wont stop > >>>> gup. At least afaik. > >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > >>> already seen tons of problems with the page cache. > >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed > >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. > >> > >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how > >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. > >> > >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think > >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, > >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle > >> in hugepte support by intentionally not being devmap. > >> > >> So I'm really not sure this works as we think it should. Maybe good to do > >> a quick test program on amdgpu with a buffer in system memory only and try > >> to do direct io into it. If it works, you have a problem, and a bad one. > > That's probably impossible, since a quick git grep shows that pretty > > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > > where you can plug an amdgpu in and actually exercise the bug :-) > > Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I > don't see what should be stopping gup to those? If you have an arch with pte special we use insert_pfn(), which afaict will use pte_mkspecial for the !devmap case. And ttm isn't devmap (otherwise our hugepte abuse of devmap hugeptes would go rather wrong). So I think it stops gup. But I haven't verified at all. Would be good if Christian can check this with some direct io to a buffer in system memory. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-26 13:28 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-26 13:28 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/25/21 4:49 PM, Daniel Vetter wrote: > > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: > >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > >>>> <thomas_os@shipmail.org> wrote: > >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: > >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > >>>>>> <thomas_os@shipmail.org> wrote: > >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: > >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use > >>>>>>>> them like that (like calling get_user_pages works, or that they're > >>>>>>>> accounting like any other normal memory) cannot be guaranteed. > >>>>>>>> > >>>>>>>> Since some userspace only runs on integrated devices, where all > >>>>>>>> buffers are actually all resident system memory, there's a huge > >>>>>>>> temptation to assume that a struct page is always present and useable > >>>>>>>> like for any more pagecache backed mmap. This has the potential to > >>>>>>>> result in a uapi nightmare. > >>>>>>>> > >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > >>>>>>>> blocks get_user_pages and all the other struct page based > >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to > >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. > >>>>>>>> > >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf > >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. > >>>>>>>> > >>>>>>>> v2: > >>>>>>>> > >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the > >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures > >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > >>>>>>>> > >>>>>>>> From auditing the various functions to insert pfn pte entires > >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > >>>>>>>> this should be the correct flag to check for. > >>>>>>>> > >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to > >>>>>>> disallow COW mappings, since it will not work on architectures that > >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since > >>>>>> COW really makes absolutely no sense. How would we enforce this? > >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > >>>>> or allowing MIXEDMAP. > >>>>> > >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal > >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. > >>>>>> I think that's why ttm has a page cache for these, because it indeed > >>>>>> sucks. The PAT changes on pages are rather expensive. > >>>>> IIRC the page cache was implemented because of the slowness of the > >>>>> caching mode transition itself, more specifically the wbinvd() call + > >>>>> global TLB flush. > >>> Yes, exactly that. The global TLB flush is what really breaks our neck here > >>> from a performance perspective. > >>> > >>>>>> There is still an issue for iomem mappings, because the PAT validation > >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. > >>>>>> But for i915 at least this is fixed by using the io_mapping > >>>>>> infrastructure, which does the PAT reservation only once when you set > >>>>>> up the mapping area at driver load. > >>>>> Yes, I guess that was the issue that the comment describes, but the > >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > >>>>> > >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a > >>>>>> problem that hurts much :-) > >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > >>>>> > >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > >>>> Uh that's bad, because mixed maps pointing at struct page wont stop > >>>> gup. At least afaik. > >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > >>> already seen tons of problems with the page cache. > >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed > >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. > >> > >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how > >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. > >> > >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think > >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, > >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle > >> in hugepte support by intentionally not being devmap. > >> > >> So I'm really not sure this works as we think it should. Maybe good to do > >> a quick test program on amdgpu with a buffer in system memory only and try > >> to do direct io into it. If it works, you have a problem, and a bad one. > > That's probably impossible, since a quick git grep shows that pretty > > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > > where you can plug an amdgpu in and actually exercise the bug :-) > > Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I > don't see what should be stopping gup to those? If you have an arch with pte special we use insert_pfn(), which afaict will use pte_mkspecial for the !devmap case. And ttm isn't devmap (otherwise our hugepte abuse of devmap hugeptes would go rather wrong). So I think it stops gup. But I haven't verified at all. Would be good if Christian can check this with some direct io to a buffer in system memory. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-26 13:28 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-26 13:28 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > > On 2/25/21 4:49 PM, Daniel Vetter wrote: > > On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: > >> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: > >>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: > >>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) > >>>> <thomas_os@shipmail.org> wrote: > >>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: > >>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) > >>>>>> <thomas_os@shipmail.org> wrote: > >>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: > >>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use > >>>>>>>> them like that (like calling get_user_pages works, or that they're > >>>>>>>> accounting like any other normal memory) cannot be guaranteed. > >>>>>>>> > >>>>>>>> Since some userspace only runs on integrated devices, where all > >>>>>>>> buffers are actually all resident system memory, there's a huge > >>>>>>>> temptation to assume that a struct page is always present and useable > >>>>>>>> like for any more pagecache backed mmap. This has the potential to > >>>>>>>> result in a uapi nightmare. > >>>>>>>> > >>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > >>>>>>>> blocks get_user_pages and all the other struct page based > >>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to > >>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. > >>>>>>>> > >>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf > >>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. > >>>>>>>> > >>>>>>>> v2: > >>>>>>>> > >>>>>>>> Jason brought up that we also want to guarantee that all ptes have the > >>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures > >>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > >>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > >>>>>>>> > >>>>>>>> From auditing the various functions to insert pfn pte entires > >>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > >>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > >>>>>>>> this should be the correct flag to check for. > >>>>>>>> > >>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to > >>>>>>> disallow COW mappings, since it will not work on architectures that > >>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). > >>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since > >>>>>> COW really makes absolutely no sense. How would we enforce this? > >>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that > >>>>> or allowing MIXEDMAP. > >>>>> > >>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with > >>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal > >>>>>>> pages. That's a very old comment, though, and might not be valid anymore. > >>>>>> I think that's why ttm has a page cache for these, because it indeed > >>>>>> sucks. The PAT changes on pages are rather expensive. > >>>>> IIRC the page cache was implemented because of the slowness of the > >>>>> caching mode transition itself, more specifically the wbinvd() call + > >>>>> global TLB flush. > >>> Yes, exactly that. The global TLB flush is what really breaks our neck here > >>> from a performance perspective. > >>> > >>>>>> There is still an issue for iomem mappings, because the PAT validation > >>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. > >>>>>> But for i915 at least this is fixed by using the io_mapping > >>>>>> infrastructure, which does the PAT reservation only once when you set > >>>>>> up the mapping area at driver load. > >>>>> Yes, I guess that was the issue that the comment describes, but the > >>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. > >>>>> > >>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a > >>>>>> problem that hurts much :-) > >>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? > >>>>> > >>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 > >>>> Uh that's bad, because mixed maps pointing at struct page wont stop > >>>> gup. At least afaik. > >>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have > >>> already seen tons of problems with the page cache. > >> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed > >> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. > >> > >> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how > >> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. > >> > >> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think > >> vm_insert_mixed even works on iomem pfns. There's the devmap exception, > >> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle > >> in hugepte support by intentionally not being devmap. > >> > >> So I'm really not sure this works as we think it should. Maybe good to do > >> a quick test program on amdgpu with a buffer in system memory only and try > >> to do direct io into it. If it works, you have a problem, and a bad one. > > That's probably impossible, since a quick git grep shows that pretty > > much anything reasonable has special ptes: arc, arm, arm64, powerpc, > > riscv, s390, sh, sparc, x86. I don't think you'll have a platform > > where you can plug an amdgpu in and actually exercise the bug :-) > > Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I > don't see what should be stopping gup to those? If you have an arch with pte special we use insert_pfn(), which afaict will use pte_mkspecial for the !devmap case. And ttm isn't devmap (otherwise our hugepte abuse of devmap hugeptes would go rather wrong). So I think it stops gup. But I haven't verified at all. Would be good if Christian can check this with some direct io to a buffer in system memory. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-26 13:28 ` Daniel Vetter (?) @ 2021-02-27 8:06 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-27 8:06 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/26/21 2:28 PM, Daniel Vetter wrote: > On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/25/21 4:49 PM, Daniel Vetter wrote: >>> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >>>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>>>> >>>>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>>>> result in a uapi nightmare. >>>>>>>>>> >>>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>>>> >>>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>>>> >>>>>>>>>> v2: >>>>>>>>>> >>>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>>>> >>>>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>>>> this should be the correct flag to check for. >>>>>>>>>> >>>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>>>> or allowing MIXEDMAP. >>>>>>> >>>>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>>>> sucks. The PAT changes on pages are rather expensive. >>>>>>> IIRC the page cache was implemented because of the slowness of the >>>>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>>>> global TLB flush. >>>>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>>>> from a performance perspective. >>>>> >>>>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>>>> up the mapping area at driver load. >>>>>>> Yes, I guess that was the issue that the comment describes, but the >>>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>>>> >>>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>>>> problem that hurts much :-) >>>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>>>> >>>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 >>>>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>>>> gup. At least afaik. >>>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>>>> already seen tons of problems with the page cache. >>>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >>>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >>>> >>>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >>>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >>>> >>>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >>>> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >>>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >>>> in hugepte support by intentionally not being devmap. >>>> >>>> So I'm really not sure this works as we think it should. Maybe good to do >>>> a quick test program on amdgpu with a buffer in system memory only and try >>>> to do direct io into it. If it works, you have a problem, and a bad one. >>> That's probably impossible, since a quick git grep shows that pretty >>> much anything reasonable has special ptes: arc, arm, arm64, powerpc, >>> riscv, s390, sh, sparc, x86. I don't think you'll have a platform >>> where you can plug an amdgpu in and actually exercise the bug :-) >> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I >> don't see what should be stopping gup to those? > If you have an arch with pte special we use insert_pfn(), which afaict > will use pte_mkspecial for the !devmap case. And ttm isn't devmap > (otherwise our hugepte abuse of devmap hugeptes would go rather > wrong). > > So I think it stops gup. But I haven't verified at all. Would be good > if Christian can check this with some direct io to a buffer in system > memory. Hmm, Docs (again vm_normal_page() say) * VM_MIXEDMAP mappings can likewise contain memory with or without "struct * page" backing, however the difference is that _all_ pages with a struct * page (that is, those where pfn_valid is true) are refcounted and considered * normal pages by the VM. The disadvantage is that pages are refcounted * (which can be slower and simply not an option for some PFNMAP users). The * advantage is that we don't have to follow the strict linearity rule of * PFNMAP mappings in order to support COWable mappings. but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so the above isn't really true, which makes me wonder if and in that case why there could any longer ever be a significant performance difference between MIXEDMAP and PFNMAP. BTW regarding the TTM hugeptes, I don't think we ever landed that devmap hack, so they are (for the non-gup case) relying on vma_is_special_huge(). For the gup case, I think the bug is still there. /Thomas > -Daniel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-27 8:06 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-27 8:06 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/26/21 2:28 PM, Daniel Vetter wrote: > On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/25/21 4:49 PM, Daniel Vetter wrote: >>> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >>>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>>>> >>>>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>>>> result in a uapi nightmare. >>>>>>>>>> >>>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>>>> >>>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>>>> >>>>>>>>>> v2: >>>>>>>>>> >>>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>>>> >>>>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>>>> this should be the correct flag to check for. >>>>>>>>>> >>>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>>>> or allowing MIXEDMAP. >>>>>>> >>>>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>>>> sucks. The PAT changes on pages are rather expensive. >>>>>>> IIRC the page cache was implemented because of the slowness of the >>>>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>>>> global TLB flush. >>>>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>>>> from a performance perspective. >>>>> >>>>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>>>> up the mapping area at driver load. >>>>>>> Yes, I guess that was the issue that the comment describes, but the >>>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>>>> >>>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>>>> problem that hurts much :-) >>>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>>>> >>>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 >>>>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>>>> gup. At least afaik. >>>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>>>> already seen tons of problems with the page cache. >>>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >>>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >>>> >>>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >>>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >>>> >>>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >>>> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >>>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >>>> in hugepte support by intentionally not being devmap. >>>> >>>> So I'm really not sure this works as we think it should. Maybe good to do >>>> a quick test program on amdgpu with a buffer in system memory only and try >>>> to do direct io into it. If it works, you have a problem, and a bad one. >>> That's probably impossible, since a quick git grep shows that pretty >>> much anything reasonable has special ptes: arc, arm, arm64, powerpc, >>> riscv, s390, sh, sparc, x86. I don't think you'll have a platform >>> where you can plug an amdgpu in and actually exercise the bug :-) >> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I >> don't see what should be stopping gup to those? > If you have an arch with pte special we use insert_pfn(), which afaict > will use pte_mkspecial for the !devmap case. And ttm isn't devmap > (otherwise our hugepte abuse of devmap hugeptes would go rather > wrong). > > So I think it stops gup. But I haven't verified at all. Would be good > if Christian can check this with some direct io to a buffer in system > memory. Hmm, Docs (again vm_normal_page() say) * VM_MIXEDMAP mappings can likewise contain memory with or without "struct * page" backing, however the difference is that _all_ pages with a struct * page (that is, those where pfn_valid is true) are refcounted and considered * normal pages by the VM. The disadvantage is that pages are refcounted * (which can be slower and simply not an option for some PFNMAP users). The * advantage is that we don't have to follow the strict linearity rule of * PFNMAP mappings in order to support COWable mappings. but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so the above isn't really true, which makes me wonder if and in that case why there could any longer ever be a significant performance difference between MIXEDMAP and PFNMAP. BTW regarding the TTM hugeptes, I don't think we ever landed that devmap hack, so they are (for the non-gup case) relying on vma_is_special_huge(). For the gup case, I think the bug is still there. /Thomas > -Daniel _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-27 8:06 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-02-27 8:06 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 2/26/21 2:28 PM, Daniel Vetter wrote: > On Fri, Feb 26, 2021 at 10:41 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> >> On 2/25/21 4:49 PM, Daniel Vetter wrote: >>> On Thu, Feb 25, 2021 at 11:44 AM Daniel Vetter <daniel@ffwll.ch> wrote: >>>> On Thu, Feb 25, 2021 at 11:28:31AM +0100, Christian König wrote: >>>>> Am 24.02.21 um 10:31 schrieb Daniel Vetter: >>>>>> On Wed, Feb 24, 2021 at 10:16 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/24/21 9:45 AM, Daniel Vetter wrote: >>>>>>>> On Wed, Feb 24, 2021 at 8:46 AM Thomas Hellström (Intel) >>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>> On 2/23/21 11:59 AM, Daniel Vetter wrote: >>>>>>>>>> tldr; DMA buffers aren't normal memory, expecting that you can use >>>>>>>>>> them like that (like calling get_user_pages works, or that they're >>>>>>>>>> accounting like any other normal memory) cannot be guaranteed. >>>>>>>>>> >>>>>>>>>> Since some userspace only runs on integrated devices, where all >>>>>>>>>> buffers are actually all resident system memory, there's a huge >>>>>>>>>> temptation to assume that a struct page is always present and useable >>>>>>>>>> like for any more pagecache backed mmap. This has the potential to >>>>>>>>>> result in a uapi nightmare. >>>>>>>>>> >>>>>>>>>> To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which >>>>>>>>>> blocks get_user_pages and all the other struct page based >>>>>>>>>> infrastructure for everyone. In spirit this is the uapi counterpart to >>>>>>>>>> the kernel-internal CONFIG_DMABUF_DEBUG. >>>>>>>>>> >>>>>>>>>> Motivated by a recent patch which wanted to swich the system dma-buf >>>>>>>>>> heap to vm_insert_page instead of vm_insert_pfn. >>>>>>>>>> >>>>>>>>>> v2: >>>>>>>>>> >>>>>>>>>> Jason brought up that we also want to guarantee that all ptes have the >>>>>>>>>> pte_special flag set, to catch fast get_user_pages (on architectures >>>>>>>>>> that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would >>>>>>>>>> still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. >>>>>>>>>> >>>>>>>>>> From auditing the various functions to insert pfn pte entires >>>>>>>>>> (vm_insert_pfn_prot, remap_pfn_range and all it's callers like >>>>>>>>>> dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so >>>>>>>>>> this should be the correct flag to check for. >>>>>>>>>> >>>>>>>>> If we require VM_PFNMAP, for ordinary page mappings, we also need to >>>>>>>>> disallow COW mappings, since it will not work on architectures that >>>>>>>>> don't have CONFIG_ARCH_HAS_PTE_SPECIAL, (see the docs for vm_normal_page()). >>>>>>>> Hm I figured everyone just uses MAP_SHARED for buffer objects since >>>>>>>> COW really makes absolutely no sense. How would we enforce this? >>>>>>> Perhaps returning -EINVAL on is_cow_mapping() at mmap time. Either that >>>>>>> or allowing MIXEDMAP. >>>>>>> >>>>>>>>> Also worth noting is the comment in ttm_bo_mmap_vma_setup() with >>>>>>>>> possible performance implications with x86 + PAT + VM_PFNMAP + normal >>>>>>>>> pages. That's a very old comment, though, and might not be valid anymore. >>>>>>>> I think that's why ttm has a page cache for these, because it indeed >>>>>>>> sucks. The PAT changes on pages are rather expensive. >>>>>>> IIRC the page cache was implemented because of the slowness of the >>>>>>> caching mode transition itself, more specifically the wbinvd() call + >>>>>>> global TLB flush. >>>>> Yes, exactly that. The global TLB flush is what really breaks our neck here >>>>> from a performance perspective. >>>>> >>>>>>>> There is still an issue for iomem mappings, because the PAT validation >>>>>>>> does a linear walk of the resource tree (lol) for every vm_insert_pfn. >>>>>>>> But for i915 at least this is fixed by using the io_mapping >>>>>>>> infrastructure, which does the PAT reservation only once when you set >>>>>>>> up the mapping area at driver load. >>>>>>> Yes, I guess that was the issue that the comment describes, but the >>>>>>> issue wasn't there with vm_insert_mixed() + VM_MIXEDMAP. >>>>>>> >>>>>>>> Also TTM uses VM_PFNMAP right now for everything, so it can't be a >>>>>>>> problem that hurts much :-) >>>>>>> Hmm, both 5.11 and drm-tip appears to still use MIXEDMAP? >>>>>>> >>>>>>> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/ttm/ttm_bo_vm.c#L554 >>>>>> Uh that's bad, because mixed maps pointing at struct page wont stop >>>>>> gup. At least afaik. >>>>> Hui? I'm pretty sure MIXEDMAP stops gup as well. Otherwise we would have >>>>> already seen tons of problems with the page cache. >>>> On any architecture which has CONFIG_ARCH_HAS_PTE_SPECIAL vm_insert_mixed >>>> boils down to vm_insert_pfn wrt gup. And special pte stops gup fast path. >>>> >>>> But if you don't have VM_IO or VM_PFNMAP set, then I'm not seeing how >>>> you're stopping gup slow path. See check_vma_flags() in mm/gup.c. >>>> >>>> Also if you don't have CONFIG_ARCH_HAS_PTE_SPECIAL then I don't think >>>> vm_insert_mixed even works on iomem pfns. There's the devmap exception, >>>> but we're not devmap. Worse ttm abuses some accidental codepath to smuggle >>>> in hugepte support by intentionally not being devmap. >>>> >>>> So I'm really not sure this works as we think it should. Maybe good to do >>>> a quick test program on amdgpu with a buffer in system memory only and try >>>> to do direct io into it. If it works, you have a problem, and a bad one. >>> That's probably impossible, since a quick git grep shows that pretty >>> much anything reasonable has special ptes: arc, arm, arm64, powerpc, >>> riscv, s390, sh, sparc, x86. I don't think you'll have a platform >>> where you can plug an amdgpu in and actually exercise the bug :-) >> Hm. AFAIK _insert_mixed() doesn't set PTE_SPECIAL on system pages, so I >> don't see what should be stopping gup to those? > If you have an arch with pte special we use insert_pfn(), which afaict > will use pte_mkspecial for the !devmap case. And ttm isn't devmap > (otherwise our hugepte abuse of devmap hugeptes would go rather > wrong). > > So I think it stops gup. But I haven't verified at all. Would be good > if Christian can check this with some direct io to a buffer in system > memory. Hmm, Docs (again vm_normal_page() say) * VM_MIXEDMAP mappings can likewise contain memory with or without "struct * page" backing, however the difference is that _all_ pages with a struct * page (that is, those where pfn_valid is true) are refcounted and considered * normal pages by the VM. The disadvantage is that pages are refcounted * (which can be slower and simply not an option for some PFNMAP users). The * advantage is that we don't have to follow the strict linearity rule of * PFNMAP mappings in order to support COWable mappings. but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so the above isn't really true, which makes me wonder if and in that case why there could any longer ever be a significant performance difference between MIXEDMAP and PFNMAP. BTW regarding the TTM hugeptes, I don't think we ever landed that devmap hack, so they are (for the non-gup case) relying on vma_is_special_huge(). For the gup case, I think the bug is still there. /Thomas > -Daniel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-27 8:06 ` Thomas Hellström (Intel) (?) @ 2021-03-01 8:28 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 8:28 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > So I think it stops gup. But I haven't verified at all. Would be good > > if Christian can check this with some direct io to a buffer in system > > memory. > > Hmm, > > Docs (again vm_normal_page() say) > > * VM_MIXEDMAP mappings can likewise contain memory with or without "struct > * page" backing, however the difference is that _all_ pages with a struct > * page (that is, those where pfn_valid is true) are refcounted and > considered > * normal pages by the VM. The disadvantage is that pages are refcounted > * (which can be slower and simply not an option for some PFNMAP > users). The > * advantage is that we don't have to follow the strict linearity rule of > * PFNMAP mappings in order to support COWable mappings. > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so > the above isn't really true, which makes me wonder if and in that case > why there could any longer ever be a significant performance difference > between MIXEDMAP and PFNMAP. Yeah it's definitely confusing. I guess I'll hack up a patch and see what sticks. > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap > hack, so they are (for the non-gup case) relying on > vma_is_special_huge(). For the gup case, I think the bug is still there. Maybe there's another devmap hack, but the ttm_vm_insert functions do use PFN_DEV and all that. And I think that stops gup_fast from trying to find the underlying page. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 8:28 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 8:28 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > So I think it stops gup. But I haven't verified at all. Would be good > > if Christian can check this with some direct io to a buffer in system > > memory. > > Hmm, > > Docs (again vm_normal_page() say) > > * VM_MIXEDMAP mappings can likewise contain memory with or without "struct > * page" backing, however the difference is that _all_ pages with a struct > * page (that is, those where pfn_valid is true) are refcounted and > considered > * normal pages by the VM. The disadvantage is that pages are refcounted > * (which can be slower and simply not an option for some PFNMAP > users). The > * advantage is that we don't have to follow the strict linearity rule of > * PFNMAP mappings in order to support COWable mappings. > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so > the above isn't really true, which makes me wonder if and in that case > why there could any longer ever be a significant performance difference > between MIXEDMAP and PFNMAP. Yeah it's definitely confusing. I guess I'll hack up a patch and see what sticks. > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap > hack, so they are (for the non-gup case) relying on > vma_is_special_huge(). For the gup case, I think the bug is still there. Maybe there's another devmap hack, but the ttm_vm_insert functions do use PFN_DEV and all that. And I think that stops gup_fast from trying to find the underlying page. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 8:28 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 8:28 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > So I think it stops gup. But I haven't verified at all. Would be good > > if Christian can check this with some direct io to a buffer in system > > memory. > > Hmm, > > Docs (again vm_normal_page() say) > > * VM_MIXEDMAP mappings can likewise contain memory with or without "struct > * page" backing, however the difference is that _all_ pages with a struct > * page (that is, those where pfn_valid is true) are refcounted and > considered > * normal pages by the VM. The disadvantage is that pages are refcounted > * (which can be slower and simply not an option for some PFNMAP > users). The > * advantage is that we don't have to follow the strict linearity rule of > * PFNMAP mappings in order to support COWable mappings. > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so > the above isn't really true, which makes me wonder if and in that case > why there could any longer ever be a significant performance difference > between MIXEDMAP and PFNMAP. Yeah it's definitely confusing. I guess I'll hack up a patch and see what sticks. > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap > hack, so they are (for the non-gup case) relying on > vma_is_special_huge(). For the gup case, I think the bug is still there. Maybe there's another devmap hack, but the ttm_vm_insert functions do use PFN_DEV and all that. And I think that stops gup_fast from trying to find the underlying page. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-01 8:28 ` Daniel Vetter (?) @ 2021-03-01 8:39 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-01 8:39 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK Hi, On 3/1/21 9:28 AM, Daniel Vetter wrote: > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>> So I think it stops gup. But I haven't verified at all. Would be good >>> if Christian can check this with some direct io to a buffer in system >>> memory. >> Hmm, >> >> Docs (again vm_normal_page() say) >> >> * VM_MIXEDMAP mappings can likewise contain memory with or without "struct >> * page" backing, however the difference is that _all_ pages with a struct >> * page (that is, those where pfn_valid is true) are refcounted and >> considered >> * normal pages by the VM. The disadvantage is that pages are refcounted >> * (which can be slower and simply not an option for some PFNMAP >> users). The >> * advantage is that we don't have to follow the strict linearity rule of >> * PFNMAP mappings in order to support COWable mappings. >> >> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so >> the above isn't really true, which makes me wonder if and in that case >> why there could any longer ever be a significant performance difference >> between MIXEDMAP and PFNMAP. > Yeah it's definitely confusing. I guess I'll hack up a patch and see > what sticks. > >> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap >> hack, so they are (for the non-gup case) relying on >> vma_is_special_huge(). For the gup case, I think the bug is still there. > Maybe there's another devmap hack, but the ttm_vm_insert functions do > use PFN_DEV and all that. And I think that stops gup_fast from trying > to find the underlying page. > -Daniel Hmm perhaps it might, but I don't think so. The fix I tried out was to set PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and then follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() backs off, in the end that would mean setting in stone that "if there is a huge devmap page table entry for which we haven't registered any devmap struct pages (get_dev_pagemap returns NULL), we should treat that as a "special" huge page table entry". From what I can tell, all code calling get_dev_pagemap() already does that, it's just a question of getting it accepted and formalizing it. /Thomas ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 8:39 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-01 8:39 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK Hi, On 3/1/21 9:28 AM, Daniel Vetter wrote: > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>> So I think it stops gup. But I haven't verified at all. Would be good >>> if Christian can check this with some direct io to a buffer in system >>> memory. >> Hmm, >> >> Docs (again vm_normal_page() say) >> >> * VM_MIXEDMAP mappings can likewise contain memory with or without "struct >> * page" backing, however the difference is that _all_ pages with a struct >> * page (that is, those where pfn_valid is true) are refcounted and >> considered >> * normal pages by the VM. The disadvantage is that pages are refcounted >> * (which can be slower and simply not an option for some PFNMAP >> users). The >> * advantage is that we don't have to follow the strict linearity rule of >> * PFNMAP mappings in order to support COWable mappings. >> >> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so >> the above isn't really true, which makes me wonder if and in that case >> why there could any longer ever be a significant performance difference >> between MIXEDMAP and PFNMAP. > Yeah it's definitely confusing. I guess I'll hack up a patch and see > what sticks. > >> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap >> hack, so they are (for the non-gup case) relying on >> vma_is_special_huge(). For the gup case, I think the bug is still there. > Maybe there's another devmap hack, but the ttm_vm_insert functions do > use PFN_DEV and all that. And I think that stops gup_fast from trying > to find the underlying page. > -Daniel Hmm perhaps it might, but I don't think so. The fix I tried out was to set PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and then follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() backs off, in the end that would mean setting in stone that "if there is a huge devmap page table entry for which we haven't registered any devmap struct pages (get_dev_pagemap returns NULL), we should treat that as a "special" huge page table entry". From what I can tell, all code calling get_dev_pagemap() already does that, it's just a question of getting it accepted and formalizing it. /Thomas _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 8:39 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-01 8:39 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK Hi, On 3/1/21 9:28 AM, Daniel Vetter wrote: > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>> So I think it stops gup. But I haven't verified at all. Would be good >>> if Christian can check this with some direct io to a buffer in system >>> memory. >> Hmm, >> >> Docs (again vm_normal_page() say) >> >> * VM_MIXEDMAP mappings can likewise contain memory with or without "struct >> * page" backing, however the difference is that _all_ pages with a struct >> * page (that is, those where pfn_valid is true) are refcounted and >> considered >> * normal pages by the VM. The disadvantage is that pages are refcounted >> * (which can be slower and simply not an option for some PFNMAP >> users). The >> * advantage is that we don't have to follow the strict linearity rule of >> * PFNMAP mappings in order to support COWable mappings. >> >> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so >> the above isn't really true, which makes me wonder if and in that case >> why there could any longer ever be a significant performance difference >> between MIXEDMAP and PFNMAP. > Yeah it's definitely confusing. I guess I'll hack up a patch and see > what sticks. > >> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap >> hack, so they are (for the non-gup case) relying on >> vma_is_special_huge(). For the gup case, I think the bug is still there. > Maybe there's another devmap hack, but the ttm_vm_insert functions do > use PFN_DEV and all that. And I think that stops gup_fast from trying > to find the underlying page. > -Daniel Hmm perhaps it might, but I don't think so. The fix I tried out was to set PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and then follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() backs off, in the end that would mean setting in stone that "if there is a huge devmap page table entry for which we haven't registered any devmap struct pages (get_dev_pagemap returns NULL), we should treat that as a "special" huge page table entry". From what I can tell, all code calling get_dev_pagemap() already does that, it's just a question of getting it accepted and formalizing it. /Thomas _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-01 8:39 ` Thomas Hellström (Intel) (?) @ 2021-03-01 9:05 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 9:05 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Daniel Vetter, Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote: > Hi, > > On 3/1/21 9:28 AM, Daniel Vetter wrote: > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > > > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > > > So I think it stops gup. But I haven't verified at all. Would be good > > > > if Christian can check this with some direct io to a buffer in system > > > > memory. > > > Hmm, > > > > > > Docs (again vm_normal_page() say) > > > > > > * VM_MIXEDMAP mappings can likewise contain memory with or without "struct > > > * page" backing, however the difference is that _all_ pages with a struct > > > * page (that is, those where pfn_valid is true) are refcounted and > > > considered > > > * normal pages by the VM. The disadvantage is that pages are refcounted > > > * (which can be slower and simply not an option for some PFNMAP > > > users). The > > > * advantage is that we don't have to follow the strict linearity rule of > > > * PFNMAP mappings in order to support COWable mappings. > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so > > > the above isn't really true, which makes me wonder if and in that case > > > why there could any longer ever be a significant performance difference > > > between MIXEDMAP and PFNMAP. > > Yeah it's definitely confusing. I guess I'll hack up a patch and see > > what sticks. > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap > > > hack, so they are (for the non-gup case) relying on > > > vma_is_special_huge(). For the gup case, I think the bug is still there. > > Maybe there's another devmap hack, but the ttm_vm_insert functions do > > use PFN_DEV and all that. And I think that stops gup_fast from trying > > to find the underlying page. > > -Daniel > > Hmm perhaps it might, but I don't think so. The fix I tried out was to set > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and > then > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() > backs off, > > in the end that would mean setting in stone that "if there is a huge devmap > page table entry for which we haven't registered any devmap struct pages > (get_dev_pagemap returns NULL), we should treat that as a "special" huge > page table entry". > > From what I can tell, all code calling get_dev_pagemap() already does that, > it's just a question of getting it accepted and formalizing it. Oh I thought that's already how it works, since I didn't spot anything else that would block gup_fast from falling over. I guess really would need some testcases to make sure direct i/o (that's the easiest to test) fails like we expect. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 9:05 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 9:05 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Christian König, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote: > Hi, > > On 3/1/21 9:28 AM, Daniel Vetter wrote: > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > > > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > > > So I think it stops gup. But I haven't verified at all. Would be good > > > > if Christian can check this with some direct io to a buffer in system > > > > memory. > > > Hmm, > > > > > > Docs (again vm_normal_page() say) > > > > > > * VM_MIXEDMAP mappings can likewise contain memory with or without "struct > > > * page" backing, however the difference is that _all_ pages with a struct > > > * page (that is, those where pfn_valid is true) are refcounted and > > > considered > > > * normal pages by the VM. The disadvantage is that pages are refcounted > > > * (which can be slower and simply not an option for some PFNMAP > > > users). The > > > * advantage is that we don't have to follow the strict linearity rule of > > > * PFNMAP mappings in order to support COWable mappings. > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so > > > the above isn't really true, which makes me wonder if and in that case > > > why there could any longer ever be a significant performance difference > > > between MIXEDMAP and PFNMAP. > > Yeah it's definitely confusing. I guess I'll hack up a patch and see > > what sticks. > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap > > > hack, so they are (for the non-gup case) relying on > > > vma_is_special_huge(). For the gup case, I think the bug is still there. > > Maybe there's another devmap hack, but the ttm_vm_insert functions do > > use PFN_DEV and all that. And I think that stops gup_fast from trying > > to find the underlying page. > > -Daniel > > Hmm perhaps it might, but I don't think so. The fix I tried out was to set > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and > then > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() > backs off, > > in the end that would mean setting in stone that "if there is a huge devmap > page table entry for which we haven't registered any devmap struct pages > (get_dev_pagemap returns NULL), we should treat that as a "special" huge > page table entry". > > From what I can tell, all code calling get_dev_pagemap() already does that, > it's just a question of getting it accepted and formalizing it. Oh I thought that's already how it works, since I didn't spot anything else that would block gup_fast from falling over. I guess really would need some testcases to make sure direct i/o (that's the easiest to test) fails like we expect. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 9:05 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 9:05 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Christian König, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote: > Hi, > > On 3/1/21 9:28 AM, Daniel Vetter wrote: > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > > <thomas_os@shipmail.org> wrote: > > > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > > > So I think it stops gup. But I haven't verified at all. Would be good > > > > if Christian can check this with some direct io to a buffer in system > > > > memory. > > > Hmm, > > > > > > Docs (again vm_normal_page() say) > > > > > > * VM_MIXEDMAP mappings can likewise contain memory with or without "struct > > > * page" backing, however the difference is that _all_ pages with a struct > > > * page (that is, those where pfn_valid is true) are refcounted and > > > considered > > > * normal pages by the VM. The disadvantage is that pages are refcounted > > > * (which can be slower and simply not an option for some PFNMAP > > > users). The > > > * advantage is that we don't have to follow the strict linearity rule of > > > * PFNMAP mappings in order to support COWable mappings. > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so > > > the above isn't really true, which makes me wonder if and in that case > > > why there could any longer ever be a significant performance difference > > > between MIXEDMAP and PFNMAP. > > Yeah it's definitely confusing. I guess I'll hack up a patch and see > > what sticks. > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that devmap > > > hack, so they are (for the non-gup case) relying on > > > vma_is_special_huge(). For the gup case, I think the bug is still there. > > Maybe there's another devmap hack, but the ttm_vm_insert functions do > > use PFN_DEV and all that. And I think that stops gup_fast from trying > > to find the underlying page. > > -Daniel > > Hmm perhaps it might, but I don't think so. The fix I tried out was to set > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and > then > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() > backs off, > > in the end that would mean setting in stone that "if there is a huge devmap > page table entry for which we haven't registered any devmap struct pages > (get_dev_pagemap returns NULL), we should treat that as a "special" huge > page table entry". > > From what I can tell, all code calling get_dev_pagemap() already does that, > it's just a question of getting it accepted and formalizing it. Oh I thought that's already how it works, since I didn't spot anything else that would block gup_fast from falling over. I guess really would need some testcases to make sure direct i/o (that's the easiest to test) fails like we expect. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-01 9:05 ` Daniel Vetter (?) @ 2021-03-01 9:21 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-01 9:21 UTC (permalink / raw) To: Daniel Vetter Cc: Daniel Vetter, Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 3/1/21 10:05 AM, Daniel Vetter wrote: > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote: >> Hi, >> >> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>> <thomas_os@shipmail.org> wrote: >>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>> So I think it stops gup. But I haven't verified at all. Would be good >>>>> if Christian can check this with some direct io to a buffer in system >>>>> memory. >>>> Hmm, >>>> >>>> Docs (again vm_normal_page() say) >>>> >>>> * VM_MIXEDMAP mappings can likewise contain memory with or without "struct >>>> * page" backing, however the difference is that _all_ pages with a struct >>>> * page (that is, those where pfn_valid is true) are refcounted and >>>> considered >>>> * normal pages by the VM. The disadvantage is that pages are refcounted >>>> * (which can be slower and simply not an option for some PFNMAP >>>> users). The >>>> * advantage is that we don't have to follow the strict linearity rule of >>>> * PFNMAP mappings in order to support COWable mappings. >>>> >>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so >>>> the above isn't really true, which makes me wonder if and in that case >>>> why there could any longer ever be a significant performance difference >>>> between MIXEDMAP and PFNMAP. >>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>> what sticks. >>> >>>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap >>>> hack, so they are (for the non-gup case) relying on >>>> vma_is_special_huge(). For the gup case, I think the bug is still there. >>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>> to find the underlying page. >>> -Daniel >> Hmm perhaps it might, but I don't think so. The fix I tried out was to set >> >> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and >> then >> >> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() >> backs off, >> >> in the end that would mean setting in stone that "if there is a huge devmap >> page table entry for which we haven't registered any devmap struct pages >> (get_dev_pagemap returns NULL), we should treat that as a "special" huge >> page table entry". >> >> From what I can tell, all code calling get_dev_pagemap() already does that, >> it's just a question of getting it accepted and formalizing it. > Oh I thought that's already how it works, since I didn't spot anything > else that would block gup_fast from falling over. I guess really would > need some testcases to make sure direct i/o (that's the easiest to test) > fails like we expect. Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. Otherwise pmd_devmap() will not return true and since there is no pmd_special() things break. /Thomas > -Daniel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 9:21 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-01 9:21 UTC (permalink / raw) To: Daniel Vetter Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Christian König, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 3/1/21 10:05 AM, Daniel Vetter wrote: > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote: >> Hi, >> >> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>> <thomas_os@shipmail.org> wrote: >>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>> So I think it stops gup. But I haven't verified at all. Would be good >>>>> if Christian can check this with some direct io to a buffer in system >>>>> memory. >>>> Hmm, >>>> >>>> Docs (again vm_normal_page() say) >>>> >>>> * VM_MIXEDMAP mappings can likewise contain memory with or without "struct >>>> * page" backing, however the difference is that _all_ pages with a struct >>>> * page (that is, those where pfn_valid is true) are refcounted and >>>> considered >>>> * normal pages by the VM. The disadvantage is that pages are refcounted >>>> * (which can be slower and simply not an option for some PFNMAP >>>> users). The >>>> * advantage is that we don't have to follow the strict linearity rule of >>>> * PFNMAP mappings in order to support COWable mappings. >>>> >>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so >>>> the above isn't really true, which makes me wonder if and in that case >>>> why there could any longer ever be a significant performance difference >>>> between MIXEDMAP and PFNMAP. >>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>> what sticks. >>> >>>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap >>>> hack, so they are (for the non-gup case) relying on >>>> vma_is_special_huge(). For the gup case, I think the bug is still there. >>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>> to find the underlying page. >>> -Daniel >> Hmm perhaps it might, but I don't think so. The fix I tried out was to set >> >> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and >> then >> >> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() >> backs off, >> >> in the end that would mean setting in stone that "if there is a huge devmap >> page table entry for which we haven't registered any devmap struct pages >> (get_dev_pagemap returns NULL), we should treat that as a "special" huge >> page table entry". >> >> From what I can tell, all code calling get_dev_pagemap() already does that, >> it's just a question of getting it accepted and formalizing it. > Oh I thought that's already how it works, since I didn't spot anything > else that would block gup_fast from falling over. I guess really would > need some testcases to make sure direct i/o (that's the easiest to test) > fails like we expect. Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. Otherwise pmd_devmap() will not return true and since there is no pmd_special() things break. /Thomas > -Daniel _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 9:21 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-01 9:21 UTC (permalink / raw) To: Daniel Vetter Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Christian König, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 3/1/21 10:05 AM, Daniel Vetter wrote: > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) wrote: >> Hi, >> >> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>> <thomas_os@shipmail.org> wrote: >>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>> So I think it stops gup. But I haven't verified at all. Would be good >>>>> if Christian can check this with some direct io to a buffer in system >>>>> memory. >>>> Hmm, >>>> >>>> Docs (again vm_normal_page() say) >>>> >>>> * VM_MIXEDMAP mappings can likewise contain memory with or without "struct >>>> * page" backing, however the difference is that _all_ pages with a struct >>>> * page (that is, those where pfn_valid is true) are refcounted and >>>> considered >>>> * normal pages by the VM. The disadvantage is that pages are refcounted >>>> * (which can be slower and simply not an option for some PFNMAP >>>> users). The >>>> * advantage is that we don't have to follow the strict linearity rule of >>>> * PFNMAP mappings in order to support COWable mappings. >>>> >>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() path, so >>>> the above isn't really true, which makes me wonder if and in that case >>>> why there could any longer ever be a significant performance difference >>>> between MIXEDMAP and PFNMAP. >>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>> what sticks. >>> >>>> BTW regarding the TTM hugeptes, I don't think we ever landed that devmap >>>> hack, so they are (for the non-gup case) relying on >>>> vma_is_special_huge(). For the gup case, I think the bug is still there. >>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>> to find the underlying page. >>> -Daniel >> Hmm perhaps it might, but I don't think so. The fix I tried out was to set >> >> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be true, and >> then >> >> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and gup_fast() >> backs off, >> >> in the end that would mean setting in stone that "if there is a huge devmap >> page table entry for which we haven't registered any devmap struct pages >> (get_dev_pagemap returns NULL), we should treat that as a "special" huge >> page table entry". >> >> From what I can tell, all code calling get_dev_pagemap() already does that, >> it's just a question of getting it accepted and formalizing it. > Oh I thought that's already how it works, since I didn't spot anything > else that would block gup_fast from falling over. I guess really would > need some testcases to make sure direct i/o (that's the easiest to test) > fails like we expect. Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. Otherwise pmd_devmap() will not return true and since there is no pmd_special() things break. /Thomas > -Daniel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-01 9:21 ` Thomas Hellström (Intel) (?) @ 2021-03-01 10:17 ` Christian König -1 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-03-01 10:17 UTC (permalink / raw) To: Thomas Hellström (Intel), Daniel Vetter Cc: Daniel Vetter, Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > On 3/1/21 10:05 AM, Daniel Vetter wrote: >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >> wrote: >>> Hi, >>> >>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>> good >>>>>> if Christian can check this with some direct io to a buffer in >>>>>> system >>>>>> memory. >>>>> Hmm, >>>>> >>>>> Docs (again vm_normal_page() say) >>>>> >>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>> without "struct >>>>> * page" backing, however the difference is that _all_ pages >>>>> with a struct >>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>> and >>>>> considered >>>>> * normal pages by the VM. The disadvantage is that pages are >>>>> refcounted >>>>> * (which can be slower and simply not an option for some PFNMAP >>>>> users). The >>>>> * advantage is that we don't have to follow the strict >>>>> linearity rule of >>>>> * PFNMAP mappings in order to support COWable mappings. >>>>> >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>> path, so >>>>> the above isn't really true, which makes me wonder if and in that >>>>> case >>>>> why there could any longer ever be a significant performance >>>>> difference >>>>> between MIXEDMAP and PFNMAP. >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>> what sticks. >>>> >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>> devmap >>>>> hack, so they are (for the non-gup case) relying on >>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>> there. >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>> to find the underlying page. >>>> -Daniel >>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>> to set >>> >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>> true, and >>> then >>> >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>> gup_fast() >>> backs off, >>> >>> in the end that would mean setting in stone that "if there is a huge >>> devmap >>> page table entry for which we haven't registered any devmap struct >>> pages >>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>> huge >>> page table entry". >>> >>> From what I can tell, all code calling get_dev_pagemap() already >>> does that, >>> it's just a question of getting it accepted and formalizing it. >> Oh I thought that's already how it works, since I didn't spot anything >> else that would block gup_fast from falling over. I guess really would >> need some testcases to make sure direct i/o (that's the easiest to test) >> fails like we expect. > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > Otherwise pmd_devmap() will not return true and since there is no > pmd_special() things break. Is that maybe the issue we have seen with amdgpu and huge pages? Apart from that I'm lost guys, that devmap and gup stuff is not something I have a good knowledge of apart from a one mile high view. Christian. > > /Thomas > > > >> -Daniel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 10:17 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-03-01 10:17 UTC (permalink / raw) To: Thomas Hellström (Intel), Daniel Vetter Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Christian König, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > On 3/1/21 10:05 AM, Daniel Vetter wrote: >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >> wrote: >>> Hi, >>> >>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>> good >>>>>> if Christian can check this with some direct io to a buffer in >>>>>> system >>>>>> memory. >>>>> Hmm, >>>>> >>>>> Docs (again vm_normal_page() say) >>>>> >>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>> without "struct >>>>> * page" backing, however the difference is that _all_ pages >>>>> with a struct >>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>> and >>>>> considered >>>>> * normal pages by the VM. The disadvantage is that pages are >>>>> refcounted >>>>> * (which can be slower and simply not an option for some PFNMAP >>>>> users). The >>>>> * advantage is that we don't have to follow the strict >>>>> linearity rule of >>>>> * PFNMAP mappings in order to support COWable mappings. >>>>> >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>> path, so >>>>> the above isn't really true, which makes me wonder if and in that >>>>> case >>>>> why there could any longer ever be a significant performance >>>>> difference >>>>> between MIXEDMAP and PFNMAP. >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>> what sticks. >>>> >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>> devmap >>>>> hack, so they are (for the non-gup case) relying on >>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>> there. >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>> to find the underlying page. >>>> -Daniel >>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>> to set >>> >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>> true, and >>> then >>> >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>> gup_fast() >>> backs off, >>> >>> in the end that would mean setting in stone that "if there is a huge >>> devmap >>> page table entry for which we haven't registered any devmap struct >>> pages >>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>> huge >>> page table entry". >>> >>> From what I can tell, all code calling get_dev_pagemap() already >>> does that, >>> it's just a question of getting it accepted and formalizing it. >> Oh I thought that's already how it works, since I didn't spot anything >> else that would block gup_fast from falling over. I guess really would >> need some testcases to make sure direct i/o (that's the easiest to test) >> fails like we expect. > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > Otherwise pmd_devmap() will not return true and since there is no > pmd_special() things break. Is that maybe the issue we have seen with amdgpu and huge pages? Apart from that I'm lost guys, that devmap and gup stuff is not something I have a good knowledge of apart from a one mile high view. Christian. > > /Thomas > > > >> -Daniel _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 10:17 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-03-01 10:17 UTC (permalink / raw) To: Thomas Hellström (Intel), Daniel Vetter Cc: Daniel Vetter, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Christian König, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > On 3/1/21 10:05 AM, Daniel Vetter wrote: >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >> wrote: >>> Hi, >>> >>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>> <thomas_os@shipmail.org> wrote: >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>> good >>>>>> if Christian can check this with some direct io to a buffer in >>>>>> system >>>>>> memory. >>>>> Hmm, >>>>> >>>>> Docs (again vm_normal_page() say) >>>>> >>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>> without "struct >>>>> * page" backing, however the difference is that _all_ pages >>>>> with a struct >>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>> and >>>>> considered >>>>> * normal pages by the VM. The disadvantage is that pages are >>>>> refcounted >>>>> * (which can be slower and simply not an option for some PFNMAP >>>>> users). The >>>>> * advantage is that we don't have to follow the strict >>>>> linearity rule of >>>>> * PFNMAP mappings in order to support COWable mappings. >>>>> >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>> path, so >>>>> the above isn't really true, which makes me wonder if and in that >>>>> case >>>>> why there could any longer ever be a significant performance >>>>> difference >>>>> between MIXEDMAP and PFNMAP. >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>> what sticks. >>>> >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>> devmap >>>>> hack, so they are (for the non-gup case) relying on >>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>> there. >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>> to find the underlying page. >>>> -Daniel >>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>> to set >>> >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>> true, and >>> then >>> >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>> gup_fast() >>> backs off, >>> >>> in the end that would mean setting in stone that "if there is a huge >>> devmap >>> page table entry for which we haven't registered any devmap struct >>> pages >>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>> huge >>> page table entry". >>> >>> From what I can tell, all code calling get_dev_pagemap() already >>> does that, >>> it's just a question of getting it accepted and formalizing it. >> Oh I thought that's already how it works, since I didn't spot anything >> else that would block gup_fast from falling over. I guess really would >> need some testcases to make sure direct i/o (that's the easiest to test) >> fails like we expect. > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > Otherwise pmd_devmap() will not return true and since there is no > pmd_special() things break. Is that maybe the issue we have seen with amdgpu and huge pages? Apart from that I'm lost guys, that devmap and gup stuff is not something I have a good knowledge of apart from a one mile high view. Christian. > > /Thomas > > > >> -Daniel _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-01 10:17 ` Christian König (?) @ 2021-03-01 14:09 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 14:09 UTC (permalink / raw) To: Christian König Cc: Thomas Hellström (Intel), Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK On Mon, Mar 1, 2021 at 11:17 AM Christian König <christian.koenig@amd.com> wrote: > > > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > > > On 3/1/21 10:05 AM, Daniel Vetter wrote: > >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > >> wrote: > >>> Hi, > >>> > >>> On 3/1/21 9:28 AM, Daniel Vetter wrote: > >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > >>>> <thomas_os@shipmail.org> wrote: > >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: > >>>>>> So I think it stops gup. But I haven't verified at all. Would be > >>>>>> good > >>>>>> if Christian can check this with some direct io to a buffer in > >>>>>> system > >>>>>> memory. > >>>>> Hmm, > >>>>> > >>>>> Docs (again vm_normal_page() say) > >>>>> > >>>>> * VM_MIXEDMAP mappings can likewise contain memory with or > >>>>> without "struct > >>>>> * page" backing, however the difference is that _all_ pages > >>>>> with a struct > >>>>> * page (that is, those where pfn_valid is true) are refcounted > >>>>> and > >>>>> considered > >>>>> * normal pages by the VM. The disadvantage is that pages are > >>>>> refcounted > >>>>> * (which can be slower and simply not an option for some PFNMAP > >>>>> users). The > >>>>> * advantage is that we don't have to follow the strict > >>>>> linearity rule of > >>>>> * PFNMAP mappings in order to support COWable mappings. > >>>>> > >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() > >>>>> path, so > >>>>> the above isn't really true, which makes me wonder if and in that > >>>>> case > >>>>> why there could any longer ever be a significant performance > >>>>> difference > >>>>> between MIXEDMAP and PFNMAP. > >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see > >>>> what sticks. > >>>> > >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that > >>>>> devmap > >>>>> hack, so they are (for the non-gup case) relying on > >>>>> vma_is_special_huge(). For the gup case, I think the bug is still > >>>>> there. > >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do > >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying > >>>> to find the underlying page. > >>>> -Daniel > >>> Hmm perhaps it might, but I don't think so. The fix I tried out was > >>> to set > >>> > >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > >>> true, and > >>> then > >>> > >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > >>> gup_fast() > >>> backs off, > >>> > >>> in the end that would mean setting in stone that "if there is a huge > >>> devmap > >>> page table entry for which we haven't registered any devmap struct > >>> pages > >>> (get_dev_pagemap returns NULL), we should treat that as a "special" > >>> huge > >>> page table entry". > >>> > >>> From what I can tell, all code calling get_dev_pagemap() already > >>> does that, > >>> it's just a question of getting it accepted and formalizing it. > >> Oh I thought that's already how it works, since I didn't spot anything > >> else that would block gup_fast from falling over. I guess really would > >> need some testcases to make sure direct i/o (that's the easiest to test) > >> fails like we expect. > > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > > Otherwise pmd_devmap() will not return true and since there is no > > pmd_special() things break. > > Is that maybe the issue we have seen with amdgpu and huge pages? Yeah, essentially when you have a hugepte inserted by ttm, and it happens to point at system memory, then gup will work on that. And create all kinds of havoc. > Apart from that I'm lost guys, that devmap and gup stuff is not > something I have a good knowledge of apart from a one mile high view. I'm not really better, hence would be good to do a testcase and see. This should provoke it: - allocate nicely aligned bo in system memory - mmap, again nicely aligned to 2M - do some direct io from a filesystem into that mmap, that should trigger gup - before the gup completes free the mmap and bo so that ttm recycles the pages, which should trip up on the elevated refcount. If you wait until the direct io is completely, then I think nothing bad can be observed. Ofc if your amdgpu+hugepte issue is something else, then maybe we have another issue. Also usual caveat: I'm not an mm hacker either, so might be completely wrong. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 14:09 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 14:09 UTC (permalink / raw) To: Christian König Cc: Christian König, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Mon, Mar 1, 2021 at 11:17 AM Christian König <christian.koenig@amd.com> wrote: > > > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > > > On 3/1/21 10:05 AM, Daniel Vetter wrote: > >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > >> wrote: > >>> Hi, > >>> > >>> On 3/1/21 9:28 AM, Daniel Vetter wrote: > >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > >>>> <thomas_os@shipmail.org> wrote: > >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: > >>>>>> So I think it stops gup. But I haven't verified at all. Would be > >>>>>> good > >>>>>> if Christian can check this with some direct io to a buffer in > >>>>>> system > >>>>>> memory. > >>>>> Hmm, > >>>>> > >>>>> Docs (again vm_normal_page() say) > >>>>> > >>>>> * VM_MIXEDMAP mappings can likewise contain memory with or > >>>>> without "struct > >>>>> * page" backing, however the difference is that _all_ pages > >>>>> with a struct > >>>>> * page (that is, those where pfn_valid is true) are refcounted > >>>>> and > >>>>> considered > >>>>> * normal pages by the VM. The disadvantage is that pages are > >>>>> refcounted > >>>>> * (which can be slower and simply not an option for some PFNMAP > >>>>> users). The > >>>>> * advantage is that we don't have to follow the strict > >>>>> linearity rule of > >>>>> * PFNMAP mappings in order to support COWable mappings. > >>>>> > >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() > >>>>> path, so > >>>>> the above isn't really true, which makes me wonder if and in that > >>>>> case > >>>>> why there could any longer ever be a significant performance > >>>>> difference > >>>>> between MIXEDMAP and PFNMAP. > >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see > >>>> what sticks. > >>>> > >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that > >>>>> devmap > >>>>> hack, so they are (for the non-gup case) relying on > >>>>> vma_is_special_huge(). For the gup case, I think the bug is still > >>>>> there. > >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do > >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying > >>>> to find the underlying page. > >>>> -Daniel > >>> Hmm perhaps it might, but I don't think so. The fix I tried out was > >>> to set > >>> > >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > >>> true, and > >>> then > >>> > >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > >>> gup_fast() > >>> backs off, > >>> > >>> in the end that would mean setting in stone that "if there is a huge > >>> devmap > >>> page table entry for which we haven't registered any devmap struct > >>> pages > >>> (get_dev_pagemap returns NULL), we should treat that as a "special" > >>> huge > >>> page table entry". > >>> > >>> From what I can tell, all code calling get_dev_pagemap() already > >>> does that, > >>> it's just a question of getting it accepted and formalizing it. > >> Oh I thought that's already how it works, since I didn't spot anything > >> else that would block gup_fast from falling over. I guess really would > >> need some testcases to make sure direct i/o (that's the easiest to test) > >> fails like we expect. > > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > > Otherwise pmd_devmap() will not return true and since there is no > > pmd_special() things break. > > Is that maybe the issue we have seen with amdgpu and huge pages? Yeah, essentially when you have a hugepte inserted by ttm, and it happens to point at system memory, then gup will work on that. And create all kinds of havoc. > Apart from that I'm lost guys, that devmap and gup stuff is not > something I have a good knowledge of apart from a one mile high view. I'm not really better, hence would be good to do a testcase and see. This should provoke it: - allocate nicely aligned bo in system memory - mmap, again nicely aligned to 2M - do some direct io from a filesystem into that mmap, that should trigger gup - before the gup completes free the mmap and bo so that ttm recycles the pages, which should trip up on the elevated refcount. If you wait until the direct io is completely, then I think nothing bad can be observed. Ofc if your amdgpu+hugepte issue is something else, then maybe we have another issue. Also usual caveat: I'm not an mm hacker either, so might be completely wrong. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-01 14:09 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-01 14:09 UTC (permalink / raw) To: Christian König Cc: Christian König, Thomas Hellström (Intel), Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Mon, Mar 1, 2021 at 11:17 AM Christian König <christian.koenig@amd.com> wrote: > > > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > > > On 3/1/21 10:05 AM, Daniel Vetter wrote: > >> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > >> wrote: > >>> Hi, > >>> > >>> On 3/1/21 9:28 AM, Daniel Vetter wrote: > >>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > >>>> <thomas_os@shipmail.org> wrote: > >>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: > >>>>>> So I think it stops gup. But I haven't verified at all. Would be > >>>>>> good > >>>>>> if Christian can check this with some direct io to a buffer in > >>>>>> system > >>>>>> memory. > >>>>> Hmm, > >>>>> > >>>>> Docs (again vm_normal_page() say) > >>>>> > >>>>> * VM_MIXEDMAP mappings can likewise contain memory with or > >>>>> without "struct > >>>>> * page" backing, however the difference is that _all_ pages > >>>>> with a struct > >>>>> * page (that is, those where pfn_valid is true) are refcounted > >>>>> and > >>>>> considered > >>>>> * normal pages by the VM. The disadvantage is that pages are > >>>>> refcounted > >>>>> * (which can be slower and simply not an option for some PFNMAP > >>>>> users). The > >>>>> * advantage is that we don't have to follow the strict > >>>>> linearity rule of > >>>>> * PFNMAP mappings in order to support COWable mappings. > >>>>> > >>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() > >>>>> path, so > >>>>> the above isn't really true, which makes me wonder if and in that > >>>>> case > >>>>> why there could any longer ever be a significant performance > >>>>> difference > >>>>> between MIXEDMAP and PFNMAP. > >>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see > >>>> what sticks. > >>>> > >>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that > >>>>> devmap > >>>>> hack, so they are (for the non-gup case) relying on > >>>>> vma_is_special_huge(). For the gup case, I think the bug is still > >>>>> there. > >>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do > >>>> use PFN_DEV and all that. And I think that stops gup_fast from trying > >>>> to find the underlying page. > >>>> -Daniel > >>> Hmm perhaps it might, but I don't think so. The fix I tried out was > >>> to set > >>> > >>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > >>> true, and > >>> then > >>> > >>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > >>> gup_fast() > >>> backs off, > >>> > >>> in the end that would mean setting in stone that "if there is a huge > >>> devmap > >>> page table entry for which we haven't registered any devmap struct > >>> pages > >>> (get_dev_pagemap returns NULL), we should treat that as a "special" > >>> huge > >>> page table entry". > >>> > >>> From what I can tell, all code calling get_dev_pagemap() already > >>> does that, > >>> it's just a question of getting it accepted and formalizing it. > >> Oh I thought that's already how it works, since I didn't spot anything > >> else that would block gup_fast from falling over. I guess really would > >> need some testcases to make sure direct i/o (that's the easiest to test) > >> fails like we expect. > > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > > Otherwise pmd_devmap() will not return true and since there is no > > pmd_special() things break. > > Is that maybe the issue we have seen with amdgpu and huge pages? Yeah, essentially when you have a hugepte inserted by ttm, and it happens to point at system memory, then gup will work on that. And create all kinds of havoc. > Apart from that I'm lost guys, that devmap and gup stuff is not > something I have a good knowledge of apart from a one mile high view. I'm not really better, hence would be good to do a testcase and see. This should provoke it: - allocate nicely aligned bo in system memory - mmap, again nicely aligned to 2M - do some direct io from a filesystem into that mmap, that should trigger gup - before the gup completes free the mmap and bo so that ttm recycles the pages, which should trip up on the elevated refcount. If you wait until the direct io is completely, then I think nothing bad can be observed. Ofc if your amdgpu+hugepte issue is something else, then maybe we have another issue. Also usual caveat: I'm not an mm hacker either, so might be completely wrong. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-01 14:09 ` Daniel Vetter (?) @ 2021-03-11 10:22 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 10:22 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK On 3/1/21 3:09 PM, Daniel Vetter wrote: > On Mon, Mar 1, 2021 at 11:17 AM Christian König > <christian.koenig@amd.com> wrote: >> >> >> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>> wrote: >>>>> Hi, >>>>> >>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>> good >>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>> system >>>>>>>> memory. >>>>>>> Hmm, >>>>>>> >>>>>>> Docs (again vm_normal_page() say) >>>>>>> >>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>> without "struct >>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>> with a struct >>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>> and >>>>>>> considered >>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>> refcounted >>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>> users). The >>>>>>> * advantage is that we don't have to follow the strict >>>>>>> linearity rule of >>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>> >>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>> path, so >>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>> case >>>>>>> why there could any longer ever be a significant performance >>>>>>> difference >>>>>>> between MIXEDMAP and PFNMAP. >>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>> what sticks. >>>>>> >>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>> devmap >>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>> there. >>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>> to find the underlying page. >>>>>> -Daniel >>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>> to set >>>>> >>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>> true, and >>>>> then >>>>> >>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>> gup_fast() >>>>> backs off, >>>>> >>>>> in the end that would mean setting in stone that "if there is a huge >>>>> devmap >>>>> page table entry for which we haven't registered any devmap struct >>>>> pages >>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>> huge >>>>> page table entry". >>>>> >>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>> does that, >>>>> it's just a question of getting it accepted and formalizing it. >>>> Oh I thought that's already how it works, since I didn't spot anything >>>> else that would block gup_fast from falling over. I guess really would >>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>> fails like we expect. >>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>> Otherwise pmd_devmap() will not return true and since there is no >>> pmd_special() things break. >> Is that maybe the issue we have seen with amdgpu and huge pages? > Yeah, essentially when you have a hugepte inserted by ttm, and it > happens to point at system memory, then gup will work on that. And > create all kinds of havoc. > >> Apart from that I'm lost guys, that devmap and gup stuff is not >> something I have a good knowledge of apart from a one mile high view. > I'm not really better, hence would be good to do a testcase and see. > This should provoke it: > - allocate nicely aligned bo in system memory > - mmap, again nicely aligned to 2M > - do some direct io from a filesystem into that mmap, that should trigger gup > - before the gup completes free the mmap and bo so that ttm recycles > the pages, which should trip up on the elevated refcount. If you wait > until the direct io is completely, then I think nothing bad can be > observed. > > Ofc if your amdgpu+hugepte issue is something else, then maybe we have > another issue. > > Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > -Daniel So I did the following quick experiment on vmwgfx, and it turns out that with it, fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds I should probably craft an RFC formalizing this. /Thomas diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 6dc96cf66744..72b6fb17c984 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, pfn_t pfnt; struct ttm_tt *ttm = bo->ttm; bool write = vmf->flags & FAULT_FLAG_WRITE; + struct dev_pagemap *pagemap; /* Fault should not cross bo boundary. */ page_offset &= ~(fault_page_size - 1); @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, if ((pfn & (fault_page_size - 1)) != 0) goto out_fallback; + /* + * Huge entries must be special, that is marking them as devmap + * with no backing device map range. If there is a backing + * range, Don't insert a huge entry. + */ + pagemap = get_dev_pagemap(pfn, NULL); + if (pagemap) { + put_dev_pagemap(pagemap); + goto out_fallback; + } + /* Check that memory is contiguous. */ if (!bo->mem.bus.is_iomem) { for (i = 1; i < fault_page_size; ++i) { @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, } } - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, if (ret != VM_FAULT_NOPAGE) goto out_fallback; +#if 1 + { + int npages; + struct page *page; + + npages = get_user_pages_fast_only(vmf->address, 1, 0, &page); + if (npages == 1) { + DRM_WARN("Fast gup succeeded. Bad.\n"); + put_page(page); + } else { + DRM_INFO("Fast gup failed. Good.\n"); + } + } +#endif + return VM_FAULT_NOPAGE; out_fallback: count_vm_event(THP_FAULT_FALLBACK); ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 10:22 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 10:22 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK On 3/1/21 3:09 PM, Daniel Vetter wrote: > On Mon, Mar 1, 2021 at 11:17 AM Christian König > <christian.koenig@amd.com> wrote: >> >> >> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>> wrote: >>>>> Hi, >>>>> >>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>> good >>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>> system >>>>>>>> memory. >>>>>>> Hmm, >>>>>>> >>>>>>> Docs (again vm_normal_page() say) >>>>>>> >>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>> without "struct >>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>> with a struct >>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>> and >>>>>>> considered >>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>> refcounted >>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>> users). The >>>>>>> * advantage is that we don't have to follow the strict >>>>>>> linearity rule of >>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>> >>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>> path, so >>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>> case >>>>>>> why there could any longer ever be a significant performance >>>>>>> difference >>>>>>> between MIXEDMAP and PFNMAP. >>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>> what sticks. >>>>>> >>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>> devmap >>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>> there. >>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>> to find the underlying page. >>>>>> -Daniel >>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>> to set >>>>> >>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>> true, and >>>>> then >>>>> >>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>> gup_fast() >>>>> backs off, >>>>> >>>>> in the end that would mean setting in stone that "if there is a huge >>>>> devmap >>>>> page table entry for which we haven't registered any devmap struct >>>>> pages >>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>> huge >>>>> page table entry". >>>>> >>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>> does that, >>>>> it's just a question of getting it accepted and formalizing it. >>>> Oh I thought that's already how it works, since I didn't spot anything >>>> else that would block gup_fast from falling over. I guess really would >>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>> fails like we expect. >>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>> Otherwise pmd_devmap() will not return true and since there is no >>> pmd_special() things break. >> Is that maybe the issue we have seen with amdgpu and huge pages? > Yeah, essentially when you have a hugepte inserted by ttm, and it > happens to point at system memory, then gup will work on that. And > create all kinds of havoc. > >> Apart from that I'm lost guys, that devmap and gup stuff is not >> something I have a good knowledge of apart from a one mile high view. > I'm not really better, hence would be good to do a testcase and see. > This should provoke it: > - allocate nicely aligned bo in system memory > - mmap, again nicely aligned to 2M > - do some direct io from a filesystem into that mmap, that should trigger gup > - before the gup completes free the mmap and bo so that ttm recycles > the pages, which should trip up on the elevated refcount. If you wait > until the direct io is completely, then I think nothing bad can be > observed. > > Ofc if your amdgpu+hugepte issue is something else, then maybe we have > another issue. > > Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > -Daniel So I did the following quick experiment on vmwgfx, and it turns out that with it, fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds I should probably craft an RFC formalizing this. /Thomas diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 6dc96cf66744..72b6fb17c984 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, pfn_t pfnt; struct ttm_tt *ttm = bo->ttm; bool write = vmf->flags & FAULT_FLAG_WRITE; + struct dev_pagemap *pagemap; /* Fault should not cross bo boundary. */ page_offset &= ~(fault_page_size - 1); @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, if ((pfn & (fault_page_size - 1)) != 0) goto out_fallback; + /* + * Huge entries must be special, that is marking them as devmap + * with no backing device map range. If there is a backing + * range, Don't insert a huge entry. + */ + pagemap = get_dev_pagemap(pfn, NULL); + if (pagemap) { + put_dev_pagemap(pagemap); + goto out_fallback; + } + /* Check that memory is contiguous. */ if (!bo->mem.bus.is_iomem) { for (i = 1; i < fault_page_size; ++i) { @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, } } - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, if (ret != VM_FAULT_NOPAGE) goto out_fallback; +#if 1 + { + int npages; + struct page *page; + + npages = get_user_pages_fast_only(vmf->address, 1, 0, &page); + if (npages == 1) { + DRM_WARN("Fast gup succeeded. Bad.\n"); + put_page(page); + } else { + DRM_INFO("Fast gup failed. Good.\n"); + } + } +#endif + return VM_FAULT_NOPAGE; out_fallback: count_vm_event(THP_FAULT_FALLBACK); _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 10:22 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 10:22 UTC (permalink / raw) To: Daniel Vetter, Christian König Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK On 3/1/21 3:09 PM, Daniel Vetter wrote: > On Mon, Mar 1, 2021 at 11:17 AM Christian König > <christian.koenig@amd.com> wrote: >> >> >> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>> wrote: >>>>> Hi, >>>>> >>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>> <thomas_os@shipmail.org> wrote: >>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>> good >>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>> system >>>>>>>> memory. >>>>>>> Hmm, >>>>>>> >>>>>>> Docs (again vm_normal_page() say) >>>>>>> >>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>> without "struct >>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>> with a struct >>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>> and >>>>>>> considered >>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>> refcounted >>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>> users). The >>>>>>> * advantage is that we don't have to follow the strict >>>>>>> linearity rule of >>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>> >>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>> path, so >>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>> case >>>>>>> why there could any longer ever be a significant performance >>>>>>> difference >>>>>>> between MIXEDMAP and PFNMAP. >>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>> what sticks. >>>>>> >>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>> devmap >>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>> there. >>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>> to find the underlying page. >>>>>> -Daniel >>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>> to set >>>>> >>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>> true, and >>>>> then >>>>> >>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>> gup_fast() >>>>> backs off, >>>>> >>>>> in the end that would mean setting in stone that "if there is a huge >>>>> devmap >>>>> page table entry for which we haven't registered any devmap struct >>>>> pages >>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>> huge >>>>> page table entry". >>>>> >>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>> does that, >>>>> it's just a question of getting it accepted and formalizing it. >>>> Oh I thought that's already how it works, since I didn't spot anything >>>> else that would block gup_fast from falling over. I guess really would >>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>> fails like we expect. >>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>> Otherwise pmd_devmap() will not return true and since there is no >>> pmd_special() things break. >> Is that maybe the issue we have seen with amdgpu and huge pages? > Yeah, essentially when you have a hugepte inserted by ttm, and it > happens to point at system memory, then gup will work on that. And > create all kinds of havoc. > >> Apart from that I'm lost guys, that devmap and gup stuff is not >> something I have a good knowledge of apart from a one mile high view. > I'm not really better, hence would be good to do a testcase and see. > This should provoke it: > - allocate nicely aligned bo in system memory > - mmap, again nicely aligned to 2M > - do some direct io from a filesystem into that mmap, that should trigger gup > - before the gup completes free the mmap and bo so that ttm recycles > the pages, which should trip up on the elevated refcount. If you wait > until the direct io is completely, then I think nothing bad can be > observed. > > Ofc if your amdgpu+hugepte issue is something else, then maybe we have > another issue. > > Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > -Daniel So I did the following quick experiment on vmwgfx, and it turns out that with it, fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds I should probably craft an RFC formalizing this. /Thomas diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c index 6dc96cf66744..72b6fb17c984 100644 --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, pfn_t pfnt; struct ttm_tt *ttm = bo->ttm; bool write = vmf->flags & FAULT_FLAG_WRITE; + struct dev_pagemap *pagemap; /* Fault should not cross bo boundary. */ page_offset &= ~(fault_page_size - 1); @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, if ((pfn & (fault_page_size - 1)) != 0) goto out_fallback; + /* + * Huge entries must be special, that is marking them as devmap + * with no backing device map range. If there is a backing + * range, Don't insert a huge entry. + */ + pagemap = get_dev_pagemap(pfn, NULL); + if (pagemap) { + put_dev_pagemap(pagemap); + goto out_fallback; + } + /* Check that memory is contiguous. */ if (!bo->mem.bus.is_iomem) { for (i = 1; i < fault_page_size; ++i) { @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, } } - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault *vmf, if (ret != VM_FAULT_NOPAGE) goto out_fallback; +#if 1 + { + int npages; + struct page *page; + + npages = get_user_pages_fast_only(vmf->address, 1, 0, &page); + if (npages == 1) { + DRM_WARN("Fast gup succeeded. Bad.\n"); + put_page(page); + } else { + DRM_INFO("Fast gup failed. Good.\n"); + } + } +#endif + return VM_FAULT_NOPAGE; out_fallback: count_vm_event(THP_FAULT_FALLBACK); _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply related [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-11 10:22 ` Thomas Hellström (Intel) (?) @ 2021-03-11 13:00 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-11 13:00 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Daniel Vetter, Christian König, Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: > > On 3/1/21 3:09 PM, Daniel Vetter wrote: > > On Mon, Mar 1, 2021 at 11:17 AM Christian König > > <christian.koenig@amd.com> wrote: > > > > > > > > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > > > On 3/1/21 10:05 AM, Daniel Vetter wrote: > > > > > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > > > > > wrote: > > > > > > Hi, > > > > > > > > > > > > On 3/1/21 9:28 AM, Daniel Vetter wrote: > > > > > > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > > > > > > > <thomas_os@shipmail.org> wrote: > > > > > > > > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > > > > > > > > So I think it stops gup. But I haven't verified at all. Would be > > > > > > > > > good > > > > > > > > > if Christian can check this with some direct io to a buffer in > > > > > > > > > system > > > > > > > > > memory. > > > > > > > > Hmm, > > > > > > > > > > > > > > > > Docs (again vm_normal_page() say) > > > > > > > > > > > > > > > > * VM_MIXEDMAP mappings can likewise contain memory with or > > > > > > > > without "struct > > > > > > > > * page" backing, however the difference is that _all_ pages > > > > > > > > with a struct > > > > > > > > * page (that is, those where pfn_valid is true) are refcounted > > > > > > > > and > > > > > > > > considered > > > > > > > > * normal pages by the VM. The disadvantage is that pages are > > > > > > > > refcounted > > > > > > > > * (which can be slower and simply not an option for some PFNMAP > > > > > > > > users). The > > > > > > > > * advantage is that we don't have to follow the strict > > > > > > > > linearity rule of > > > > > > > > * PFNMAP mappings in order to support COWable mappings. > > > > > > > > > > > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn() > > > > > > > > path, so > > > > > > > > the above isn't really true, which makes me wonder if and in that > > > > > > > > case > > > > > > > > why there could any longer ever be a significant performance > > > > > > > > difference > > > > > > > > between MIXEDMAP and PFNMAP. > > > > > > > Yeah it's definitely confusing. I guess I'll hack up a patch and see > > > > > > > what sticks. > > > > > > > > > > > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that > > > > > > > > devmap > > > > > > > > hack, so they are (for the non-gup case) relying on > > > > > > > > vma_is_special_huge(). For the gup case, I think the bug is still > > > > > > > > there. > > > > > > > Maybe there's another devmap hack, but the ttm_vm_insert functions do > > > > > > > use PFN_DEV and all that. And I think that stops gup_fast from trying > > > > > > > to find the underlying page. > > > > > > > -Daniel > > > > > > Hmm perhaps it might, but I don't think so. The fix I tried out was > > > > > > to set > > > > > > > > > > > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > > > > > > true, and > > > > > > then > > > > > > > > > > > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > > > > > > gup_fast() > > > > > > backs off, > > > > > > > > > > > > in the end that would mean setting in stone that "if there is a huge > > > > > > devmap > > > > > > page table entry for which we haven't registered any devmap struct > > > > > > pages > > > > > > (get_dev_pagemap returns NULL), we should treat that as a "special" > > > > > > huge > > > > > > page table entry". > > > > > > > > > > > > From what I can tell, all code calling get_dev_pagemap() already > > > > > > does that, > > > > > > it's just a question of getting it accepted and formalizing it. > > > > > Oh I thought that's already how it works, since I didn't spot anything > > > > > else that would block gup_fast from falling over. I guess really would > > > > > need some testcases to make sure direct i/o (that's the easiest to test) > > > > > fails like we expect. > > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > > > > Otherwise pmd_devmap() will not return true and since there is no > > > > pmd_special() things break. > > > Is that maybe the issue we have seen with amdgpu and huge pages? > > Yeah, essentially when you have a hugepte inserted by ttm, and it > > happens to point at system memory, then gup will work on that. And > > create all kinds of havoc. > > > > > Apart from that I'm lost guys, that devmap and gup stuff is not > > > something I have a good knowledge of apart from a one mile high view. > > I'm not really better, hence would be good to do a testcase and see. > > This should provoke it: > > - allocate nicely aligned bo in system memory > > - mmap, again nicely aligned to 2M > > - do some direct io from a filesystem into that mmap, that should trigger gup > > - before the gup completes free the mmap and bo so that ttm recycles > > the pages, which should trip up on the elevated refcount. If you wait > > until the direct io is completely, then I think nothing bad can be > > observed. > > > > Ofc if your amdgpu+hugepte issue is something else, then maybe we have > > another issue. > > > > Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > > -Daniel > > So I did the following quick experiment on vmwgfx, and it turns out that > with it, > fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds > > I should probably craft an RFC formalizing this. Yeah I think that would be good. Maybe even more formalized if we also switch over to VM_PFNMAP, since afaiui these pte flags here only stop the fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or something like that. Otoh your description of when it only sometimes succeeds would indicate my understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. Christian, what's your take? -Daniel > > /Thomas > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c > b/drivers/gpu/drm/ttm/ttm_bo_vm.c > index 6dc96cf66744..72b6fb17c984 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > pfn_t pfnt; > struct ttm_tt *ttm = bo->ttm; > bool write = vmf->flags & FAULT_FLAG_WRITE; > + struct dev_pagemap *pagemap; > > /* Fault should not cross bo boundary. */ > page_offset &= ~(fault_page_size - 1); > @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > if ((pfn & (fault_page_size - 1)) != 0) > goto out_fallback; > > + /* > + * Huge entries must be special, that is marking them as devmap > + * with no backing device map range. If there is a backing > + * range, Don't insert a huge entry. > + */ > + pagemap = get_dev_pagemap(pfn, NULL); > + if (pagemap) { > + put_dev_pagemap(pagemap); > + goto out_fallback; > + } > + > /* Check that memory is contiguous. */ > if (!bo->mem.bus.is_iomem) { > for (i = 1; i < fault_page_size; ++i) { > @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > } > } > > - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); > + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); > if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) > ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); > #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > if (ret != VM_FAULT_NOPAGE) > goto out_fallback; > > +#if 1 > + { > + int npages; > + struct page *page; > + > + npages = get_user_pages_fast_only(vmf->address, 1, 0, > &page); > + if (npages == 1) { > + DRM_WARN("Fast gup succeeded. Bad.\n"); > + put_page(page); > + } else { > + DRM_INFO("Fast gup failed. Good.\n"); > + } > + } > +#endif > + > return VM_FAULT_NOPAGE; > out_fallback: > count_vm_event(THP_FAULT_FALLBACK); > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 13:00 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-11 13:00 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: > > On 3/1/21 3:09 PM, Daniel Vetter wrote: > > On Mon, Mar 1, 2021 at 11:17 AM Christian König > > <christian.koenig@amd.com> wrote: > > > > > > > > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > > > On 3/1/21 10:05 AM, Daniel Vetter wrote: > > > > > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > > > > > wrote: > > > > > > Hi, > > > > > > > > > > > > On 3/1/21 9:28 AM, Daniel Vetter wrote: > > > > > > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > > > > > > > <thomas_os@shipmail.org> wrote: > > > > > > > > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > > > > > > > > So I think it stops gup. But I haven't verified at all. Would be > > > > > > > > > good > > > > > > > > > if Christian can check this with some direct io to a buffer in > > > > > > > > > system > > > > > > > > > memory. > > > > > > > > Hmm, > > > > > > > > > > > > > > > > Docs (again vm_normal_page() say) > > > > > > > > > > > > > > > > * VM_MIXEDMAP mappings can likewise contain memory with or > > > > > > > > without "struct > > > > > > > > * page" backing, however the difference is that _all_ pages > > > > > > > > with a struct > > > > > > > > * page (that is, those where pfn_valid is true) are refcounted > > > > > > > > and > > > > > > > > considered > > > > > > > > * normal pages by the VM. The disadvantage is that pages are > > > > > > > > refcounted > > > > > > > > * (which can be slower and simply not an option for some PFNMAP > > > > > > > > users). The > > > > > > > > * advantage is that we don't have to follow the strict > > > > > > > > linearity rule of > > > > > > > > * PFNMAP mappings in order to support COWable mappings. > > > > > > > > > > > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn() > > > > > > > > path, so > > > > > > > > the above isn't really true, which makes me wonder if and in that > > > > > > > > case > > > > > > > > why there could any longer ever be a significant performance > > > > > > > > difference > > > > > > > > between MIXEDMAP and PFNMAP. > > > > > > > Yeah it's definitely confusing. I guess I'll hack up a patch and see > > > > > > > what sticks. > > > > > > > > > > > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that > > > > > > > > devmap > > > > > > > > hack, so they are (for the non-gup case) relying on > > > > > > > > vma_is_special_huge(). For the gup case, I think the bug is still > > > > > > > > there. > > > > > > > Maybe there's another devmap hack, but the ttm_vm_insert functions do > > > > > > > use PFN_DEV and all that. And I think that stops gup_fast from trying > > > > > > > to find the underlying page. > > > > > > > -Daniel > > > > > > Hmm perhaps it might, but I don't think so. The fix I tried out was > > > > > > to set > > > > > > > > > > > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > > > > > > true, and > > > > > > then > > > > > > > > > > > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > > > > > > gup_fast() > > > > > > backs off, > > > > > > > > > > > > in the end that would mean setting in stone that "if there is a huge > > > > > > devmap > > > > > > page table entry for which we haven't registered any devmap struct > > > > > > pages > > > > > > (get_dev_pagemap returns NULL), we should treat that as a "special" > > > > > > huge > > > > > > page table entry". > > > > > > > > > > > > From what I can tell, all code calling get_dev_pagemap() already > > > > > > does that, > > > > > > it's just a question of getting it accepted and formalizing it. > > > > > Oh I thought that's already how it works, since I didn't spot anything > > > > > else that would block gup_fast from falling over. I guess really would > > > > > need some testcases to make sure direct i/o (that's the easiest to test) > > > > > fails like we expect. > > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > > > > Otherwise pmd_devmap() will not return true and since there is no > > > > pmd_special() things break. > > > Is that maybe the issue we have seen with amdgpu and huge pages? > > Yeah, essentially when you have a hugepte inserted by ttm, and it > > happens to point at system memory, then gup will work on that. And > > create all kinds of havoc. > > > > > Apart from that I'm lost guys, that devmap and gup stuff is not > > > something I have a good knowledge of apart from a one mile high view. > > I'm not really better, hence would be good to do a testcase and see. > > This should provoke it: > > - allocate nicely aligned bo in system memory > > - mmap, again nicely aligned to 2M > > - do some direct io from a filesystem into that mmap, that should trigger gup > > - before the gup completes free the mmap and bo so that ttm recycles > > the pages, which should trip up on the elevated refcount. If you wait > > until the direct io is completely, then I think nothing bad can be > > observed. > > > > Ofc if your amdgpu+hugepte issue is something else, then maybe we have > > another issue. > > > > Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > > -Daniel > > So I did the following quick experiment on vmwgfx, and it turns out that > with it, > fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds > > I should probably craft an RFC formalizing this. Yeah I think that would be good. Maybe even more formalized if we also switch over to VM_PFNMAP, since afaiui these pte flags here only stop the fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or something like that. Otoh your description of when it only sometimes succeeds would indicate my understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. Christian, what's your take? -Daniel > > /Thomas > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c > b/drivers/gpu/drm/ttm/ttm_bo_vm.c > index 6dc96cf66744..72b6fb17c984 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > pfn_t pfnt; > struct ttm_tt *ttm = bo->ttm; > bool write = vmf->flags & FAULT_FLAG_WRITE; > + struct dev_pagemap *pagemap; > > /* Fault should not cross bo boundary. */ > page_offset &= ~(fault_page_size - 1); > @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > if ((pfn & (fault_page_size - 1)) != 0) > goto out_fallback; > > + /* > + * Huge entries must be special, that is marking them as devmap > + * with no backing device map range. If there is a backing > + * range, Don't insert a huge entry. > + */ > + pagemap = get_dev_pagemap(pfn, NULL); > + if (pagemap) { > + put_dev_pagemap(pagemap); > + goto out_fallback; > + } > + > /* Check that memory is contiguous. */ > if (!bo->mem.bus.is_iomem) { > for (i = 1; i < fault_page_size; ++i) { > @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > } > } > > - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); > + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); > if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) > ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); > #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > if (ret != VM_FAULT_NOPAGE) > goto out_fallback; > > +#if 1 > + { > + int npages; > + struct page *page; > + > + npages = get_user_pages_fast_only(vmf->address, 1, 0, > &page); > + if (npages == 1) { > + DRM_WARN("Fast gup succeeded. Bad.\n"); > + put_page(page); > + } else { > + DRM_INFO("Fast gup failed. Good.\n"); > + } > + } > +#endif > + > return VM_FAULT_NOPAGE; > out_fallback: > count_vm_event(THP_FAULT_FALLBACK); > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 13:00 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-11 13:00 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: > > On 3/1/21 3:09 PM, Daniel Vetter wrote: > > On Mon, Mar 1, 2021 at 11:17 AM Christian König > > <christian.koenig@amd.com> wrote: > > > > > > > > > Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > > > > On 3/1/21 10:05 AM, Daniel Vetter wrote: > > > > > On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > > > > > wrote: > > > > > > Hi, > > > > > > > > > > > > On 3/1/21 9:28 AM, Daniel Vetter wrote: > > > > > > > On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > > > > > > > <thomas_os@shipmail.org> wrote: > > > > > > > > On 2/26/21 2:28 PM, Daniel Vetter wrote: > > > > > > > > > So I think it stops gup. But I haven't verified at all. Would be > > > > > > > > > good > > > > > > > > > if Christian can check this with some direct io to a buffer in > > > > > > > > > system > > > > > > > > > memory. > > > > > > > > Hmm, > > > > > > > > > > > > > > > > Docs (again vm_normal_page() say) > > > > > > > > > > > > > > > > * VM_MIXEDMAP mappings can likewise contain memory with or > > > > > > > > without "struct > > > > > > > > * page" backing, however the difference is that _all_ pages > > > > > > > > with a struct > > > > > > > > * page (that is, those where pfn_valid is true) are refcounted > > > > > > > > and > > > > > > > > considered > > > > > > > > * normal pages by the VM. The disadvantage is that pages are > > > > > > > > refcounted > > > > > > > > * (which can be slower and simply not an option for some PFNMAP > > > > > > > > users). The > > > > > > > > * advantage is that we don't have to follow the strict > > > > > > > > linearity rule of > > > > > > > > * PFNMAP mappings in order to support COWable mappings. > > > > > > > > > > > > > > > > but it's true __vm_insert_mixed() ends up in the insert_pfn() > > > > > > > > path, so > > > > > > > > the above isn't really true, which makes me wonder if and in that > > > > > > > > case > > > > > > > > why there could any longer ever be a significant performance > > > > > > > > difference > > > > > > > > between MIXEDMAP and PFNMAP. > > > > > > > Yeah it's definitely confusing. I guess I'll hack up a patch and see > > > > > > > what sticks. > > > > > > > > > > > > > > > BTW regarding the TTM hugeptes, I don't think we ever landed that > > > > > > > > devmap > > > > > > > > hack, so they are (for the non-gup case) relying on > > > > > > > > vma_is_special_huge(). For the gup case, I think the bug is still > > > > > > > > there. > > > > > > > Maybe there's another devmap hack, but the ttm_vm_insert functions do > > > > > > > use PFN_DEV and all that. And I think that stops gup_fast from trying > > > > > > > to find the underlying page. > > > > > > > -Daniel > > > > > > Hmm perhaps it might, but I don't think so. The fix I tried out was > > > > > > to set > > > > > > > > > > > > PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > > > > > > true, and > > > > > > then > > > > > > > > > > > > follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > > > > > > gup_fast() > > > > > > backs off, > > > > > > > > > > > > in the end that would mean setting in stone that "if there is a huge > > > > > > devmap > > > > > > page table entry for which we haven't registered any devmap struct > > > > > > pages > > > > > > (get_dev_pagemap returns NULL), we should treat that as a "special" > > > > > > huge > > > > > > page table entry". > > > > > > > > > > > > From what I can tell, all code calling get_dev_pagemap() already > > > > > > does that, > > > > > > it's just a question of getting it accepted and formalizing it. > > > > > Oh I thought that's already how it works, since I didn't spot anything > > > > > else that would block gup_fast from falling over. I guess really would > > > > > need some testcases to make sure direct i/o (that's the easiest to test) > > > > > fails like we expect. > > > > Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > > > > Otherwise pmd_devmap() will not return true and since there is no > > > > pmd_special() things break. > > > Is that maybe the issue we have seen with amdgpu and huge pages? > > Yeah, essentially when you have a hugepte inserted by ttm, and it > > happens to point at system memory, then gup will work on that. And > > create all kinds of havoc. > > > > > Apart from that I'm lost guys, that devmap and gup stuff is not > > > something I have a good knowledge of apart from a one mile high view. > > I'm not really better, hence would be good to do a testcase and see. > > This should provoke it: > > - allocate nicely aligned bo in system memory > > - mmap, again nicely aligned to 2M > > - do some direct io from a filesystem into that mmap, that should trigger gup > > - before the gup completes free the mmap and bo so that ttm recycles > > the pages, which should trip up on the elevated refcount. If you wait > > until the direct io is completely, then I think nothing bad can be > > observed. > > > > Ofc if your amdgpu+hugepte issue is something else, then maybe we have > > another issue. > > > > Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > > -Daniel > > So I did the following quick experiment on vmwgfx, and it turns out that > with it, > fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds > > I should probably craft an RFC formalizing this. Yeah I think that would be good. Maybe even more formalized if we also switch over to VM_PFNMAP, since afaiui these pte flags here only stop the fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or something like that. Otoh your description of when it only sometimes succeeds would indicate my understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. Christian, what's your take? -Daniel > > /Thomas > > diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c > b/drivers/gpu/drm/ttm/ttm_bo_vm.c > index 6dc96cf66744..72b6fb17c984 100644 > --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > pfn_t pfnt; > struct ttm_tt *ttm = bo->ttm; > bool write = vmf->flags & FAULT_FLAG_WRITE; > + struct dev_pagemap *pagemap; > > /* Fault should not cross bo boundary. */ > page_offset &= ~(fault_page_size - 1); > @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > if ((pfn & (fault_page_size - 1)) != 0) > goto out_fallback; > > + /* > + * Huge entries must be special, that is marking them as devmap > + * with no backing device map range. If there is a backing > + * range, Don't insert a huge entry. > + */ > + pagemap = get_dev_pagemap(pfn, NULL); > + if (pagemap) { > + put_dev_pagemap(pagemap); > + goto out_fallback; > + } > + > /* Check that memory is contiguous. */ > if (!bo->mem.bus.is_iomem) { > for (i = 1; i < fault_page_size; ++i) { > @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > } > } > > - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); > + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); > if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) > ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); > #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > *vmf, > if (ret != VM_FAULT_NOPAGE) > goto out_fallback; > > +#if 1 > + { > + int npages; > + struct page *page; > + > + npages = get_user_pages_fast_only(vmf->address, 1, 0, > &page); > + if (npages == 1) { > + DRM_WARN("Fast gup succeeded. Bad.\n"); > + put_page(page); > + } else { > + DRM_INFO("Fast gup failed. Good.\n"); > + } > + } > +#endif > + > return VM_FAULT_NOPAGE; > out_fallback: > count_vm_event(THP_FAULT_FALLBACK); > > > > > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-11 13:00 ` Daniel Vetter (?) @ 2021-03-11 13:12 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 13:12 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Hi! On 3/11/21 2:00 PM, Daniel Vetter wrote: > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: >> On 3/1/21 3:09 PM, Daniel Vetter wrote: >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König >>> <christian.koenig@amd.com> wrote: >>>> >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>>>> good >>>>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>>>> system >>>>>>>>>> memory. >>>>>>>>> Hmm, >>>>>>>>> >>>>>>>>> Docs (again vm_normal_page() say) >>>>>>>>> >>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>>>> without "struct >>>>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>>>> with a struct >>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>>>> and >>>>>>>>> considered >>>>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>>>> refcounted >>>>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>>>> users). The >>>>>>>>> * advantage is that we don't have to follow the strict >>>>>>>>> linearity rule of >>>>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>>>> >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>>>> path, so >>>>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>>>> case >>>>>>>>> why there could any longer ever be a significant performance >>>>>>>>> difference >>>>>>>>> between MIXEDMAP and PFNMAP. >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>>>> what sticks. >>>>>>>> >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>>>> devmap >>>>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>>>> there. >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>>>> to find the underlying page. >>>>>>>> -Daniel >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>>>> to set >>>>>>> >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>>>> true, and >>>>>>> then >>>>>>> >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>>>> gup_fast() >>>>>>> backs off, >>>>>>> >>>>>>> in the end that would mean setting in stone that "if there is a huge >>>>>>> devmap >>>>>>> page table entry for which we haven't registered any devmap struct >>>>>>> pages >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>>>> huge >>>>>>> page table entry". >>>>>>> >>>>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>>>> does that, >>>>>>> it's just a question of getting it accepted and formalizing it. >>>>>> Oh I thought that's already how it works, since I didn't spot anything >>>>>> else that would block gup_fast from falling over. I guess really would >>>>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>>>> fails like we expect. >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>>>> Otherwise pmd_devmap() will not return true and since there is no >>>>> pmd_special() things break. >>>> Is that maybe the issue we have seen with amdgpu and huge pages? >>> Yeah, essentially when you have a hugepte inserted by ttm, and it >>> happens to point at system memory, then gup will work on that. And >>> create all kinds of havoc. >>> >>>> Apart from that I'm lost guys, that devmap and gup stuff is not >>>> something I have a good knowledge of apart from a one mile high view. >>> I'm not really better, hence would be good to do a testcase and see. >>> This should provoke it: >>> - allocate nicely aligned bo in system memory >>> - mmap, again nicely aligned to 2M >>> - do some direct io from a filesystem into that mmap, that should trigger gup >>> - before the gup completes free the mmap and bo so that ttm recycles >>> the pages, which should trip up on the elevated refcount. If you wait >>> until the direct io is completely, then I think nothing bad can be >>> observed. >>> >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have >>> another issue. >>> >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. >>> -Daniel >> So I did the following quick experiment on vmwgfx, and it turns out that >> with it, >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >> >> I should probably craft an RFC formalizing this. > Yeah I think that would be good. Maybe even more formalized if we also > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or > something like that. > > Otoh your description of when it only sometimes succeeds would indicate my > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. My understanding from reading the vmf_insert_mixed() code is that iff the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's not consistent with the vm_normal_page() doc. For architectures without pte_special, VM_PFNMAP must be used, and then we must also block COW mappings. If we can get someone can commit to verify that the potential PAT WC performance issue is gone with PFNMAP, I can put together a series with that included. As for existing userspace using COW TTM mappings, I once had a couple of test cases to verify that it actually worked, in particular together with huge PMDs and PUDs where breaking COW would imply splitting those, but I can't think of anything else actually wanting to do that other than by mistake. /Thomas > > Christian, what's your take? > -Daniel > >> /Thomas >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> index 6dc96cf66744..72b6fb17c984 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> pfn_t pfnt; >> struct ttm_tt *ttm = bo->ttm; >> bool write = vmf->flags & FAULT_FLAG_WRITE; >> + struct dev_pagemap *pagemap; >> >> /* Fault should not cross bo boundary. */ >> page_offset &= ~(fault_page_size - 1); >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> if ((pfn & (fault_page_size - 1)) != 0) >> goto out_fallback; >> >> + /* >> + * Huge entries must be special, that is marking them as devmap >> + * with no backing device map range. If there is a backing >> + * range, Don't insert a huge entry. >> + */ >> + pagemap = get_dev_pagemap(pfn, NULL); >> + if (pagemap) { >> + put_dev_pagemap(pagemap); >> + goto out_fallback; >> + } >> + >> /* Check that memory is contiguous. */ >> if (!bo->mem.bus.is_iomem) { >> for (i = 1; i < fault_page_size; ++i) { >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> } >> } >> >> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); >> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); >> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) >> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); >> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> if (ret != VM_FAULT_NOPAGE) >> goto out_fallback; >> >> +#if 1 >> + { >> + int npages; >> + struct page *page; >> + >> + npages = get_user_pages_fast_only(vmf->address, 1, 0, >> &page); >> + if (npages == 1) { >> + DRM_WARN("Fast gup succeeded. Bad.\n"); >> + put_page(page); >> + } else { >> + DRM_INFO("Fast gup failed. Good.\n"); >> + } >> + } >> +#endif >> + >> return VM_FAULT_NOPAGE; >> out_fallback: >> count_vm_event(THP_FAULT_FALLBACK); >> >> >> >> >> ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 13:12 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 13:12 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK Hi! On 3/11/21 2:00 PM, Daniel Vetter wrote: > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: >> On 3/1/21 3:09 PM, Daniel Vetter wrote: >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König >>> <christian.koenig@amd.com> wrote: >>>> >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>>>> good >>>>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>>>> system >>>>>>>>>> memory. >>>>>>>>> Hmm, >>>>>>>>> >>>>>>>>> Docs (again vm_normal_page() say) >>>>>>>>> >>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>>>> without "struct >>>>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>>>> with a struct >>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>>>> and >>>>>>>>> considered >>>>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>>>> refcounted >>>>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>>>> users). The >>>>>>>>> * advantage is that we don't have to follow the strict >>>>>>>>> linearity rule of >>>>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>>>> >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>>>> path, so >>>>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>>>> case >>>>>>>>> why there could any longer ever be a significant performance >>>>>>>>> difference >>>>>>>>> between MIXEDMAP and PFNMAP. >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>>>> what sticks. >>>>>>>> >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>>>> devmap >>>>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>>>> there. >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>>>> to find the underlying page. >>>>>>>> -Daniel >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>>>> to set >>>>>>> >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>>>> true, and >>>>>>> then >>>>>>> >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>>>> gup_fast() >>>>>>> backs off, >>>>>>> >>>>>>> in the end that would mean setting in stone that "if there is a huge >>>>>>> devmap >>>>>>> page table entry for which we haven't registered any devmap struct >>>>>>> pages >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>>>> huge >>>>>>> page table entry". >>>>>>> >>>>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>>>> does that, >>>>>>> it's just a question of getting it accepted and formalizing it. >>>>>> Oh I thought that's already how it works, since I didn't spot anything >>>>>> else that would block gup_fast from falling over. I guess really would >>>>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>>>> fails like we expect. >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>>>> Otherwise pmd_devmap() will not return true and since there is no >>>>> pmd_special() things break. >>>> Is that maybe the issue we have seen with amdgpu and huge pages? >>> Yeah, essentially when you have a hugepte inserted by ttm, and it >>> happens to point at system memory, then gup will work on that. And >>> create all kinds of havoc. >>> >>>> Apart from that I'm lost guys, that devmap and gup stuff is not >>>> something I have a good knowledge of apart from a one mile high view. >>> I'm not really better, hence would be good to do a testcase and see. >>> This should provoke it: >>> - allocate nicely aligned bo in system memory >>> - mmap, again nicely aligned to 2M >>> - do some direct io from a filesystem into that mmap, that should trigger gup >>> - before the gup completes free the mmap and bo so that ttm recycles >>> the pages, which should trip up on the elevated refcount. If you wait >>> until the direct io is completely, then I think nothing bad can be >>> observed. >>> >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have >>> another issue. >>> >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. >>> -Daniel >> So I did the following quick experiment on vmwgfx, and it turns out that >> with it, >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >> >> I should probably craft an RFC formalizing this. > Yeah I think that would be good. Maybe even more formalized if we also > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or > something like that. > > Otoh your description of when it only sometimes succeeds would indicate my > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. My understanding from reading the vmf_insert_mixed() code is that iff the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's not consistent with the vm_normal_page() doc. For architectures without pte_special, VM_PFNMAP must be used, and then we must also block COW mappings. If we can get someone can commit to verify that the potential PAT WC performance issue is gone with PFNMAP, I can put together a series with that included. As for existing userspace using COW TTM mappings, I once had a couple of test cases to verify that it actually worked, in particular together with huge PMDs and PUDs where breaking COW would imply splitting those, but I can't think of anything else actually wanting to do that other than by mistake. /Thomas > > Christian, what's your take? > -Daniel > >> /Thomas >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> index 6dc96cf66744..72b6fb17c984 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> pfn_t pfnt; >> struct ttm_tt *ttm = bo->ttm; >> bool write = vmf->flags & FAULT_FLAG_WRITE; >> + struct dev_pagemap *pagemap; >> >> /* Fault should not cross bo boundary. */ >> page_offset &= ~(fault_page_size - 1); >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> if ((pfn & (fault_page_size - 1)) != 0) >> goto out_fallback; >> >> + /* >> + * Huge entries must be special, that is marking them as devmap >> + * with no backing device map range. If there is a backing >> + * range, Don't insert a huge entry. >> + */ >> + pagemap = get_dev_pagemap(pfn, NULL); >> + if (pagemap) { >> + put_dev_pagemap(pagemap); >> + goto out_fallback; >> + } >> + >> /* Check that memory is contiguous. */ >> if (!bo->mem.bus.is_iomem) { >> for (i = 1; i < fault_page_size; ++i) { >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> } >> } >> >> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); >> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); >> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) >> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); >> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> if (ret != VM_FAULT_NOPAGE) >> goto out_fallback; >> >> +#if 1 >> + { >> + int npages; >> + struct page *page; >> + >> + npages = get_user_pages_fast_only(vmf->address, 1, 0, >> &page); >> + if (npages == 1) { >> + DRM_WARN("Fast gup succeeded. Bad.\n"); >> + put_page(page); >> + } else { >> + DRM_INFO("Fast gup failed. Good.\n"); >> + } >> + } >> +#endif >> + >> return VM_FAULT_NOPAGE; >> out_fallback: >> count_vm_event(THP_FAULT_FALLBACK); >> >> >> >> >> _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 13:12 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 13:12 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK Hi! On 3/11/21 2:00 PM, Daniel Vetter wrote: > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: >> On 3/1/21 3:09 PM, Daniel Vetter wrote: >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König >>> <christian.koenig@amd.com> wrote: >>>> >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>>>> wrote: >>>>>>> Hi, >>>>>>> >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>>>> good >>>>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>>>> system >>>>>>>>>> memory. >>>>>>>>> Hmm, >>>>>>>>> >>>>>>>>> Docs (again vm_normal_page() say) >>>>>>>>> >>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>>>> without "struct >>>>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>>>> with a struct >>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>>>> and >>>>>>>>> considered >>>>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>>>> refcounted >>>>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>>>> users). The >>>>>>>>> * advantage is that we don't have to follow the strict >>>>>>>>> linearity rule of >>>>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>>>> >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>>>> path, so >>>>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>>>> case >>>>>>>>> why there could any longer ever be a significant performance >>>>>>>>> difference >>>>>>>>> between MIXEDMAP and PFNMAP. >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>>>> what sticks. >>>>>>>> >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>>>> devmap >>>>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>>>> there. >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>>>> to find the underlying page. >>>>>>>> -Daniel >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>>>> to set >>>>>>> >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>>>> true, and >>>>>>> then >>>>>>> >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>>>> gup_fast() >>>>>>> backs off, >>>>>>> >>>>>>> in the end that would mean setting in stone that "if there is a huge >>>>>>> devmap >>>>>>> page table entry for which we haven't registered any devmap struct >>>>>>> pages >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>>>> huge >>>>>>> page table entry". >>>>>>> >>>>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>>>> does that, >>>>>>> it's just a question of getting it accepted and formalizing it. >>>>>> Oh I thought that's already how it works, since I didn't spot anything >>>>>> else that would block gup_fast from falling over. I guess really would >>>>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>>>> fails like we expect. >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>>>> Otherwise pmd_devmap() will not return true and since there is no >>>>> pmd_special() things break. >>>> Is that maybe the issue we have seen with amdgpu and huge pages? >>> Yeah, essentially when you have a hugepte inserted by ttm, and it >>> happens to point at system memory, then gup will work on that. And >>> create all kinds of havoc. >>> >>>> Apart from that I'm lost guys, that devmap and gup stuff is not >>>> something I have a good knowledge of apart from a one mile high view. >>> I'm not really better, hence would be good to do a testcase and see. >>> This should provoke it: >>> - allocate nicely aligned bo in system memory >>> - mmap, again nicely aligned to 2M >>> - do some direct io from a filesystem into that mmap, that should trigger gup >>> - before the gup completes free the mmap and bo so that ttm recycles >>> the pages, which should trip up on the elevated refcount. If you wait >>> until the direct io is completely, then I think nothing bad can be >>> observed. >>> >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have >>> another issue. >>> >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. >>> -Daniel >> So I did the following quick experiment on vmwgfx, and it turns out that >> with it, >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >> >> I should probably craft an RFC formalizing this. > Yeah I think that would be good. Maybe even more formalized if we also > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or > something like that. > > Otoh your description of when it only sometimes succeeds would indicate my > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. My understanding from reading the vmf_insert_mixed() code is that iff the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's not consistent with the vm_normal_page() doc. For architectures without pte_special, VM_PFNMAP must be used, and then we must also block COW mappings. If we can get someone can commit to verify that the potential PAT WC performance issue is gone with PFNMAP, I can put together a series with that included. As for existing userspace using COW TTM mappings, I once had a couple of test cases to verify that it actually worked, in particular together with huge PMDs and PUDs where breaking COW would imply splitting those, but I can't think of anything else actually wanting to do that other than by mistake. /Thomas > > Christian, what's your take? > -Daniel > >> /Thomas >> >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> index 6dc96cf66744..72b6fb17c984 100644 >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> pfn_t pfnt; >> struct ttm_tt *ttm = bo->ttm; >> bool write = vmf->flags & FAULT_FLAG_WRITE; >> + struct dev_pagemap *pagemap; >> >> /* Fault should not cross bo boundary. */ >> page_offset &= ~(fault_page_size - 1); >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> if ((pfn & (fault_page_size - 1)) != 0) >> goto out_fallback; >> >> + /* >> + * Huge entries must be special, that is marking them as devmap >> + * with no backing device map range. If there is a backing >> + * range, Don't insert a huge entry. >> + */ >> + pagemap = get_dev_pagemap(pfn, NULL); >> + if (pagemap) { >> + put_dev_pagemap(pagemap); >> + goto out_fallback; >> + } >> + >> /* Check that memory is contiguous. */ >> if (!bo->mem.bus.is_iomem) { >> for (i = 1; i < fault_page_size; ++i) { >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> } >> } >> >> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); >> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); >> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) >> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); >> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >> *vmf, >> if (ret != VM_FAULT_NOPAGE) >> goto out_fallback; >> >> +#if 1 >> + { >> + int npages; >> + struct page *page; >> + >> + npages = get_user_pages_fast_only(vmf->address, 1, 0, >> &page); >> + if (npages == 1) { >> + DRM_WARN("Fast gup succeeded. Bad.\n"); >> + put_page(page); >> + } else { >> + DRM_INFO("Fast gup failed. Good.\n"); >> + } >> + } >> +#endif >> + >> return VM_FAULT_NOPAGE; >> out_fallback: >> count_vm_event(THP_FAULT_FALLBACK); >> >> >> >> >> _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-11 13:12 ` Thomas Hellström (Intel) (?) @ 2021-03-11 13:17 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-11 13:17 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > Hi! > > On 3/11/21 2:00 PM, Daniel Vetter wrote: > > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: > >> On 3/1/21 3:09 PM, Daniel Vetter wrote: > >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König > >>> <christian.koenig@amd.com> wrote: > >>>> > >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: > >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > >>>>>> wrote: > >>>>>>> Hi, > >>>>>>> > >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: > >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > >>>>>>>> <thomas_os@shipmail.org> wrote: > >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: > >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be > >>>>>>>>>> good > >>>>>>>>>> if Christian can check this with some direct io to a buffer in > >>>>>>>>>> system > >>>>>>>>>> memory. > >>>>>>>>> Hmm, > >>>>>>>>> > >>>>>>>>> Docs (again vm_normal_page() say) > >>>>>>>>> > >>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or > >>>>>>>>> without "struct > >>>>>>>>> * page" backing, however the difference is that _all_ pages > >>>>>>>>> with a struct > >>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted > >>>>>>>>> and > >>>>>>>>> considered > >>>>>>>>> * normal pages by the VM. The disadvantage is that pages are > >>>>>>>>> refcounted > >>>>>>>>> * (which can be slower and simply not an option for some PFNMAP > >>>>>>>>> users). The > >>>>>>>>> * advantage is that we don't have to follow the strict > >>>>>>>>> linearity rule of > >>>>>>>>> * PFNMAP mappings in order to support COWable mappings. > >>>>>>>>> > >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() > >>>>>>>>> path, so > >>>>>>>>> the above isn't really true, which makes me wonder if and in that > >>>>>>>>> case > >>>>>>>>> why there could any longer ever be a significant performance > >>>>>>>>> difference > >>>>>>>>> between MIXEDMAP and PFNMAP. > >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see > >>>>>>>> what sticks. > >>>>>>>> > >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that > >>>>>>>>> devmap > >>>>>>>>> hack, so they are (for the non-gup case) relying on > >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still > >>>>>>>>> there. > >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do > >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying > >>>>>>>> to find the underlying page. > >>>>>>>> -Daniel > >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was > >>>>>>> to set > >>>>>>> > >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > >>>>>>> true, and > >>>>>>> then > >>>>>>> > >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > >>>>>>> gup_fast() > >>>>>>> backs off, > >>>>>>> > >>>>>>> in the end that would mean setting in stone that "if there is a huge > >>>>>>> devmap > >>>>>>> page table entry for which we haven't registered any devmap struct > >>>>>>> pages > >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" > >>>>>>> huge > >>>>>>> page table entry". > >>>>>>> > >>>>>>> From what I can tell, all code calling get_dev_pagemap() already > >>>>>>> does that, > >>>>>>> it's just a question of getting it accepted and formalizing it. > >>>>>> Oh I thought that's already how it works, since I didn't spot anything > >>>>>> else that would block gup_fast from falling over. I guess really would > >>>>>> need some testcases to make sure direct i/o (that's the easiest to test) > >>>>>> fails like we expect. > >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > >>>>> Otherwise pmd_devmap() will not return true and since there is no > >>>>> pmd_special() things break. > >>>> Is that maybe the issue we have seen with amdgpu and huge pages? > >>> Yeah, essentially when you have a hugepte inserted by ttm, and it > >>> happens to point at system memory, then gup will work on that. And > >>> create all kinds of havoc. > >>> > >>>> Apart from that I'm lost guys, that devmap and gup stuff is not > >>>> something I have a good knowledge of apart from a one mile high view. > >>> I'm not really better, hence would be good to do a testcase and see. > >>> This should provoke it: > >>> - allocate nicely aligned bo in system memory > >>> - mmap, again nicely aligned to 2M > >>> - do some direct io from a filesystem into that mmap, that should trigger gup > >>> - before the gup completes free the mmap and bo so that ttm recycles > >>> the pages, which should trip up on the elevated refcount. If you wait > >>> until the direct io is completely, then I think nothing bad can be > >>> observed. > >>> > >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have > >>> another issue. > >>> > >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > >>> -Daniel > >> So I did the following quick experiment on vmwgfx, and it turns out that > >> with it, > >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds > >> > >> I should probably craft an RFC formalizing this. > > Yeah I think that would be good. Maybe even more formalized if we also > > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the > > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or > > something like that. > > > > Otoh your description of when it only sometimes succeeds would indicate my > > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. > > My understanding from reading the vmf_insert_mixed() code is that iff > the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's > not consistent with the vm_normal_page() doc. For architectures without > pte_special, VM_PFNMAP must be used, and then we must also block COW > mappings. > > If we can get someone can commit to verify that the potential PAT WC > performance issue is gone with PFNMAP, I can put together a series with > that included. Iirc when I checked there's not much archs without pte_special, so I guess that's why we luck out. Hopefully. > As for existing userspace using COW TTM mappings, I once had a couple of > test cases to verify that it actually worked, in particular together > with huge PMDs and PUDs where breaking COW would imply splitting those, > but I can't think of anything else actually wanting to do that other > than by mistake. Yeah disallowing MAP_PRIVATE mappings would be another good thing to lock down. Really doesn't make much sense. -Daniel > /Thomas > > > > > > Christian, what's your take? > > -Daniel > > > >> /Thomas > >> > >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> index 6dc96cf66744..72b6fb17c984 100644 > >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> pfn_t pfnt; > >> struct ttm_tt *ttm = bo->ttm; > >> bool write = vmf->flags & FAULT_FLAG_WRITE; > >> + struct dev_pagemap *pagemap; > >> > >> /* Fault should not cross bo boundary. */ > >> page_offset &= ~(fault_page_size - 1); > >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> if ((pfn & (fault_page_size - 1)) != 0) > >> goto out_fallback; > >> > >> + /* > >> + * Huge entries must be special, that is marking them as devmap > >> + * with no backing device map range. If there is a backing > >> + * range, Don't insert a huge entry. > >> + */ > >> + pagemap = get_dev_pagemap(pfn, NULL); > >> + if (pagemap) { > >> + put_dev_pagemap(pagemap); > >> + goto out_fallback; > >> + } > >> + > >> /* Check that memory is contiguous. */ > >> if (!bo->mem.bus.is_iomem) { > >> for (i = 1; i < fault_page_size; ++i) { > >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> } > >> } > >> > >> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); > >> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); > >> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) > >> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); > >> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> if (ret != VM_FAULT_NOPAGE) > >> goto out_fallback; > >> > >> +#if 1 > >> + { > >> + int npages; > >> + struct page *page; > >> + > >> + npages = get_user_pages_fast_only(vmf->address, 1, 0, > >> &page); > >> + if (npages == 1) { > >> + DRM_WARN("Fast gup succeeded. Bad.\n"); > >> + put_page(page); > >> + } else { > >> + DRM_INFO("Fast gup failed. Good.\n"); > >> + } > >> + } > >> +#endif > >> + > >> return VM_FAULT_NOPAGE; > >> out_fallback: > >> count_vm_event(THP_FAULT_FALLBACK); > >> > >> > >> > >> > >> -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 13:17 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-11 13:17 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > Hi! > > On 3/11/21 2:00 PM, Daniel Vetter wrote: > > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: > >> On 3/1/21 3:09 PM, Daniel Vetter wrote: > >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König > >>> <christian.koenig@amd.com> wrote: > >>>> > >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: > >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > >>>>>> wrote: > >>>>>>> Hi, > >>>>>>> > >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: > >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > >>>>>>>> <thomas_os@shipmail.org> wrote: > >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: > >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be > >>>>>>>>>> good > >>>>>>>>>> if Christian can check this with some direct io to a buffer in > >>>>>>>>>> system > >>>>>>>>>> memory. > >>>>>>>>> Hmm, > >>>>>>>>> > >>>>>>>>> Docs (again vm_normal_page() say) > >>>>>>>>> > >>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or > >>>>>>>>> without "struct > >>>>>>>>> * page" backing, however the difference is that _all_ pages > >>>>>>>>> with a struct > >>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted > >>>>>>>>> and > >>>>>>>>> considered > >>>>>>>>> * normal pages by the VM. The disadvantage is that pages are > >>>>>>>>> refcounted > >>>>>>>>> * (which can be slower and simply not an option for some PFNMAP > >>>>>>>>> users). The > >>>>>>>>> * advantage is that we don't have to follow the strict > >>>>>>>>> linearity rule of > >>>>>>>>> * PFNMAP mappings in order to support COWable mappings. > >>>>>>>>> > >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() > >>>>>>>>> path, so > >>>>>>>>> the above isn't really true, which makes me wonder if and in that > >>>>>>>>> case > >>>>>>>>> why there could any longer ever be a significant performance > >>>>>>>>> difference > >>>>>>>>> between MIXEDMAP and PFNMAP. > >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see > >>>>>>>> what sticks. > >>>>>>>> > >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that > >>>>>>>>> devmap > >>>>>>>>> hack, so they are (for the non-gup case) relying on > >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still > >>>>>>>>> there. > >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do > >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying > >>>>>>>> to find the underlying page. > >>>>>>>> -Daniel > >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was > >>>>>>> to set > >>>>>>> > >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > >>>>>>> true, and > >>>>>>> then > >>>>>>> > >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > >>>>>>> gup_fast() > >>>>>>> backs off, > >>>>>>> > >>>>>>> in the end that would mean setting in stone that "if there is a huge > >>>>>>> devmap > >>>>>>> page table entry for which we haven't registered any devmap struct > >>>>>>> pages > >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" > >>>>>>> huge > >>>>>>> page table entry". > >>>>>>> > >>>>>>> From what I can tell, all code calling get_dev_pagemap() already > >>>>>>> does that, > >>>>>>> it's just a question of getting it accepted and formalizing it. > >>>>>> Oh I thought that's already how it works, since I didn't spot anything > >>>>>> else that would block gup_fast from falling over. I guess really would > >>>>>> need some testcases to make sure direct i/o (that's the easiest to test) > >>>>>> fails like we expect. > >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > >>>>> Otherwise pmd_devmap() will not return true and since there is no > >>>>> pmd_special() things break. > >>>> Is that maybe the issue we have seen with amdgpu and huge pages? > >>> Yeah, essentially when you have a hugepte inserted by ttm, and it > >>> happens to point at system memory, then gup will work on that. And > >>> create all kinds of havoc. > >>> > >>>> Apart from that I'm lost guys, that devmap and gup stuff is not > >>>> something I have a good knowledge of apart from a one mile high view. > >>> I'm not really better, hence would be good to do a testcase and see. > >>> This should provoke it: > >>> - allocate nicely aligned bo in system memory > >>> - mmap, again nicely aligned to 2M > >>> - do some direct io from a filesystem into that mmap, that should trigger gup > >>> - before the gup completes free the mmap and bo so that ttm recycles > >>> the pages, which should trip up on the elevated refcount. If you wait > >>> until the direct io is completely, then I think nothing bad can be > >>> observed. > >>> > >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have > >>> another issue. > >>> > >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > >>> -Daniel > >> So I did the following quick experiment on vmwgfx, and it turns out that > >> with it, > >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds > >> > >> I should probably craft an RFC formalizing this. > > Yeah I think that would be good. Maybe even more formalized if we also > > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the > > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or > > something like that. > > > > Otoh your description of when it only sometimes succeeds would indicate my > > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. > > My understanding from reading the vmf_insert_mixed() code is that iff > the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's > not consistent with the vm_normal_page() doc. For architectures without > pte_special, VM_PFNMAP must be used, and then we must also block COW > mappings. > > If we can get someone can commit to verify that the potential PAT WC > performance issue is gone with PFNMAP, I can put together a series with > that included. Iirc when I checked there's not much archs without pte_special, so I guess that's why we luck out. Hopefully. > As for existing userspace using COW TTM mappings, I once had a couple of > test cases to verify that it actually worked, in particular together > with huge PMDs and PUDs where breaking COW would imply splitting those, > but I can't think of anything else actually wanting to do that other > than by mistake. Yeah disallowing MAP_PRIVATE mappings would be another good thing to lock down. Really doesn't make much sense. -Daniel > /Thomas > > > > > > Christian, what's your take? > > -Daniel > > > >> /Thomas > >> > >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> index 6dc96cf66744..72b6fb17c984 100644 > >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> pfn_t pfnt; > >> struct ttm_tt *ttm = bo->ttm; > >> bool write = vmf->flags & FAULT_FLAG_WRITE; > >> + struct dev_pagemap *pagemap; > >> > >> /* Fault should not cross bo boundary. */ > >> page_offset &= ~(fault_page_size - 1); > >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> if ((pfn & (fault_page_size - 1)) != 0) > >> goto out_fallback; > >> > >> + /* > >> + * Huge entries must be special, that is marking them as devmap > >> + * with no backing device map range. If there is a backing > >> + * range, Don't insert a huge entry. > >> + */ > >> + pagemap = get_dev_pagemap(pfn, NULL); > >> + if (pagemap) { > >> + put_dev_pagemap(pagemap); > >> + goto out_fallback; > >> + } > >> + > >> /* Check that memory is contiguous. */ > >> if (!bo->mem.bus.is_iomem) { > >> for (i = 1; i < fault_page_size; ++i) { > >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> } > >> } > >> > >> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); > >> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); > >> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) > >> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); > >> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> if (ret != VM_FAULT_NOPAGE) > >> goto out_fallback; > >> > >> +#if 1 > >> + { > >> + int npages; > >> + struct page *page; > >> + > >> + npages = get_user_pages_fast_only(vmf->address, 1, 0, > >> &page); > >> + if (npages == 1) { > >> + DRM_WARN("Fast gup succeeded. Bad.\n"); > >> + put_page(page); > >> + } else { > >> + DRM_INFO("Fast gup failed. Good.\n"); > >> + } > >> + } > >> +#endif > >> + > >> return VM_FAULT_NOPAGE; > >> out_fallback: > >> count_vm_event(THP_FAULT_FALLBACK); > >> > >> > >> > >> > >> -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 13:17 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-03-11 13:17 UTC (permalink / raw) To: Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel) <thomas_os@shipmail.org> wrote: > > Hi! > > On 3/11/21 2:00 PM, Daniel Vetter wrote: > > On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: > >> On 3/1/21 3:09 PM, Daniel Vetter wrote: > >>> On Mon, Mar 1, 2021 at 11:17 AM Christian König > >>> <christian.koenig@amd.com> wrote: > >>>> > >>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): > >>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: > >>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) > >>>>>> wrote: > >>>>>>> Hi, > >>>>>>> > >>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: > >>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) > >>>>>>>> <thomas_os@shipmail.org> wrote: > >>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: > >>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be > >>>>>>>>>> good > >>>>>>>>>> if Christian can check this with some direct io to a buffer in > >>>>>>>>>> system > >>>>>>>>>> memory. > >>>>>>>>> Hmm, > >>>>>>>>> > >>>>>>>>> Docs (again vm_normal_page() say) > >>>>>>>>> > >>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or > >>>>>>>>> without "struct > >>>>>>>>> * page" backing, however the difference is that _all_ pages > >>>>>>>>> with a struct > >>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted > >>>>>>>>> and > >>>>>>>>> considered > >>>>>>>>> * normal pages by the VM. The disadvantage is that pages are > >>>>>>>>> refcounted > >>>>>>>>> * (which can be slower and simply not an option for some PFNMAP > >>>>>>>>> users). The > >>>>>>>>> * advantage is that we don't have to follow the strict > >>>>>>>>> linearity rule of > >>>>>>>>> * PFNMAP mappings in order to support COWable mappings. > >>>>>>>>> > >>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() > >>>>>>>>> path, so > >>>>>>>>> the above isn't really true, which makes me wonder if and in that > >>>>>>>>> case > >>>>>>>>> why there could any longer ever be a significant performance > >>>>>>>>> difference > >>>>>>>>> between MIXEDMAP and PFNMAP. > >>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see > >>>>>>>> what sticks. > >>>>>>>> > >>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that > >>>>>>>>> devmap > >>>>>>>>> hack, so they are (for the non-gup case) relying on > >>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still > >>>>>>>>> there. > >>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do > >>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying > >>>>>>>> to find the underlying page. > >>>>>>>> -Daniel > >>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was > >>>>>>> to set > >>>>>>> > >>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be > >>>>>>> true, and > >>>>>>> then > >>>>>>> > >>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and > >>>>>>> gup_fast() > >>>>>>> backs off, > >>>>>>> > >>>>>>> in the end that would mean setting in stone that "if there is a huge > >>>>>>> devmap > >>>>>>> page table entry for which we haven't registered any devmap struct > >>>>>>> pages > >>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" > >>>>>>> huge > >>>>>>> page table entry". > >>>>>>> > >>>>>>> From what I can tell, all code calling get_dev_pagemap() already > >>>>>>> does that, > >>>>>>> it's just a question of getting it accepted and formalizing it. > >>>>>> Oh I thought that's already how it works, since I didn't spot anything > >>>>>> else that would block gup_fast from falling over. I guess really would > >>>>>> need some testcases to make sure direct i/o (that's the easiest to test) > >>>>>> fails like we expect. > >>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. > >>>>> Otherwise pmd_devmap() will not return true and since there is no > >>>>> pmd_special() things break. > >>>> Is that maybe the issue we have seen with amdgpu and huge pages? > >>> Yeah, essentially when you have a hugepte inserted by ttm, and it > >>> happens to point at system memory, then gup will work on that. And > >>> create all kinds of havoc. > >>> > >>>> Apart from that I'm lost guys, that devmap and gup stuff is not > >>>> something I have a good knowledge of apart from a one mile high view. > >>> I'm not really better, hence would be good to do a testcase and see. > >>> This should provoke it: > >>> - allocate nicely aligned bo in system memory > >>> - mmap, again nicely aligned to 2M > >>> - do some direct io from a filesystem into that mmap, that should trigger gup > >>> - before the gup completes free the mmap and bo so that ttm recycles > >>> the pages, which should trip up on the elevated refcount. If you wait > >>> until the direct io is completely, then I think nothing bad can be > >>> observed. > >>> > >>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have > >>> another issue. > >>> > >>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. > >>> -Daniel > >> So I did the following quick experiment on vmwgfx, and it turns out that > >> with it, > >> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds > >> > >> I should probably craft an RFC formalizing this. > > Yeah I think that would be good. Maybe even more formalized if we also > > switch over to VM_PFNMAP, since afaiui these pte flags here only stop the > > fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or > > something like that. > > > > Otoh your description of when it only sometimes succeeds would indicate my > > understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. > > My understanding from reading the vmf_insert_mixed() code is that iff > the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's > not consistent with the vm_normal_page() doc. For architectures without > pte_special, VM_PFNMAP must be used, and then we must also block COW > mappings. > > If we can get someone can commit to verify that the potential PAT WC > performance issue is gone with PFNMAP, I can put together a series with > that included. Iirc when I checked there's not much archs without pte_special, so I guess that's why we luck out. Hopefully. > As for existing userspace using COW TTM mappings, I once had a couple of > test cases to verify that it actually worked, in particular together > with huge PMDs and PUDs where breaking COW would imply splitting those, > but I can't think of anything else actually wanting to do that other > than by mistake. Yeah disallowing MAP_PRIVATE mappings would be another good thing to lock down. Really doesn't make much sense. -Daniel > /Thomas > > > > > > Christian, what's your take? > > -Daniel > > > >> /Thomas > >> > >> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> index 6dc96cf66744..72b6fb17c984 100644 > >> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c > >> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> pfn_t pfnt; > >> struct ttm_tt *ttm = bo->ttm; > >> bool write = vmf->flags & FAULT_FLAG_WRITE; > >> + struct dev_pagemap *pagemap; > >> > >> /* Fault should not cross bo boundary. */ > >> page_offset &= ~(fault_page_size - 1); > >> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> if ((pfn & (fault_page_size - 1)) != 0) > >> goto out_fallback; > >> > >> + /* > >> + * Huge entries must be special, that is marking them as devmap > >> + * with no backing device map range. If there is a backing > >> + * range, Don't insert a huge entry. > >> + */ > >> + pagemap = get_dev_pagemap(pfn, NULL); > >> + if (pagemap) { > >> + put_dev_pagemap(pagemap); > >> + goto out_fallback; > >> + } > >> + > >> /* Check that memory is contiguous. */ > >> if (!bo->mem.bus.is_iomem) { > >> for (i = 1; i < fault_page_size; ++i) { > >> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> } > >> } > >> > >> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); > >> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); > >> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) > >> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); > >> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD > >> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault > >> *vmf, > >> if (ret != VM_FAULT_NOPAGE) > >> goto out_fallback; > >> > >> +#if 1 > >> + { > >> + int npages; > >> + struct page *page; > >> + > >> + npages = get_user_pages_fast_only(vmf->address, 1, 0, > >> &page); > >> + if (npages == 1) { > >> + DRM_WARN("Fast gup succeeded. Bad.\n"); > >> + put_page(page); > >> + } else { > >> + DRM_INFO("Fast gup failed. Good.\n"); > >> + } > >> + } > >> +#endif > >> + > >> return VM_FAULT_NOPAGE; > >> out_fallback: > >> count_vm_event(THP_FAULT_FALLBACK); > >> > >> > >> > >> > >> -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-11 13:17 ` Daniel Vetter (?) @ 2021-03-11 15:37 ` Thomas Hellström (Intel) -1 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 15:37 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK On 3/11/21 2:17 PM, Daniel Vetter wrote: > On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> Hi! >> >> On 3/11/21 2:00 PM, Daniel Vetter wrote: >>> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: >>>> On 3/1/21 3:09 PM, Daniel Vetter wrote: >>>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König >>>>> <christian.koenig@amd.com> wrote: >>>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>>>>>> good >>>>>>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>>>>>> system >>>>>>>>>>>> memory. >>>>>>>>>>> Hmm, >>>>>>>>>>> >>>>>>>>>>> Docs (again vm_normal_page() say) >>>>>>>>>>> >>>>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>>>>>> without "struct >>>>>>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>>>>>> with a struct >>>>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>>>>>> and >>>>>>>>>>> considered >>>>>>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>>>>>> refcounted >>>>>>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>>>>>> users). The >>>>>>>>>>> * advantage is that we don't have to follow the strict >>>>>>>>>>> linearity rule of >>>>>>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>>>>>> >>>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>>>>>> path, so >>>>>>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>>>>>> case >>>>>>>>>>> why there could any longer ever be a significant performance >>>>>>>>>>> difference >>>>>>>>>>> between MIXEDMAP and PFNMAP. >>>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>>>>>> what sticks. >>>>>>>>>> >>>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>>>>>> devmap >>>>>>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>>>>>> there. >>>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>>>>>> to find the underlying page. >>>>>>>>>> -Daniel >>>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>>>>>> to set >>>>>>>>> >>>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>>>>>> true, and >>>>>>>>> then >>>>>>>>> >>>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>>>>>> gup_fast() >>>>>>>>> backs off, >>>>>>>>> >>>>>>>>> in the end that would mean setting in stone that "if there is a huge >>>>>>>>> devmap >>>>>>>>> page table entry for which we haven't registered any devmap struct >>>>>>>>> pages >>>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>>>>>> huge >>>>>>>>> page table entry". >>>>>>>>> >>>>>>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>>>>>> does that, >>>>>>>>> it's just a question of getting it accepted and formalizing it. >>>>>>>> Oh I thought that's already how it works, since I didn't spot anything >>>>>>>> else that would block gup_fast from falling over. I guess really would >>>>>>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>>>>>> fails like we expect. >>>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>>>>>> Otherwise pmd_devmap() will not return true and since there is no >>>>>>> pmd_special() things break. >>>>>> Is that maybe the issue we have seen with amdgpu and huge pages? >>>>> Yeah, essentially when you have a hugepte inserted by ttm, and it >>>>> happens to point at system memory, then gup will work on that. And >>>>> create all kinds of havoc. >>>>> >>>>>> Apart from that I'm lost guys, that devmap and gup stuff is not >>>>>> something I have a good knowledge of apart from a one mile high view. >>>>> I'm not really better, hence would be good to do a testcase and see. >>>>> This should provoke it: >>>>> - allocate nicely aligned bo in system memory >>>>> - mmap, again nicely aligned to 2M >>>>> - do some direct io from a filesystem into that mmap, that should trigger gup >>>>> - before the gup completes free the mmap and bo so that ttm recycles >>>>> the pages, which should trip up on the elevated refcount. If you wait >>>>> until the direct io is completely, then I think nothing bad can be >>>>> observed. >>>>> >>>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have >>>>> another issue. >>>>> >>>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. >>>>> -Daniel >>>> So I did the following quick experiment on vmwgfx, and it turns out that >>>> with it, >>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >>>> >>>> I should probably craft an RFC formalizing this. >>> Yeah I think that would be good. Maybe even more formalized if we also >>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the >>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or >>> something like that. >>> >>> Otoh your description of when it only sometimes succeeds would indicate my >>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. >> My understanding from reading the vmf_insert_mixed() code is that iff >> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's >> not consistent with the vm_normal_page() doc. For architectures without >> pte_special, VM_PFNMAP must be used, and then we must also block COW >> mappings. >> >> If we can get someone can commit to verify that the potential PAT WC >> performance issue is gone with PFNMAP, I can put together a series with >> that included. > Iirc when I checked there's not much archs without pte_special, so I > guess that's why we luck out. Hopefully. > >> As for existing userspace using COW TTM mappings, I once had a couple of >> test cases to verify that it actually worked, in particular together >> with huge PMDs and PUDs where breaking COW would imply splitting those, >> but I can't think of anything else actually wanting to do that other >> than by mistake. > Yeah disallowing MAP_PRIVATE mappings would be another good thing to > lock down. Really doesn't make much sense. > -Daniel Yes, we can't allow them with PFNMAP + a non-linear address space... /Thomas >> /Thomas >> >> >>> Christian, what's your take? >>> -Daniel >>> >>>> /Thomas >>>> >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> index 6dc96cf66744..72b6fb17c984 100644 >>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> pfn_t pfnt; >>>> struct ttm_tt *ttm = bo->ttm; >>>> bool write = vmf->flags & FAULT_FLAG_WRITE; >>>> + struct dev_pagemap *pagemap; >>>> >>>> /* Fault should not cross bo boundary. */ >>>> page_offset &= ~(fault_page_size - 1); >>>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> if ((pfn & (fault_page_size - 1)) != 0) >>>> goto out_fallback; >>>> >>>> + /* >>>> + * Huge entries must be special, that is marking them as devmap >>>> + * with no backing device map range. If there is a backing >>>> + * range, Don't insert a huge entry. >>>> + */ >>>> + pagemap = get_dev_pagemap(pfn, NULL); >>>> + if (pagemap) { >>>> + put_dev_pagemap(pagemap); >>>> + goto out_fallback; >>>> + } >>>> + >>>> /* Check that memory is contiguous. */ >>>> if (!bo->mem.bus.is_iomem) { >>>> for (i = 1; i < fault_page_size; ++i) { >>>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> } >>>> } >>>> >>>> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); >>>> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); >>>> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) >>>> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); >>>> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD >>>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> if (ret != VM_FAULT_NOPAGE) >>>> goto out_fallback; >>>> >>>> +#if 1 >>>> + { >>>> + int npages; >>>> + struct page *page; >>>> + >>>> + npages = get_user_pages_fast_only(vmf->address, 1, 0, >>>> &page); >>>> + if (npages == 1) { >>>> + DRM_WARN("Fast gup succeeded. Bad.\n"); >>>> + put_page(page); >>>> + } else { >>>> + DRM_INFO("Fast gup failed. Good.\n"); >>>> + } >>>> + } >>>> +#endif >>>> + >>>> return VM_FAULT_NOPAGE; >>>> out_fallback: >>>> count_vm_event(THP_FAULT_FALLBACK); >>>> >>>> >>>> >>>> >>>> > > ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 15:37 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 15:37 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 3/11/21 2:17 PM, Daniel Vetter wrote: > On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> Hi! >> >> On 3/11/21 2:00 PM, Daniel Vetter wrote: >>> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: >>>> On 3/1/21 3:09 PM, Daniel Vetter wrote: >>>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König >>>>> <christian.koenig@amd.com> wrote: >>>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>>>>>> good >>>>>>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>>>>>> system >>>>>>>>>>>> memory. >>>>>>>>>>> Hmm, >>>>>>>>>>> >>>>>>>>>>> Docs (again vm_normal_page() say) >>>>>>>>>>> >>>>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>>>>>> without "struct >>>>>>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>>>>>> with a struct >>>>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>>>>>> and >>>>>>>>>>> considered >>>>>>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>>>>>> refcounted >>>>>>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>>>>>> users). The >>>>>>>>>>> * advantage is that we don't have to follow the strict >>>>>>>>>>> linearity rule of >>>>>>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>>>>>> >>>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>>>>>> path, so >>>>>>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>>>>>> case >>>>>>>>>>> why there could any longer ever be a significant performance >>>>>>>>>>> difference >>>>>>>>>>> between MIXEDMAP and PFNMAP. >>>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>>>>>> what sticks. >>>>>>>>>> >>>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>>>>>> devmap >>>>>>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>>>>>> there. >>>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>>>>>> to find the underlying page. >>>>>>>>>> -Daniel >>>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>>>>>> to set >>>>>>>>> >>>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>>>>>> true, and >>>>>>>>> then >>>>>>>>> >>>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>>>>>> gup_fast() >>>>>>>>> backs off, >>>>>>>>> >>>>>>>>> in the end that would mean setting in stone that "if there is a huge >>>>>>>>> devmap >>>>>>>>> page table entry for which we haven't registered any devmap struct >>>>>>>>> pages >>>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>>>>>> huge >>>>>>>>> page table entry". >>>>>>>>> >>>>>>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>>>>>> does that, >>>>>>>>> it's just a question of getting it accepted and formalizing it. >>>>>>>> Oh I thought that's already how it works, since I didn't spot anything >>>>>>>> else that would block gup_fast from falling over. I guess really would >>>>>>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>>>>>> fails like we expect. >>>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>>>>>> Otherwise pmd_devmap() will not return true and since there is no >>>>>>> pmd_special() things break. >>>>>> Is that maybe the issue we have seen with amdgpu and huge pages? >>>>> Yeah, essentially when you have a hugepte inserted by ttm, and it >>>>> happens to point at system memory, then gup will work on that. And >>>>> create all kinds of havoc. >>>>> >>>>>> Apart from that I'm lost guys, that devmap and gup stuff is not >>>>>> something I have a good knowledge of apart from a one mile high view. >>>>> I'm not really better, hence would be good to do a testcase and see. >>>>> This should provoke it: >>>>> - allocate nicely aligned bo in system memory >>>>> - mmap, again nicely aligned to 2M >>>>> - do some direct io from a filesystem into that mmap, that should trigger gup >>>>> - before the gup completes free the mmap and bo so that ttm recycles >>>>> the pages, which should trip up on the elevated refcount. If you wait >>>>> until the direct io is completely, then I think nothing bad can be >>>>> observed. >>>>> >>>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have >>>>> another issue. >>>>> >>>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. >>>>> -Daniel >>>> So I did the following quick experiment on vmwgfx, and it turns out that >>>> with it, >>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >>>> >>>> I should probably craft an RFC formalizing this. >>> Yeah I think that would be good. Maybe even more formalized if we also >>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the >>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or >>> something like that. >>> >>> Otoh your description of when it only sometimes succeeds would indicate my >>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. >> My understanding from reading the vmf_insert_mixed() code is that iff >> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's >> not consistent with the vm_normal_page() doc. For architectures without >> pte_special, VM_PFNMAP must be used, and then we must also block COW >> mappings. >> >> If we can get someone can commit to verify that the potential PAT WC >> performance issue is gone with PFNMAP, I can put together a series with >> that included. > Iirc when I checked there's not much archs without pte_special, so I > guess that's why we luck out. Hopefully. > >> As for existing userspace using COW TTM mappings, I once had a couple of >> test cases to verify that it actually worked, in particular together >> with huge PMDs and PUDs where breaking COW would imply splitting those, >> but I can't think of anything else actually wanting to do that other >> than by mistake. > Yeah disallowing MAP_PRIVATE mappings would be another good thing to > lock down. Really doesn't make much sense. > -Daniel Yes, we can't allow them with PFNMAP + a non-linear address space... /Thomas >> /Thomas >> >> >>> Christian, what's your take? >>> -Daniel >>> >>>> /Thomas >>>> >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> index 6dc96cf66744..72b6fb17c984 100644 >>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> pfn_t pfnt; >>>> struct ttm_tt *ttm = bo->ttm; >>>> bool write = vmf->flags & FAULT_FLAG_WRITE; >>>> + struct dev_pagemap *pagemap; >>>> >>>> /* Fault should not cross bo boundary. */ >>>> page_offset &= ~(fault_page_size - 1); >>>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> if ((pfn & (fault_page_size - 1)) != 0) >>>> goto out_fallback; >>>> >>>> + /* >>>> + * Huge entries must be special, that is marking them as devmap >>>> + * with no backing device map range. If there is a backing >>>> + * range, Don't insert a huge entry. >>>> + */ >>>> + pagemap = get_dev_pagemap(pfn, NULL); >>>> + if (pagemap) { >>>> + put_dev_pagemap(pagemap); >>>> + goto out_fallback; >>>> + } >>>> + >>>> /* Check that memory is contiguous. */ >>>> if (!bo->mem.bus.is_iomem) { >>>> for (i = 1; i < fault_page_size; ++i) { >>>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> } >>>> } >>>> >>>> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); >>>> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); >>>> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) >>>> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); >>>> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD >>>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> if (ret != VM_FAULT_NOPAGE) >>>> goto out_fallback; >>>> >>>> +#if 1 >>>> + { >>>> + int npages; >>>> + struct page *page; >>>> + >>>> + npages = get_user_pages_fast_only(vmf->address, 1, 0, >>>> &page); >>>> + if (npages == 1) { >>>> + DRM_WARN("Fast gup succeeded. Bad.\n"); >>>> + put_page(page); >>>> + } else { >>>> + DRM_INFO("Fast gup failed. Good.\n"); >>>> + } >>>> + } >>>> +#endif >>>> + >>>> return VM_FAULT_NOPAGE; >>>> out_fallback: >>>> count_vm_event(THP_FAULT_FALLBACK); >>>> >>>> >>>> >>>> >>>> > > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-11 15:37 ` Thomas Hellström (Intel) 0 siblings, 0 replies; 110+ messages in thread From: Thomas Hellström (Intel) @ 2021-03-11 15:37 UTC (permalink / raw) To: Daniel Vetter Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On 3/11/21 2:17 PM, Daniel Vetter wrote: > On Thu, Mar 11, 2021 at 2:12 PM Thomas Hellström (Intel) > <thomas_os@shipmail.org> wrote: >> Hi! >> >> On 3/11/21 2:00 PM, Daniel Vetter wrote: >>> On Thu, Mar 11, 2021 at 11:22:06AM +0100, Thomas Hellström (Intel) wrote: >>>> On 3/1/21 3:09 PM, Daniel Vetter wrote: >>>>> On Mon, Mar 1, 2021 at 11:17 AM Christian König >>>>> <christian.koenig@amd.com> wrote: >>>>>> Am 01.03.21 um 10:21 schrieb Thomas Hellström (Intel): >>>>>>> On 3/1/21 10:05 AM, Daniel Vetter wrote: >>>>>>>> On Mon, Mar 01, 2021 at 09:39:53AM +0100, Thomas Hellström (Intel) >>>>>>>> wrote: >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> On 3/1/21 9:28 AM, Daniel Vetter wrote: >>>>>>>>>> On Sat, Feb 27, 2021 at 9:06 AM Thomas Hellström (Intel) >>>>>>>>>> <thomas_os@shipmail.org> wrote: >>>>>>>>>>> On 2/26/21 2:28 PM, Daniel Vetter wrote: >>>>>>>>>>>> So I think it stops gup. But I haven't verified at all. Would be >>>>>>>>>>>> good >>>>>>>>>>>> if Christian can check this with some direct io to a buffer in >>>>>>>>>>>> system >>>>>>>>>>>> memory. >>>>>>>>>>> Hmm, >>>>>>>>>>> >>>>>>>>>>> Docs (again vm_normal_page() say) >>>>>>>>>>> >>>>>>>>>>> * VM_MIXEDMAP mappings can likewise contain memory with or >>>>>>>>>>> without "struct >>>>>>>>>>> * page" backing, however the difference is that _all_ pages >>>>>>>>>>> with a struct >>>>>>>>>>> * page (that is, those where pfn_valid is true) are refcounted >>>>>>>>>>> and >>>>>>>>>>> considered >>>>>>>>>>> * normal pages by the VM. The disadvantage is that pages are >>>>>>>>>>> refcounted >>>>>>>>>>> * (which can be slower and simply not an option for some PFNMAP >>>>>>>>>>> users). The >>>>>>>>>>> * advantage is that we don't have to follow the strict >>>>>>>>>>> linearity rule of >>>>>>>>>>> * PFNMAP mappings in order to support COWable mappings. >>>>>>>>>>> >>>>>>>>>>> but it's true __vm_insert_mixed() ends up in the insert_pfn() >>>>>>>>>>> path, so >>>>>>>>>>> the above isn't really true, which makes me wonder if and in that >>>>>>>>>>> case >>>>>>>>>>> why there could any longer ever be a significant performance >>>>>>>>>>> difference >>>>>>>>>>> between MIXEDMAP and PFNMAP. >>>>>>>>>> Yeah it's definitely confusing. I guess I'll hack up a patch and see >>>>>>>>>> what sticks. >>>>>>>>>> >>>>>>>>>>> BTW regarding the TTM hugeptes, I don't think we ever landed that >>>>>>>>>>> devmap >>>>>>>>>>> hack, so they are (for the non-gup case) relying on >>>>>>>>>>> vma_is_special_huge(). For the gup case, I think the bug is still >>>>>>>>>>> there. >>>>>>>>>> Maybe there's another devmap hack, but the ttm_vm_insert functions do >>>>>>>>>> use PFN_DEV and all that. And I think that stops gup_fast from trying >>>>>>>>>> to find the underlying page. >>>>>>>>>> -Daniel >>>>>>>>> Hmm perhaps it might, but I don't think so. The fix I tried out was >>>>>>>>> to set >>>>>>>>> >>>>>>>>> PFN_DEV | PFN_MAP for huge PTEs which causes pfn_devmap() to be >>>>>>>>> true, and >>>>>>>>> then >>>>>>>>> >>>>>>>>> follow_devmap_pmd()->get_dev_pagemap() which returns NULL and >>>>>>>>> gup_fast() >>>>>>>>> backs off, >>>>>>>>> >>>>>>>>> in the end that would mean setting in stone that "if there is a huge >>>>>>>>> devmap >>>>>>>>> page table entry for which we haven't registered any devmap struct >>>>>>>>> pages >>>>>>>>> (get_dev_pagemap returns NULL), we should treat that as a "special" >>>>>>>>> huge >>>>>>>>> page table entry". >>>>>>>>> >>>>>>>>> From what I can tell, all code calling get_dev_pagemap() already >>>>>>>>> does that, >>>>>>>>> it's just a question of getting it accepted and formalizing it. >>>>>>>> Oh I thought that's already how it works, since I didn't spot anything >>>>>>>> else that would block gup_fast from falling over. I guess really would >>>>>>>> need some testcases to make sure direct i/o (that's the easiest to test) >>>>>>>> fails like we expect. >>>>>>> Yeah, IIRC the "| PFN_MAP" is the missing piece for TTM huge ptes. >>>>>>> Otherwise pmd_devmap() will not return true and since there is no >>>>>>> pmd_special() things break. >>>>>> Is that maybe the issue we have seen with amdgpu and huge pages? >>>>> Yeah, essentially when you have a hugepte inserted by ttm, and it >>>>> happens to point at system memory, then gup will work on that. And >>>>> create all kinds of havoc. >>>>> >>>>>> Apart from that I'm lost guys, that devmap and gup stuff is not >>>>>> something I have a good knowledge of apart from a one mile high view. >>>>> I'm not really better, hence would be good to do a testcase and see. >>>>> This should provoke it: >>>>> - allocate nicely aligned bo in system memory >>>>> - mmap, again nicely aligned to 2M >>>>> - do some direct io from a filesystem into that mmap, that should trigger gup >>>>> - before the gup completes free the mmap and bo so that ttm recycles >>>>> the pages, which should trip up on the elevated refcount. If you wait >>>>> until the direct io is completely, then I think nothing bad can be >>>>> observed. >>>>> >>>>> Ofc if your amdgpu+hugepte issue is something else, then maybe we have >>>>> another issue. >>>>> >>>>> Also usual caveat: I'm not an mm hacker either, so might be completely wrong. >>>>> -Daniel >>>> So I did the following quick experiment on vmwgfx, and it turns out that >>>> with it, >>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >>>> >>>> I should probably craft an RFC formalizing this. >>> Yeah I think that would be good. Maybe even more formalized if we also >>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the >>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or >>> something like that. >>> >>> Otoh your description of when it only sometimes succeeds would indicate my >>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. >> My understanding from reading the vmf_insert_mixed() code is that iff >> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's >> not consistent with the vm_normal_page() doc. For architectures without >> pte_special, VM_PFNMAP must be used, and then we must also block COW >> mappings. >> >> If we can get someone can commit to verify that the potential PAT WC >> performance issue is gone with PFNMAP, I can put together a series with >> that included. > Iirc when I checked there's not much archs without pte_special, so I > guess that's why we luck out. Hopefully. > >> As for existing userspace using COW TTM mappings, I once had a couple of >> test cases to verify that it actually worked, in particular together >> with huge PMDs and PUDs where breaking COW would imply splitting those, >> but I can't think of anything else actually wanting to do that other >> than by mistake. > Yeah disallowing MAP_PRIVATE mappings would be another good thing to > lock down. Really doesn't make much sense. > -Daniel Yes, we can't allow them with PFNMAP + a non-linear address space... /Thomas >> /Thomas >> >> >>> Christian, what's your take? >>> -Daniel >>> >>>> /Thomas >>>> >>>> diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> b/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> index 6dc96cf66744..72b6fb17c984 100644 >>>> --- a/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> +++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c >>>> @@ -195,6 +195,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> pfn_t pfnt; >>>> struct ttm_tt *ttm = bo->ttm; >>>> bool write = vmf->flags & FAULT_FLAG_WRITE; >>>> + struct dev_pagemap *pagemap; >>>> >>>> /* Fault should not cross bo boundary. */ >>>> page_offset &= ~(fault_page_size - 1); >>>> @@ -210,6 +211,17 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> if ((pfn & (fault_page_size - 1)) != 0) >>>> goto out_fallback; >>>> >>>> + /* >>>> + * Huge entries must be special, that is marking them as devmap >>>> + * with no backing device map range. If there is a backing >>>> + * range, Don't insert a huge entry. >>>> + */ >>>> + pagemap = get_dev_pagemap(pfn, NULL); >>>> + if (pagemap) { >>>> + put_dev_pagemap(pagemap); >>>> + goto out_fallback; >>>> + } >>>> + >>>> /* Check that memory is contiguous. */ >>>> if (!bo->mem.bus.is_iomem) { >>>> for (i = 1; i < fault_page_size; ++i) { >>>> @@ -223,7 +235,7 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> } >>>> } >>>> >>>> - pfnt = __pfn_to_pfn_t(pfn, PFN_DEV); >>>> + pfnt = __pfn_to_pfn_t(pfn, PFN_DEV | PFN_MAP); >>>> if (fault_page_size == (HPAGE_PMD_SIZE >> PAGE_SHIFT)) >>>> ret = vmf_insert_pfn_pmd_prot(vmf, pfnt, pgprot, write); >>>> #ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD >>>> @@ -236,6 +248,21 @@ static vm_fault_t ttm_bo_vm_insert_huge(struct vm_fault >>>> *vmf, >>>> if (ret != VM_FAULT_NOPAGE) >>>> goto out_fallback; >>>> >>>> +#if 1 >>>> + { >>>> + int npages; >>>> + struct page *page; >>>> + >>>> + npages = get_user_pages_fast_only(vmf->address, 1, 0, >>>> &page); >>>> + if (npages == 1) { >>>> + DRM_WARN("Fast gup succeeded. Bad.\n"); >>>> + put_page(page); >>>> + } else { >>>> + DRM_INFO("Fast gup failed. Good.\n"); >>>> + } >>>> + } >>>> +#endif >>>> + >>>> return VM_FAULT_NOPAGE; >>>> out_fallback: >>>> count_vm_event(THP_FAULT_FALLBACK); >>>> >>>> >>>> >>>> >>>> > > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-03-11 13:17 ` Daniel Vetter (?) @ 2021-03-12 7:51 ` Christian König -1 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-03-12 7:51 UTC (permalink / raw) To: Daniel Vetter, Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Am 11.03.21 um 14:17 schrieb Daniel Vetter: > [SNIP] >>>> So I did the following quick experiment on vmwgfx, and it turns out that >>>> with it, >>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >>>> >>>> I should probably craft an RFC formalizing this. >>> Yeah I think that would be good. Maybe even more formalized if we also >>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the >>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or >>> something like that. >>> >>> Otoh your description of when it only sometimes succeeds would indicate my >>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. >> My understanding from reading the vmf_insert_mixed() code is that iff >> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's >> not consistent with the vm_normal_page() doc. For architectures without >> pte_special, VM_PFNMAP must be used, and then we must also block COW >> mappings. >> >> If we can get someone can commit to verify that the potential PAT WC >> performance issue is gone with PFNMAP, I can put together a series with >> that included. > Iirc when I checked there's not much archs without pte_special, so I > guess that's why we luck out. Hopefully. I still need to read up a bit on what you guys are discussing here, but it starts to make a picture. Especially my understanding of what VM_MIXEDMAP means seems to have been slightly of. I would say just go ahead and provide patches to always use VM_PFNMAP in TTM and we can test it and see if there are still some issues. >> As for existing userspace using COW TTM mappings, I once had a couple of >> test cases to verify that it actually worked, in particular together >> with huge PMDs and PUDs where breaking COW would imply splitting those, >> but I can't think of anything else actually wanting to do that other >> than by mistake. > Yeah disallowing MAP_PRIVATE mappings would be another good thing to > lock down. Really doesn't make much sense. Completely agree. That sounds like something we should try to avoid. Regards, Christian. > -Daniel > ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-12 7:51 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-03-12 7:51 UTC (permalink / raw) To: Daniel Vetter, Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Am 11.03.21 um 14:17 schrieb Daniel Vetter: > [SNIP] >>>> So I did the following quick experiment on vmwgfx, and it turns out that >>>> with it, >>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >>>> >>>> I should probably craft an RFC formalizing this. >>> Yeah I think that would be good. Maybe even more formalized if we also >>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the >>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or >>> something like that. >>> >>> Otoh your description of when it only sometimes succeeds would indicate my >>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. >> My understanding from reading the vmf_insert_mixed() code is that iff >> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's >> not consistent with the vm_normal_page() doc. For architectures without >> pte_special, VM_PFNMAP must be used, and then we must also block COW >> mappings. >> >> If we can get someone can commit to verify that the potential PAT WC >> performance issue is gone with PFNMAP, I can put together a series with >> that included. > Iirc when I checked there's not much archs without pte_special, so I > guess that's why we luck out. Hopefully. I still need to read up a bit on what you guys are discussing here, but it starts to make a picture. Especially my understanding of what VM_MIXEDMAP means seems to have been slightly of. I would say just go ahead and provide patches to always use VM_PFNMAP in TTM and we can test it and see if there are still some issues. >> As for existing userspace using COW TTM mappings, I once had a couple of >> test cases to verify that it actually worked, in particular together >> with huge PMDs and PUDs where breaking COW would imply splitting those, >> but I can't think of anything else actually wanting to do that other >> than by mistake. > Yeah disallowing MAP_PRIVATE mappings would be another good thing to > lock down. Really doesn't make much sense. Completely agree. That sounds like something we should try to avoid. Regards, Christian. > -Daniel > _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-03-12 7:51 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-03-12 7:51 UTC (permalink / raw) To: Daniel Vetter, Thomas Hellström (Intel) Cc: Christian König, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Am 11.03.21 um 14:17 schrieb Daniel Vetter: > [SNIP] >>>> So I did the following quick experiment on vmwgfx, and it turns out that >>>> with it, >>>> fast gup never succeeds. Without the "| PFN_MAP", it typically succeeds >>>> >>>> I should probably craft an RFC formalizing this. >>> Yeah I think that would be good. Maybe even more formalized if we also >>> switch over to VM_PFNMAP, since afaiui these pte flags here only stop the >>> fast gup path. And slow gup can still peak through VM_MIXEDMAP. Or >>> something like that. >>> >>> Otoh your description of when it only sometimes succeeds would indicate my >>> understanding of VM_PFNMAP vs VM_MIXEDMAP is wrong here. >> My understanding from reading the vmf_insert_mixed() code is that iff >> the arch has pte_special(), VM_MIXEDMAP should be harmless. But that's >> not consistent with the vm_normal_page() doc. For architectures without >> pte_special, VM_PFNMAP must be used, and then we must also block COW >> mappings. >> >> If we can get someone can commit to verify that the potential PAT WC >> performance issue is gone with PFNMAP, I can put together a series with >> that included. > Iirc when I checked there's not much archs without pte_special, so I > guess that's why we luck out. Hopefully. I still need to read up a bit on what you guys are discussing here, but it starts to make a picture. Especially my understanding of what VM_MIXEDMAP means seems to have been slightly of. I would say just go ahead and provide patches to always use VM_PFNMAP in TTM and we can test it and see if there are still some issues. >> As for existing userspace using COW TTM mappings, I once had a couple of >> test cases to verify that it actually worked, in particular together >> with huge PMDs and PUDs where breaking COW would imply splitting those, >> but I can't think of anything else actually wanting to do that other >> than by mistake. > Yeah disallowing MAP_PRIVATE mappings would be another good thing to > lock down. Really doesn't make much sense. Completely agree. That sounds like something we should try to avoid. Regards, Christian. > -Daniel > _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-24 8:45 ` Daniel Vetter @ 2021-02-24 18:46 ` Jason Gunthorpe -1 siblings, 0 replies; 110+ messages in thread From: Jason Gunthorpe @ 2021-02-24 18:46 UTC (permalink / raw) To: Daniel Vetter Cc: Thomas Hellström (Intel), DRI Development, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, John Stultz, Daniel Vetter, Suren Baghdasaryan, Christian König, open list:DMA BUFFER SHARING FRAMEWORK On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote: > Hm I figured everyone just uses MAP_SHARED for buffer objects since > COW really makes absolutely no sense. How would we enforce this? In RDMA we test drivers/infiniband/core/ib_core_uverbs.c: if (!(vma->vm_flags & VM_SHARED)) During mmap to reject use of MAP_PRIVATE on BAR pages. Jason ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-24 18:46 ` Jason Gunthorpe 0 siblings, 0 replies; 110+ messages in thread From: Jason Gunthorpe @ 2021-02-24 18:46 UTC (permalink / raw) To: Daniel Vetter Cc: Thomas Hellström (Intel), Matthew Wilcox, Christian König, moderated list:DMA BUFFER SHARING FRAMEWORK, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote: > Hm I figured everyone just uses MAP_SHARED for buffer objects since > COW really makes absolutely no sense. How would we enforce this? In RDMA we test drivers/infiniband/core/ib_core_uverbs.c: if (!(vma->vm_flags & VM_SHARED)) During mmap to reject use of MAP_PRIVATE on BAR pages. Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-24 18:46 ` Jason Gunthorpe (?) @ 2021-02-25 10:30 ` Christian König -1 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 10:30 UTC (permalink / raw) To: Jason Gunthorpe, Daniel Vetter Cc: Thomas Hellström (Intel), DRI Development, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, John Stultz, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK Am 24.02.21 um 19:46 schrieb Jason Gunthorpe: > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote: > >> Hm I figured everyone just uses MAP_SHARED for buffer objects since >> COW really makes absolutely no sense. How would we enforce this? > In RDMA we test > > drivers/infiniband/core/ib_core_uverbs.c: if (!(vma->vm_flags & VM_SHARED)) > > During mmap to reject use of MAP_PRIVATE on BAR pages. That's a really good idea. MAP_PRIVATE and any driver mappings doesn't really work at all. Christian. > > Jason ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 10:30 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 10:30 UTC (permalink / raw) To: Jason Gunthorpe, Daniel Vetter Cc: Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK Am 24.02.21 um 19:46 schrieb Jason Gunthorpe: > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote: > >> Hm I figured everyone just uses MAP_SHARED for buffer objects since >> COW really makes absolutely no sense. How would we enforce this? > In RDMA we test > > drivers/infiniband/core/ib_core_uverbs.c: if (!(vma->vm_flags & VM_SHARED)) > > During mmap to reject use of MAP_PRIVATE on BAR pages. That's a really good idea. MAP_PRIVATE and any driver mappings doesn't really work at all. Christian. > > Jason _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 10:30 ` Christian König 0 siblings, 0 replies; 110+ messages in thread From: Christian König @ 2021-02-25 10:30 UTC (permalink / raw) To: Jason Gunthorpe, Daniel Vetter Cc: Thomas Hellström (Intel), Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK Am 24.02.21 um 19:46 schrieb Jason Gunthorpe: > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote: > >> Hm I figured everyone just uses MAP_SHARED for buffer objects since >> COW really makes absolutely no sense. How would we enforce this? > In RDMA we test > > drivers/infiniband/core/ib_core_uverbs.c: if (!(vma->vm_flags & VM_SHARED)) > > During mmap to reject use of MAP_PRIVATE on BAR pages. That's a really good idea. MAP_PRIVATE and any driver mappings doesn't really work at all. Christian. > > Jason _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-25 10:30 ` Christian König (?) @ 2021-02-25 10:45 ` Daniel Vetter -1 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 10:45 UTC (permalink / raw) To: Christian König Cc: Jason Gunthorpe, Daniel Vetter, Thomas Hellström (Intel), DRI Development, Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, John Stultz, Daniel Vetter, Suren Baghdasaryan, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:30:23AM +0100, Christian König wrote: > > > Am 24.02.21 um 19:46 schrieb Jason Gunthorpe: > > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote: > > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > COW really makes absolutely no sense. How would we enforce this? > > In RDMA we test > > > > drivers/infiniband/core/ib_core_uverbs.c: if (!(vma->vm_flags & VM_SHARED)) > > > > During mmap to reject use of MAP_PRIVATE on BAR pages. > > That's a really good idea. MAP_PRIVATE and any driver mappings doesn't > really work at all. Yeah I feel like this is the next patch we need to add on this little series of locking down dma-buf mmap semantics. Probably should also push these into drm gem mmap code (and maybe ttm can switch over to that, it's really the same). One at a time. -Daniel > > Christian. > > > > > Jason > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 10:45 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 10:45 UTC (permalink / raw) To: Christian König Cc: Daniel Vetter, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, John Stultz, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:30:23AM +0100, Christian König wrote: > > > Am 24.02.21 um 19:46 schrieb Jason Gunthorpe: > > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote: > > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > COW really makes absolutely no sense. How would we enforce this? > > In RDMA we test > > > > drivers/infiniband/core/ib_core_uverbs.c: if (!(vma->vm_flags & VM_SHARED)) > > > > During mmap to reject use of MAP_PRIVATE on BAR pages. > > That's a really good idea. MAP_PRIVATE and any driver mappings doesn't > really work at all. Yeah I feel like this is the next patch we need to add on this little series of locking down dma-buf mmap semantics. Probably should also push these into drm gem mmap code (and maybe ttm can switch over to that, it's really the same). One at a time. -Daniel > > Christian. > > > > > Jason > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-25 10:45 ` Daniel Vetter 0 siblings, 0 replies; 110+ messages in thread From: Daniel Vetter @ 2021-02-25 10:45 UTC (permalink / raw) To: Christian König Cc: Daniel Vetter, Thomas Hellström (Intel), Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Daniel Vetter, Suren Baghdasaryan, Intel Graphics Development, open list:DMA BUFFER SHARING FRAMEWORK On Thu, Feb 25, 2021 at 11:30:23AM +0100, Christian König wrote: > > > Am 24.02.21 um 19:46 schrieb Jason Gunthorpe: > > On Wed, Feb 24, 2021 at 09:45:51AM +0100, Daniel Vetter wrote: > > > > > Hm I figured everyone just uses MAP_SHARED for buffer objects since > > > COW really makes absolutely no sense. How would we enforce this? > > In RDMA we test > > > > drivers/infiniband/core/ib_core_uverbs.c: if (!(vma->vm_flags & VM_SHARED)) > > > > During mmap to reject use of MAP_PRIVATE on BAR pages. > > That's a really good idea. MAP_PRIVATE and any driver mappings doesn't > really work at all. Yeah I feel like this is the next patch we need to add on this little series of locking down dma-buf mmap semantics. Probably should also push these into drm gem mmap code (and maybe ttm can switch over to that, it's really the same). One at a time. -Daniel > > Christian. > > > > > Jason > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3) 2021-02-23 10:59 ` Daniel Vetter ` (5 preceding siblings ...) (?) @ 2021-02-25 10:38 ` Patchwork -1 siblings, 0 replies; 110+ messages in thread From: Patchwork @ 2021-02-25 10:38 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx == Series Details == Series: series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3) URL : https://patchwork.freedesktop.org/series/87313/ State : warning == Summary == $ dim checkpatch origin/drm-tip b71cc38b23b9 dma-buf: Require VM_PFNMAP vma for mmap -:34: WARNING:TYPO_SPELLING: 'entires' may be misspelled - perhaps 'entries'? #34: From auditing the various functions to insert pfn pte entires ^^^^^^^ -:39: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line) #39: References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/ -:97: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>' total: 0 errors, 3 warnings, 0 checks, 39 lines checked 93fc58ee63d1 drm/vgem: use shmem helpers -:424: WARNING:FROM_SIGN_OFF_MISMATCH: From:/Signed-off-by: email address mismatch: 'From: Daniel Vetter <daniel.vetter@ffwll.ch>' != 'Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>' total: 0 errors, 1 warnings, 0 checks, 381 lines checked _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BAT: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3) 2021-02-23 10:59 ` Daniel Vetter ` (6 preceding siblings ...) (?) @ 2021-02-25 11:19 ` Patchwork -1 siblings, 0 replies; 110+ messages in thread From: Patchwork @ 2021-02-25 11:19 UTC (permalink / raw) To: Daniel Vetter; +Cc: intel-gfx [-- Attachment #1.1: Type: text/plain, Size: 26735 bytes --] == Series Details == Series: series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3) URL : https://patchwork.freedesktop.org/series/87313/ State : failure == Summary == CI Bug Log - changes from CI_DRM_9804 -> Patchwork_19728 ==================================================== Summary ------- **FAILURE** Serious unknown changes coming with Patchwork_19728 absolutely need to be verified manually. If you think the reported changes have nothing to do with the changes introduced in Patchwork_19728, please notify your bug team to allow them to document this new failure mode, which will reduce false positives in CI. External URL: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/index.html Possible new issues ------------------- Here are the unknown changes that may have been introduced in Patchwork_19728: ### IGT changes ### #### Possible regressions #### * igt@prime_vgem@basic-fence-mmap: - fi-byt-j1900: [PASS][1] -> [FAIL][2] +3 similar issues [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-byt-j1900/igt@prime_vgem@basic-fence-mmap.html [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-byt-j1900/igt@prime_vgem@basic-fence-mmap.html * igt@prime_vgem@basic-fence-read: - fi-bsw-kefka: [PASS][3] -> [INCOMPLETE][4] +1 similar issue [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-kefka/igt@prime_vgem@basic-fence-read.html [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-kefka/igt@prime_vgem@basic-fence-read.html - fi-ilk-650: [PASS][5] -> [INCOMPLETE][6] +1 similar issue [5]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ilk-650/igt@prime_vgem@basic-fence-read.html [6]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ilk-650/igt@prime_vgem@basic-fence-read.html - fi-byt-j1900: [PASS][7] -> [INCOMPLETE][8] [7]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-byt-j1900/igt@prime_vgem@basic-fence-read.html [8]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-byt-j1900/igt@prime_vgem@basic-fence-read.html * igt@prime_vgem@basic-gtt: - fi-ilk-650: [PASS][9] -> [FAIL][10] +3 similar issues [9]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ilk-650/igt@prime_vgem@basic-gtt.html [10]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ilk-650/igt@prime_vgem@basic-gtt.html - fi-elk-e7500: [PASS][11] -> [FAIL][12] +5 similar issues [11]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-elk-e7500/igt@prime_vgem@basic-gtt.html [12]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-elk-e7500/igt@prime_vgem@basic-gtt.html * igt@prime_vgem@basic-read: - fi-bsw-nick: [PASS][13] -> [FAIL][14] +4 similar issues [13]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-nick/igt@prime_vgem@basic-read.html [14]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-nick/igt@prime_vgem@basic-read.html * igt@prime_vgem@basic-write: - fi-pnv-d510: [PASS][15] -> [FAIL][16] +2 similar issues [15]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-pnv-d510/igt@prime_vgem@basic-write.html [16]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-pnv-d510/igt@prime_vgem@basic-write.html * igt@runner@aborted: - fi-ilk-650: NOTRUN -> [FAIL][17] [17]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ilk-650/igt@runner@aborted.html - fi-kbl-x1275: NOTRUN -> [FAIL][18] [18]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-x1275/igt@runner@aborted.html - fi-bsw-kefka: NOTRUN -> [FAIL][19] [19]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-kefka/igt@runner@aborted.html - fi-cfl-8700k: NOTRUN -> [FAIL][20] [20]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8700k/igt@runner@aborted.html - fi-tgl-y: NOTRUN -> [FAIL][21] [21]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@runner@aborted.html - fi-skl-6600u: NOTRUN -> [FAIL][22] [22]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6600u/igt@runner@aborted.html - fi-cfl-8109u: NOTRUN -> [FAIL][23] [23]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8109u/igt@runner@aborted.html - fi-bsw-nick: NOTRUN -> [FAIL][24] [24]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-nick/igt@runner@aborted.html - fi-snb-2520m: NOTRUN -> [FAIL][25] [25]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2520m/igt@runner@aborted.html - fi-kbl-soraka: NOTRUN -> [FAIL][26] [26]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-soraka/igt@runner@aborted.html - fi-kbl-7500u: NOTRUN -> [FAIL][27] [27]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-7500u/igt@runner@aborted.html - fi-kbl-guc: NOTRUN -> [FAIL][28] [28]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-guc/igt@runner@aborted.html - fi-cml-u2: NOTRUN -> [FAIL][29] [29]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-u2/igt@runner@aborted.html - fi-ivb-3770: NOTRUN -> [FAIL][30] [30]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ivb-3770/igt@runner@aborted.html - fi-bxt-dsi: NOTRUN -> [FAIL][31] [31]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bxt-dsi/igt@runner@aborted.html - fi-elk-e7500: NOTRUN -> [FAIL][32] [32]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-elk-e7500/igt@runner@aborted.html - fi-cml-s: NOTRUN -> [FAIL][33] [33]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-s/igt@runner@aborted.html - fi-cfl-guc: NOTRUN -> [FAIL][34] [34]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-guc/igt@runner@aborted.html - fi-skl-guc: NOTRUN -> [FAIL][35] [35]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-guc/igt@runner@aborted.html - fi-skl-6700k2: NOTRUN -> [FAIL][36] [36]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6700k2/igt@runner@aborted.html - fi-tgl-u2: NOTRUN -> [FAIL][37] [37]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-u2/igt@runner@aborted.html * igt@vgem_basic@create: - fi-skl-6700k2: [PASS][38] -> [FAIL][39] [38]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-6700k2/igt@vgem_basic@create.html [39]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6700k2/igt@vgem_basic@create.html - fi-glk-dsi: [PASS][40] -> [FAIL][41] [40]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-glk-dsi/igt@vgem_basic@create.html [41]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-glk-dsi/igt@vgem_basic@create.html - fi-kbl-x1275: [PASS][42] -> [FAIL][43] [42]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-x1275/igt@vgem_basic@create.html [43]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-x1275/igt@vgem_basic@create.html - fi-bsw-kefka: [PASS][44] -> [FAIL][45] +3 similar issues [44]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-kefka/igt@vgem_basic@create.html [45]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-kefka/igt@vgem_basic@create.html - fi-snb-2600: [PASS][46] -> [FAIL][47] [46]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-snb-2600/igt@vgem_basic@create.html [47]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2600/igt@vgem_basic@create.html - fi-bdw-5557u: [PASS][48] -> [FAIL][49] [48]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bdw-5557u/igt@vgem_basic@create.html [49]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bdw-5557u/igt@vgem_basic@create.html - fi-tgl-y: [PASS][50] -> [FAIL][51] [50]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-y/igt@vgem_basic@create.html [51]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@vgem_basic@create.html - fi-skl-guc: [PASS][52] -> [FAIL][53] [52]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-guc/igt@vgem_basic@create.html [53]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-guc/igt@vgem_basic@create.html - fi-cfl-8109u: [PASS][54] -> [FAIL][55] [54]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-8109u/igt@vgem_basic@create.html [55]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8109u/igt@vgem_basic@create.html - fi-kbl-7500u: [PASS][56] -> [FAIL][57] [56]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-7500u/igt@vgem_basic@create.html [57]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-7500u/igt@vgem_basic@create.html - fi-kbl-guc: [PASS][58] -> [FAIL][59] [58]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-guc/igt@vgem_basic@create.html [59]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-guc/igt@vgem_basic@create.html - fi-cml-u2: [PASS][60] -> [FAIL][61] [60]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cml-u2/igt@vgem_basic@create.html [61]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-u2/igt@vgem_basic@create.html - fi-cfl-8700k: [PASS][62] -> [FAIL][63] [62]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-8700k/igt@vgem_basic@create.html [63]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8700k/igt@vgem_basic@create.html - fi-bxt-dsi: [PASS][64] -> [FAIL][65] [64]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bxt-dsi/igt@vgem_basic@create.html [65]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bxt-dsi/igt@vgem_basic@create.html - fi-hsw-4770: [PASS][66] -> [FAIL][67] [66]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-hsw-4770/igt@vgem_basic@create.html [67]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-4770/igt@vgem_basic@create.html - fi-snb-2520m: [PASS][68] -> [FAIL][69] [68]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-snb-2520m/igt@vgem_basic@create.html [69]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2520m/igt@vgem_basic@create.html - fi-cml-s: [PASS][70] -> [FAIL][71] [70]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cml-s/igt@vgem_basic@create.html [71]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-s/igt@vgem_basic@create.html - fi-cfl-guc: [PASS][72] -> [FAIL][73] [72]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-guc/igt@vgem_basic@create.html [73]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-guc/igt@vgem_basic@create.html - fi-kbl-soraka: [PASS][74] -> [FAIL][75] [74]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-soraka/igt@vgem_basic@create.html [75]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-soraka/igt@vgem_basic@create.html - fi-tgl-u2: [PASS][76] -> [FAIL][77] [76]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-u2/igt@vgem_basic@create.html [77]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-u2/igt@vgem_basic@create.html - fi-skl-6600u: [PASS][78] -> [FAIL][79] [78]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-6600u/igt@vgem_basic@create.html [79]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6600u/igt@vgem_basic@create.html - fi-ivb-3770: [PASS][80] -> [FAIL][81] [80]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ivb-3770/igt@vgem_basic@create.html [81]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ivb-3770/igt@vgem_basic@create.html * igt@vgem_basic@dmabuf-mmap: - fi-ivb-3770: [PASS][82] -> [DMESG-WARN][83] [82]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ivb-3770/igt@vgem_basic@dmabuf-mmap.html [83]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ivb-3770/igt@vgem_basic@dmabuf-mmap.html - fi-glk-dsi: [PASS][84] -> [DMESG-WARN][85] [84]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-glk-dsi/igt@vgem_basic@dmabuf-mmap.html [85]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-glk-dsi/igt@vgem_basic@dmabuf-mmap.html - fi-kbl-soraka: [PASS][86] -> [DMESG-WARN][87] [86]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-soraka/igt@vgem_basic@dmabuf-mmap.html [87]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-soraka/igt@vgem_basic@dmabuf-mmap.html - fi-elk-e7500: [PASS][88] -> [DMESG-WARN][89] [88]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-elk-e7500/igt@vgem_basic@dmabuf-mmap.html [89]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-elk-e7500/igt@vgem_basic@dmabuf-mmap.html - fi-skl-6700k2: [PASS][90] -> [DMESG-WARN][91] [90]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-6700k2/igt@vgem_basic@dmabuf-mmap.html [91]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6700k2/igt@vgem_basic@dmabuf-mmap.html - fi-cml-s: [PASS][92] -> [DMESG-WARN][93] [92]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cml-s/igt@vgem_basic@dmabuf-mmap.html [93]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-s/igt@vgem_basic@dmabuf-mmap.html - fi-cfl-guc: [PASS][94] -> [DMESG-WARN][95] [94]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-guc/igt@vgem_basic@dmabuf-mmap.html [95]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-guc/igt@vgem_basic@dmabuf-mmap.html - fi-hsw-4770: [PASS][96] -> [DMESG-WARN][97] [96]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-hsw-4770/igt@vgem_basic@dmabuf-mmap.html [97]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-4770/igt@vgem_basic@dmabuf-mmap.html - fi-ilk-650: [PASS][98] -> [DMESG-WARN][99] [98]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ilk-650/igt@vgem_basic@dmabuf-mmap.html [99]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ilk-650/igt@vgem_basic@dmabuf-mmap.html - fi-tgl-u2: [PASS][100] -> [DMESG-WARN][101] [100]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-u2/igt@vgem_basic@dmabuf-mmap.html [101]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-u2/igt@vgem_basic@dmabuf-mmap.html - fi-byt-j1900: [PASS][102] -> [DMESG-WARN][103] [102]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-byt-j1900/igt@vgem_basic@dmabuf-mmap.html [103]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-byt-j1900/igt@vgem_basic@dmabuf-mmap.html - fi-pnv-d510: [PASS][104] -> [DMESG-WARN][105] [104]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-pnv-d510/igt@vgem_basic@dmabuf-mmap.html [105]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-pnv-d510/igt@vgem_basic@dmabuf-mmap.html - fi-cml-u2: [PASS][106] -> [DMESG-WARN][107] [106]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cml-u2/igt@vgem_basic@dmabuf-mmap.html [107]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cml-u2/igt@vgem_basic@dmabuf-mmap.html - fi-skl-6600u: [PASS][108] -> [DMESG-WARN][109] [108]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-6600u/igt@vgem_basic@dmabuf-mmap.html [109]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-6600u/igt@vgem_basic@dmabuf-mmap.html - fi-bxt-dsi: [PASS][110] -> [DMESG-WARN][111] [110]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bxt-dsi/igt@vgem_basic@dmabuf-mmap.html [111]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bxt-dsi/igt@vgem_basic@dmabuf-mmap.html - fi-cfl-8700k: [PASS][112] -> [DMESG-WARN][113] [112]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-8700k/igt@vgem_basic@dmabuf-mmap.html [113]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8700k/igt@vgem_basic@dmabuf-mmap.html - fi-snb-2520m: [PASS][114] -> [DMESG-WARN][115] [114]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-snb-2520m/igt@vgem_basic@dmabuf-mmap.html [115]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2520m/igt@vgem_basic@dmabuf-mmap.html - fi-cfl-8109u: [PASS][116] -> [DMESG-WARN][117] [116]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-cfl-8109u/igt@vgem_basic@dmabuf-mmap.html [117]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-cfl-8109u/igt@vgem_basic@dmabuf-mmap.html - fi-bdw-5557u: [PASS][118] -> [DMESG-WARN][119] [118]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bdw-5557u/igt@vgem_basic@dmabuf-mmap.html [119]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bdw-5557u/igt@vgem_basic@dmabuf-mmap.html - fi-bsw-nick: [PASS][120] -> [DMESG-WARN][121] [120]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-nick/igt@vgem_basic@dmabuf-mmap.html [121]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-nick/igt@vgem_basic@dmabuf-mmap.html - fi-skl-guc: [PASS][122] -> [DMESG-WARN][123] [122]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-skl-guc/igt@vgem_basic@dmabuf-mmap.html [123]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-skl-guc/igt@vgem_basic@dmabuf-mmap.html - fi-bsw-kefka: [PASS][124] -> [DMESG-WARN][125] [124]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-bsw-kefka/igt@vgem_basic@dmabuf-mmap.html [125]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bsw-kefka/igt@vgem_basic@dmabuf-mmap.html - fi-kbl-guc: [PASS][126] -> [DMESG-WARN][127] [126]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-guc/igt@vgem_basic@dmabuf-mmap.html [127]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-guc/igt@vgem_basic@dmabuf-mmap.html - fi-kbl-7500u: [PASS][128] -> [DMESG-WARN][129] [128]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-kbl-7500u/igt@vgem_basic@dmabuf-mmap.html [129]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-kbl-7500u/igt@vgem_basic@dmabuf-mmap.html - fi-tgl-y: [PASS][130] -> [DMESG-WARN][131] [130]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-y/igt@vgem_basic@dmabuf-mmap.html [131]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@vgem_basic@dmabuf-mmap.html - fi-snb-2600: [PASS][132] -> [DMESG-WARN][133] [132]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-snb-2600/igt@vgem_basic@dmabuf-mmap.html [133]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2600/igt@vgem_basic@dmabuf-mmap.html #### Suppressed #### The following results come from untrusted machines, tests, or statuses. They do not affect the overall result. * igt@runner@aborted: - {fi-rkl-11500t}: NOTRUN -> [FAIL][134] [134]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-rkl-11500t/igt@runner@aborted.html - {fi-tgl-dsi}: NOTRUN -> [FAIL][135] [135]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-dsi/igt@runner@aborted.html - {fi-jsl-1}: NOTRUN -> [FAIL][136] [136]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-jsl-1/igt@runner@aborted.html * igt@vgem_basic@create: - {fi-rkl-11500t}: [PASS][137] -> [FAIL][138] [137]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-rkl-11500t/igt@vgem_basic@create.html [138]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-rkl-11500t/igt@vgem_basic@create.html - {fi-ehl-2}: NOTRUN -> [FAIL][139] +1 similar issue [139]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ehl-2/igt@vgem_basic@create.html - {fi-jsl-1}: [PASS][140] -> [FAIL][141] [140]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-jsl-1/igt@vgem_basic@create.html [141]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-jsl-1/igt@vgem_basic@create.html - {fi-tgl-dsi}: [PASS][142] -> [FAIL][143] [142]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-dsi/igt@vgem_basic@create.html [143]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-dsi/igt@vgem_basic@create.html - {fi-hsw-gt1}: [PASS][144] -> [FAIL][145] [144]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-hsw-gt1/igt@vgem_basic@create.html [145]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-gt1/igt@vgem_basic@create.html - {fi-ehl-1}: [PASS][146] -> [FAIL][147] [146]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ehl-1/igt@vgem_basic@create.html [147]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ehl-1/igt@vgem_basic@create.html * igt@vgem_basic@dmabuf-mmap: - {fi-ehl-1}: [PASS][148] -> [DMESG-WARN][149] [148]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-ehl-1/igt@vgem_basic@dmabuf-mmap.html [149]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ehl-1/igt@vgem_basic@dmabuf-mmap.html - {fi-jsl-1}: [PASS][150] -> [DMESG-WARN][151] [150]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-jsl-1/igt@vgem_basic@dmabuf-mmap.html [151]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-jsl-1/igt@vgem_basic@dmabuf-mmap.html - {fi-hsw-gt1}: [PASS][152] -> [DMESG-WARN][153] [152]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-hsw-gt1/igt@vgem_basic@dmabuf-mmap.html [153]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-gt1/igt@vgem_basic@dmabuf-mmap.html - {fi-tgl-dsi}: [PASS][154] -> [DMESG-WARN][155] [154]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-dsi/igt@vgem_basic@dmabuf-mmap.html [155]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-dsi/igt@vgem_basic@dmabuf-mmap.html - {fi-ehl-2}: NOTRUN -> [DMESG-WARN][156] [156]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-ehl-2/igt@vgem_basic@dmabuf-mmap.html - {fi-rkl-11500t}: [PASS][157] -> [DMESG-WARN][158] [157]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-rkl-11500t/igt@vgem_basic@dmabuf-mmap.html [158]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-rkl-11500t/igt@vgem_basic@dmabuf-mmap.html Known issues ------------ Here are the changes found in Patchwork_19728 that come from known issues: ### IGT changes ### #### Issues hit #### * igt@prime_self_import@basic-with_one_bo_two_files: - fi-tgl-y: [PASS][159] -> [DMESG-WARN][160] ([i915#402]) [159]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-y/igt@prime_self_import@basic-with_one_bo_two_files.html [160]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@prime_self_import@basic-with_one_bo_two_files.html * igt@runner@aborted: - fi-pnv-d510: NOTRUN -> [FAIL][161] ([i915#2403] / [i915#2505]) [161]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-pnv-d510/igt@runner@aborted.html - fi-glk-dsi: NOTRUN -> [FAIL][162] ([k.org#202321]) [162]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-glk-dsi/igt@runner@aborted.html - fi-bdw-5557u: NOTRUN -> [FAIL][163] ([i915#2369]) [163]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-bdw-5557u/igt@runner@aborted.html - fi-hsw-4770: NOTRUN -> [FAIL][164] ([i915#2505]) [164]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-hsw-4770/igt@runner@aborted.html - fi-snb-2600: NOTRUN -> [FAIL][165] ([i915#698]) [165]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-snb-2600/igt@runner@aborted.html - fi-byt-j1900: NOTRUN -> [FAIL][166] ([i915#2505]) [166]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-byt-j1900/igt@runner@aborted.html #### Possible fixes #### * igt@prime_vgem@basic-fence-flip: - fi-tgl-y: [DMESG-WARN][167] ([i915#402]) -> [PASS][168] [167]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_9804/fi-tgl-y/igt@prime_vgem@basic-fence-flip.html [168]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/fi-tgl-y/igt@prime_vgem@basic-fence-flip.html {name}: This element is suppressed. This means it is ignored when computing the status of the difference (SUCCESS, WARNING, or FAILURE). [fdo#109285]: https://bugs.freedesktop.org/show_bug.cgi?id=109285 [fdo#111827]: https://bugs.freedesktop.org/show_bug.cgi?id=111827 [i915#1222]: https://gitlab.freedesktop.org/drm/intel/issues/1222 [i915#2190]: https://gitlab.freedesktop.org/drm/intel/issues/2190 [i915#2369]: https://gitlab.freedesktop.org/drm/intel/issues/2369 [i915#2403]: https://gitlab.freedesktop.org/drm/intel/issues/2403 [i915#2505]: https://gitlab.freedesktop.org/drm/intel/issues/2505 [i915#402]: https://gitlab.freedesktop.org/drm/intel/issues/402 [i915#533]: https://gitlab.freedesktop.org/drm/intel/issues/533 [i915#698]: https://gitlab.freedesktop.org/drm/intel/issues/698 [k.org#202321]: https://bugzilla.kernel.org/show_bug.cgi?id=202321 Participating hosts (42 -> 38) ------------------------------ Additional (1): fi-ehl-2 Missing (5): fi-ilk-m540 fi-hsw-4200u fi-bsw-cyan fi-ctg-p8600 fi-bdw-samus Build changes ------------- * Linux: CI_DRM_9804 -> Patchwork_19728 CI-20190529: 20190529 CI_DRM_9804: 0ed1d18cdc37ecf5e07f009a9788ea9ad74677a8 @ git://anongit.freedesktop.org/gfx-ci/linux IGT_6015: aa44cddf4ef689f8a3726fcbeedc03f08b12bd82 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools Patchwork_19728: 93fc58ee63d1e8a1289b265f4d6b75a18b222945 @ git://anongit.freedesktop.org/gfx-ci/linux == Linux commits == 93fc58ee63d1 drm/vgem: use shmem helpers b71cc38b23b9 dma-buf: Require VM_PFNMAP vma for mmap == Logs == For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_19728/index.html [-- Attachment #1.2: Type: text/html, Size: 28700 bytes --] [-- Attachment #2: Type: text/plain, Size: 160 bytes --] _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap 2021-02-23 10:59 ` Daniel Vetter (?) @ 2021-02-26 3:57 ` John Stultz -1 siblings, 0 replies; 110+ messages in thread From: John Stultz @ 2021-02-26 3:57 UTC (permalink / raw) To: Daniel Vetter Cc: DRI Development, Intel Graphics Development, Christian König, Jason Gunthorpe, Suren Baghdasaryan, Matthew Wilcox, Daniel Vetter, Sumit Semwal, linux-media, moderated list:DMA BUFFER SHARING FRAMEWORK, Hridya Valsaraju On Tue, Feb 23, 2021 at 3:00 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > tldr; DMA buffers aren't normal memory, expecting that you can use > them like that (like calling get_user_pages works, or that they're > accounting like any other normal memory) cannot be guaranteed. > > Since some userspace only runs on integrated devices, where all > buffers are actually all resident system memory, there's a huge > temptation to assume that a struct page is always present and useable > like for any more pagecache backed mmap. This has the potential to > result in a uapi nightmare. > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > blocks get_user_pages and all the other struct page based > infrastructure for everyone. In spirit this is the uapi counterpart to > the kernel-internal CONFIG_DMABUF_DEBUG. > > Motivated by a recent patch which wanted to swich the system dma-buf > heap to vm_insert_page instead of vm_insert_pfn. > > v2: > > Jason brought up that we also want to guarantee that all ptes have the > pte_special flag set, to catch fast get_user_pages (on architectures > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > From auditing the various functions to insert pfn pte entires > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > this should be the correct flag to check for. > > References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/ > Acked-by: Christian König <christian.koenig@amd.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca> > Cc: Suren Baghdasaryan <surenb@google.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: John Stultz <john.stultz@linaro.org> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > Cc: Sumit Semwal <sumit.semwal@linaro.org> > Cc: "Christian König" <christian.koenig@amd.com> > Cc: linux-media@vger.kernel.org > Cc: linaro-mm-sig@lists.linaro.org > --- > drivers/dma-buf/dma-buf.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) So I gave this a spin in a few of my environments, and with the current dmabuf heaps it spews a lot of warnings. I'm testing some simple fixes to add: vma->vm_flags |= VM_PFNMAP; to the dmabuf heap mmap ops, which we might want to queue along side of this. So assuming those can land together. Acked-by: John Stultz <john.stultz@linaro.org> thanks -john ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [Intel-gfx] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-26 3:57 ` John Stultz 0 siblings, 0 replies; 110+ messages in thread From: John Stultz @ 2021-02-26 3:57 UTC (permalink / raw) To: Daniel Vetter Cc: Intel Graphics Development, Matthew Wilcox, Sumit Semwal, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Hridya Valsaraju, Daniel Vetter, Suren Baghdasaryan, Christian König, linux-media On Tue, Feb 23, 2021 at 3:00 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > tldr; DMA buffers aren't normal memory, expecting that you can use > them like that (like calling get_user_pages works, or that they're > accounting like any other normal memory) cannot be guaranteed. > > Since some userspace only runs on integrated devices, where all > buffers are actually all resident system memory, there's a huge > temptation to assume that a struct page is always present and useable > like for any more pagecache backed mmap. This has the potential to > result in a uapi nightmare. > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > blocks get_user_pages and all the other struct page based > infrastructure for everyone. In spirit this is the uapi counterpart to > the kernel-internal CONFIG_DMABUF_DEBUG. > > Motivated by a recent patch which wanted to swich the system dma-buf > heap to vm_insert_page instead of vm_insert_pfn. > > v2: > > Jason brought up that we also want to guarantee that all ptes have the > pte_special flag set, to catch fast get_user_pages (on architectures > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > From auditing the various functions to insert pfn pte entires > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > this should be the correct flag to check for. > > References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/ > Acked-by: Christian König <christian.koenig@amd.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca> > Cc: Suren Baghdasaryan <surenb@google.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: John Stultz <john.stultz@linaro.org> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > Cc: Sumit Semwal <sumit.semwal@linaro.org> > Cc: "Christian König" <christian.koenig@amd.com> > Cc: linux-media@vger.kernel.org > Cc: linaro-mm-sig@lists.linaro.org > --- > drivers/dma-buf/dma-buf.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) So I gave this a spin in a few of my environments, and with the current dmabuf heaps it spews a lot of warnings. I'm testing some simple fixes to add: vma->vm_flags |= VM_PFNMAP; to the dmabuf heap mmap ops, which we might want to queue along side of this. So assuming those can land together. Acked-by: John Stultz <john.stultz@linaro.org> thanks -john _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
* Re: [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap @ 2021-02-26 3:57 ` John Stultz 0 siblings, 0 replies; 110+ messages in thread From: John Stultz @ 2021-02-26 3:57 UTC (permalink / raw) To: Daniel Vetter Cc: Intel Graphics Development, Matthew Wilcox, moderated list:DMA BUFFER SHARING FRAMEWORK, Jason Gunthorpe, DRI Development, Hridya Valsaraju, Daniel Vetter, Suren Baghdasaryan, Christian König, linux-media On Tue, Feb 23, 2021 at 3:00 AM Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > > tldr; DMA buffers aren't normal memory, expecting that you can use > them like that (like calling get_user_pages works, or that they're > accounting like any other normal memory) cannot be guaranteed. > > Since some userspace only runs on integrated devices, where all > buffers are actually all resident system memory, there's a huge > temptation to assume that a struct page is always present and useable > like for any more pagecache backed mmap. This has the potential to > result in a uapi nightmare. > > To stop this gap require that DMA buffer mmaps are VM_PFNMAP, which > blocks get_user_pages and all the other struct page based > infrastructure for everyone. In spirit this is the uapi counterpart to > the kernel-internal CONFIG_DMABUF_DEBUG. > > Motivated by a recent patch which wanted to swich the system dma-buf > heap to vm_insert_page instead of vm_insert_pfn. > > v2: > > Jason brought up that we also want to guarantee that all ptes have the > pte_special flag set, to catch fast get_user_pages (on architectures > that support this). Allowing VM_MIXEDMAP (like VM_SPECIAL does) would > still allow vm_insert_page, but limiting to VM_PFNMAP will catch that. > > From auditing the various functions to insert pfn pte entires > (vm_insert_pfn_prot, remap_pfn_range and all it's callers like > dma_mmap_wc) it looks like VM_PFNMAP is already required anyway, so > this should be the correct flag to check for. > > References: https://lore.kernel.org/lkml/CAKMK7uHi+mG0z0HUmNt13QCCvutuRVjpcR0NjRL12k-WbWzkRg@mail.gmail.com/ > Acked-by: Christian König <christian.koenig@amd.com> > Cc: Jason Gunthorpe <jgg@ziepe.ca> > Cc: Suren Baghdasaryan <surenb@google.com> > Cc: Matthew Wilcox <willy@infradead.org> > Cc: John Stultz <john.stultz@linaro.org> > Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> > Cc: Sumit Semwal <sumit.semwal@linaro.org> > Cc: "Christian König" <christian.koenig@amd.com> > Cc: linux-media@vger.kernel.org > Cc: linaro-mm-sig@lists.linaro.org > --- > drivers/dma-buf/dma-buf.c | 15 +++++++++++++-- > 1 file changed, 13 insertions(+), 2 deletions(-) So I gave this a spin in a few of my environments, and with the current dmabuf heaps it spews a lot of warnings. I'm testing some simple fixes to add: vma->vm_flags |= VM_PFNMAP; to the dmabuf heap mmap ops, which we might want to queue along side of this. So assuming those can land together. Acked-by: John Stultz <john.stultz@linaro.org> thanks -john _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel ^ permalink raw reply [flat|nested] 110+ messages in thread
* [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [Linaro-mm-sig,1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev4) 2021-02-23 10:59 ` Daniel Vetter ` (8 preceding siblings ...) (?) @ 2021-03-11 10:58 ` Patchwork -1 siblings, 0 replies; 110+ messages in thread From: Patchwork @ 2021-03-11 10:58 UTC (permalink / raw) To: Thomas Hellström (Intel); +Cc: intel-gfx == Series Details == Series: series starting with [Linaro-mm-sig,1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev4) URL : https://patchwork.freedesktop.org/series/87313/ State : failure == Summary == Applying: dma-buf: Require VM_PFNMAP vma for mmap error: git diff header lacks filename information when removing 1 leading pathname component (line 2) error: could not build fake ancestor hint: Use 'git am --show-current-patch=diff' to see the failed patch Patch failed at 0001 dma-buf: Require VM_PFNMAP vma for mmap When you have resolved this problem, run "git am --continue". If you prefer to skip this patch, run "git am --skip" instead. To restore the original branch and stop patching, run "git am --abort". _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx ^ permalink raw reply [flat|nested] 110+ messages in thread
end of thread, other threads:[~2021-03-12 7:56 UTC | newest] Thread overview: 110+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-23 10:59 [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap Daniel Vetter 2021-02-23 10:59 ` [Intel-gfx] " Daniel Vetter 2021-02-23 10:59 ` Daniel Vetter 2021-02-23 10:59 ` [PATCH 2/2] drm/vgem: use shmem helpers Daniel Vetter 2021-02-23 10:59 ` [Intel-gfx] " Daniel Vetter 2021-02-23 11:19 ` Thomas Zimmermann 2021-02-23 11:19 ` [Intel-gfx] " Thomas Zimmermann 2021-02-23 11:51 ` [PATCH] " Daniel Vetter 2021-02-23 11:51 ` [Intel-gfx] " Daniel Vetter 2021-02-23 14:21 ` [PATCH 2/2] " kernel test robot 2021-02-23 14:21 ` kernel test robot 2021-02-23 14:21 ` [Intel-gfx] " kernel test robot 2021-02-23 15:07 ` kernel test robot 2021-02-23 15:07 ` kernel test robot 2021-02-23 15:07 ` [Intel-gfx] " kernel test robot 2021-02-25 10:23 ` [PATCH] " Daniel Vetter 2021-02-25 10:23 ` [Intel-gfx] " Daniel Vetter 2021-02-26 9:19 ` Thomas Zimmermann 2021-02-26 9:19 ` [Intel-gfx] " Thomas Zimmermann 2021-02-26 13:30 ` Daniel Vetter 2021-02-26 13:30 ` [Intel-gfx] " Daniel Vetter 2021-02-26 13:51 ` Thomas Zimmermann 2021-02-26 13:51 ` [Intel-gfx] " Thomas Zimmermann 2021-02-26 14:04 ` Daniel Vetter 2021-02-26 14:04 ` [Intel-gfx] " Daniel Vetter 2021-02-23 11:19 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap Patchwork 2021-02-23 13:11 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev2) Patchwork 2021-02-24 7:46 ` [Linaro-mm-sig] [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap Thomas Hellström (Intel) 2021-02-24 7:46 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-02-24 7:46 ` Thomas Hellström (Intel) 2021-02-24 8:45 ` Daniel Vetter 2021-02-24 8:45 ` [Intel-gfx] " Daniel Vetter 2021-02-24 8:45 ` Daniel Vetter 2021-02-24 9:15 ` Thomas Hellström (Intel) 2021-02-24 9:15 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-02-24 9:15 ` Thomas Hellström (Intel) 2021-02-24 9:31 ` Daniel Vetter 2021-02-24 9:31 ` [Intel-gfx] " Daniel Vetter 2021-02-24 9:31 ` Daniel Vetter 2021-02-25 10:28 ` Christian König 2021-02-25 10:28 ` [Intel-gfx] " Christian König 2021-02-25 10:28 ` Christian König 2021-02-25 10:44 ` Daniel Vetter 2021-02-25 10:44 ` [Intel-gfx] " Daniel Vetter 2021-02-25 10:44 ` Daniel Vetter 2021-02-25 15:49 ` Daniel Vetter 2021-02-25 15:49 ` [Intel-gfx] " Daniel Vetter 2021-02-25 15:49 ` Daniel Vetter 2021-02-25 16:53 ` Christian König 2021-02-25 16:53 ` [Intel-gfx] " Christian König 2021-02-25 16:53 ` Christian König 2021-02-26 9:41 ` Thomas Hellström (Intel) 2021-02-26 9:41 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-02-26 9:41 ` Thomas Hellström (Intel) 2021-02-26 13:28 ` Daniel Vetter 2021-02-26 13:28 ` [Intel-gfx] " Daniel Vetter 2021-02-26 13:28 ` Daniel Vetter 2021-02-27 8:06 ` Thomas Hellström (Intel) 2021-02-27 8:06 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-02-27 8:06 ` Thomas Hellström (Intel) 2021-03-01 8:28 ` Daniel Vetter 2021-03-01 8:28 ` [Intel-gfx] " Daniel Vetter 2021-03-01 8:28 ` Daniel Vetter 2021-03-01 8:39 ` Thomas Hellström (Intel) 2021-03-01 8:39 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-03-01 8:39 ` Thomas Hellström (Intel) 2021-03-01 9:05 ` Daniel Vetter 2021-03-01 9:05 ` [Intel-gfx] " Daniel Vetter 2021-03-01 9:05 ` Daniel Vetter 2021-03-01 9:21 ` Thomas Hellström (Intel) 2021-03-01 9:21 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-03-01 9:21 ` Thomas Hellström (Intel) 2021-03-01 10:17 ` Christian König 2021-03-01 10:17 ` [Intel-gfx] " Christian König 2021-03-01 10:17 ` Christian König 2021-03-01 14:09 ` Daniel Vetter 2021-03-01 14:09 ` [Intel-gfx] " Daniel Vetter 2021-03-01 14:09 ` Daniel Vetter 2021-03-11 10:22 ` Thomas Hellström (Intel) 2021-03-11 10:22 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-03-11 10:22 ` Thomas Hellström (Intel) 2021-03-11 13:00 ` Daniel Vetter 2021-03-11 13:00 ` [Intel-gfx] " Daniel Vetter 2021-03-11 13:00 ` Daniel Vetter 2021-03-11 13:12 ` Thomas Hellström (Intel) 2021-03-11 13:12 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-03-11 13:12 ` Thomas Hellström (Intel) 2021-03-11 13:17 ` Daniel Vetter 2021-03-11 13:17 ` [Intel-gfx] " Daniel Vetter 2021-03-11 13:17 ` Daniel Vetter 2021-03-11 15:37 ` Thomas Hellström (Intel) 2021-03-11 15:37 ` [Intel-gfx] " Thomas Hellström (Intel) 2021-03-11 15:37 ` Thomas Hellström (Intel) 2021-03-12 7:51 ` Christian König 2021-03-12 7:51 ` [Intel-gfx] " Christian König 2021-03-12 7:51 ` Christian König 2021-02-24 18:46 ` Jason Gunthorpe 2021-02-24 18:46 ` Jason Gunthorpe 2021-02-25 10:30 ` Christian König 2021-02-25 10:30 ` [Intel-gfx] " Christian König 2021-02-25 10:30 ` Christian König 2021-02-25 10:45 ` Daniel Vetter 2021-02-25 10:45 ` [Intel-gfx] " Daniel Vetter 2021-02-25 10:45 ` Daniel Vetter 2021-02-25 10:38 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev3) Patchwork 2021-02-25 11:19 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork 2021-02-26 3:57 ` [PATCH 1/2] dma-buf: Require VM_PFNMAP vma for mmap John Stultz 2021-02-26 3:57 ` [Intel-gfx] " John Stultz 2021-02-26 3:57 ` John Stultz 2021-03-11 10:58 ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for series starting with [Linaro-mm-sig,1/2] dma-buf: Require VM_PFNMAP vma for mmap (rev4) Patchwork
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.