All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] updated exynos-drm-next
@ 2012-04-23 13:43 Inki Dae
  2012-04-23 13:43 ` [PATCH 1/4] drm/exynos: added cache attribute support for gem Inki Dae
                   ` (3 more replies)
  0 siblings, 4 replies; 77+ messages in thread
From: Inki Dae @ 2012-04-23 13:43 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

this patch set adds the following features:
cache attribute support
- user can set cache attributes(such as cachable, non-cachable or writecombime)
  to the buffer to be allocated by gem framework.

drm prime support
- this feature had recentrly been introduced to share a buffer between drivers
  using dmabuf(known as UMM) and this patch adds exynos specific codes for it.
  and this module had already been tested by Tomsz and for this, you can refer
  to link below:
	http://www.spinics.net/lists/linux-media/msg46292.html

userptr feature support
- this feature could be used to use the memory region allocated by malloc() in
  user mode and mmaped memory region allocated by other memory allocators.

gem-info-get feature
- this feature could be used to get information to the gem buffer allocated
  in run-time.

for now, this patch set is based on exynos-drm-fixes branch just for review
but after that this will be rebased into exynos-drm-next to drm-next.

and we will introduce new features soon like below:
Graphics 2D Accelerator
- used to draw 2d graphic primitives as direct rendering way and
  this module had been introduced before but not merged because of
  security issue.

Rotator
- used to rotate some image in gem buffer into 0, 90 180 or 270 degree.
  this module havn't been introduced yet so may be first to mainline.

Post Processor
- used to scale up/down, colospace conversion(RGB<->YUV) and also rotation.
  this module had mainly been used for v4l2-based drivers such as camera,
  video hardware codec and so on.

Inki Dae (4):
  drm/exynos: added cache attribute support for gem.
  drm/exynos: added drm prime feature.
  drm/exynos: added userptr feature.
  drm/exynos: added a feature to get gem buffer information.

 drivers/gpu/drm/exynos/Kconfig             |    6 +
 drivers/gpu/drm/exynos/Makefile            |    1 +
 drivers/gpu/drm/exynos/exynos_drm_buf.c    |   12 +-
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |  272 ++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.h |   39 +++
 drivers/gpu/drm/exynos/exynos_drm_drv.c    |   14 +-
 drivers/gpu/drm/exynos/exynos_drm_gem.c    |  347 +++++++++++++++++++++++++++-
 drivers/gpu/drm/exynos/exynos_drm_gem.h    |   25 ++-
 include/drm/exynos_drm.h                   |   55 +++++-
 9 files changed, 749 insertions(+), 22 deletions(-)
 create mode 100644 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
 create mode 100644 drivers/gpu/drm/exynos/exynos_drm_dmabuf.h

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 1/4] drm/exynos: added cache attribute support for gem.
  2012-04-23 13:43 [PATCH 0/4] updated exynos-drm-next Inki Dae
@ 2012-04-23 13:43 ` Inki Dae
  2012-04-23 13:43 ` [PATCH 2/4] drm/exynos: added drm prime feature Inki Dae
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-04-23 13:43 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

with this patch, user application can set cache attribute(such as
cachable, writecombime or non-cachable) of the memory region allocated
by gem framework.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/gpu/drm/exynos/exynos_drm_gem.c |   49 ++++++++++++++++++++++++------
 include/drm/exynos_drm.h                |   11 ++++++-
 2 files changed, 49 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index 01139c8..c85f3a7 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -66,6 +66,22 @@ static int check_gem_flags(unsigned int flags)
 	return 0;
 }
 
+static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
+					struct vm_area_struct *vma)
+{
+	DRM_DEBUG_KMS("flags = 0x%x\n", obj->flags);
+
+	/* non-cachable as default. */
+	if (obj->flags & EXYNOS_BO_CACHABLE)
+		vma->vm_page_prot = vm_get_page_prot(vma->vm_flags);
+	else if (obj->flags & EXYNOS_BO_WC)
+		vma->vm_page_prot =
+			pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+	else
+		vma->vm_page_prot =
+			pgprot_noncached(vm_get_page_prot(vma->vm_flags));
+}
+
 static unsigned long roundup_gem_size(unsigned long size, unsigned int flags)
 {
 	if (!IS_NONCONTIG_BUFFER(flags)) {
@@ -262,24 +278,24 @@ static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
 void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
 {
 	struct drm_gem_object *obj;
+	struct exynos_drm_gem_buf *buf;
 
 	DRM_DEBUG_KMS("%s\n", __FILE__);
 
-	if (!exynos_gem_obj)
-		return;
-
 	obj = &exynos_gem_obj->base;
+	buf = exynos_gem_obj->buffer;
 
 	DRM_DEBUG_KMS("handle count = %d\n", atomic_read(&obj->handle_count));
 
-	if ((exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG) &&
-			exynos_gem_obj->buffer->pages)
+	if (!buf->pages)
+		return;
+
+	if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
 		exynos_drm_gem_put_pages(obj);
 	else
-		exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
-					exynos_gem_obj->buffer);
+		exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags, buf);
 
-	exynos_drm_fini_buf(obj->dev, exynos_gem_obj->buffer);
+	exynos_drm_fini_buf(obj->dev, buf);
 	exynos_gem_obj->buffer = NULL;
 
 	if (obj->map_list.map)
@@ -493,8 +509,7 @@ static int exynos_drm_gem_mmap_buffer(struct file *filp,
 
 	vma->vm_flags |= (VM_IO | VM_RESERVED);
 
-	/* in case of direct mapping, always having non-cachable attribute */
-	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
+	update_vm_cache_attr(exynos_gem_obj, vma);
 
 	vm_size = usize = vma->vm_end - vma->vm_start;
 
@@ -726,6 +741,8 @@ int exynos_drm_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 int exynos_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
 {
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct drm_gem_object *obj;
 	int ret;
 
 	DRM_DEBUG_KMS("%s\n", __FILE__);
@@ -737,8 +754,20 @@ int exynos_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
 		return ret;
 	}
 
+	obj = vma->vm_private_data;
+	exynos_gem_obj = to_exynos_gem_obj(obj);
+
+	ret = check_gem_flags(exynos_gem_obj->flags);
+	if (ret) {
+		drm_gem_vm_close(vma);
+		drm_gem_free_mmap_offset(obj);
+		return ret;
+	}
+
 	vma->vm_flags &= ~VM_PFNMAP;
 	vma->vm_flags |= VM_MIXEDMAP;
 
+	update_vm_cache_attr(exynos_gem_obj, vma);
+
 	return ret;
 }
diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
index e478de4..2d6eb06 100644
--- a/include/drm/exynos_drm.h
+++ b/include/drm/exynos_drm.h
@@ -95,9 +95,18 @@ struct drm_exynos_plane_set_zpos {
 
 /* memory type definitions. */
 enum e_drm_exynos_gem_mem_type {
+	/* Physically Continuous memory and used as default. */
+	EXYNOS_BO_CONTIG	= 0 << 0,
 	/* Physically Non-Continuous memory. */
 	EXYNOS_BO_NONCONTIG	= 1 << 0,
-	EXYNOS_BO_MASK		= EXYNOS_BO_NONCONTIG
+	/* non-cachable mapping and used as default. */
+	EXYNOS_BO_NONCACHABLE	= 0 << 1,
+	/* cachable mapping. */
+	EXYNOS_BO_CACHABLE	= 1 << 1,
+	/* write-combine mapping. */
+	EXYNOS_BO_WC		= 1 << 2,
+	EXYNOS_BO_MASK		= EXYNOS_BO_NONCONTIG | EXYNOS_BO_CACHABLE |
+					EXYNOS_BO_WC
 };
 
 #define DRM_EXYNOS_GEM_CREATE		0x00
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH 2/4] drm/exynos: added drm prime feature.
  2012-04-23 13:43 [PATCH 0/4] updated exynos-drm-next Inki Dae
  2012-04-23 13:43 ` [PATCH 1/4] drm/exynos: added cache attribute support for gem Inki Dae
@ 2012-04-23 13:43 ` Inki Dae
  2012-04-23 13:43 ` [PATCH 3/4] drm/exynos: added userptr feature Inki Dae
  2012-04-23 13:43 ` [PATCH 4/4] drm/exynos: added a feature to get gem buffer information Inki Dae
  3 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-04-23 13:43 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

this patch adds exynos specific codes for DRM Prime feature.
with this patch, user application can get file descriptor
from gem handle through DRM_IOCTL_PRIME_HANDLE_TO_FD ioctl
command(export) and also gem handle from file descriptor
through DRM_IOCTL_PRIME_FD_TO_HANLDE(import) ioctl command.

Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
---
 drivers/gpu/drm/exynos/Kconfig             |    6 +
 drivers/gpu/drm/exynos/Makefile            |    1 +
 drivers/gpu/drm/exynos/exynos_drm_buf.c    |   12 +-
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c |  272 ++++++++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_dmabuf.h |   39 ++++
 drivers/gpu/drm/exynos/exynos_drm_drv.c    |   10 +-
 drivers/gpu/drm/exynos/exynos_drm_gem.c    |   14 ++-
 drivers/gpu/drm/exynos/exynos_drm_gem.h    |    8 +
 8 files changed, 353 insertions(+), 9 deletions(-)
 create mode 100644 drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
 create mode 100644 drivers/gpu/drm/exynos/exynos_drm_dmabuf.h

diff --git a/drivers/gpu/drm/exynos/Kconfig b/drivers/gpu/drm/exynos/Kconfig
index 3343ac4..135b618 100644
--- a/drivers/gpu/drm/exynos/Kconfig
+++ b/drivers/gpu/drm/exynos/Kconfig
@@ -10,6 +10,12 @@ config DRM_EXYNOS
 	  Choose this option if you have a Samsung SoC EXYNOS chipset.
 	  If M is selected the module will be called exynosdrm.
 
+config DRM_EXYNOS_DMABUF
+	bool "EXYNOS DRM DMABUF"
+	depends on DRM_EXYNOS
+	help
+	  Choose this option if you want to use DMABUF feature for DRM.
+
 config DRM_EXYNOS_FIMD
 	bool "Exynos DRM FIMD"
 	depends on DRM_EXYNOS && !FB_S3C
diff --git a/drivers/gpu/drm/exynos/Makefile b/drivers/gpu/drm/exynos/Makefile
index 9e0bff8..353e1b7 100644
--- a/drivers/gpu/drm/exynos/Makefile
+++ b/drivers/gpu/drm/exynos/Makefile
@@ -8,6 +8,7 @@ exynosdrm-y := exynos_drm_drv.o exynos_drm_encoder.o exynos_drm_connector.o \
 		exynos_drm_buf.o exynos_drm_gem.o exynos_drm_core.o \
 		exynos_drm_plane.o
 
+exynosdrm-$(CONFIG_DRM_EXYNOS_DMABUF) += exynos_drm_dmabuf.o
 exynosdrm-$(CONFIG_DRM_EXYNOS_FIMD)	+= exynos_drm_fimd.o
 exynosdrm-$(CONFIG_DRM_EXYNOS_HDMI)	+= exynos_hdmi.o exynos_mixer.o \
 					   exynos_ddc.o exynos_hdmiphy.o \
diff --git a/drivers/gpu/drm/exynos/exynos_drm_buf.c b/drivers/gpu/drm/exynos/exynos_drm_buf.c
index de8d209..b3cb0a6 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_buf.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_buf.c
@@ -35,7 +35,7 @@ static int lowlevel_buffer_allocate(struct drm_device *dev,
 		unsigned int flags, struct exynos_drm_gem_buf *buf)
 {
 	dma_addr_t start_addr;
-	unsigned int npages, page_size, i = 0;
+	unsigned int npages, i = 0;
 	struct scatterlist *sgl;
 	int ret = 0;
 
@@ -53,13 +53,13 @@ static int lowlevel_buffer_allocate(struct drm_device *dev,
 
 	if (buf->size >= SZ_1M) {
 		npages = buf->size >> SECTION_SHIFT;
-		page_size = SECTION_SIZE;
+		buf->page_size = SECTION_SIZE;
 	} else if (buf->size >= SZ_64K) {
 		npages = buf->size >> 16;
-		page_size = SZ_64K;
+		buf->page_size = SZ_64K;
 	} else {
 		npages = buf->size >> PAGE_SHIFT;
-		page_size = PAGE_SIZE;
+		buf->page_size = PAGE_SIZE;
 	}
 
 	buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
@@ -96,9 +96,9 @@ static int lowlevel_buffer_allocate(struct drm_device *dev,
 
 	while (i < npages) {
 		buf->pages[i] = phys_to_page(start_addr);
-		sg_set_page(sgl, buf->pages[i], page_size, 0);
+		sg_set_page(sgl, buf->pages[i], buf->page_size, 0);
 		sg_dma_address(sgl) = start_addr;
-		start_addr += page_size;
+		start_addr += buf->page_size;
 		sgl = sg_next(sgl);
 		i++;
 	}
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
new file mode 100644
index 0000000..2749092
--- /dev/null
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.c
@@ -0,0 +1,272 @@
+/* exynos_drm_dmabuf.c
+ *
+ * Copyright (c) 2012 Samsung Electronics Co., Ltd.
+ * Author: Inki Dae <inki.dae@samsung.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * VA LINUX SYSTEMS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#include "drmP.h"
+#include "drm.h"
+#include "exynos_drm_drv.h"
+#include "exynos_drm_gem.h"
+
+#include <linux/dma-buf.h>
+
+static struct sg_table *exynos_pages_to_sg(struct page **pages, int nr_pages,
+		unsigned int page_size)
+{
+	struct sg_table *sgt = NULL;
+	struct scatterlist *sgl;
+	int i, ret;
+
+	sgt = kzalloc(sizeof(*sgt), GFP_KERNEL);
+	if (!sgt)
+		goto out;
+
+	ret = sg_alloc_table(sgt, nr_pages, GFP_KERNEL);
+	if (ret)
+		goto err_free_sgt;
+
+	if (page_size < PAGE_SIZE)
+		page_size = PAGE_SIZE;
+
+	for_each_sg(sgt->sgl, sgl, nr_pages, i)
+		sg_set_page(sgl, pages[i], page_size, 0);
+
+	return sgt;
+
+err_free_sgt:
+	kfree(sgt);
+	sgt = NULL;
+out:
+	return NULL;
+}
+
+static struct sg_table *
+		exynos_gem_map_dma_buf(struct dma_buf_attachment *attach,
+					enum dma_data_direction dir)
+{
+	struct exynos_drm_gem_obj *gem_obj = attach->dmabuf->priv;
+	struct drm_device *dev = gem_obj->base.dev;
+	struct exynos_drm_gem_buf *buf;
+	struct sg_table *sgt = NULL;
+	unsigned int npages;
+	int nents;
+
+	DRM_DEBUG_PRIME("%s\n", __FILE__);
+
+	mutex_lock(&dev->struct_mutex);
+
+	buf = gem_obj->buffer;
+
+	/* there should always be pages allocated. */
+	if (!buf->pages) {
+		DRM_ERROR("pages is null.\n");
+		goto err_unlock;
+	}
+
+	npages = buf->size / buf->page_size;
+
+	sgt = exynos_pages_to_sg(buf->pages, npages, buf->page_size);
+	nents = dma_map_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+
+	DRM_DEBUG_PRIME("npages = %d buffer size = 0x%lx page_size = 0x%lx\n",
+			npages, buf->size, buf->page_size);
+
+err_unlock:
+	mutex_unlock(&dev->struct_mutex);
+	return sgt;
+}
+
+static void exynos_gem_unmap_dma_buf(struct dma_buf_attachment *attach,
+						struct sg_table *sgt,
+						enum dma_data_direction dir)
+{
+	dma_unmap_sg(attach->dev, sgt->sgl, sgt->nents, dir);
+	sg_free_table(sgt);
+	kfree(sgt);
+	sgt = NULL;
+}
+
+static void exynos_dmabuf_release(struct dma_buf *dmabuf)
+{
+	struct exynos_drm_gem_obj *exynos_gem_obj = dmabuf->priv;
+
+	DRM_DEBUG_PRIME("%s\n", __FILE__);
+
+	/*
+	 * exynos_dmabuf_release() call means that file object's
+	 * f_count is 0 and it calls drm_gem_object_handle_unreference()
+	 * to drop the references that these values had been increased
+	 * at drm_prime_handle_to_fd()
+	 */
+	if (exynos_gem_obj->base.export_dma_buf == dmabuf) {
+		exynos_gem_obj->base.export_dma_buf = NULL;
+
+		/*
+		 * drop this gem object refcount to release allocated buffer
+		 * and resources.
+		 */
+		drm_gem_object_unreference_unlocked(&exynos_gem_obj->base);
+	}
+}
+
+static void *exynos_gem_dmabuf_kmap_atomic(struct dma_buf *dma_buf,
+						unsigned long page_num)
+{
+	/* TODO */
+
+	return NULL;
+}
+
+static void exynos_gem_dmabuf_kunmap_atomic(struct dma_buf *dma_buf,
+						unsigned long page_num,
+						void *addr)
+{
+	/* TODO */
+}
+
+static void *exynos_gem_dmabuf_kmap(struct dma_buf *dma_buf,
+					unsigned long page_num)
+{
+	/* TODO */
+
+	return NULL;
+}
+
+static void exynos_gem_dmabuf_kunmap(struct dma_buf *dma_buf,
+					unsigned long page_num, void *addr)
+{
+	/* TODO */
+}
+
+static struct dma_buf_ops exynos_dmabuf_ops = {
+	.map_dma_buf		= exynos_gem_map_dma_buf,
+	.unmap_dma_buf		= exynos_gem_unmap_dma_buf,
+	.kmap			= exynos_gem_dmabuf_kmap,
+	.kmap_atomic		= exynos_gem_dmabuf_kmap_atomic,
+	.kunmap			= exynos_gem_dmabuf_kunmap,
+	.kunmap_atomic		= exynos_gem_dmabuf_kunmap_atomic,
+	.release		= exynos_dmabuf_release,
+};
+
+struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev,
+				struct drm_gem_object *obj, int flags)
+{
+	struct exynos_drm_gem_obj *exynos_gem_obj = to_exynos_gem_obj(obj);
+
+	return dma_buf_export(exynos_gem_obj, &exynos_dmabuf_ops,
+				exynos_gem_obj->base.size, 0600);
+}
+
+struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
+				struct dma_buf *dma_buf)
+{
+	struct dma_buf_attachment *attach;
+	struct sg_table *sgt;
+	struct scatterlist *sgl;
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct exynos_drm_gem_buf *buffer;
+	struct page *page;
+	int ret, i = 0;
+
+	DRM_DEBUG_PRIME("%s\n", __FILE__);
+
+	/* is this one of own objects? */
+	if (dma_buf->ops == &exynos_dmabuf_ops) {
+		struct drm_gem_object *obj;
+
+		exynos_gem_obj = dma_buf->priv;
+		obj = &exynos_gem_obj->base;
+
+		/* is it from our device? */
+		if (obj->dev == drm_dev) {
+			drm_gem_object_reference(obj);
+			return obj;
+		}
+	}
+
+	attach = dma_buf_attach(dma_buf, drm_dev->dev);
+	if (IS_ERR(attach))
+		return ERR_PTR(-EINVAL);
+
+
+	sgt = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
+	if (IS_ERR(sgt)) {
+		ret = PTR_ERR(sgt);
+		goto err_buf_detach;
+	}
+
+	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
+	if (!buffer) {
+		DRM_ERROR("failed to allocate exynos_drm_gem_buf.\n");
+		ret = -ENOMEM;
+		goto err_unmap_attach;
+	}
+
+	buffer->pages = kzalloc(sizeof(*page) * sgt->nents, GFP_KERNEL);
+	if (!buffer->pages) {
+		DRM_ERROR("failed to allocate pages.\n");
+		ret = -ENOMEM;
+		goto err_free_buffer;
+	}
+
+	exynos_gem_obj = exynos_drm_gem_init(drm_dev, dma_buf->size);
+	if (!exynos_gem_obj) {
+		ret = -ENOMEM;
+		goto err_free_pages;
+	}
+
+	sgl = sgt->sgl;
+	buffer->dma_addr = sg_dma_address(sgl);
+
+	while (i < sgt->nents) {
+		buffer->pages[i] = sg_page(sgl);
+		buffer->size += sg_dma_len(sgl);
+		sgl = sg_next(sgl);
+		i++;
+	}
+
+	exynos_gem_obj->buffer = buffer;
+	buffer->sgt = sgt;
+	exynos_gem_obj->base.import_attach = attach;
+
+	DRM_DEBUG_PRIME("dma_addr = 0x%x, size = 0x%lx\n", buffer->dma_addr,
+								buffer->size);
+
+	return &exynos_gem_obj->base;
+
+err_free_pages:
+	kfree(buffer->pages);
+	buffer->pages = NULL;
+err_free_buffer:
+	kfree(buffer);
+	buffer = NULL;
+err_unmap_attach:
+	dma_buf_unmap_attachment(attach, sgt, DMA_BIDIRECTIONAL);
+err_buf_detach:
+	dma_buf_detach(dma_buf, attach);
+	return ERR_PTR(ret);
+}
+
+MODULE_AUTHOR("Inki Dae <inki.dae@samsung.com>");
+MODULE_DESCRIPTION("Samsung SoC DRM DMABUF Module");
+MODULE_LICENSE("GPL");
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h
new file mode 100644
index 0000000..662a8f9
--- /dev/null
+++ b/drivers/gpu/drm/exynos/exynos_drm_dmabuf.h
@@ -0,0 +1,39 @@
+/* exynos_drm_dmabuf.h
+ *
+ * Copyright (c) 2012 Samsung Electronics Co., Ltd.
+ * Author: Inki Dae <inki.dae@samsung.com>
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * VA LINUX SYSTEMS AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+#ifndef _EXYNOS_DRM_DMABUF_H_
+#define _EXYNOS_DRM_DMABUF_H_
+
+#ifdef CONFIG_DRM_EXYNOS_DMABUF
+struct dma_buf *exynos_dmabuf_prime_export(struct drm_device *drm_dev,
+				struct drm_gem_object *obj, int flags);
+
+struct drm_gem_object *exynos_dmabuf_prime_import(struct drm_device *drm_dev,
+						struct dma_buf *dma_buf);
+#else
+#define exynos_dmabuf_prime_export		NULL
+#define exynos_dmabuf_prime_import		NULL
+#endif
+#endif
diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index a6819b5..f58a487 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -39,6 +39,7 @@
 #include "exynos_drm_gem.h"
 #include "exynos_drm_plane.h"
 #include "exynos_drm_vidi.h"
+#include "exynos_drm_dmabuf.h"
 
 #define DRIVER_NAME	"exynos"
 #define DRIVER_DESC	"Samsung SoC DRM"
@@ -149,6 +150,8 @@ static int exynos_drm_open(struct drm_device *dev, struct drm_file *file)
 {
 	DRM_DEBUG_DRIVER("%s\n", __FILE__);
 
+	drm_prime_init_file_private(&file->prime);
+
 	return exynos_drm_subdrv_open(dev, file);
 }
 
@@ -170,6 +173,7 @@ static void exynos_drm_preclose(struct drm_device *dev,
 			e->base.destroy(&e->base);
 		}
 	}
+	drm_prime_destroy_file_private(&file->prime);
 	spin_unlock_irqrestore(&dev->event_lock, flags);
 
 	exynos_drm_subdrv_close(dev, file);
@@ -225,7 +229,7 @@ static const struct file_operations exynos_drm_driver_fops = {
 
 static struct drm_driver exynos_drm_driver = {
 	.driver_features	= DRIVER_HAVE_IRQ | DRIVER_BUS_PLATFORM |
-				  DRIVER_MODESET | DRIVER_GEM,
+				  DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME,
 	.load			= exynos_drm_load,
 	.unload			= exynos_drm_unload,
 	.open			= exynos_drm_open,
@@ -241,6 +245,10 @@ static struct drm_driver exynos_drm_driver = {
 	.dumb_create		= exynos_drm_gem_dumb_create,
 	.dumb_map_offset	= exynos_drm_gem_dumb_map_offset,
 	.dumb_destroy		= exynos_drm_gem_dumb_destroy,
+	.prime_handle_to_fd	= drm_gem_prime_handle_to_fd,
+	.prime_fd_to_handle	= drm_gem_prime_fd_to_handle,
+	.gem_prime_export	= exynos_dmabuf_prime_export,
+	.gem_prime_import	= exynos_dmabuf_prime_import,
 	.ioctls			= exynos_ioctls,
 	.fops			= &exynos_drm_driver_fops,
 	.name	= DRIVER_NAME,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index c85f3a7..afd0cd4 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -96,7 +96,7 @@ out:
 	return roundup(size, PAGE_SIZE);
 }
 
-static struct page **exynos_gem_get_pages(struct drm_gem_object *obj,
+struct page **exynos_gem_get_pages(struct drm_gem_object *obj,
 						gfp_t gfpmask)
 {
 	struct inode *inode;
@@ -196,6 +196,7 @@ static int exynos_drm_gem_get_pages(struct drm_gem_object *obj)
 	}
 
 	npages = obj->size >> PAGE_SHIFT;
+	buf->page_size = PAGE_SIZE;
 
 	buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
 	if (!buf->sgt) {
@@ -308,7 +309,7 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
 	exynos_gem_obj = NULL;
 }
 
-static struct exynos_drm_gem_obj *exynos_drm_gem_init(struct drm_device *dev,
+struct exynos_drm_gem_obj *exynos_drm_gem_init(struct drm_device *dev,
 						      unsigned long size)
 {
 	struct exynos_drm_gem_obj *exynos_gem_obj;
@@ -614,8 +615,17 @@ int exynos_drm_gem_init_object(struct drm_gem_object *obj)
 
 void exynos_drm_gem_free_object(struct drm_gem_object *obj)
 {
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct exynos_drm_gem_buf *buf;
+
 	DRM_DEBUG_KMS("%s\n", __FILE__);
 
+	exynos_gem_obj = to_exynos_gem_obj(obj);
+	buf = exynos_gem_obj->buffer;
+
+	if (obj->import_attach)
+		drm_prime_gem_destroy(obj, buf->sgt);
+
 	exynos_drm_gem_destroy(to_exynos_gem_obj(obj));
 }
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index 4ed8420..efc8252 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -40,6 +40,7 @@
  *	device address with IOMMU.
  * @sgt: sg table to transfer page data.
  * @pages: contain all pages to allocated memory region.
+ * @page_size: could be 4K, 64K or 1MB.
  * @size: size of allocated memory region.
  */
 struct exynos_drm_gem_buf {
@@ -47,6 +48,7 @@ struct exynos_drm_gem_buf {
 	dma_addr_t		dma_addr;
 	struct sg_table		*sgt;
 	struct page		**pages;
+	unsigned long		page_size;
 	unsigned long		size;
 };
 
@@ -74,9 +76,15 @@ struct exynos_drm_gem_obj {
 	unsigned int			flags;
 };
 
+struct page **exynos_gem_get_pages(struct drm_gem_object *obj, gfp_t gfpmask);
+
 /* destroy a buffer with gem object */
 void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj);
 
+/* create a private gem object and initialize it. */
+struct exynos_drm_gem_obj *exynos_drm_gem_init(struct drm_device *dev,
+						      unsigned long size);
+
 /* create a new buffer with gem object */
 struct exynos_drm_gem_obj *exynos_drm_gem_create(struct drm_device *dev,
 						unsigned int flags,
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH 3/4] drm/exynos: added userptr feature.
  2012-04-23 13:43 [PATCH 0/4] updated exynos-drm-next Inki Dae
  2012-04-23 13:43 ` [PATCH 1/4] drm/exynos: added cache attribute support for gem Inki Dae
  2012-04-23 13:43 ` [PATCH 2/4] drm/exynos: added drm prime feature Inki Dae
@ 2012-04-23 13:43 ` Inki Dae
  2012-04-24  5:17   ` [PATCH v2 " Inki Dae
                     ` (2 more replies)
  2012-04-23 13:43 ` [PATCH 4/4] drm/exynos: added a feature to get gem buffer information Inki Dae
  3 siblings, 3 replies; 77+ messages in thread
From: Inki Dae @ 2012-04-23 13:43 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

this feature could be used to use memory region allocated by malloc() in user
mode and mmaped memory region allocated by other memory allocators. userptr
interface can identify memory type through vm_flags value and would get
pages or page frame numbers to user space appropriately.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
 drivers/gpu/drm/exynos/exynos_drm_gem.c |  258 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
 include/drm/exynos_drm.h                |   25 +++-
 4 files changed, 296 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index f58a487..5bb0361 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
 			DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
 			exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
+	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
+			exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS, exynos_plane_set_zpos_ioctl,
 			DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index afd0cd4..b68d4ea 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
 	return 0;
 }
 
+static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
+{
+	struct vm_area_struct *vma_copy;
+
+	vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
+	if (!vma_copy)
+		return NULL;
+
+	if (vma->vm_ops && vma->vm_ops->open)
+		vma->vm_ops->open(vma);
+
+	if (vma->vm_file)
+		get_file(vma->vm_file);
+
+	memcpy(vma_copy, vma, sizeof(*vma));
+
+	vma_copy->vm_mm = NULL;
+	vma_copy->vm_next = NULL;
+	vma_copy->vm_prev = NULL;
+
+	return vma_copy;
+}
+
+static void put_vma(struct vm_area_struct *vma)
+{
+	if (!vma)
+		return;
+
+	if (vma->vm_ops && vma->vm_ops->close)
+		vma->vm_ops->close(vma);
+
+	if (vma->vm_file)
+		fput(vma->vm_file);
+
+	kfree(vma);
+}
+
 static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
 					struct vm_area_struct *vma)
 {
@@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct drm_gem_object *obj)
 	/* add some codes for UNCACHED type here. TODO */
 }
 
+static void exynos_drm_put_userptr(struct drm_gem_object *obj)
+{
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct exynos_drm_gem_buf *buf;
+	struct vm_area_struct *vma;
+	int npages;
+
+	exynos_gem_obj = to_exynos_gem_obj(obj);
+	buf = exynos_gem_obj->buffer;
+	vma = exynos_gem_obj->vma;
+
+	if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
+		put_vma(exynos_gem_obj->vma);
+		goto out;
+	}
+
+	npages = buf->size >> PAGE_SHIFT;
+
+	npages--;
+	while (npages >= 0) {
+		if (buf->write)
+			set_page_dirty_lock(buf->pages[npages]);
+
+		put_page(buf->pages[npages]);
+		npages--;
+	}
+
+out:
+	kfree(buf->pages);
+	buf->pages = NULL;
+
+	kfree(buf->sgt);
+	buf->sgt = NULL;
+}
+
 static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
 					struct drm_file *file_priv,
 					unsigned int *handle)
@@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
 
 	if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
 		exynos_drm_gem_put_pages(obj);
+	else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
+		exynos_drm_put_userptr(obj);
 	else
 		exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags, buf);
 
@@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int exynos_drm_get_userptr(struct drm_device *dev,
+				struct exynos_drm_gem_obj *obj,
+				unsigned long userptr,
+				unsigned int write)
+{
+	unsigned int get_npages;
+	unsigned long npages = 0;
+	struct vm_area_struct *vma;
+	struct exynos_drm_gem_buf *buf = obj->buffer;
+
+	vma = find_vma(current->mm, userptr);
+
+	/* the memory region mmaped with VM_PFNMAP. */
+	if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
+		unsigned long this_pfn, prev_pfn, pa;
+		unsigned long start, end, offset;
+		struct scatterlist *sgl;
+
+		start = userptr;
+		offset = userptr & ~PAGE_MASK;
+		end = start + buf->size;
+		sgl = buf->sgt->sgl;
+
+		for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
+			int ret = follow_pfn(vma, start, &this_pfn);
+			if (ret)
+				return ret;
+
+			if (prev_pfn == 0)
+				pa = this_pfn << PAGE_SHIFT;
+			else if (this_pfn != prev_pfn + 1)
+				return -EFAULT;
+
+			sg_dma_address(sgl) = (pa + offset);
+			sg_dma_len(sgl) = PAGE_SIZE;
+			prev_pfn = this_pfn;
+			pa += PAGE_SIZE;
+			npages++;
+			sgl = sg_next(sgl);
+		}
+
+		buf->dma_addr = pa + offset;
+
+		obj->vma = get_vma(vma);
+		if (!obj->vma)
+			return -ENOMEM;
+
+		buf->pfnmap = true;
+
+		return npages;
+	}
+
+	buf->write = write;
+	npages = buf->size >> PAGE_SHIFT;
+
+	down_read(&current->mm->mmap_sem);
+	get_npages = get_user_pages(current, current->mm, userptr,
+					npages, write, 1, buf->pages, NULL);
+	up_read(&current->mm->mmap_sem);
+	if (get_npages != npages)
+		DRM_ERROR("failed to get user_pages.\n");
+
+	buf->pfnmap = false;
+
+	return get_npages;
+}
+
+int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv)
+{
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct drm_exynos_gem_userptr *args = data;
+	struct exynos_drm_gem_buf *buf;
+	struct scatterlist *sgl;
+	unsigned long size, userptr;
+	unsigned int npages;
+	int ret, get_npages;
+
+	DRM_DEBUG_KMS("%s\n", __FILE__);
+
+	if (!args->size) {
+		DRM_ERROR("invalid size.\n");
+		return -EINVAL;
+	}
+
+	ret = check_gem_flags(args->flags);
+	if (ret)
+		return ret;
+
+	size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
+	userptr = args->userptr;
+
+	buf = exynos_drm_init_buf(dev, size);
+	if (!buf)
+		return -ENOMEM;
+
+	exynos_gem_obj = exynos_drm_gem_init(dev, size);
+	if (!exynos_gem_obj) {
+		ret = -ENOMEM;
+		goto err_free_buffer;
+	}
+
+	buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
+	if (!buf->sgt) {
+		DRM_ERROR("failed to allocate buf->sgt.\n");
+		ret = -ENOMEM;
+		goto err_release_gem;
+	}
+
+	npages = size >> PAGE_SHIFT;
+
+	ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
+	if (ret < 0) {
+		DRM_ERROR("failed to initailize sg table.\n");
+		goto err_free_sgt;
+	}
+
+	buf->pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!buf->pages) {
+		DRM_ERROR("failed to allocate buf->pages\n");
+		ret = -ENOMEM;
+		goto err_free_table;
+	}
+
+	exynos_gem_obj->buffer = buf;
+
+	get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj, userptr, 1);
+	if (get_npages != npages) {
+		DRM_ERROR("failed to get user_pages.\n");
+		ret = get_npages;
+		goto err_release_userptr;
+	}
+
+	ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base, file_priv,
+						&args->handle);
+	if (ret < 0) {
+		DRM_ERROR("failed to create gem handle.\n");
+		goto err_release_userptr;
+	}
+
+	sgl = buf->sgt->sgl;
+
+	/*
+	 * if buf->pfnmap is true then update sgl of sgt with pages but
+	 * if buf->pfnmap is false then it means the sgl was updated already
+	 * so it doesn't need to update the sgl.
+	 */
+	if (!buf->pfnmap) {
+		unsigned int i = 0;
+
+		/* set all pages to sg list. */
+		while (i < npages) {
+			sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
+			sg_dma_address(sgl) = page_to_phys(buf->pages[i]);
+			i++;
+			sgl = sg_next(sgl);
+		}
+	}
+
+	/* always use EXYNOS_BO_USERPTR as memory type for userptr. */
+	exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
+
+	return 0;
+
+err_release_userptr:
+	get_npages--;
+	while (get_npages >= 0)
+		put_page(buf->pages[get_npages--]);
+	kfree(buf->pages);
+	buf->pages = NULL;
+err_free_table:
+	sg_free_table(buf->sgt);
+err_free_sgt:
+	kfree(buf->sgt);
+	buf->sgt = NULL;
+err_release_gem:
+	drm_gem_object_release(&exynos_gem_obj->base);
+	kfree(exynos_gem_obj);
+	exynos_gem_obj = NULL;
+err_free_buffer:
+	exynos_drm_free_buf(dev, 0, buf);
+	return ret;
+}
+
 int exynos_drm_gem_init_object(struct drm_gem_object *obj)
 {
 	DRM_DEBUG_KMS("%s\n", __FILE__);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index efc8252..1de2241 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -29,7 +29,8 @@
 #define to_exynos_gem_obj(x)	container_of(x,\
 			struct exynos_drm_gem_obj, base)
 
-#define IS_NONCONTIG_BUFFER(f)		(f & EXYNOS_BO_NONCONTIG)
+#define IS_NONCONTIG_BUFFER(f)		((f & EXYNOS_BO_NONCONTIG) ||\
+					(f & EXYNOS_BO_USERPTR))
 
 /*
  * exynos drm gem buffer structure.
@@ -38,18 +39,23 @@
  * @dma_addr: bus address(accessed by dma) to allocated memory region.
  *	- this address could be physical address without IOMMU and
  *	device address with IOMMU.
+ * @write: whether pages will be written to by the caller.
  * @sgt: sg table to transfer page data.
  * @pages: contain all pages to allocated memory region.
  * @page_size: could be 4K, 64K or 1MB.
  * @size: size of allocated memory region.
+ * @pfnmap: indicate whether memory region from userptr is mmaped with
+ *	VM_PFNMAP or not.
  */
 struct exynos_drm_gem_buf {
 	void __iomem		*kvaddr;
 	dma_addr_t		dma_addr;
+	unsigned int		write;
 	struct sg_table		*sgt;
 	struct page		**pages;
 	unsigned long		page_size;
 	unsigned long		size;
+	bool			pfnmap;
 };
 
 /*
@@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
 	struct drm_gem_object		base;
 	struct exynos_drm_gem_buf	*buffer;
 	unsigned long			size;
+	struct vm_area_struct		*vma;
 	unsigned int			flags;
 };
 
@@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct drm_device *dev, void *data,
 int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 			      struct drm_file *file_priv);
 
+/* map user space allocated by malloc to pages. */
+int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv);
+
 /* initialize gem object. */
 int exynos_drm_gem_init_object(struct drm_gem_object *obj);
 
diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
index 2d6eb06..48eda6e 100644
--- a/include/drm/exynos_drm.h
+++ b/include/drm/exynos_drm.h
@@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
 };
 
 /**
+ * User-requested user space importing structure
+ *
+ * @userptr: user space address allocated by malloc.
+ * @size: size to the buffer allocated by malloc.
+ * @flags: indicate user-desired cache attribute to map the allocated buffer
+ *	to kernel space.
+ * @handle: a returned handle to created gem object.
+ *	- this handle will be set by gem module of kernel side.
+ */
+struct drm_exynos_gem_userptr {
+	uint64_t userptr;
+	uint64_t size;
+	unsigned int flags;
+	unsigned int handle;
+};
+
+/**
  * A structure for user connection request of virtual display.
  *
  * @connection: indicate whether doing connetion or not by user.
@@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
 	EXYNOS_BO_CACHABLE	= 1 << 1,
 	/* write-combine mapping. */
 	EXYNOS_BO_WC		= 1 << 2,
+	/* user space memory allocated by malloc. */
+	EXYNOS_BO_USERPTR	= 1 << 3,
 	EXYNOS_BO_MASK		= EXYNOS_BO_NONCONTIG | EXYNOS_BO_CACHABLE |
-					EXYNOS_BO_WC
+					EXYNOS_BO_WC | EXYNOS_BO_USERPTR
 };
 
 #define DRM_EXYNOS_GEM_CREATE		0x00
 #define DRM_EXYNOS_GEM_MAP_OFFSET	0x01
 #define DRM_EXYNOS_GEM_MMAP		0x02
+#define DRM_EXYNOS_GEM_USERPTR		0x03
 /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
 #define DRM_EXYNOS_PLANE_SET_ZPOS	0x06
 #define DRM_EXYNOS_VIDI_CONNECTION	0x07
@@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
 #define DRM_IOCTL_EXYNOS_GEM_MMAP	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
 
+#define DRM_IOCTL_EXYNOS_GEM_USERPTR	DRM_IOWR(DRM_COMMAND_BASE + \
+		DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
+
 #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_PLANE_SET_ZPOS, struct drm_exynos_plane_set_zpos)
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH 4/4] drm/exynos: added a feature to get gem buffer information.
  2012-04-23 13:43 [PATCH 0/4] updated exynos-drm-next Inki Dae
                   ` (2 preceding siblings ...)
  2012-04-23 13:43 ` [PATCH 3/4] drm/exynos: added userptr feature Inki Dae
@ 2012-04-23 13:43 ` Inki Dae
  3 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-04-23 13:43 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

this patch adds a feature to get a gem buffer information and user application
can get the gem buffer information simply in runtime through gem handle.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 ++
 drivers/gpu/drm/exynos/exynos_drm_gem.c |   26 ++++++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |    4 ++++
 include/drm/exynos_drm.h                |   21 ++++++++++++++++++++-
 4 files changed, 52 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index 5bb0361..21658fa 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -213,6 +213,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
 			exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
 			exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
+	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
+			exynos_drm_gem_get_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS, exynos_plane_set_zpos_ioctl,
 			DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index b68d4ea..c3ffc9f 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -864,6 +864,32 @@ err_free_buffer:
 	return ret;
 }
 
+int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv)
+{	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct drm_exynos_gem_info *args = data;
+	struct drm_gem_object *obj;
+
+	mutex_lock(&dev->struct_mutex);
+
+	obj = drm_gem_object_lookup(dev, file_priv, args->handle);
+	if (!obj) {
+		DRM_ERROR("failed to lookup gem object.\n");
+		mutex_unlock(&dev->struct_mutex);
+		return -EINVAL;
+	}
+
+	exynos_gem_obj = to_exynos_gem_obj(obj);
+
+	args->flags = exynos_gem_obj->flags;
+	args->size = exynos_gem_obj->size;
+
+	drm_gem_object_unreference(obj);
+	mutex_unlock(&dev->struct_mutex);
+
+	return 0;
+}
+
 int exynos_drm_gem_init_object(struct drm_gem_object *obj)
 {
 	DRM_DEBUG_KMS("%s\n", __FILE__);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index 1de2241..9f5eabe 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -138,6 +138,10 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
 				      struct drm_file *file_priv);
 
+/* get buffer information to memory region allocated by gem. */
+int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv);
+
 /* initialize gem object. */
 int exynos_drm_gem_init_object(struct drm_gem_object *obj);
 
diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
index 48eda6e..63e7ee8 100644
--- a/include/drm/exynos_drm.h
+++ b/include/drm/exynos_drm.h
@@ -92,6 +92,21 @@ struct drm_exynos_gem_userptr {
 };
 
 /**
+ * A structure to gem information.
+ *
+ * @handle: a handle to gem object created.
+ * @flags: flag value including memory type and cache attribute and
+ *	this value would be set by driver.
+ * @size: size to memory region allocated by gem and this size would
+ *	be set by driver.
+ */
+struct drm_exynos_gem_info {
+	unsigned int handle;
+	unsigned int flags;
+	uint64_t size;
+};
+
+/**
  * A structure for user connection request of virtual display.
  *
  * @connection: indicate whether doing connetion or not by user.
@@ -132,7 +147,8 @@ enum e_drm_exynos_gem_mem_type {
 #define DRM_EXYNOS_GEM_MAP_OFFSET	0x01
 #define DRM_EXYNOS_GEM_MMAP		0x02
 #define DRM_EXYNOS_GEM_USERPTR		0x03
-/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
+#define DRM_EXYNOS_GEM_GET		0x04
+/* Reserved 0x04 ~ 0x05 for exynos specific gem ioctl */
 #define DRM_EXYNOS_PLANE_SET_ZPOS	0x06
 #define DRM_EXYNOS_VIDI_CONNECTION	0x07
 
@@ -148,6 +164,9 @@ enum e_drm_exynos_gem_mem_type {
 #define DRM_IOCTL_EXYNOS_GEM_USERPTR	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
 
+#define DRM_IOCTL_EXYNOS_GEM_GET	DRM_IOWR(DRM_COMMAND_BASE + \
+		DRM_EXYNOS_GEM_GET,	struct drm_exynos_gem_info)
+
 #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_PLANE_SET_ZPOS, struct drm_exynos_plane_set_zpos)
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-04-23 13:43 ` [PATCH 3/4] drm/exynos: added userptr feature Inki Dae
@ 2012-04-24  5:17   ` Inki Dae
  2012-04-25 10:15     ` Dave Airlie
  2012-05-09  6:17   ` [PATCH 0/2 v3] " Inki Dae
  2012-05-15  7:34   ` [PATCH 3/4] " Rob Clark
  2 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-04-24  5:17 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

this feature could be used to use memory region allocated by malloc() in user
mode and mmaped memory region allocated by other memory allocators. userptr
interface can identify memory type through vm_flags value and would get
pages or page frame numbers to user space appropriately.

changelog v2:
the memory region mmaped with VM_PFNMAP type is physically continuous and
start address of the memory region should be set into buf->dma_addr but
previous patch had a problem that end address is set into buf->dma_addr
so v2 fixes that problem.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
 drivers/gpu/drm/exynos/exynos_drm_gem.c |  265 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
 include/drm/exynos_drm.h                |   25 +++-
 4 files changed, 303 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index f58a487..5bb0361 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
 			DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
 			exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
+	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
+			exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS, exynos_plane_set_zpos_ioctl,
 			DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index afd0cd4..1ee5383 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
 	return 0;
 }
 
+static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
+{
+	struct vm_area_struct *vma_copy;
+
+	vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
+	if (!vma_copy)
+		return NULL;
+
+	if (vma->vm_ops && vma->vm_ops->open)
+		vma->vm_ops->open(vma);
+
+	if (vma->vm_file)
+		get_file(vma->vm_file);
+
+	memcpy(vma_copy, vma, sizeof(*vma));
+
+	vma_copy->vm_mm = NULL;
+	vma_copy->vm_next = NULL;
+	vma_copy->vm_prev = NULL;
+
+	return vma_copy;
+}
+
+static void put_vma(struct vm_area_struct *vma)
+{
+	if (!vma)
+		return;
+
+	if (vma->vm_ops && vma->vm_ops->close)
+		vma->vm_ops->close(vma);
+
+	if (vma->vm_file)
+		fput(vma->vm_file);
+
+	kfree(vma);
+}
+
 static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
 					struct vm_area_struct *vma)
 {
@@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct drm_gem_object *obj)
 	/* add some codes for UNCACHED type here. TODO */
 }
 
+static void exynos_drm_put_userptr(struct drm_gem_object *obj)
+{
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct exynos_drm_gem_buf *buf;
+	struct vm_area_struct *vma;
+	int npages;
+
+	exynos_gem_obj = to_exynos_gem_obj(obj);
+	buf = exynos_gem_obj->buffer;
+	vma = exynos_gem_obj->vma;
+
+	if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
+		put_vma(exynos_gem_obj->vma);
+		goto out;
+	}
+
+	npages = buf->size >> PAGE_SHIFT;
+
+	npages--;
+	while (npages >= 0) {
+		if (buf->write)
+			set_page_dirty_lock(buf->pages[npages]);
+
+		put_page(buf->pages[npages]);
+		npages--;
+	}
+
+out:
+	kfree(buf->pages);
+	buf->pages = NULL;
+
+	kfree(buf->sgt);
+	buf->sgt = NULL;
+}
+
 static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
 					struct drm_file *file_priv,
 					unsigned int *handle)
@@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
 
 	if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
 		exynos_drm_gem_put_pages(obj);
+	else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
+		exynos_drm_put_userptr(obj);
 	else
 		exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags, buf);
 
@@ -606,6 +680,197 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int exynos_drm_get_userptr(struct drm_device *dev,
+				struct exynos_drm_gem_obj *obj,
+				unsigned long userptr,
+				unsigned int write)
+{
+	unsigned int get_npages;
+	unsigned long npages = 0;
+	struct vm_area_struct *vma;
+	struct exynos_drm_gem_buf *buf = obj->buffer;
+
+	vma = find_vma(current->mm, userptr);
+
+	/* the memory region mmaped with VM_PFNMAP. */
+	if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
+		unsigned long this_pfn, prev_pfn, pa;
+		unsigned long start, end, offset;
+		struct scatterlist *sgl;
+		int ret;
+
+		start = userptr;
+		offset = userptr & ~PAGE_MASK;
+		end = start + buf->size;
+		sgl = buf->sgt->sgl;
+
+		for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
+			ret = follow_pfn(vma, start, &this_pfn);
+			if (ret)
+				return ret;
+
+			if (prev_pfn == 0) {
+				pa = this_pfn << PAGE_SHIFT;
+				buf->dma_addr = pa + offset;
+			} else if (this_pfn != prev_pfn + 1) {
+				ret = -EINVAL;
+				goto err;
+			}
+
+			sg_dma_address(sgl) = (pa + offset);
+			sg_dma_len(sgl) = PAGE_SIZE;
+			prev_pfn = this_pfn;
+			pa += PAGE_SIZE;
+			npages++;
+			sgl = sg_next(sgl);
+		}
+
+		obj->vma = get_vma(vma);
+		if (!obj->vma) {
+			ret = -ENOMEM;
+			goto err;
+		}
+
+		buf->pfnmap = true;
+
+		return npages;
+err:
+		buf->dma_addr = 0;
+		return ret;
+	}
+
+	buf->write = write;
+	npages = buf->size >> PAGE_SHIFT;
+
+	down_read(&current->mm->mmap_sem);
+	get_npages = get_user_pages(current, current->mm, userptr,
+					npages, write, 1, buf->pages, NULL);
+	up_read(&current->mm->mmap_sem);
+	if (get_npages != npages)
+		DRM_ERROR("failed to get user_pages.\n");
+
+	buf->pfnmap = false;
+
+	return get_npages;
+}
+
+int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv)
+{
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct drm_exynos_gem_userptr *args = data;
+	struct exynos_drm_gem_buf *buf;
+	struct scatterlist *sgl;
+	unsigned long size, userptr;
+	unsigned int npages;
+	int ret, get_npages;
+
+	DRM_DEBUG_KMS("%s\n", __FILE__);
+
+	if (!args->size) {
+		DRM_ERROR("invalid size.\n");
+		return -EINVAL;
+	}
+
+	ret = check_gem_flags(args->flags);
+	if (ret)
+		return ret;
+
+	size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
+	userptr = args->userptr;
+
+	buf = exynos_drm_init_buf(dev, size);
+	if (!buf)
+		return -ENOMEM;
+
+	exynos_gem_obj = exynos_drm_gem_init(dev, size);
+	if (!exynos_gem_obj) {
+		ret = -ENOMEM;
+		goto err_free_buffer;
+	}
+
+	buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
+	if (!buf->sgt) {
+		DRM_ERROR("failed to allocate buf->sgt.\n");
+		ret = -ENOMEM;
+		goto err_release_gem;
+	}
+
+	npages = size >> PAGE_SHIFT;
+
+	ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
+	if (ret < 0) {
+		DRM_ERROR("failed to initailize sg table.\n");
+		goto err_free_sgt;
+	}
+
+	buf->pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!buf->pages) {
+		DRM_ERROR("failed to allocate buf->pages\n");
+		ret = -ENOMEM;
+		goto err_free_table;
+	}
+
+	exynos_gem_obj->buffer = buf;
+
+	get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj, userptr, 1);
+	if (get_npages != npages) {
+		DRM_ERROR("failed to get user_pages.\n");
+		ret = get_npages;
+		goto err_release_userptr;
+	}
+
+	ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base, file_priv,
+						&args->handle);
+	if (ret < 0) {
+		DRM_ERROR("failed to create gem handle.\n");
+		goto err_release_userptr;
+	}
+
+	sgl = buf->sgt->sgl;
+
+	/*
+	 * if buf->pfnmap is true then update sgl of sgt with pages but
+	 * if buf->pfnmap is false then it means the sgl was updated already
+	 * so it doesn't need to update the sgl.
+	 */
+	if (!buf->pfnmap) {
+		unsigned int i = 0;
+
+		/* set all pages to sg list. */
+		while (i < npages) {
+			sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
+			sg_dma_address(sgl) = page_to_phys(buf->pages[i]);
+			i++;
+			sgl = sg_next(sgl);
+		}
+	}
+
+	/* always use EXYNOS_BO_USERPTR as memory type for userptr. */
+	exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
+
+	return 0;
+
+err_release_userptr:
+	get_npages--;
+	while (get_npages >= 0)
+		put_page(buf->pages[get_npages--]);
+	kfree(buf->pages);
+	buf->pages = NULL;
+err_free_table:
+	sg_free_table(buf->sgt);
+err_free_sgt:
+	kfree(buf->sgt);
+	buf->sgt = NULL;
+err_release_gem:
+	drm_gem_object_release(&exynos_gem_obj->base);
+	kfree(exynos_gem_obj);
+	exynos_gem_obj = NULL;
+err_free_buffer:
+	exynos_drm_free_buf(dev, 0, buf);
+	return ret;
+}
+
 int exynos_drm_gem_init_object(struct drm_gem_object *obj)
 {
 	DRM_DEBUG_KMS("%s\n", __FILE__);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index efc8252..1de2241 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -29,7 +29,8 @@
 #define to_exynos_gem_obj(x)	container_of(x,\
 			struct exynos_drm_gem_obj, base)
 
-#define IS_NONCONTIG_BUFFER(f)		(f & EXYNOS_BO_NONCONTIG)
+#define IS_NONCONTIG_BUFFER(f)		((f & EXYNOS_BO_NONCONTIG) ||\
+					(f & EXYNOS_BO_USERPTR))
 
 /*
  * exynos drm gem buffer structure.
@@ -38,18 +39,23 @@
  * @dma_addr: bus address(accessed by dma) to allocated memory region.
  *	- this address could be physical address without IOMMU and
  *	device address with IOMMU.
+ * @write: whether pages will be written to by the caller.
  * @sgt: sg table to transfer page data.
  * @pages: contain all pages to allocated memory region.
  * @page_size: could be 4K, 64K or 1MB.
  * @size: size of allocated memory region.
+ * @pfnmap: indicate whether memory region from userptr is mmaped with
+ *	VM_PFNMAP or not.
  */
 struct exynos_drm_gem_buf {
 	void __iomem		*kvaddr;
 	dma_addr_t		dma_addr;
+	unsigned int		write;
 	struct sg_table		*sgt;
 	struct page		**pages;
 	unsigned long		page_size;
 	unsigned long		size;
+	bool			pfnmap;
 };
 
 /*
@@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
 	struct drm_gem_object		base;
 	struct exynos_drm_gem_buf	*buffer;
 	unsigned long			size;
+	struct vm_area_struct		*vma;
 	unsigned int			flags;
 };
 
@@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct drm_device *dev, void *data,
 int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 			      struct drm_file *file_priv);
 
+/* map user space allocated by malloc to pages. */
+int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv);
+
 /* initialize gem object. */
 int exynos_drm_gem_init_object(struct drm_gem_object *obj);
 
diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
index 2d6eb06..48eda6e 100644
--- a/include/drm/exynos_drm.h
+++ b/include/drm/exynos_drm.h
@@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
 };
 
 /**
+ * User-requested user space importing structure
+ *
+ * @userptr: user space address allocated by malloc.
+ * @size: size to the buffer allocated by malloc.
+ * @flags: indicate user-desired cache attribute to map the allocated buffer
+ *	to kernel space.
+ * @handle: a returned handle to created gem object.
+ *	- this handle will be set by gem module of kernel side.
+ */
+struct drm_exynos_gem_userptr {
+	uint64_t userptr;
+	uint64_t size;
+	unsigned int flags;
+	unsigned int handle;
+};
+
+/**
  * A structure for user connection request of virtual display.
  *
  * @connection: indicate whether doing connetion or not by user.
@@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
 	EXYNOS_BO_CACHABLE	= 1 << 1,
 	/* write-combine mapping. */
 	EXYNOS_BO_WC		= 1 << 2,
+	/* user space memory allocated by malloc. */
+	EXYNOS_BO_USERPTR	= 1 << 3,
 	EXYNOS_BO_MASK		= EXYNOS_BO_NONCONTIG | EXYNOS_BO_CACHABLE |
-					EXYNOS_BO_WC
+					EXYNOS_BO_WC | EXYNOS_BO_USERPTR
 };
 
 #define DRM_EXYNOS_GEM_CREATE		0x00
 #define DRM_EXYNOS_GEM_MAP_OFFSET	0x01
 #define DRM_EXYNOS_GEM_MMAP		0x02
+#define DRM_EXYNOS_GEM_USERPTR		0x03
 /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
 #define DRM_EXYNOS_PLANE_SET_ZPOS	0x06
 #define DRM_EXYNOS_VIDI_CONNECTION	0x07
@@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
 #define DRM_IOCTL_EXYNOS_GEM_MMAP	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
 
+#define DRM_IOCTL_EXYNOS_GEM_USERPTR	DRM_IOWR(DRM_COMMAND_BASE + \
+		DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
+
 #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_PLANE_SET_ZPOS, struct drm_exynos_plane_set_zpos)
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-04-24  5:17   ` [PATCH v2 " Inki Dae
@ 2012-04-25 10:15     ` Dave Airlie
  2012-04-25 12:46       ` InKi Dae
  2012-05-05 10:19       ` daeinki
  0 siblings, 2 replies; 77+ messages in thread
From: Dave Airlie @ 2012-04-25 10:15 UTC (permalink / raw)
  To: Inki Dae; +Cc: kyungmin.park, sw0312.kim, dri-devel

On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> this feature could be used to use memory region allocated by malloc() in user
> mode and mmaped memory region allocated by other memory allocators. userptr
> interface can identify memory type through vm_flags value and would get
> pages or page frame numbers to user space appropriately.

Is there anything to stop the unpriviledged userspace driver locking
all the RAM in the machine inside userptr?

seems like a very bad plan, I know TTM has code to address this sort
of things with limits.

Dave.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-04-25 10:15     ` Dave Airlie
@ 2012-04-25 12:46       ` InKi Dae
  2012-05-05 10:19       ` daeinki
  1 sibling, 0 replies; 77+ messages in thread
From: InKi Dae @ 2012-04-25 12:46 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Inki Dae, kyungmin.park, sw0312.kim, dri-devel

Hi Dave,

2012/4/25, Dave Airlie <airlied@gmail.com>:
> On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> this feature could be used to use memory region allocated by malloc() in
>> user
>> mode and mmaped memory region allocated by other memory allocators.
>> userptr
>> interface can identify memory type through vm_flags value and would get
>> pages or page frame numbers to user space appropriately.
>
> Is there anything to stop the unpriviledged userspace driver locking
> all the RAM in the machine inside userptr?
>
> seems like a very bad plan, I know TTM has code to address this sort
> of things with limits.
>

Thank you for your pointing and I will look into it. could anyone tell
me that I can refer to which part of TTM? if doing so, it would be
helpful to me.

Thanks,
Inki Dae.


> Dave.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-04-25 10:15     ` Dave Airlie
  2012-04-25 12:46       ` InKi Dae
@ 2012-05-05 10:19       ` daeinki
  2012-05-05 10:22         ` Dave Airlie
  1 sibling, 1 reply; 77+ messages in thread
From: daeinki @ 2012-05-05 10:19 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Inki Dae, kyungmin.park, sw0312.kim, dri-devel

Hi Dave,

2012. 4. 25. 오후 7:15 Dave Airlie <airlied@gmail.com> 작성:

> On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> this feature could be used to use memory region allocated by malloc() in user
>> mode and mmaped memory region allocated by other memory allocators. userptr
>> interface can identify memory type through vm_flags value and would get
>> pages or page frame numbers to user space appropriately.
> 
> Is there anything to stop the unpriviledged userspace driver locking
> all the RAM in the machine inside userptr?
> 

you mean that there is something that it can stop user space driver locking some memory region  of RAM? and if any user space driver locked some region then anyone on user space can't access the region? could you please tell me about your concerns in more detail so that we can solve the issue? I guess you mean that any user level driver such as specific EGL library can allocate some memory region and also lock the region so that other user space applications can't access the region until rendering is completed by hw accelerator such as 2d/3d core or opposite case.

actually, this feature has already been used by v4l2 so I didn't try to consider we could face with any problem with this and I've got a feeling maybe there is something I missed so I'd be happy for you or anyone give me any advices.

Thanks,
Inki Dae.


> seems like a very bad plan, I know TTM has code to address this sort
> of things with limits.
> 
> Dave.
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-05-05 10:19       ` daeinki
@ 2012-05-05 10:22         ` Dave Airlie
  2012-05-07 18:18           ` Jerome Glisse
  2012-05-08  6:48           ` Inki Dae
  0 siblings, 2 replies; 77+ messages in thread
From: Dave Airlie @ 2012-05-05 10:22 UTC (permalink / raw)
  To: daeinki; +Cc: Inki Dae, kyungmin.park, sw0312.kim, dri-devel

On Sat, May 5, 2012 at 11:19 AM,  <daeinki@gmail.com> wrote:
> Hi Dave,
>
> 2012. 4. 25. 오후 7:15 Dave Airlie <airlied@gmail.com> 작성:
>
>> On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>> this feature could be used to use memory region allocated by malloc() in user
>>> mode and mmaped memory region allocated by other memory allocators. userptr
>>> interface can identify memory type through vm_flags value and would get
>>> pages or page frame numbers to user space appropriately.
>>
>> Is there anything to stop the unpriviledged userspace driver locking
>> all the RAM in the machine inside userptr?
>>
>
> you mean that there is something that it can stop user space driver locking some memory region  of RAM? and if any user space driver locked some region then anyone on user space can't access the region? could you please tell me about your concerns in more detail so that we can solve the issue? I guess you mean that any user level driver such as specific EGL library can allocate some memory region and also lock the region so that other user space applications can't access the region until rendering is completed by hw accelerator such as 2d/3d core or opposite case.
>
> actually, this feature has already been used by v4l2 so I didn't try to consider we could face with any problem with this and I've got a feeling maybe there is something I missed so I'd be happy for you or anyone give me any advices.

Well v4l get to make their own bad design decisions.

The problem is if an unprivledged users accessing the drm can lock all
the pages it allocates into memory, by passing them to the kernel as
userptrs., thus bypassing the swap and blocking all other users on the
system.

Dave.
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-05-05 10:22         ` Dave Airlie
@ 2012-05-07 18:18           ` Jerome Glisse
  2012-05-08  7:59             ` Inki Dae
  2012-05-08  6:48           ` Inki Dae
  1 sibling, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-07 18:18 UTC (permalink / raw)
  To: Dave Airlie; +Cc: Inki Dae, kyungmin.park, sw0312.kim, dri-devel

On Sat, May 5, 2012 at 6:22 AM, Dave Airlie <airlied@gmail.com> wrote:
> On Sat, May 5, 2012 at 11:19 AM,  <daeinki@gmail.com> wrote:
>> Hi Dave,
>>
>> 2012. 4. 25. 오후 7:15 Dave Airlie <airlied@gmail.com> 작성:
>>
>>> On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>>> this feature could be used to use memory region allocated by malloc() in user
>>>> mode and mmaped memory region allocated by other memory allocators. userptr
>>>> interface can identify memory type through vm_flags value and would get
>>>> pages or page frame numbers to user space appropriately.
>>>
>>> Is there anything to stop the unpriviledged userspace driver locking
>>> all the RAM in the machine inside userptr?
>>>
>>
>> you mean that there is something that it can stop user space driver locking some memory region  of RAM? and if any user space driver locked some region then anyone on user space can't access the region? could you please tell me about your concerns in more detail so that we can solve the issue? I guess you mean that any user level driver such as specific EGL library can allocate some memory region and also lock the region so that other user space applications can't access the region until rendering is completed by hw accelerator such as 2d/3d core or opposite case.
>>
>> actually, this feature has already been used by v4l2 so I didn't try to consider we could face with any problem with this and I've got a feeling maybe there is something I missed so I'd be happy for you or anyone give me any advices.
>
> Well v4l get to make their own bad design decisions.
>
> The problem is if an unprivledged users accessing the drm can lock all
> the pages it allocates into memory, by passing them to the kernel as
> userptrs., thus bypassing the swap and blocking all other users on the
> system.
>
> Dave.

Beside that you are not locking the vma and afaik this means that the
page backing the vma might change, yes you will still own the page you
get but userspace might be reading/writing to different pages. The vma
would need to be locked but than the userspace might unlock it in your
back and you start right from the begining.

Cheers,
Jerome
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-05-05 10:22         ` Dave Airlie
  2012-05-07 18:18           ` Jerome Glisse
@ 2012-05-08  6:48           ` Inki Dae
  1 sibling, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-08  6:48 UTC (permalink / raw)
  To: 'Dave Airlie', daeinki; +Cc: kyungmin.park, sw0312.kim, dri-devel

Hi Dave,

> -----Original Message-----
> From: Dave Airlie [mailto:airlied@gmail.com]
> Sent: Saturday, May 05, 2012 7:23 PM
> To: daeinki@gmail.com
> Cc: Inki Dae; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
> devel@lists.freedesktop.org
> Subject: Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
> 
> On Sat, May 5, 2012 at 11:19 AM,  <daeinki@gmail.com> wrote:
> > Hi Dave,
> >
> > 2012. 4. 25. 오후 7:15 Dave Airlie <airlied@gmail.com> 작성:
> >
> >> On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >>> this feature could be used to use memory region allocated by malloc()
> in user
> >>> mode and mmaped memory region allocated by other memory allocators.
> userptr
> >>> interface can identify memory type through vm_flags value and would
> get
> >>> pages or page frame numbers to user space appropriately.
> >>
> >> Is there anything to stop the unpriviledged userspace driver locking
> >> all the RAM in the machine inside userptr?
> >>
> >
> > you mean that there is something that it can stop user space driver
> locking some memory region  of RAM? and if any user space driver locked
> some region then anyone on user space can't access the region? could you
> please tell me about your concerns in more detail so that we can solve the
> issue? I guess you mean that any user level driver such as specific EGL
> library can allocate some memory region and also lock the region so that
> other user space applications can't access the region until rendering is
> completed by hw accelerator such as 2d/3d core or opposite case.
> >
> > actually, this feature has already been used by v4l2 so I didn't try to
> consider we could face with any problem with this and I've got a feeling
> maybe there is something I missed so I'd be happy for you or anyone give
> me any advices.
> 
> Well v4l get to make their own bad design decisions.
> 
> The problem is if an unprivledged users accessing the drm can lock all
> the pages it allocates into memory, by passing them to the kernel as
> userptrs., thus bypassing the swap and blocking all other users on the
> system.
> 

Thank you for your advices and comments and I will look over this feature
again.

We should use this feature for our linux-based platform because the backend
of evas(used by elementary) or pixman(used by Cario) needs this feature to
use hardware accelerator only using user address so I will re-post it again
after resolving this issue if possible.

Thanks,
Inki Dae

> Dave.

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-05-07 18:18           ` Jerome Glisse
@ 2012-05-08  7:59             ` Inki Dae
  2012-05-08 15:05               ` Jerome Glisse
  0 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-08  7:59 UTC (permalink / raw)
  To: 'Jerome Glisse', 'Dave Airlie'
  Cc: kyungmin.park, sw0312.kim, dri-devel

Hi Jerome,

> -----Original Message-----
> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> Sent: Tuesday, May 08, 2012 3:18 AM
> To: Dave Airlie
> Cc: daeinki@gmail.com; Inki Dae; kyungmin.park@samsung.com;
> sw0312.kim@samsung.com; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
> 
> On Sat, May 5, 2012 at 6:22 AM, Dave Airlie <airlied@gmail.com> wrote:
> > On Sat, May 5, 2012 at 11:19 AM,  <daeinki@gmail.com> wrote:
> >> Hi Dave,
> >>
> >> 2012. 4. 25. 오후 7:15 Dave Airlie <airlied@gmail.com> 작성:
> >>
> >>> On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae <inki.dae@samsung.com>
wrote:
> >>>> this feature could be used to use memory region allocated by malloc()
> in user
> >>>> mode and mmaped memory region allocated by other memory allocators.
> userptr
> >>>> interface can identify memory type through vm_flags value and would
> get
> >>>> pages or page frame numbers to user space appropriately.
> >>>
> >>> Is there anything to stop the unpriviledged userspace driver locking
> >>> all the RAM in the machine inside userptr?
> >>>
> >>
> >> you mean that there is something that it can stop user space driver
> locking some memory region  of RAM? and if any user space driver locked
> some region then anyone on user space can't access the region? could you
> please tell me about your concerns in more detail so that we can solve the
> issue? I guess you mean that any user level driver such as specific EGL
> library can allocate some memory region and also lock the region so that
> other user space applications can't access the region until rendering is
> completed by hw accelerator such as 2d/3d core or opposite case.
> >>
> >> actually, this feature has already been used by v4l2 so I didn't try to
> consider we could face with any problem with this and I've got a feeling
> maybe there is something I missed so I'd be happy for you or anyone give
> me any advices.
> >
> > Well v4l get to make their own bad design decisions.
> >
> > The problem is if an unprivledged users accessing the drm can lock all
> > the pages it allocates into memory, by passing them to the kernel as
> > userptrs., thus bypassing the swap and blocking all other users on the
> > system.
> >
> > Dave.
> 
> Beside that you are not locking the vma and afaik this means that the
> page backing the vma might change, yes you will still own the page you
> get but userspace might be reading/writing to different pages. The vma

Yes, right. the vma should be locked because the pages backing the vma
might be changed after swap-in. thank you for your pointing.


> would need to be locked but than the userspace might unlock it in your
> back and you start right from the begining.
> 

I'm not sure I understood your comments but my understanding is you mean
that there is some interface that it can unlock(using some interface of drm
side or implementing it?) the vma so we can resolve the issue Dave pointed
out?

Thanks,
Inki Dae

> Cheers,
> Jerome

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
  2012-05-08  7:59             ` Inki Dae
@ 2012-05-08 15:05               ` Jerome Glisse
  0 siblings, 0 replies; 77+ messages in thread
From: Jerome Glisse @ 2012-05-08 15:05 UTC (permalink / raw)
  To: Inki Dae; +Cc: kyungmin.park, dri-devel, sw0312.kim

On Tue, May 8, 2012 at 3:59 AM, Inki Dae <inki.dae@samsung.com> wrote:
> Hi Jerome,
>
>> -----Original Message-----
>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>> Sent: Tuesday, May 08, 2012 3:18 AM
>> To: Dave Airlie
>> Cc: daeinki@gmail.com; Inki Dae; kyungmin.park@samsung.com;
>> sw0312.kim@samsung.com; dri-devel@lists.freedesktop.org
>> Subject: Re: [PATCH v2 3/4] drm/exynos: added userptr feature.
>>
>> On Sat, May 5, 2012 at 6:22 AM, Dave Airlie <airlied@gmail.com> wrote:
>> > On Sat, May 5, 2012 at 11:19 AM,  <daeinki@gmail.com> wrote:
>> >> Hi Dave,
>> >>
>> >> 2012. 4. 25. 오후 7:15 Dave Airlie <airlied@gmail.com> 작성:
>> >>
>> >>> On Tue, Apr 24, 2012 at 6:17 AM, Inki Dae <inki.dae@samsung.com>
> wrote:
>> >>>> this feature could be used to use memory region allocated by malloc()
>> in user
>> >>>> mode and mmaped memory region allocated by other memory allocators.
>> userptr
>> >>>> interface can identify memory type through vm_flags value and would
>> get
>> >>>> pages or page frame numbers to user space appropriately.
>> >>>
>> >>> Is there anything to stop the unpriviledged userspace driver locking
>> >>> all the RAM in the machine inside userptr?
>> >>>
>> >>
>> >> you mean that there is something that it can stop user space driver
>> locking some memory region  of RAM? and if any user space driver locked
>> some region then anyone on user space can't access the region? could you
>> please tell me about your concerns in more detail so that we can solve the
>> issue? I guess you mean that any user level driver such as specific EGL
>> library can allocate some memory region and also lock the region so that
>> other user space applications can't access the region until rendering is
>> completed by hw accelerator such as 2d/3d core or opposite case.
>> >>
>> >> actually, this feature has already been used by v4l2 so I didn't try to
>> consider we could face with any problem with this and I've got a feeling
>> maybe there is something I missed so I'd be happy for you or anyone give
>> me any advices.
>> >
>> > Well v4l get to make their own bad design decisions.
>> >
>> > The problem is if an unprivledged users accessing the drm can lock all
>> > the pages it allocates into memory, by passing them to the kernel as
>> > userptrs., thus bypassing the swap and blocking all other users on the
>> > system.
>> >
>> > Dave.
>>
>> Beside that you are not locking the vma and afaik this means that the
>> page backing the vma might change, yes you will still own the page you
>> get but userspace might be reading/writing to different pages. The vma
>
> Yes, right. the vma should be locked because the pages backing the vma
> might be changed after swap-in. thank you for your pointing.
>
>
>> would need to be locked but than the userspace might unlock it in your
>> back and you start right from the begining.
>>
>
> I'm not sure I understood your comments but my understanding is you mean
> that there is some interface that it can unlock(using some interface of drm
> side or implementing it?) the vma so we can resolve the issue Dave pointed
> out?
>
> Thanks,
> Inki Dae
>

Userspace can unlock the vma behind your back with the sys call
munlock, malicious process might use that to trick your gpu somehow.
The thing is there is no proper infrastructure in linux kernel for
locking vma for long period of time. Because it mess with core linux
vm stuff i believe the vm people should be involved and comment on
this. Also we absolutely don't want process to abuse the drm to force
locking, so you will need to make sure that what drm does won't go
over the memlock limit of the process (and i am not even sure that by
default this limit is not 0).

Anyway my point is i believe there is maybe no proper right now to do
this, maybe using VM_RESERVED but i am not clear on the implication of
doing so. Note in all case you will need the fragment the vma so that
you don't suddenly convert more than you want from anonymous vma to
something more restrictive.

Cheers,
Jerome
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 0/2 v3] drm/exynos: added userptr feature
  2012-04-23 13:43 ` [PATCH 3/4] drm/exynos: added userptr feature Inki Dae
  2012-04-24  5:17   ` [PATCH v2 " Inki Dae
@ 2012-05-09  6:17   ` Inki Dae
  2012-05-09  6:17     ` [PATCH 1/2 v3] drm/exynos: added userptr limit ioctl Inki Dae
                       ` (2 more replies)
  2012-05-15  7:34   ` [PATCH 3/4] " Rob Clark
  2 siblings, 3 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-09  6:17 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

this feature could be used to use memory region allocated by malloc() in user
mode and mmaped memory region allocated by other memory allocators. userptr
interface can identify memory type through vm_flags value and would get
pages or page frame numbers to user space appropriately.

changelog v2:
the memory region mmaped with VM_PFNMAP type is physically continuous and
start address of the memory region should be set into buf->dma_addr but
previous patch had a problem that end address is set into buf->dma_addr
so v2 fixes that problem.

changelog v3:
mitigated the issues pointed out by Dave and Jerome.

for this, added some codes to guarantee the pages to user space not
to be swapped out, the VMAs within the user space would be locked and
then unlocked when the pages are released.

but this lock might result in significant degradation of system performance
because the pages couldn't be swapped out so added one more feature
that we can limit user-desired userptr size to pre-defined value using
userptr limit ioctl that can be accessed by only root user.
these issues had been pointed out by Dave and Jerome.

Inki Dae (2):
  drm/exynos: added userptr limit ioctl.
  drm/exynos: added userptr feature.

 drivers/gpu/drm/exynos/exynos_drm_drv.c |    8 +
 drivers/gpu/drm/exynos/exynos_drm_drv.h |    6 +
 drivers/gpu/drm/exynos/exynos_drm_gem.c |  356 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |   20 ++-
 include/drm/exynos_drm.h                |   43 ++++-
 5 files changed, 430 insertions(+), 3 deletions(-)

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 1/2 v3] drm/exynos: added userptr limit ioctl.
  2012-05-09  6:17   ` [PATCH 0/2 v3] " Inki Dae
@ 2012-05-09  6:17     ` Inki Dae
  2012-05-09  6:17     ` [PATCH 2/2 v3] drm/exynos: added userptr feature Inki Dae
  2012-05-14  6:17     ` [PATCH 0/2 v4] " Inki Dae
  2 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-09  6:17 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

this ioctl is used to limit user-desired userptr size as pre-defined
and also could be accessed by only root user.

with userptr feature, unprivileged user can allocate all the pages on system,
so the amount of free physical pages will be very limited. if the VMAs
within user address space was locked, the pages couldn't be swapped out so
it may result in significant degradation of system performance. so this
feature would be used to avoid such situation.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c |    6 ++++++
 drivers/gpu/drm/exynos/exynos_drm_drv.h |    6 ++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.c |   22 ++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |    3 +++
 include/drm/exynos_drm.h                |   17 +++++++++++++++++
 5 files changed, 54 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index 9d3204c..1e68ec2 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -64,6 +64,9 @@ static int exynos_drm_load(struct drm_device *dev, unsigned long flags)
 		return -ENOMEM;
 	}
 
+	/* maximum size of userptr is limited to 16MB as default. */
+	private->userptr_limit = SZ_16M;
+
 	INIT_LIST_HEAD(&private->pageflip_event_list);
 	dev->dev_private = (void *)private;
 
@@ -221,6 +224,9 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
 			exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
 			exynos_drm_gem_get_ioctl, DRM_UNLOCKED),
+	DRM_IOCTL_DEF_DRV(EXYNOS_USER_LIMIT,
+			exynos_drm_gem_user_limit_ioctl, DRM_MASTER |
+			DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS, exynos_plane_set_zpos_ioctl,
 			DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h b/drivers/gpu/drm/exynos/exynos_drm_drv.h
index c82c90c..b38ed6f 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
@@ -235,6 +235,12 @@ struct exynos_drm_private {
 	 * this array is used to be aware of which crtc did it request vblank.
 	 */
 	struct drm_crtc *crtc[MAX_CRTC];
+
+	/*
+	 * maximum size of allocation by userptr feature.
+	 * - as default, this has 16MB and only root user can change it.
+	 */
+	unsigned long userptr_limit;
 };
 
 /*
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index fc91293..e6abb66 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -33,6 +33,8 @@
 #include "exynos_drm_gem.h"
 #include "exynos_drm_buf.h"
 
+#define USERPTR_MAX_SIZE		SZ_64M
+
 static unsigned int convert_to_vm_err_msg(int msg)
 {
 	unsigned int out_msg;
@@ -630,6 +632,26 @@ int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+int exynos_drm_gem_user_limit_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *filp)
+{
+	struct exynos_drm_private *priv = dev->dev_private;
+	struct drm_exynos_user_limit *limit = data;
+
+	if (limit->userptr_limit < PAGE_SIZE ||
+			limit->userptr_limit > USERPTR_MAX_SIZE) {
+		DRM_DEBUG_KMS("invalid userptr_limit size.\n");
+		return -EINVAL;
+	}
+
+	if (priv->userptr_limit == limit->userptr_limit)
+		return 0;
+
+	priv->userptr_limit = limit->userptr_limit;
+
+	return 0;
+}
+
 int exynos_drm_gem_init_object(struct drm_gem_object *obj)
 {
 	DRM_DEBUG_KMS("%s\n", __FILE__);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index 14d038b..3334c9f 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -78,6 +78,9 @@ struct exynos_drm_gem_obj {
 
 struct page **exynos_gem_get_pages(struct drm_gem_object *obj, gfp_t gfpmask);
 
+int exynos_drm_gem_user_limit_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *filp);
+
 /* destroy a buffer with gem object */
 void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj);
 
diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
index 54c97e8..52465dc 100644
--- a/include/drm/exynos_drm.h
+++ b/include/drm/exynos_drm.h
@@ -92,6 +92,19 @@ struct drm_exynos_gem_info {
 };
 
 /**
+ * A structure to userptr limited information.
+ *
+ * @userptr_limit: maximum size to userptr buffer.
+ *	the buffer could be allocated by unprivileged user using malloc()
+ *	and the size of the buffer would be limited as userptr_limit value.
+ * @pad: just padding to be 64-bit aligned.
+ */
+struct drm_exynos_user_limit {
+	unsigned int userptr_limit;
+	unsigned int pad;
+};
+
+/**
  * A structure for user connection request of virtual display.
  *
  * @connection: indicate whether doing connetion or not by user.
@@ -162,6 +175,7 @@ struct drm_exynos_g2d_exec {
 #define DRM_EXYNOS_GEM_MMAP		0x02
 /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
 #define DRM_EXYNOS_GEM_GET		0x04
+#define DRM_EXYNOS_USER_LIMIT		0x05
 #define DRM_EXYNOS_PLANE_SET_ZPOS	0x06
 #define DRM_EXYNOS_VIDI_CONNECTION	0x07
 
@@ -182,6 +196,9 @@ struct drm_exynos_g2d_exec {
 #define DRM_IOCTL_EXYNOS_GEM_GET	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_GET,	struct drm_exynos_gem_info)
 
+#define DRM_IOCTL_EXYNOS_USER_LIMIT	DRM_IOWR(DRM_COMMAND_BASE + \
+		DRM_EXYNOS_USER_LIMIT,	struct drm_exynos_user_limit)
+
 #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_PLANE_SET_ZPOS, struct drm_exynos_plane_set_zpos)
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-09  6:17   ` [PATCH 0/2 v3] " Inki Dae
  2012-05-09  6:17     ` [PATCH 1/2 v3] drm/exynos: added userptr limit ioctl Inki Dae
@ 2012-05-09  6:17     ` Inki Dae
  2012-05-09 14:45       ` Jerome Glisse
  2012-05-14  6:17     ` [PATCH 0/2 v4] " Inki Dae
  2 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-09  6:17 UTC (permalink / raw)
  To: airlied, dri-devel; +Cc: Inki Dae, kyungmin.park, sw0312.kim

this feature is used to import user space region allocated by malloc() or
mmaped into a gem. and to guarantee the pages to user space not to be
swapped out, the VMAs within the user space would be locked and then unlocked
when the pages are released.

but this lock might result in significant degradation of system performance
because the pages couldn't be swapped out so we limit user-desired userptr
size to pre-defined.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
 drivers/gpu/drm/exynos/exynos_drm_gem.c |  334 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |   17 ++-
 include/drm/exynos_drm.h                |   26 +++-
 4 files changed, 376 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index 1e68ec2..e8ae3f1 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -220,6 +220,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MAP_OFFSET,
 			exynos_drm_gem_map_offset_ioctl, DRM_UNLOCKED |
 			DRM_AUTH),
+	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
+			exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
 			exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index e6abb66..ccc6e3d 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -68,6 +68,83 @@ static int check_gem_flags(unsigned int flags)
 	return 0;
 }
 
+static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
+{
+	struct vm_area_struct *vma_copy;
+
+	vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
+	if (!vma_copy)
+		return NULL;
+
+	if (vma->vm_ops && vma->vm_ops->open)
+		vma->vm_ops->open(vma);
+
+	if (vma->vm_file)
+		get_file(vma->vm_file);
+
+	memcpy(vma_copy, vma, sizeof(*vma));
+
+	vma_copy->vm_mm = NULL;
+	vma_copy->vm_next = NULL;
+	vma_copy->vm_prev = NULL;
+
+	return vma_copy;
+}
+
+
+static void put_vma(struct vm_area_struct *vma)
+{
+	if (!vma)
+		return;
+
+	if (vma->vm_ops && vma->vm_ops->close)
+		vma->vm_ops->close(vma);
+
+	if (vma->vm_file)
+		fput(vma->vm_file);
+
+	kfree(vma);
+}
+
+/*
+ * lock_userptr_vma - lock VMAs within user address space
+ *
+ * this function locks vma within user address space to avoid pages
+ * to the userspace from being swapped out.
+ * if this vma isn't locked, the pages to the userspace could be swapped out
+ * so unprivileged user might access different pages and dma of any device
+ * could access physical memory region not intended once swap-in.
+ */
+static int lock_userptr_vma(struct exynos_drm_gem_buf *buf, unsigned int lock)
+{
+	struct vm_area_struct *vma;
+	unsigned long start, end;
+
+	start = buf->userptr;
+	end = buf->userptr + buf->size - 1;
+
+	down_write(&current->mm->mmap_sem);
+
+	do {
+		vma = find_vma(current->mm, start);
+		if (!vma) {
+			up_write(&current->mm->mmap_sem);
+			return -EFAULT;
+		}
+
+		if (lock)
+			vma->vm_flags |= VM_LOCKED;
+		else
+			vma->vm_flags &= ~VM_LOCKED;
+
+		start = vma->vm_end + 1;
+	} while (vma->vm_end < end);
+
+	up_write(&current->mm->mmap_sem);
+
+	return 0;
+}
+
 static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
 					struct vm_area_struct *vma)
 {
@@ -256,6 +333,44 @@ static void exynos_drm_gem_put_pages(struct drm_gem_object *obj)
 	/* add some codes for UNCACHED type here. TODO */
 }
 
+static void exynos_drm_put_userptr(struct drm_gem_object *obj)
+{
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct exynos_drm_gem_buf *buf;
+	struct vm_area_struct *vma;
+	int npages;
+
+	exynos_gem_obj = to_exynos_gem_obj(obj);
+	buf = exynos_gem_obj->buffer;
+	vma = exynos_gem_obj->vma;
+
+	if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
+		put_vma(exynos_gem_obj->vma);
+		goto out;
+	}
+
+	npages = buf->size >> PAGE_SHIFT;
+
+	if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR && !buf->pfnmap)
+		lock_userptr_vma(buf, 0);
+
+	npages--;
+	while (npages >= 0) {
+		if (buf->write)
+			set_page_dirty_lock(buf->pages[npages]);
+
+		put_page(buf->pages[npages]);
+		npages--;
+	}
+
+out:
+	kfree(buf->pages);
+	buf->pages = NULL;
+
+	kfree(buf->sgt);
+	buf->sgt = NULL;
+}
+
 static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
 					struct drm_file *file_priv,
 					unsigned int *handle)
@@ -295,6 +410,8 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
 
 	if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
 		exynos_drm_gem_put_pages(obj);
+	else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
+		exynos_drm_put_userptr(obj);
 	else
 		exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags, buf);
 
@@ -606,6 +723,223 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int exynos_drm_get_userptr(struct drm_device *dev,
+				struct exynos_drm_gem_obj *obj,
+				unsigned long userptr,
+				unsigned int write)
+{
+	unsigned int get_npages;
+	unsigned long npages = 0;
+	struct vm_area_struct *vma;
+	struct exynos_drm_gem_buf *buf = obj->buffer;
+	int ret;
+
+	down_read(&current->mm->mmap_sem);
+	vma = find_vma(current->mm, userptr);
+
+	/* the memory region mmaped with VM_PFNMAP. */
+	if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
+		unsigned long this_pfn, prev_pfn, pa;
+		unsigned long start, end, offset;
+		struct scatterlist *sgl;
+		int ret;
+
+		start = userptr;
+		offset = userptr & ~PAGE_MASK;
+		end = start + buf->size;
+		sgl = buf->sgt->sgl;
+
+		for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
+			ret = follow_pfn(vma, start, &this_pfn);
+			if (ret)
+				goto err;
+
+			if (prev_pfn == 0) {
+				pa = this_pfn << PAGE_SHIFT;
+				buf->dma_addr = pa + offset;
+			} else if (this_pfn != prev_pfn + 1) {
+				ret = -EINVAL;
+				goto err;
+			}
+
+			sg_dma_address(sgl) = (pa + offset);
+			sg_dma_len(sgl) = PAGE_SIZE;
+			prev_pfn = this_pfn;
+			pa += PAGE_SIZE;
+			npages++;
+			sgl = sg_next(sgl);
+		}
+
+		obj->vma = get_vma(vma);
+		if (!obj->vma) {
+			ret = -ENOMEM;
+			goto err;
+		}
+
+		up_read(&current->mm->mmap_sem);
+		buf->pfnmap = true;
+
+		return npages;
+err:
+		buf->dma_addr = 0;
+		up_read(&current->mm->mmap_sem);
+
+		return ret;
+	}
+
+	up_read(&current->mm->mmap_sem);
+
+	/*
+	 * lock the vma within userptr to avoid userspace buffer
+	 * from being swapped out.
+	 */
+	ret = lock_userptr_vma(buf, 1);
+	if (ret < 0) {
+		DRM_ERROR("failed to lock vma for userptr.\n");
+		lock_userptr_vma(buf, 0);
+		return 0;
+	}
+
+	buf->write = write;
+	npages = buf->size >> PAGE_SHIFT;
+
+	down_read(&current->mm->mmap_sem);
+	get_npages = get_user_pages(current, current->mm, userptr,
+					npages, write, 1, buf->pages, NULL);
+	up_read(&current->mm->mmap_sem);
+	if (get_npages != npages)
+		DRM_ERROR("failed to get user_pages.\n");
+
+	buf->userptr = userptr;
+	buf->pfnmap = false;
+
+	return get_npages;
+}
+
+int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv)
+{
+	struct exynos_drm_private *priv = dev->dev_private;
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct drm_exynos_gem_userptr *args = data;
+	struct exynos_drm_gem_buf *buf;
+	struct scatterlist *sgl;
+	unsigned long size, userptr;
+	unsigned int npages;
+	int ret, get_npages;
+
+	DRM_DEBUG_KMS("%s\n", __FILE__);
+
+	if (!args->size) {
+		DRM_ERROR("invalid size.\n");
+		return -EINVAL;
+	}
+
+	ret = check_gem_flags(args->flags);
+	if (ret)
+		return ret;
+
+	size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
+
+	if (size > priv->userptr_limit) {
+		DRM_ERROR("excessed maximum size of userptr.\n");
+		return -EINVAL;
+	}
+
+	userptr = args->userptr;
+
+	buf = exynos_drm_init_buf(dev, size);
+	if (!buf)
+		return -ENOMEM;
+
+	exynos_gem_obj = exynos_drm_gem_init(dev, size);
+	if (!exynos_gem_obj) {
+		ret = -ENOMEM;
+		goto err_free_buffer;
+	}
+
+	buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
+	if (!buf->sgt) {
+		DRM_ERROR("failed to allocate buf->sgt.\n");
+		ret = -ENOMEM;
+		goto err_release_gem;
+	}
+
+	npages = size >> PAGE_SHIFT;
+
+	ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
+	if (ret < 0) {
+		DRM_ERROR("failed to initailize sg table.\n");
+		goto err_free_sgt;
+	}
+
+	buf->pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!buf->pages) {
+		DRM_ERROR("failed to allocate buf->pages\n");
+		ret = -ENOMEM;
+		goto err_free_table;
+	}
+
+	exynos_gem_obj->buffer = buf;
+
+	get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj, userptr, 1);
+	if (get_npages != npages) {
+		DRM_ERROR("failed to get user_pages.\n");
+		ret = get_npages;
+		goto err_release_userptr;
+	}
+
+	ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base, file_priv,
+						&args->handle);
+	if (ret < 0) {
+		DRM_ERROR("failed to create gem handle.\n");
+		goto err_release_userptr;
+	}
+
+	sgl = buf->sgt->sgl;
+
+	/*
+	 * if buf->pfnmap is true then update sgl of sgt with pages but
+	 * if buf->pfnmap is false then it means the sgl was updated already
+	 * so it doesn't need to update the sgl.
+	 */
+	if (!buf->pfnmap) {
+		unsigned int i = 0;
+
+		/* set all pages to sg list. */
+		while (i < npages) {
+			sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
+			sg_dma_address(sgl) = page_to_phys(buf->pages[i]);
+			i++;
+			sgl = sg_next(sgl);
+		}
+	}
+
+	/* always use EXYNOS_BO_USERPTR as memory type for userptr. */
+	exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
+
+	return 0;
+
+err_release_userptr:
+	get_npages--;
+	while (get_npages >= 0)
+		put_page(buf->pages[get_npages--]);
+	kfree(buf->pages);
+	buf->pages = NULL;
+err_free_table:
+	sg_free_table(buf->sgt);
+err_free_sgt:
+	kfree(buf->sgt);
+	buf->sgt = NULL;
+err_release_gem:
+	drm_gem_object_release(&exynos_gem_obj->base);
+	kfree(exynos_gem_obj);
+	exynos_gem_obj = NULL;
+err_free_buffer:
+	exynos_drm_free_buf(dev, 0, buf);
+	return ret;
+}
+
 int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
 				      struct drm_file *file_priv)
 {	struct exynos_drm_gem_obj *exynos_gem_obj;
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index 3334c9f..72bd993 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -29,27 +29,35 @@
 #define to_exynos_gem_obj(x)	container_of(x,\
 			struct exynos_drm_gem_obj, base)
 
-#define IS_NONCONTIG_BUFFER(f)		(f & EXYNOS_BO_NONCONTIG)
+#define IS_NONCONTIG_BUFFER(f)	((f & EXYNOS_BO_NONCONTIG) ||\
+					(f & EXYNOS_BO_USERPTR))
 
 /*
  * exynos drm gem buffer structure.
  *
  * @kvaddr: kernel virtual address to allocated memory region.
+ * @userptr: user space address.
  * @dma_addr: bus address(accessed by dma) to allocated memory region.
  *	- this address could be physical address without IOMMU and
  *	device address with IOMMU.
+ * @write: whether pages will be written to by the caller.
  * @sgt: sg table to transfer page data.
  * @pages: contain all pages to allocated memory region.
  * @page_size: could be 4K, 64K or 1MB.
  * @size: size of allocated memory region.
+ * @pfnmap: indicate whether memory region from userptr is mmaped with
+ *	VM_PFNMAP or not.
  */
 struct exynos_drm_gem_buf {
 	void __iomem		*kvaddr;
+	unsigned long		userptr;
 	dma_addr_t		dma_addr;
+	unsigned int		write;
 	struct sg_table		*sgt;
 	struct page		**pages;
 	unsigned long		page_size;
 	unsigned long		size;
+	bool			pfnmap;
 };
 
 /*
@@ -64,6 +72,8 @@ struct exynos_drm_gem_buf {
  *	continuous memory region allocated by user request
  *	or at framebuffer creation.
  * @size: total memory size to physically non-continuous memory region.
+ * @vma: a pointer to the vma to user address space and used to release
+ *	the pages to user space.
  * @flags: indicate memory type to allocated buffer and cache attruibute.
  *
  * P.S. this object would be transfered to user as kms_bo.handle so
@@ -73,6 +83,7 @@ struct exynos_drm_gem_obj {
 	struct drm_gem_object		base;
 	struct exynos_drm_gem_buf	*buffer;
 	unsigned long			size;
+	struct vm_area_struct		*vma;
 	unsigned int			flags;
 };
 
@@ -130,6 +141,10 @@ int exynos_drm_gem_map_offset_ioctl(struct drm_device *dev, void *data,
 int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 			      struct drm_file *file_priv);
 
+/* map user space allocated by malloc to pages. */
+int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv);
+
 /* get buffer information to memory region allocated by gem. */
 int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
 				      struct drm_file *file_priv);
diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
index 52465dc..33fa1e2 100644
--- a/include/drm/exynos_drm.h
+++ b/include/drm/exynos_drm.h
@@ -77,6 +77,23 @@ struct drm_exynos_gem_mmap {
 };
 
 /**
+ * User-requested user space importing structure
+ *
+ * @userptr: user space address allocated by malloc.
+ * @size: size to the buffer allocated by malloc.
+ * @flags: indicate user-desired cache attribute to map the allocated buffer
+ *	to kernel space.
+ * @handle: a returned handle to created gem object.
+ *	- this handle will be set by gem module of kernel side.
+ */
+struct drm_exynos_gem_userptr {
+	uint64_t userptr;
+	uint64_t size;
+	unsigned int flags;
+	unsigned int handle;
+};
+
+/**
  * A structure to gem information.
  *
  * @handle: a handle to gem object created.
@@ -135,8 +152,10 @@ enum e_drm_exynos_gem_mem_type {
 	EXYNOS_BO_CACHABLE	= 1 << 1,
 	/* write-combine mapping. */
 	EXYNOS_BO_WC		= 1 << 2,
+	/* user space memory allocated by malloc. */
+	EXYNOS_BO_USERPTR	= 1 << 3,
 	EXYNOS_BO_MASK		= EXYNOS_BO_NONCONTIG | EXYNOS_BO_CACHABLE |
-					EXYNOS_BO_WC
+					EXYNOS_BO_WC | EXYNOS_BO_USERPTR
 };
 
 struct drm_exynos_g2d_get_ver {
@@ -173,7 +192,7 @@ struct drm_exynos_g2d_exec {
 #define DRM_EXYNOS_GEM_CREATE		0x00
 #define DRM_EXYNOS_GEM_MAP_OFFSET	0x01
 #define DRM_EXYNOS_GEM_MMAP		0x02
-/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
+#define DRM_EXYNOS_GEM_USERPTR		0x03
 #define DRM_EXYNOS_GEM_GET		0x04
 #define DRM_EXYNOS_USER_LIMIT		0x05
 #define DRM_EXYNOS_PLANE_SET_ZPOS	0x06
@@ -193,6 +212,9 @@ struct drm_exynos_g2d_exec {
 #define DRM_IOCTL_EXYNOS_GEM_MMAP	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
 
+#define DRM_IOCTL_EXYNOS_GEM_USERPTR	DRM_IOWR(DRM_COMMAND_BASE + \
+		DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
+
 #define DRM_IOCTL_EXYNOS_GEM_GET	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_GET,	struct drm_exynos_gem_info)
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-09  6:17     ` [PATCH 2/2 v3] drm/exynos: added userptr feature Inki Dae
@ 2012-05-09 14:45       ` Jerome Glisse
  2012-05-09 18:32         ` Jerome Glisse
  2012-05-10  1:39         ` Inki Dae
  0 siblings, 2 replies; 77+ messages in thread
From: Jerome Glisse @ 2012-05-09 14:45 UTC (permalink / raw)
  To: Inki Dae; +Cc: airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> this feature is used to import user space region allocated by malloc() or
> mmaped into a gem. and to guarantee the pages to user space not to be
> swapped out, the VMAs within the user space would be locked and then unlocked
> when the pages are released.
>
> but this lock might result in significant degradation of system performance
> because the pages couldn't be swapped out so we limit user-desired userptr
> size to pre-defined.
>
> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>


Again i would like feedback from mm people (adding cc). I am not sure
locking the vma is the right anwser as i said in my previous mail,
userspace can munlock it in your back, maybe VM_RESERVED is better.
Anyway even not considering that you don't check at all that process
don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
for how it's done. Also you mlock complete vma but the userptr you get
might be inside say 16M vma and you only care about 1M of userptr, if
you mark the whole vma as locked than anytime a new page is fault in
the vma else where than in the buffer you are interested then it got
allocated for ever until the gem buffer is destroy, i am not sure of
what happen to the vma on next malloc if it grows or not (i would
think it won't grow at it would have different flags than new
anonymous memory).

The whole business of directly using malloced memory for gpu is fishy
and i would really like to get it right rather than relying on never
hitting strange things like page migration, vma merging, or worse
things like over locking pages and stealing memory.

Cheers,
Jerome


> ---
>  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
>  drivers/gpu/drm/exynos/exynos_drm_gem.c |  334 +++++++++++++++++++++++++++++++
>  drivers/gpu/drm/exynos/exynos_drm_gem.h |   17 ++-
>  include/drm/exynos_drm.h                |   26 +++-
>  4 files changed, 376 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> index 1e68ec2..e8ae3f1 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> @@ -220,6 +220,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
>        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MAP_OFFSET,
>                        exynos_drm_gem_map_offset_ioctl, DRM_UNLOCKED |
>                        DRM_AUTH),
> +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
> +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
>        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
>                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
>        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> index e6abb66..ccc6e3d 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> @@ -68,6 +68,83 @@ static int check_gem_flags(unsigned int flags)
>        return 0;
>  }
>
> +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
> +{
> +       struct vm_area_struct *vma_copy;
> +
> +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
> +       if (!vma_copy)
> +               return NULL;
> +
> +       if (vma->vm_ops && vma->vm_ops->open)
> +               vma->vm_ops->open(vma);
> +
> +       if (vma->vm_file)
> +               get_file(vma->vm_file);
> +
> +       memcpy(vma_copy, vma, sizeof(*vma));
> +
> +       vma_copy->vm_mm = NULL;
> +       vma_copy->vm_next = NULL;
> +       vma_copy->vm_prev = NULL;
> +
> +       return vma_copy;
> +}
> +
> +
> +static void put_vma(struct vm_area_struct *vma)
> +{
> +       if (!vma)
> +               return;
> +
> +       if (vma->vm_ops && vma->vm_ops->close)
> +               vma->vm_ops->close(vma);
> +
> +       if (vma->vm_file)
> +               fput(vma->vm_file);
> +
> +       kfree(vma);
> +}
> +
> +/*
> + * lock_userptr_vma - lock VMAs within user address space
> + *
> + * this function locks vma within user address space to avoid pages
> + * to the userspace from being swapped out.
> + * if this vma isn't locked, the pages to the userspace could be swapped out
> + * so unprivileged user might access different pages and dma of any device
> + * could access physical memory region not intended once swap-in.
> + */
> +static int lock_userptr_vma(struct exynos_drm_gem_buf *buf, unsigned int lock)
> +{
> +       struct vm_area_struct *vma;
> +       unsigned long start, end;
> +
> +       start = buf->userptr;
> +       end = buf->userptr + buf->size - 1;
> +
> +       down_write(&current->mm->mmap_sem);
> +
> +       do {
> +               vma = find_vma(current->mm, start);
> +               if (!vma) {
> +                       up_write(&current->mm->mmap_sem);
> +                       return -EFAULT;
> +               }
> +
> +               if (lock)
> +                       vma->vm_flags |= VM_LOCKED;
> +               else
> +                       vma->vm_flags &= ~VM_LOCKED;
> +
> +               start = vma->vm_end + 1;
> +       } while (vma->vm_end < end);
> +
> +       up_write(&current->mm->mmap_sem);
> +
> +       return 0;
> +}
> +
>  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
>                                        struct vm_area_struct *vma)
>  {
> @@ -256,6 +333,44 @@ static void exynos_drm_gem_put_pages(struct drm_gem_object *obj)
>        /* add some codes for UNCACHED type here. TODO */
>  }
>
> +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
> +{
> +       struct exynos_drm_gem_obj *exynos_gem_obj;
> +       struct exynos_drm_gem_buf *buf;
> +       struct vm_area_struct *vma;
> +       int npages;
> +
> +       exynos_gem_obj = to_exynos_gem_obj(obj);
> +       buf = exynos_gem_obj->buffer;
> +       vma = exynos_gem_obj->vma;
> +
> +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> +               put_vma(exynos_gem_obj->vma);
> +               goto out;
> +       }
> +
> +       npages = buf->size >> PAGE_SHIFT;
> +
> +       if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR && !buf->pfnmap)
> +               lock_userptr_vma(buf, 0);
> +
> +       npages--;
> +       while (npages >= 0) {
> +               if (buf->write)
> +                       set_page_dirty_lock(buf->pages[npages]);
> +
> +               put_page(buf->pages[npages]);
> +               npages--;
> +       }
> +
> +out:
> +       kfree(buf->pages);
> +       buf->pages = NULL;
> +
> +       kfree(buf->sgt);
> +       buf->sgt = NULL;
> +}
> +
>  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
>                                        struct drm_file *file_priv,
>                                        unsigned int *handle)
> @@ -295,6 +410,8 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
>
>        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
>                exynos_drm_gem_put_pages(obj);
> +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
> +               exynos_drm_put_userptr(obj);
>        else
>                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags, buf);
>
> @@ -606,6 +723,223 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>        return 0;
>  }
>
> +static int exynos_drm_get_userptr(struct drm_device *dev,
> +                               struct exynos_drm_gem_obj *obj,
> +                               unsigned long userptr,
> +                               unsigned int write)
> +{
> +       unsigned int get_npages;
> +       unsigned long npages = 0;
> +       struct vm_area_struct *vma;
> +       struct exynos_drm_gem_buf *buf = obj->buffer;
> +       int ret;
> +
> +       down_read(&current->mm->mmap_sem);
> +       vma = find_vma(current->mm, userptr);
> +
> +       /* the memory region mmaped with VM_PFNMAP. */
> +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> +               unsigned long this_pfn, prev_pfn, pa;
> +               unsigned long start, end, offset;
> +               struct scatterlist *sgl;
> +               int ret;
> +
> +               start = userptr;
> +               offset = userptr & ~PAGE_MASK;
> +               end = start + buf->size;
> +               sgl = buf->sgt->sgl;
> +
> +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
> +                       ret = follow_pfn(vma, start, &this_pfn);
> +                       if (ret)
> +                               goto err;
> +
> +                       if (prev_pfn == 0) {
> +                               pa = this_pfn << PAGE_SHIFT;
> +                               buf->dma_addr = pa + offset;
> +                       } else if (this_pfn != prev_pfn + 1) {
> +                               ret = -EINVAL;
> +                               goto err;
> +                       }
> +
> +                       sg_dma_address(sgl) = (pa + offset);
> +                       sg_dma_len(sgl) = PAGE_SIZE;
> +                       prev_pfn = this_pfn;
> +                       pa += PAGE_SIZE;
> +                       npages++;
> +                       sgl = sg_next(sgl);
> +               }
> +
> +               obj->vma = get_vma(vma);
> +               if (!obj->vma) {
> +                       ret = -ENOMEM;
> +                       goto err;
> +               }
> +
> +               up_read(&current->mm->mmap_sem);
> +               buf->pfnmap = true;
> +
> +               return npages;
> +err:
> +               buf->dma_addr = 0;
> +               up_read(&current->mm->mmap_sem);
> +
> +               return ret;
> +       }
> +
> +       up_read(&current->mm->mmap_sem);
> +
> +       /*
> +        * lock the vma within userptr to avoid userspace buffer
> +        * from being swapped out.
> +        */
> +       ret = lock_userptr_vma(buf, 1);
> +       if (ret < 0) {
> +               DRM_ERROR("failed to lock vma for userptr.\n");
> +               lock_userptr_vma(buf, 0);
> +               return 0;
> +       }
> +
> +       buf->write = write;
> +       npages = buf->size >> PAGE_SHIFT;
> +
> +       down_read(&current->mm->mmap_sem);
> +       get_npages = get_user_pages(current, current->mm, userptr,
> +                                       npages, write, 1, buf->pages, NULL);
> +       up_read(&current->mm->mmap_sem);
> +       if (get_npages != npages)
> +               DRM_ERROR("failed to get user_pages.\n");
> +
> +       buf->userptr = userptr;
> +       buf->pfnmap = false;
> +
> +       return get_npages;
> +}
> +
> +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> +                                     struct drm_file *file_priv)
> +{
> +       struct exynos_drm_private *priv = dev->dev_private;
> +       struct exynos_drm_gem_obj *exynos_gem_obj;
> +       struct drm_exynos_gem_userptr *args = data;
> +       struct exynos_drm_gem_buf *buf;
> +       struct scatterlist *sgl;
> +       unsigned long size, userptr;
> +       unsigned int npages;
> +       int ret, get_npages;
> +
> +       DRM_DEBUG_KMS("%s\n", __FILE__);
> +
> +       if (!args->size) {
> +               DRM_ERROR("invalid size.\n");
> +               return -EINVAL;
> +       }
> +
> +       ret = check_gem_flags(args->flags);
> +       if (ret)
> +               return ret;
> +
> +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
> +
> +       if (size > priv->userptr_limit) {
> +               DRM_ERROR("excessed maximum size of userptr.\n");
> +               return -EINVAL;
> +       }
> +
> +       userptr = args->userptr;
> +
> +       buf = exynos_drm_init_buf(dev, size);
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
> +       if (!exynos_gem_obj) {
> +               ret = -ENOMEM;
> +               goto err_free_buffer;
> +       }
> +
> +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> +       if (!buf->sgt) {
> +               DRM_ERROR("failed to allocate buf->sgt.\n");
> +               ret = -ENOMEM;
> +               goto err_release_gem;
> +       }
> +
> +       npages = size >> PAGE_SHIFT;
> +
> +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
> +       if (ret < 0) {
> +               DRM_ERROR("failed to initailize sg table.\n");
> +               goto err_free_sgt;
> +       }
> +
> +       buf->pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
> +       if (!buf->pages) {
> +               DRM_ERROR("failed to allocate buf->pages\n");
> +               ret = -ENOMEM;
> +               goto err_free_table;
> +       }
> +
> +       exynos_gem_obj->buffer = buf;
> +
> +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj, userptr, 1);
> +       if (get_npages != npages) {
> +               DRM_ERROR("failed to get user_pages.\n");
> +               ret = get_npages;
> +               goto err_release_userptr;
> +       }
> +
> +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base, file_priv,
> +                                               &args->handle);
> +       if (ret < 0) {
> +               DRM_ERROR("failed to create gem handle.\n");
> +               goto err_release_userptr;
> +       }
> +
> +       sgl = buf->sgt->sgl;
> +
> +       /*
> +        * if buf->pfnmap is true then update sgl of sgt with pages but
> +        * if buf->pfnmap is false then it means the sgl was updated already
> +        * so it doesn't need to update the sgl.
> +        */
> +       if (!buf->pfnmap) {
> +               unsigned int i = 0;
> +
> +               /* set all pages to sg list. */
> +               while (i < npages) {
> +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
> +                       sg_dma_address(sgl) = page_to_phys(buf->pages[i]);
> +                       i++;
> +                       sgl = sg_next(sgl);
> +               }
> +       }
> +
> +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
> +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
> +
> +       return 0;
> +
> +err_release_userptr:
> +       get_npages--;
> +       while (get_npages >= 0)
> +               put_page(buf->pages[get_npages--]);
> +       kfree(buf->pages);
> +       buf->pages = NULL;
> +err_free_table:
> +       sg_free_table(buf->sgt);
> +err_free_sgt:
> +       kfree(buf->sgt);
> +       buf->sgt = NULL;
> +err_release_gem:
> +       drm_gem_object_release(&exynos_gem_obj->base);
> +       kfree(exynos_gem_obj);
> +       exynos_gem_obj = NULL;
> +err_free_buffer:
> +       exynos_drm_free_buf(dev, 0, buf);
> +       return ret;
> +}
> +
>  int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
>                                      struct drm_file *file_priv)
>  {      struct exynos_drm_gem_obj *exynos_gem_obj;
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> index 3334c9f..72bd993 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> @@ -29,27 +29,35 @@
>  #define to_exynos_gem_obj(x)   container_of(x,\
>                        struct exynos_drm_gem_obj, base)
>
> -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
> +#define IS_NONCONTIG_BUFFER(f) ((f & EXYNOS_BO_NONCONTIG) ||\
> +                                       (f & EXYNOS_BO_USERPTR))
>
>  /*
>  * exynos drm gem buffer structure.
>  *
>  * @kvaddr: kernel virtual address to allocated memory region.
> + * @userptr: user space address.
>  * @dma_addr: bus address(accessed by dma) to allocated memory region.
>  *     - this address could be physical address without IOMMU and
>  *     device address with IOMMU.
> + * @write: whether pages will be written to by the caller.
>  * @sgt: sg table to transfer page data.
>  * @pages: contain all pages to allocated memory region.
>  * @page_size: could be 4K, 64K or 1MB.
>  * @size: size of allocated memory region.
> + * @pfnmap: indicate whether memory region from userptr is mmaped with
> + *     VM_PFNMAP or not.
>  */
>  struct exynos_drm_gem_buf {
>        void __iomem            *kvaddr;
> +       unsigned long           userptr;
>        dma_addr_t              dma_addr;
> +       unsigned int            write;
>        struct sg_table         *sgt;
>        struct page             **pages;
>        unsigned long           page_size;
>        unsigned long           size;
> +       bool                    pfnmap;
>  };
>
>  /*
> @@ -64,6 +72,8 @@ struct exynos_drm_gem_buf {
>  *     continuous memory region allocated by user request
>  *     or at framebuffer creation.
>  * @size: total memory size to physically non-continuous memory region.
> + * @vma: a pointer to the vma to user address space and used to release
> + *     the pages to user space.
>  * @flags: indicate memory type to allocated buffer and cache attruibute.
>  *
>  * P.S. this object would be transfered to user as kms_bo.handle so
> @@ -73,6 +83,7 @@ struct exynos_drm_gem_obj {
>        struct drm_gem_object           base;
>        struct exynos_drm_gem_buf       *buffer;
>        unsigned long                   size;
> +       struct vm_area_struct           *vma;
>        unsigned int                    flags;
>  };
>
> @@ -130,6 +141,10 @@ int exynos_drm_gem_map_offset_ioctl(struct drm_device *dev, void *data,
>  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>                              struct drm_file *file_priv);
>
> +/* map user space allocated by malloc to pages. */
> +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> +                                     struct drm_file *file_priv);
> +
>  /* get buffer information to memory region allocated by gem. */
>  int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
>                                      struct drm_file *file_priv);
> diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> index 52465dc..33fa1e2 100644
> --- a/include/drm/exynos_drm.h
> +++ b/include/drm/exynos_drm.h
> @@ -77,6 +77,23 @@ struct drm_exynos_gem_mmap {
>  };
>
>  /**
> + * User-requested user space importing structure
> + *
> + * @userptr: user space address allocated by malloc.
> + * @size: size to the buffer allocated by malloc.
> + * @flags: indicate user-desired cache attribute to map the allocated buffer
> + *     to kernel space.
> + * @handle: a returned handle to created gem object.
> + *     - this handle will be set by gem module of kernel side.
> + */
> +struct drm_exynos_gem_userptr {
> +       uint64_t userptr;
> +       uint64_t size;
> +       unsigned int flags;
> +       unsigned int handle;
> +};
> +
> +/**
>  * A structure to gem information.
>  *
>  * @handle: a handle to gem object created.
> @@ -135,8 +152,10 @@ enum e_drm_exynos_gem_mem_type {
>        EXYNOS_BO_CACHABLE      = 1 << 1,
>        /* write-combine mapping. */
>        EXYNOS_BO_WC            = 1 << 2,
> +       /* user space memory allocated by malloc. */
> +       EXYNOS_BO_USERPTR       = 1 << 3,
>        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG | EXYNOS_BO_CACHABLE |
> -                                       EXYNOS_BO_WC
> +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
>  };
>
>  struct drm_exynos_g2d_get_ver {
> @@ -173,7 +192,7 @@ struct drm_exynos_g2d_exec {
>  #define DRM_EXYNOS_GEM_CREATE          0x00
>  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
>  #define DRM_EXYNOS_GEM_MMAP            0x02
> -/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
> +#define DRM_EXYNOS_GEM_USERPTR         0x03
>  #define DRM_EXYNOS_GEM_GET             0x04
>  #define DRM_EXYNOS_USER_LIMIT          0x05
>  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
> @@ -193,6 +212,9 @@ struct drm_exynos_g2d_exec {
>  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
>                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
>
> +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
> +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
> +
>  #define DRM_IOCTL_EXYNOS_GEM_GET       DRM_IOWR(DRM_COMMAND_BASE + \
>                DRM_EXYNOS_GEM_GET,     struct drm_exynos_gem_info)
>
> --
> 1.7.4.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-09 14:45       ` Jerome Glisse
@ 2012-05-09 18:32         ` Jerome Glisse
  2012-05-10  2:44           ` Inki Dae
  2012-05-10  1:39         ` Inki Dae
  1 sibling, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-09 18:32 UTC (permalink / raw)
  To: Inki Dae; +Cc: airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

On Wed, May 9, 2012 at 10:45 AM, Jerome Glisse <j.glisse@gmail.com> wrote:
> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> this feature is used to import user space region allocated by malloc() or
>> mmaped into a gem. and to guarantee the pages to user space not to be
>> swapped out, the VMAs within the user space would be locked and then unlocked
>> when the pages are released.
>>
>> but this lock might result in significant degradation of system performance
>> because the pages couldn't be swapped out so we limit user-desired userptr
>> size to pre-defined.
>>
>> Signed-off-by: Inki Dae <inki.dae@samsung.com>
>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>
>
> Again i would like feedback from mm people (adding cc). I am not sure
> locking the vma is the right anwser as i said in my previous mail,
> userspace can munlock it in your back, maybe VM_RESERVED is better.
> Anyway even not considering that you don't check at all that process
> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
> for how it's done. Also you mlock complete vma but the userptr you get
> might be inside say 16M vma and you only care about 1M of userptr, if
> you mark the whole vma as locked than anytime a new page is fault in
> the vma else where than in the buffer you are interested then it got
> allocated for ever until the gem buffer is destroy, i am not sure of
> what happen to the vma on next malloc if it grows or not (i would
> think it won't grow at it would have different flags than new
> anonymous memory).
>
> The whole business of directly using malloced memory for gpu is fishy
> and i would really like to get it right rather than relying on never
> hitting strange things like page migration, vma merging, or worse
> things like over locking pages and stealing memory.
>
> Cheers,
> Jerome

I had a lengthy discussion with mm people (thx a lot for that). I
think we should split 2 different use case. The zero-copy upload case
ie :
app:
    ptr = malloc()
    ...
    glTex/VBO/UBO/...(ptr)
    free(ptr) or reuse it for other things
For which i guess you want to avoid having to do a memcpy inside the
gl library (could be anything else than gl that have same useage
pattern).

ie after the upload happen you don't care about those page they can
removed from the vma or marked as cow so that anything messing with
those page after the upload won't change what you uploaded. Of course
this is assuming that the tlb cost of doing such thing is smaller than
the cost of memcpy the data.

Two way to do that, either you assume app can't not read back data
after gl can and you do an unmap_mapping_range (make sure you only
unmap fully covered page and that you copy non fully covered page) or
you want to allow userspace to still read data or possibly overwrite
them

Second use case is something more like for the opencl case of
CL_MEM_USE_HOST_PTR, in which you want to use the same page in the gpu
and keep the userspace vma pointing to those page. I think the
agreement on this case is that there is no way right now to do it
sanely inside linux kernel. mlocking will need proper accounting
against rtlimit but this limit might be low. Also the fork case might
be problematic.

For the fork case the memory is anonymous so it should be COWed in the
fork child but relative to cl context that means the child could not
use the cl context with that memory or at least if the child write to
this memory the cl will not see those change. I guess the answer to
that one is that you really need to use the cl api to read the object
or get proper ptr to read it.

Anyway in all case, implementing this userptr thing need a lot more
code. You have to check to that the vma you are trying to use is
anonymous and only handle this case and fallback to alloc new page and
copy otherwise..

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-09 14:45       ` Jerome Glisse
  2012-05-09 18:32         ` Jerome Glisse
@ 2012-05-10  1:39         ` Inki Dae
  2012-05-10  4:58           ` Minchan Kim
  1 sibling, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-10  1:39 UTC (permalink / raw)
  To: 'Jerome Glisse'
  Cc: airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

Hi Jerome,

> -----Original Message-----
> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> Sent: Wednesday, May 09, 2012 11:46 PM
> To: Inki Dae
> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> 
> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> > this feature is used to import user space region allocated by malloc()
> or
> > mmaped into a gem. and to guarantee the pages to user space not to be
> > swapped out, the VMAs within the user space would be locked and then
> unlocked
> > when the pages are released.
> >
> > but this lock might result in significant degradation of system
> performance
> > because the pages couldn't be swapped out so we limit user-desired
> userptr
> > size to pre-defined.
> >
> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> 
> 
> Again i would like feedback from mm people (adding cc). I am not sure

Thank you, I missed adding mm as cc.

> locking the vma is the right anwser as i said in my previous mail,
> userspace can munlock it in your back, maybe VM_RESERVED is better.

I know that with VM_RESERVED flag, also we can avoid the pages from being
swapped out. but these pages should be unlocked anytime we want because we
could allocate all pages on system and lock them, which in turn, it may
result in significant deterioration of system performance.(maybe other
processes requesting free memory would be blocked) so I used VM_LOCKED flags
instead. but I'm not sure this way is best also.

> Anyway even not considering that you don't check at all that process
> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK

Thank you for your advices.

> for how it's done. Also you mlock complete vma but the userptr you get
> might be inside say 16M vma and you only care about 1M of userptr, if
> you mark the whole vma as locked than anytime a new page is fault in
> the vma else where than in the buffer you are interested then it got
> allocated for ever until the gem buffer is destroy, i am not sure of
> what happen to the vma on next malloc if it grows or not (i would
> think it won't grow at it would have different flags than new
> anonymous memory).
> 
> The whole business of directly using malloced memory for gpu is fishy
> and i would really like to get it right rather than relying on never
> hitting strange things like page migration, vma merging, or worse
> things like over locking pages and stealing memory.
> 

Your comments are very helpful to me and I will consider some cases I missed
and you pointed out for next patch.

Thanks,
Inki Dae

> Cheers,
> Jerome
> 
> 
> > ---
> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  334
> +++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   17 ++-
> >  include/drm/exynos_drm.h                |   26 +++-
> >  4 files changed, 376 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > index 1e68ec2..e8ae3f1 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > @@ -220,6 +220,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MAP_OFFSET,
> >                        exynos_drm_gem_map_offset_ioctl, DRM_UNLOCKED |
> >                        DRM_AUTH),
> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
> > +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
DRM_AUTH),
> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > index e6abb66..ccc6e3d 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > @@ -68,6 +68,83 @@ static int check_gem_flags(unsigned int flags)
> >        return 0;
> >  }
> >
> > +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
> > +{
> > +       struct vm_area_struct *vma_copy;
> > +
> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
> > +       if (!vma_copy)
> > +               return NULL;
> > +
> > +       if (vma->vm_ops && vma->vm_ops->open)
> > +               vma->vm_ops->open(vma);
> > +
> > +       if (vma->vm_file)
> > +               get_file(vma->vm_file);
> > +
> > +       memcpy(vma_copy, vma, sizeof(*vma));
> > +
> > +       vma_copy->vm_mm = NULL;
> > +       vma_copy->vm_next = NULL;
> > +       vma_copy->vm_prev = NULL;
> > +
> > +       return vma_copy;
> > +}
> > +
> > +
> > +static void put_vma(struct vm_area_struct *vma)
> > +{
> > +       if (!vma)
> > +               return;
> > +
> > +       if (vma->vm_ops && vma->vm_ops->close)
> > +               vma->vm_ops->close(vma);
> > +
> > +       if (vma->vm_file)
> > +               fput(vma->vm_file);
> > +
> > +       kfree(vma);
> > +}
> > +
> > +/*
> > + * lock_userptr_vma - lock VMAs within user address space
> > + *
> > + * this function locks vma within user address space to avoid pages
> > + * to the userspace from being swapped out.
> > + * if this vma isn't locked, the pages to the userspace could be
> swapped out
> > + * so unprivileged user might access different pages and dma of any
> device
> > + * could access physical memory region not intended once swap-in.
> > + */
> > +static int lock_userptr_vma(struct exynos_drm_gem_buf *buf, unsigned
> int lock)
> > +{
> > +       struct vm_area_struct *vma;
> > +       unsigned long start, end;
> > +
> > +       start = buf->userptr;
> > +       end = buf->userptr + buf->size - 1;
> > +
> > +       down_write(&current->mm->mmap_sem);
> > +
> > +       do {
> > +               vma = find_vma(current->mm, start);
> > +               if (!vma) {
> > +                       up_write(&current->mm->mmap_sem);
> > +                       return -EFAULT;
> > +               }
> > +
> > +               if (lock)
> > +                       vma->vm_flags |= VM_LOCKED;
> > +               else
> > +                       vma->vm_flags &= ~VM_LOCKED;
> > +
> > +               start = vma->vm_end + 1;
> > +       } while (vma->vm_end < end);
> > +
> > +       up_write(&current->mm->mmap_sem);
> > +
> > +       return 0;
> > +}
> > +
> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
> >                                        struct vm_area_struct *vma)
> >  {
> > @@ -256,6 +333,44 @@ static void exynos_drm_gem_put_pages(struct
> drm_gem_object *obj)
> >        /* add some codes for UNCACHED type here. TODO */
> >  }
> >
> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
> > +{
> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> > +       struct exynos_drm_gem_buf *buf;
> > +       struct vm_area_struct *vma;
> > +       int npages;
> > +
> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
> > +       buf = exynos_gem_obj->buffer;
> > +       vma = exynos_gem_obj->vma;
> > +
> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> > +               put_vma(exynos_gem_obj->vma);
> > +               goto out;
> > +       }
> > +
> > +       npages = buf->size >> PAGE_SHIFT;
> > +
> > +       if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR && !buf->pfnmap)
> > +               lock_userptr_vma(buf, 0);
> > +
> > +       npages--;
> > +       while (npages >= 0) {
> > +               if (buf->write)
> > +                       set_page_dirty_lock(buf->pages[npages]);
> > +
> > +               put_page(buf->pages[npages]);
> > +               npages--;
> > +       }
> > +
> > +out:
> > +       kfree(buf->pages);
> > +       buf->pages = NULL;
> > +
> > +       kfree(buf->sgt);
> > +       buf->sgt = NULL;
> > +}
> > +
> >  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
> >                                        struct drm_file *file_priv,
> >                                        unsigned int *handle)
> > @@ -295,6 +410,8 @@ void exynos_drm_gem_destroy(struct
> exynos_drm_gem_obj *exynos_gem_obj)
> >
> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
> >                exynos_drm_gem_put_pages(obj);
> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
> > +               exynos_drm_put_userptr(obj);
> >        else
> >                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
buf);
> >
> > @@ -606,6 +723,223 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device
> *dev, void *data,
> >        return 0;
> >  }
> >
> > +static int exynos_drm_get_userptr(struct drm_device *dev,
> > +                               struct exynos_drm_gem_obj *obj,
> > +                               unsigned long userptr,
> > +                               unsigned int write)
> > +{
> > +       unsigned int get_npages;
> > +       unsigned long npages = 0;
> > +       struct vm_area_struct *vma;
> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
> > +       int ret;
> > +
> > +       down_read(&current->mm->mmap_sem);
> > +       vma = find_vma(current->mm, userptr);
> > +
> > +       /* the memory region mmaped with VM_PFNMAP. */
> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> > +               unsigned long this_pfn, prev_pfn, pa;
> > +               unsigned long start, end, offset;
> > +               struct scatterlist *sgl;
> > +               int ret;
> > +
> > +               start = userptr;
> > +               offset = userptr & ~PAGE_MASK;
> > +               end = start + buf->size;
> > +               sgl = buf->sgt->sgl;
> > +
> > +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
> > +                       ret = follow_pfn(vma, start, &this_pfn);
> > +                       if (ret)
> > +                               goto err;
> > +
> > +                       if (prev_pfn == 0) {
> > +                               pa = this_pfn << PAGE_SHIFT;
> > +                               buf->dma_addr = pa + offset;
> > +                       } else if (this_pfn != prev_pfn + 1) {
> > +                               ret = -EINVAL;
> > +                               goto err;
> > +                       }
> > +
> > +                       sg_dma_address(sgl) = (pa + offset);
> > +                       sg_dma_len(sgl) = PAGE_SIZE;
> > +                       prev_pfn = this_pfn;
> > +                       pa += PAGE_SIZE;
> > +                       npages++;
> > +                       sgl = sg_next(sgl);
> > +               }
> > +
> > +               obj->vma = get_vma(vma);
> > +               if (!obj->vma) {
> > +                       ret = -ENOMEM;
> > +                       goto err;
> > +               }
> > +
> > +               up_read(&current->mm->mmap_sem);
> > +               buf->pfnmap = true;
> > +
> > +               return npages;
> > +err:
> > +               buf->dma_addr = 0;
> > +               up_read(&current->mm->mmap_sem);
> > +
> > +               return ret;
> > +       }
> > +
> > +       up_read(&current->mm->mmap_sem);
> > +
> > +       /*
> > +        * lock the vma within userptr to avoid userspace buffer
> > +        * from being swapped out.
> > +        */
> > +       ret = lock_userptr_vma(buf, 1);
> > +       if (ret < 0) {
> > +               DRM_ERROR("failed to lock vma for userptr.\n");
> > +               lock_userptr_vma(buf, 0);
> > +               return 0;
> > +       }
> > +
> > +       buf->write = write;
> > +       npages = buf->size >> PAGE_SHIFT;
> > +
> > +       down_read(&current->mm->mmap_sem);
> > +       get_npages = get_user_pages(current, current->mm, userptr,
> > +                                       npages, write, 1, buf->pages,
NULL);
> > +       up_read(&current->mm->mmap_sem);
> > +       if (get_npages != npages)
> > +               DRM_ERROR("failed to get user_pages.\n");
> > +
> > +       buf->userptr = userptr;
> > +       buf->pfnmap = false;
> > +
> > +       return get_npages;
> > +}
> > +
> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> > +                                     struct drm_file *file_priv)
> > +{
> > +       struct exynos_drm_private *priv = dev->dev_private;
> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> > +       struct drm_exynos_gem_userptr *args = data;
> > +       struct exynos_drm_gem_buf *buf;
> > +       struct scatterlist *sgl;
> > +       unsigned long size, userptr;
> > +       unsigned int npages;
> > +       int ret, get_npages;
> > +
> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
> > +
> > +       if (!args->size) {
> > +               DRM_ERROR("invalid size.\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       ret = check_gem_flags(args->flags);
> > +       if (ret)
> > +               return ret;
> > +
> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
> > +
> > +       if (size > priv->userptr_limit) {
> > +               DRM_ERROR("excessed maximum size of userptr.\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       userptr = args->userptr;
> > +
> > +       buf = exynos_drm_init_buf(dev, size);
> > +       if (!buf)
> > +               return -ENOMEM;
> > +
> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
> > +       if (!exynos_gem_obj) {
> > +               ret = -ENOMEM;
> > +               goto err_free_buffer;
> > +       }
> > +
> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> > +       if (!buf->sgt) {
> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
> > +               ret = -ENOMEM;
> > +               goto err_release_gem;
> > +       }
> > +
> > +       npages = size >> PAGE_SHIFT;
> > +
> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
> > +       if (ret < 0) {
> > +               DRM_ERROR("failed to initailize sg table.\n");
> > +               goto err_free_sgt;
> > +       }
> > +
> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
GFP_KERNEL);
> > +       if (!buf->pages) {
> > +               DRM_ERROR("failed to allocate buf->pages\n");
> > +               ret = -ENOMEM;
> > +               goto err_free_table;
> > +       }
> > +
> > +       exynos_gem_obj->buffer = buf;
> > +
> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
userptr,
> 1);
> > +       if (get_npages != npages) {
> > +               DRM_ERROR("failed to get user_pages.\n");
> > +               ret = get_npages;
> > +               goto err_release_userptr;
> > +       }
> > +
> > +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base,
> file_priv,
> > +                                               &args->handle);
> > +       if (ret < 0) {
> > +               DRM_ERROR("failed to create gem handle.\n");
> > +               goto err_release_userptr;
> > +       }
> > +
> > +       sgl = buf->sgt->sgl;
> > +
> > +       /*
> > +        * if buf->pfnmap is true then update sgl of sgt with pages but
> > +        * if buf->pfnmap is false then it means the sgl was updated
> already
> > +        * so it doesn't need to update the sgl.
> > +        */
> > +       if (!buf->pfnmap) {
> > +               unsigned int i = 0;
> > +
> > +               /* set all pages to sg list. */
> > +               while (i < npages) {
> > +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
> > +                       sg_dma_address(sgl) =
page_to_phys(buf->pages[i]);
> > +                       i++;
> > +                       sgl = sg_next(sgl);
> > +               }
> > +       }
> > +
> > +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
> > +
> > +       return 0;
> > +
> > +err_release_userptr:
> > +       get_npages--;
> > +       while (get_npages >= 0)
> > +               put_page(buf->pages[get_npages--]);
> > +       kfree(buf->pages);
> > +       buf->pages = NULL;
> > +err_free_table:
> > +       sg_free_table(buf->sgt);
> > +err_free_sgt:
> > +       kfree(buf->sgt);
> > +       buf->sgt = NULL;
> > +err_release_gem:
> > +       drm_gem_object_release(&exynos_gem_obj->base);
> > +       kfree(exynos_gem_obj);
> > +       exynos_gem_obj = NULL;
> > +err_free_buffer:
> > +       exynos_drm_free_buf(dev, 0, buf);
> > +       return ret;
> > +}
> > +
> >  int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
> >                                      struct drm_file *file_priv)
> >  {      struct exynos_drm_gem_obj *exynos_gem_obj;
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > index 3334c9f..72bd993 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > @@ -29,27 +29,35 @@
> >  #define to_exynos_gem_obj(x)   container_of(x,\
> >                        struct exynos_drm_gem_obj, base)
> >
> > -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
> > +#define IS_NONCONTIG_BUFFER(f) ((f & EXYNOS_BO_NONCONTIG) ||\
> > +                                       (f & EXYNOS_BO_USERPTR))
> >
> >  /*
> >  * exynos drm gem buffer structure.
> >  *
> >  * @kvaddr: kernel virtual address to allocated memory region.
> > + * @userptr: user space address.
> >  * @dma_addr: bus address(accessed by dma) to allocated memory region.
> >  *     - this address could be physical address without IOMMU and
> >  *     device address with IOMMU.
> > + * @write: whether pages will be written to by the caller.
> >  * @sgt: sg table to transfer page data.
> >  * @pages: contain all pages to allocated memory region.
> >  * @page_size: could be 4K, 64K or 1MB.
> >  * @size: size of allocated memory region.
> > + * @pfnmap: indicate whether memory region from userptr is mmaped with
> > + *     VM_PFNMAP or not.
> >  */
> >  struct exynos_drm_gem_buf {
> >        void __iomem            *kvaddr;
> > +       unsigned long           userptr;
> >        dma_addr_t              dma_addr;
> > +       unsigned int            write;
> >        struct sg_table         *sgt;
> >        struct page             **pages;
> >        unsigned long           page_size;
> >        unsigned long           size;
> > +       bool                    pfnmap;
> >  };
> >
> >  /*
> > @@ -64,6 +72,8 @@ struct exynos_drm_gem_buf {
> >  *     continuous memory region allocated by user request
> >  *     or at framebuffer creation.
> >  * @size: total memory size to physically non-continuous memory region.
> > + * @vma: a pointer to the vma to user address space and used to release
> > + *     the pages to user space.
> >  * @flags: indicate memory type to allocated buffer and cache
attruibute.
> >  *
> >  * P.S. this object would be transfered to user as kms_bo.handle so
> > @@ -73,6 +83,7 @@ struct exynos_drm_gem_obj {
> >        struct drm_gem_object           base;
> >        struct exynos_drm_gem_buf       *buffer;
> >        unsigned long                   size;
> > +       struct vm_area_struct           *vma;
> >        unsigned int                    flags;
> >  };
> >
> > @@ -130,6 +141,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
> drm_device *dev, void *data,
> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
> >                              struct drm_file *file_priv);
> >
> > +/* map user space allocated by malloc to pages. */
> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> > +                                     struct drm_file *file_priv);
> > +
> >  /* get buffer information to memory region allocated by gem. */
> >  int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
> >                                      struct drm_file *file_priv);
> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> > index 52465dc..33fa1e2 100644
> > --- a/include/drm/exynos_drm.h
> > +++ b/include/drm/exynos_drm.h
> > @@ -77,6 +77,23 @@ struct drm_exynos_gem_mmap {
> >  };
> >
> >  /**
> > + * User-requested user space importing structure
> > + *
> > + * @userptr: user space address allocated by malloc.
> > + * @size: size to the buffer allocated by malloc.
> > + * @flags: indicate user-desired cache attribute to map the allocated
> buffer
> > + *     to kernel space.
> > + * @handle: a returned handle to created gem object.
> > + *     - this handle will be set by gem module of kernel side.
> > + */
> > +struct drm_exynos_gem_userptr {
> > +       uint64_t userptr;
> > +       uint64_t size;
> > +       unsigned int flags;
> > +       unsigned int handle;
> > +};
> > +
> > +/**
> >  * A structure to gem information.
> >  *
> >  * @handle: a handle to gem object created.
> > @@ -135,8 +152,10 @@ enum e_drm_exynos_gem_mem_type {
> >        EXYNOS_BO_CACHABLE      = 1 << 1,
> >        /* write-combine mapping. */
> >        EXYNOS_BO_WC            = 1 << 2,
> > +       /* user space memory allocated by malloc. */
> > +       EXYNOS_BO_USERPTR       = 1 << 3,
> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
EXYNOS_BO_CACHABLE
> |
> > -                                       EXYNOS_BO_WC
> > +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
> >  };
> >
> >  struct drm_exynos_g2d_get_ver {
> > @@ -173,7 +192,7 @@ struct drm_exynos_g2d_exec {
> >  #define DRM_EXYNOS_GEM_CREATE          0x00
> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
> >  #define DRM_EXYNOS_GEM_MMAP            0x02
> > -/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
> >  #define DRM_EXYNOS_GEM_GET             0x04
> >  #define DRM_EXYNOS_USER_LIMIT          0x05
> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
> > @@ -193,6 +212,9 @@ struct drm_exynos_g2d_exec {
> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
> >
> > +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
> > +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
> > +
> >  #define DRM_IOCTL_EXYNOS_GEM_GET       DRM_IOWR(DRM_COMMAND_BASE + \
> >                DRM_EXYNOS_GEM_GET,     struct drm_exynos_gem_info)
> >
> > --
> > 1.7.4.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-09 18:32         ` Jerome Glisse
@ 2012-05-10  2:44           ` Inki Dae
  2012-05-10 15:05             ` Jerome Glisse
  0 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-10  2:44 UTC (permalink / raw)
  To: 'Jerome Glisse'
  Cc: airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

Hi Jerome,

Thank you again.

> -----Original Message-----
> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> Sent: Thursday, May 10, 2012 3:33 AM
> To: Inki Dae
> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> 
> On Wed, May 9, 2012 at 10:45 AM, Jerome Glisse <j.glisse@gmail.com> wrote:
> > On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >> this feature is used to import user space region allocated by malloc()
> or
> >> mmaped into a gem. and to guarantee the pages to user space not to be
> >> swapped out, the VMAs within the user space would be locked and then
> unlocked
> >> when the pages are released.
> >>
> >> but this lock might result in significant degradation of system
> performance
> >> because the pages couldn't be swapped out so we limit user-desired
> userptr
> >> size to pre-defined.
> >>
> >> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >
> >
> > Again i would like feedback from mm people (adding cc). I am not sure
> > locking the vma is the right anwser as i said in my previous mail,
> > userspace can munlock it in your back, maybe VM_RESERVED is better.
> > Anyway even not considering that you don't check at all that process
> > don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
> > for how it's done. Also you mlock complete vma but the userptr you get
> > might be inside say 16M vma and you only care about 1M of userptr, if
> > you mark the whole vma as locked than anytime a new page is fault in
> > the vma else where than in the buffer you are interested then it got
> > allocated for ever until the gem buffer is destroy, i am not sure of
> > what happen to the vma on next malloc if it grows or not (i would
> > think it won't grow at it would have different flags than new
> > anonymous memory).
> >
> > The whole business of directly using malloced memory for gpu is fishy
> > and i would really like to get it right rather than relying on never
> > hitting strange things like page migration, vma merging, or worse
> > things like over locking pages and stealing memory.
> >
> > Cheers,
> > Jerome
> 
> I had a lengthy discussion with mm people (thx a lot for that). I
> think we should split 2 different use case. The zero-copy upload case
> ie :
> app:
>     ptr = malloc()
>     ...
>     glTex/VBO/UBO/...(ptr)
>     free(ptr) or reuse it for other things
> For which i guess you want to avoid having to do a memcpy inside the
> gl library (could be anything else than gl that have same useage
> pattern).
> 

Right, in this case, we are using the userptr feature as pixman and evas
backend to use 2d accelerator.

> ie after the upload happen you don't care about those page they can
> removed from the vma or marked as cow so that anything messing with
> those page after the upload won't change what you uploaded. Of course

I'm not sure that I understood your mentions but could the pages be removed
from vma with VM_LOCKED or VM_RESERVED? once glTex/VBO/UBO/..., the VMAs to
user space would be locked. if cpu accessed significant part of all the
pages in user mode then pages to the part would be allocated by page fault
handler, after that, through userptr, the VMAs to user address space would
be locked(at this time, the remaining pages would be allocated also by
get_user_pages by calling page fault handler) I'd be glad to give me any
comments and advices if there is my missing point.

> this is assuming that the tlb cost of doing such thing is smaller than
> the cost of memcpy the data.
> 

yes, in our test case, the tlb cost(incurred by tlb miss) was smaller than
the cost of memcpy also cpu usage. of course, this would be depended on gpu
performance.

> Two way to do that, either you assume app can't not read back data
> after gl can and you do an unmap_mapping_range (make sure you only
> unmap fully covered page and that you copy non fully covered page) or
> you want to allow userspace to still read data or possibly overwrite
> them
> 
> Second use case is something more like for the opencl case of
> CL_MEM_USE_HOST_PTR, in which you want to use the same page in the gpu
> and keep the userspace vma pointing to those page. I think the
> agreement on this case is that there is no way right now to do it
> sanely inside linux kernel. mlocking will need proper accounting
> against rtlimit but this limit might be low. Also the fork case might
> be problematic.
> 
> For the fork case the memory is anonymous so it should be COWed in the
> fork child but relative to cl context that means the child could not
> use the cl context with that memory or at least if the child write to
> this memory the cl will not see those change. I guess the answer to
> that one is that you really need to use the cl api to read the object
> or get proper ptr to read it.
> 
> Anyway in all case, implementing this userptr thing need a lot more
> code. You have to check to that the vma you are trying to use is
> anonymous and only handle this case and fallback to alloc new page and
> copy otherwise..
> 

I'd like to say thank you again you gave me comments and advices in detail.
there may be my missing points but I will check it again.

Thanks,
Inki Dae

> Cheers,
> Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  1:39         ` Inki Dae
@ 2012-05-10  4:58           ` Minchan Kim
  2012-05-10  6:53             ` KOSAKI Motohiro
  2012-05-10  6:57             ` Inki Dae
  0 siblings, 2 replies; 77+ messages in thread
From: Minchan Kim @ 2012-05-10  4:58 UTC (permalink / raw)
  To: Inki Dae
  Cc: 'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

On 05/10/2012 10:39 AM, Inki Dae wrote:

> Hi Jerome,
> 
>> -----Original Message-----
>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>> Sent: Wednesday, May 09, 2012 11:46 PM
>> To: Inki Dae
>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>
>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>> this feature is used to import user space region allocated by malloc()
>> or
>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>> swapped out, the VMAs within the user space would be locked and then
>> unlocked
>>> when the pages are released.
>>>
>>> but this lock might result in significant degradation of system
>> performance
>>> because the pages couldn't be swapped out so we limit user-desired
>> userptr
>>> size to pre-defined.
>>>
>>> Signed-off-by: Inki Dae <inki.dae@samsung.com>
>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>
>>
>> Again i would like feedback from mm people (adding cc). I am not sure
> 
> Thank you, I missed adding mm as cc.
> 
>> locking the vma is the right anwser as i said in my previous mail,
>> userspace can munlock it in your back, maybe VM_RESERVED is better.
> 
> I know that with VM_RESERVED flag, also we can avoid the pages from being
> swapped out. but these pages should be unlocked anytime we want because we
> could allocate all pages on system and lock them, which in turn, it may
> result in significant deterioration of system performance.(maybe other
> processes requesting free memory would be blocked) so I used VM_LOCKED flags
> instead. but I'm not sure this way is best also.
> 
>> Anyway even not considering that you don't check at all that process
>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
> 
> Thank you for your advices.
> 
>> for how it's done. Also you mlock complete vma but the userptr you get
>> might be inside say 16M vma and you only care about 1M of userptr, if
>> you mark the whole vma as locked than anytime a new page is fault in
>> the vma else where than in the buffer you are interested then it got
>> allocated for ever until the gem buffer is destroy, i am not sure of
>> what happen to the vma on next malloc if it grows or not (i would
>> think it won't grow at it would have different flags than new
>> anonymous memory).


I don't know history in detail because you didn't have sent full patches to linux-mm and
I didn't read the below code, either.
Just read your description and reply of Jerome. Apparently, there is something I missed.

Your goal is to avoid swap out some user pages which is used in kernel at the same time. Right?
Let's use get_user_pages. Is there any issue you can't use it?
It increases page count so reclaimer can't swap out page.
Isn't it enough?
Marking whole VMA into MLCOKED is overkill.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  4:58           ` Minchan Kim
@ 2012-05-10  6:53             ` KOSAKI Motohiro
  2012-05-10  7:27               ` Minchan Kim
  2012-05-10  6:57             ` Inki Dae
  1 sibling, 1 reply; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-10  6:53 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Inki Dae, 'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm,
	kosaki.motohiro

(5/10/12 12:58 AM), Minchan Kim wrote:
> On 05/10/2012 10:39 AM, Inki Dae wrote:
>
>> Hi Jerome,
>>
>>> -----Original Message-----
>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>> Sent: Wednesday, May 09, 2012 11:46 PM
>>> To: Inki Dae
>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>
>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae<inki.dae@samsung.com>  wrote:
>>>> this feature is used to import user space region allocated by malloc()
>>> or
>>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>>> swapped out, the VMAs within the user space would be locked and then
>>> unlocked
>>>> when the pages are released.
>>>>
>>>> but this lock might result in significant degradation of system
>>> performance
>>>> because the pages couldn't be swapped out so we limit user-desired
>>> userptr
>>>> size to pre-defined.
>>>>
>>>> Signed-off-by: Inki Dae<inki.dae@samsung.com>
>>>> Signed-off-by: Kyungmin Park<kyungmin.park@samsung.com>
>>>
>>>
>>> Again i would like feedback from mm people (adding cc). I am not sure
>>
>> Thank you, I missed adding mm as cc.
>>
>>> locking the vma is the right anwser as i said in my previous mail,
>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
>>
>> I know that with VM_RESERVED flag, also we can avoid the pages from being
>> swapped out. but these pages should be unlocked anytime we want because we
>> could allocate all pages on system and lock them, which in turn, it may
>> result in significant deterioration of system performance.(maybe other
>> processes requesting free memory would be blocked) so I used VM_LOCKED flags
>> instead. but I'm not sure this way is best also.
>>
>>> Anyway even not considering that you don't check at all that process
>>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>>
>> Thank you for your advices.
>>
>>> for how it's done. Also you mlock complete vma but the userptr you get
>>> might be inside say 16M vma and you only care about 1M of userptr, if
>>> you mark the whole vma as locked than anytime a new page is fault in
>>> the vma else where than in the buffer you are interested then it got
>>> allocated for ever until the gem buffer is destroy, i am not sure of
>>> what happen to the vma on next malloc if it grows or not (i would
>>> think it won't grow at it would have different flags than new
>>> anonymous memory).
>
>
> I don't know history in detail because you didn't have sent full patches to linux-mm and
> I didn't read the below code, either.
> Just read your description and reply of Jerome. Apparently, there is something I missed.
>
> Your goal is to avoid swap out some user pages which is used in kernel at the same time. Right?
> Let's use get_user_pages. Is there any issue you can't use it?

Maybe because get_user_pages() is fork unsafe? dunno.



> It increases page count so reclaimer can't swap out page.
> Isn't it enough?
> Marking whole VMA into MLCOKED is overkill.
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  4:58           ` Minchan Kim
  2012-05-10  6:53             ` KOSAKI Motohiro
@ 2012-05-10  6:57             ` Inki Dae
  2012-05-10  7:05               ` Minchan Kim
  1 sibling, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-10  6:57 UTC (permalink / raw)
  To: 'Minchan Kim'
  Cc: 'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm



> -----Original Message-----
> From: Minchan Kim [mailto:minchan@kernel.org]
> Sent: Thursday, May 10, 2012 1:58 PM
> To: Inki Dae
> Cc: 'Jerome Glisse'; airlied@linux.ie; dri-devel@lists.freedesktop.org;
> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> 
> On 05/10/2012 10:39 AM, Inki Dae wrote:
> 
> > Hi Jerome,
> >
> >> -----Original Message-----
> >> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> >> Sent: Wednesday, May 09, 2012 11:46 PM
> >> To: Inki Dae
> >> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> >> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
> >> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> >>
> >> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >>> this feature is used to import user space region allocated by malloc()
> >> or
> >>> mmaped into a gem. and to guarantee the pages to user space not to be
> >>> swapped out, the VMAs within the user space would be locked and then
> >> unlocked
> >>> when the pages are released.
> >>>
> >>> but this lock might result in significant degradation of system
> >> performance
> >>> because the pages couldn't be swapped out so we limit user-desired
> >> userptr
> >>> size to pre-defined.
> >>>
> >>> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> >>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >>
> >>
> >> Again i would like feedback from mm people (adding cc). I am not sure
> >
> > Thank you, I missed adding mm as cc.
> >
> >> locking the vma is the right anwser as i said in my previous mail,
> >> userspace can munlock it in your back, maybe VM_RESERVED is better.
> >
> > I know that with VM_RESERVED flag, also we can avoid the pages from
> being
> > swapped out. but these pages should be unlocked anytime we want because
> we
> > could allocate all pages on system and lock them, which in turn, it may
> > result in significant deterioration of system performance.(maybe other
> > processes requesting free memory would be blocked) so I used VM_LOCKED
> flags
> > instead. but I'm not sure this way is best also.
> >
> >> Anyway even not considering that you don't check at all that process
> >> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
> >
> > Thank you for your advices.
> >
> >> for how it's done. Also you mlock complete vma but the userptr you get
> >> might be inside say 16M vma and you only care about 1M of userptr, if
> >> you mark the whole vma as locked than anytime a new page is fault in
> >> the vma else where than in the buffer you are interested then it got
> >> allocated for ever until the gem buffer is destroy, i am not sure of
> >> what happen to the vma on next malloc if it grows or not (i would
> >> think it won't grow at it would have different flags than new
> >> anonymous memory).
> 
> 
> I don't know history in detail because you didn't have sent full patches
> to linux-mm and
> I didn't read the below code, either.
> Just read your description and reply of Jerome. Apparently, there is
> something I missed.
> 
> Your goal is to avoid swap out some user pages which is used in kernel at
> the same time. Right?
> Let's use get_user_pages. Is there any issue you can't use it?
> It increases page count so reclaimer can't swap out page.
> Isn't it enough?
> Marking whole VMA into MLCOKED is overkill.
> 

As I mentioned, we are already using get_user_pages. as you said, this
function increases page count but just only things to the user address space
cpu already accessed. other would be allocated by page fault hander once
get_user_pages call. if so... ok, after that refcount(page->_count) of the
pages user already accessed would have 2 and just 1 for other all pages. so
we may have to consider only pages never accessed by cpu to be locked to
avoid from swapped out.

Thanks,
Inki Dae

> --
> Kind regards,
> Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  6:57             ` Inki Dae
@ 2012-05-10  7:05               ` Minchan Kim
  2012-05-10  7:59                 ` InKi Dae
  0 siblings, 1 reply; 77+ messages in thread
From: Minchan Kim @ 2012-05-10  7:05 UTC (permalink / raw)
  To: Inki Dae
  Cc: 'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

On 05/10/2012 03:57 PM, Inki Dae wrote:

> 
> 
>> -----Original Message-----
>> From: Minchan Kim [mailto:minchan@kernel.org]
>> Sent: Thursday, May 10, 2012 1:58 PM
>> To: Inki Dae
>> Cc: 'Jerome Glisse'; airlied@linux.ie; dri-devel@lists.freedesktop.org;
>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>
>> On 05/10/2012 10:39 AM, Inki Dae wrote:
>>
>>> Hi Jerome,
>>>
>>>> -----Original Message-----
>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>>> Sent: Wednesday, May 09, 2012 11:46 PM
>>>> To: Inki Dae
>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>
>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>>>> this feature is used to import user space region allocated by malloc()
>>>> or
>>>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>>>> swapped out, the VMAs within the user space would be locked and then
>>>> unlocked
>>>>> when the pages are released.
>>>>>
>>>>> but this lock might result in significant degradation of system
>>>> performance
>>>>> because the pages couldn't be swapped out so we limit user-desired
>>>> userptr
>>>>> size to pre-defined.
>>>>>
>>>>> Signed-off-by: Inki Dae <inki.dae@samsung.com>
>>>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>>>
>>>>
>>>> Again i would like feedback from mm people (adding cc). I am not sure
>>>
>>> Thank you, I missed adding mm as cc.
>>>
>>>> locking the vma is the right anwser as i said in my previous mail,
>>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
>>>
>>> I know that with VM_RESERVED flag, also we can avoid the pages from
>> being
>>> swapped out. but these pages should be unlocked anytime we want because
>> we
>>> could allocate all pages on system and lock them, which in turn, it may
>>> result in significant deterioration of system performance.(maybe other
>>> processes requesting free memory would be blocked) so I used VM_LOCKED
>> flags
>>> instead. but I'm not sure this way is best also.
>>>
>>>> Anyway even not considering that you don't check at all that process
>>>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>>>
>>> Thank you for your advices.
>>>
>>>> for how it's done. Also you mlock complete vma but the userptr you get
>>>> might be inside say 16M vma and you only care about 1M of userptr, if
>>>> you mark the whole vma as locked than anytime a new page is fault in
>>>> the vma else where than in the buffer you are interested then it got
>>>> allocated for ever until the gem buffer is destroy, i am not sure of
>>>> what happen to the vma on next malloc if it grows or not (i would
>>>> think it won't grow at it would have different flags than new
>>>> anonymous memory).
>>
>>
>> I don't know history in detail because you didn't have sent full patches
>> to linux-mm and
>> I didn't read the below code, either.
>> Just read your description and reply of Jerome. Apparently, there is
>> something I missed.
>>
>> Your goal is to avoid swap out some user pages which is used in kernel at
>> the same time. Right?
>> Let's use get_user_pages. Is there any issue you can't use it?
>> It increases page count so reclaimer can't swap out page.
>> Isn't it enough?
>> Marking whole VMA into MLCOKED is overkill.
>>
> 
> As I mentioned, we are already using get_user_pages. as you said, this
> function increases page count but just only things to the user address space
> cpu already accessed. other would be allocated by page fault hander once
> get_user_pages call. if so... ok, after that refcount(page->_count) of the


Not true. Look __get_user_pages.
It handles case you mentioned by handle_mm_fault.
Do I miss something?

> pages user already accessed would have 2 and just 1 for other all pages. so
> we may have to consider only pages never accessed by cpu to be locked to
> avoid from swapped out.
> 
> Thanks,
> Inki Dae
> 
>> --
>> Kind regards,
>> Minchan Kim
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  6:53             ` KOSAKI Motohiro
@ 2012-05-10  7:27               ` Minchan Kim
  2012-05-10  7:31                 ` Kyungmin Park
  0 siblings, 1 reply; 77+ messages in thread
From: Minchan Kim @ 2012-05-10  7:27 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Inki Dae, 'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

On 05/10/2012 03:53 PM, KOSAKI Motohiro wrote:

> (5/10/12 12:58 AM), Minchan Kim wrote:
>> On 05/10/2012 10:39 AM, Inki Dae wrote:
>>
>>> Hi Jerome,
>>>
>>>> -----Original Message-----
>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>>> Sent: Wednesday, May 09, 2012 11:46 PM
>>>> To: Inki Dae
>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>
>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae<inki.dae@samsung.com>  wrote:
>>>>> this feature is used to import user space region allocated by malloc()
>>>> or
>>>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>>>> swapped out, the VMAs within the user space would be locked and then
>>>> unlocked
>>>>> when the pages are released.
>>>>>
>>>>> but this lock might result in significant degradation of system
>>>> performance
>>>>> because the pages couldn't be swapped out so we limit user-desired
>>>> userptr
>>>>> size to pre-defined.
>>>>>
>>>>> Signed-off-by: Inki Dae<inki.dae@samsung.com>
>>>>> Signed-off-by: Kyungmin Park<kyungmin.park@samsung.com>
>>>>
>>>>
>>>> Again i would like feedback from mm people (adding cc). I am not sure
>>>
>>> Thank you, I missed adding mm as cc.
>>>
>>>> locking the vma is the right anwser as i said in my previous mail,
>>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
>>>
>>> I know that with VM_RESERVED flag, also we can avoid the pages from
>>> being
>>> swapped out. but these pages should be unlocked anytime we want
>>> because we
>>> could allocate all pages on system and lock them, which in turn, it may
>>> result in significant deterioration of system performance.(maybe other
>>> processes requesting free memory would be blocked) so I used
>>> VM_LOCKED flags
>>> instead. but I'm not sure this way is best also.
>>>
>>>> Anyway even not considering that you don't check at all that process
>>>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>>>
>>> Thank you for your advices.
>>>
>>>> for how it's done. Also you mlock complete vma but the userptr you get
>>>> might be inside say 16M vma and you only care about 1M of userptr, if
>>>> you mark the whole vma as locked than anytime a new page is fault in
>>>> the vma else where than in the buffer you are interested then it got
>>>> allocated for ever until the gem buffer is destroy, i am not sure of
>>>> what happen to the vma on next malloc if it grows or not (i would
>>>> think it won't grow at it would have different flags than new
>>>> anonymous memory).
>>
>>
>> I don't know history in detail because you didn't have sent full
>> patches to linux-mm and
>> I didn't read the below code, either.
>> Just read your description and reply of Jerome. Apparently, there is
>> something I missed.
>>
>> Your goal is to avoid swap out some user pages which is used in kernel
>> at the same time. Right?
>> Let's use get_user_pages. Is there any issue you can't use it?
> 
> Maybe because get_user_pages() is fork unsafe? dunno.


If there is such problem, I think user program should handle it by MADV_DONTFORK 
and make to allow write by only parent process.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  7:27               ` Minchan Kim
@ 2012-05-10  7:31                 ` Kyungmin Park
  2012-05-10  7:56                   ` Minchan Kim
  0 siblings, 1 reply; 77+ messages in thread
From: Kyungmin Park @ 2012-05-10  7:31 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, Inki Dae, Jerome Glisse, airlied, dri-devel,
	sw0312.kim, linux-mm

On 5/10/12, Minchan Kim <minchan@kernel.org> wrote:
> On 05/10/2012 03:53 PM, KOSAKI Motohiro wrote:
>
>> (5/10/12 12:58 AM), Minchan Kim wrote:
>>> On 05/10/2012 10:39 AM, Inki Dae wrote:
>>>
>>>> Hi Jerome,
>>>>
>>>>> -----Original Message-----
>>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>>>> Sent: Wednesday, May 09, 2012 11:46 PM
>>>>> To: Inki Dae
>>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>>
>>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae<inki.dae@samsung.com>  wrote:
>>>>>> this feature is used to import user space region allocated by
>>>>>> malloc()
>>>>> or
>>>>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>>>>> swapped out, the VMAs within the user space would be locked and then
>>>>> unlocked
>>>>>> when the pages are released.
>>>>>>
>>>>>> but this lock might result in significant degradation of system
>>>>> performance
>>>>>> because the pages couldn't be swapped out so we limit user-desired
>>>>> userptr
>>>>>> size to pre-defined.
>>>>>>
>>>>>> Signed-off-by: Inki Dae<inki.dae@samsung.com>
>>>>>> Signed-off-by: Kyungmin Park<kyungmin.park@samsung.com>
>>>>>
>>>>>
>>>>> Again i would like feedback from mm people (adding cc). I am not sure
>>>>
>>>> Thank you, I missed adding mm as cc.
>>>>
>>>>> locking the vma is the right anwser as i said in my previous mail,
>>>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
>>>>
>>>> I know that with VM_RESERVED flag, also we can avoid the pages from
>>>> being
>>>> swapped out. but these pages should be unlocked anytime we want
>>>> because we
>>>> could allocate all pages on system and lock them, which in turn, it may
>>>> result in significant deterioration of system performance.(maybe other
>>>> processes requesting free memory would be blocked) so I used
>>>> VM_LOCKED flags
>>>> instead. but I'm not sure this way is best also.
>>>>
>>>>> Anyway even not considering that you don't check at all that process
>>>>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>>>>
>>>> Thank you for your advices.
>>>>
>>>>> for how it's done. Also you mlock complete vma but the userptr you get
>>>>> might be inside say 16M vma and you only care about 1M of userptr, if
>>>>> you mark the whole vma as locked than anytime a new page is fault in
>>>>> the vma else where than in the buffer you are interested then it got
>>>>> allocated for ever until the gem buffer is destroy, i am not sure of
>>>>> what happen to the vma on next malloc if it grows or not (i would
>>>>> think it won't grow at it would have different flags than new
>>>>> anonymous memory).
>>>
>>>
>>> I don't know history in detail because you didn't have sent full
>>> patches to linux-mm and
>>> I didn't read the below code, either.
>>> Just read your description and reply of Jerome. Apparently, there is
>>> something I missed.
>>>
>>> Your goal is to avoid swap out some user pages which is used in kernel
>>> at the same time. Right?
>>> Let's use get_user_pages. Is there any issue you can't use it?
>>
>> Maybe because get_user_pages() is fork unsafe? dunno.
>
>
> If there is such problem, I think user program should handle it by
> MADV_DONTFORK
> and make to allow write by only parent process.
Please read the original patches and discuss the root cause. Does it
harm to pass user space memory to kernel space and how to make is
possible at DRM?

Thank you,
Kyungmin Park
>
> --
> Kind regards,
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign
> http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  7:31                 ` Kyungmin Park
@ 2012-05-10  7:56                   ` Minchan Kim
  2012-05-10  7:58                     ` Minchan Kim
  0 siblings, 1 reply; 77+ messages in thread
From: Minchan Kim @ 2012-05-10  7:56 UTC (permalink / raw)
  To: Kyungmin Park
  Cc: KOSAKI Motohiro, Inki Dae, Jerome Glisse, airlied, dri-devel,
	sw0312.kim, linux-mm

On 05/10/2012 04:31 PM, Kyungmin Park wrote:

> On 5/10/12, Minchan Kim <minchan@kernel.org> wrote:
>> On 05/10/2012 03:53 PM, KOSAKI Motohiro wrote:
>>
>>> (5/10/12 12:58 AM), Minchan Kim wrote:
>>>> On 05/10/2012 10:39 AM, Inki Dae wrote:
>>>>
>>>>> Hi Jerome,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>>>>> Sent: Wednesday, May 09, 2012 11:46 PM
>>>>>> To: Inki Dae
>>>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>>>
>>>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae<inki.dae@samsung.com>  wrote:
>>>>>>> this feature is used to import user space region allocated by
>>>>>>> malloc()
>>>>>> or
>>>>>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>>>>>> swapped out, the VMAs within the user space would be locked and then
>>>>>> unlocked
>>>>>>> when the pages are released.
>>>>>>>
>>>>>>> but this lock might result in significant degradation of system
>>>>>> performance
>>>>>>> because the pages couldn't be swapped out so we limit user-desired
>>>>>> userptr
>>>>>>> size to pre-defined.
>>>>>>>
>>>>>>> Signed-off-by: Inki Dae<inki.dae@samsung.com>
>>>>>>> Signed-off-by: Kyungmin Park<kyungmin.park@samsung.com>
>>>>>>
>>>>>>
>>>>>> Again i would like feedback from mm people (adding cc). I am not sure
>>>>>
>>>>> Thank you, I missed adding mm as cc.
>>>>>
>>>>>> locking the vma is the right anwser as i said in my previous mail,
>>>>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
>>>>>
>>>>> I know that with VM_RESERVED flag, also we can avoid the pages from
>>>>> being
>>>>> swapped out. but these pages should be unlocked anytime we want
>>>>> because we
>>>>> could allocate all pages on system and lock them, which in turn, it may
>>>>> result in significant deterioration of system performance.(maybe other
>>>>> processes requesting free memory would be blocked) so I used
>>>>> VM_LOCKED flags
>>>>> instead. but I'm not sure this way is best also.
>>>>>
>>>>>> Anyway even not considering that you don't check at all that process
>>>>>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>>>>>
>>>>> Thank you for your advices.
>>>>>
>>>>>> for how it's done. Also you mlock complete vma but the userptr you get
>>>>>> might be inside say 16M vma and you only care about 1M of userptr, if
>>>>>> you mark the whole vma as locked than anytime a new page is fault in
>>>>>> the vma else where than in the buffer you are interested then it got
>>>>>> allocated for ever until the gem buffer is destroy, i am not sure of
>>>>>> what happen to the vma on next malloc if it grows or not (i would
>>>>>> think it won't grow at it would have different flags than new
>>>>>> anonymous memory).
>>>>
>>>>
>>>> I don't know history in detail because you didn't have sent full
>>>> patches to linux-mm and
>>>> I didn't read the below code, either.
>>>> Just read your description and reply of Jerome. Apparently, there is
>>>> something I missed.
>>>>
>>>> Your goal is to avoid swap out some user pages which is used in kernel
>>>> at the same time. Right?
>>>> Let's use get_user_pages. Is there any issue you can't use it?
>>>
>>> Maybe because get_user_pages() is fork unsafe? dunno.
>>
>>
>> If there is such problem, I think user program should handle it by
>> MADV_DONTFORK
>> and make to allow write by only parent process.
> Please read the original patches and discuss the root cause. Does it
> harm to pass user space memory to kernel space and how to make is
> possible at DRM?


Where can I read original discussion history?
I am not expert of DRAM so I can answer only mm stuff and it's why Jerome ccing mm-list.
About mm stuff, I think it's no harm for kernel to use user space memory if it uses carefully.
If you are saying about permission, at least, DRM code can check it by can_do_mlock and checking lock_limit.
If you are saying another security, I'm not right person to discuss it.
please Ccing security@kernel.org.

Thanks.

> 
> Thank you,
> Kyungmin Park
>>
>> --
>> Kind regards,
>> Minchan Kim
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Fight unfair telecom internet charges in Canada: sign
>> http://stopthemeter.ca/
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  7:56                   ` Minchan Kim
@ 2012-05-10  7:58                     ` Minchan Kim
  0 siblings, 0 replies; 77+ messages in thread
From: Minchan Kim @ 2012-05-10  7:58 UTC (permalink / raw)
  Cc: Kyungmin Park, KOSAKI Motohiro, Inki Dae, Jerome Glisse, airlied,
	dri-devel, sw0312.kim, linux-mm

On 05/10/2012 04:56 PM, Minchan Kim wrote:

> On 05/10/2012 04:31 PM, Kyungmin Park wrote:
> 
>> On 5/10/12, Minchan Kim <minchan@kernel.org> wrote:
>>> On 05/10/2012 03:53 PM, KOSAKI Motohiro wrote:
>>>
>>>> (5/10/12 12:58 AM), Minchan Kim wrote:
>>>>> On 05/10/2012 10:39 AM, Inki Dae wrote:
>>>>>
>>>>>> Hi Jerome,
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>>>>>> Sent: Wednesday, May 09, 2012 11:46 PM
>>>>>>> To: Inki Dae
>>>>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>>>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>>>>
>>>>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae<inki.dae@samsung.com>  wrote:
>>>>>>>> this feature is used to import user space region allocated by
>>>>>>>> malloc()
>>>>>>> or
>>>>>>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>>>>>>> swapped out, the VMAs within the user space would be locked and then
>>>>>>> unlocked
>>>>>>>> when the pages are released.
>>>>>>>>
>>>>>>>> but this lock might result in significant degradation of system
>>>>>>> performance
>>>>>>>> because the pages couldn't be swapped out so we limit user-desired
>>>>>>> userptr
>>>>>>>> size to pre-defined.
>>>>>>>>
>>>>>>>> Signed-off-by: Inki Dae<inki.dae@samsung.com>
>>>>>>>> Signed-off-by: Kyungmin Park<kyungmin.park@samsung.com>
>>>>>>>
>>>>>>>
>>>>>>> Again i would like feedback from mm people (adding cc). I am not sure
>>>>>>
>>>>>> Thank you, I missed adding mm as cc.
>>>>>>
>>>>>>> locking the vma is the right anwser as i said in my previous mail,
>>>>>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
>>>>>>
>>>>>> I know that with VM_RESERVED flag, also we can avoid the pages from
>>>>>> being
>>>>>> swapped out. but these pages should be unlocked anytime we want
>>>>>> because we
>>>>>> could allocate all pages on system and lock them, which in turn, it may
>>>>>> result in significant deterioration of system performance.(maybe other
>>>>>> processes requesting free memory would be blocked) so I used
>>>>>> VM_LOCKED flags
>>>>>> instead. but I'm not sure this way is best also.
>>>>>>
>>>>>>> Anyway even not considering that you don't check at all that process
>>>>>>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>>>>>>
>>>>>> Thank you for your advices.
>>>>>>
>>>>>>> for how it's done. Also you mlock complete vma but the userptr you get
>>>>>>> might be inside say 16M vma and you only care about 1M of userptr, if
>>>>>>> you mark the whole vma as locked than anytime a new page is fault in
>>>>>>> the vma else where than in the buffer you are interested then it got
>>>>>>> allocated for ever until the gem buffer is destroy, i am not sure of
>>>>>>> what happen to the vma on next malloc if it grows or not (i would
>>>>>>> think it won't grow at it would have different flags than new
>>>>>>> anonymous memory).
>>>>>
>>>>>
>>>>> I don't know history in detail because you didn't have sent full
>>>>> patches to linux-mm and
>>>>> I didn't read the below code, either.
>>>>> Just read your description and reply of Jerome. Apparently, there is
>>>>> something I missed.
>>>>>
>>>>> Your goal is to avoid swap out some user pages which is used in kernel
>>>>> at the same time. Right?
>>>>> Let's use get_user_pages. Is there any issue you can't use it?
>>>>
>>>> Maybe because get_user_pages() is fork unsafe? dunno.
>>>
>>>
>>> If there is such problem, I think user program should handle it by
>>> MADV_DONTFORK
>>> and make to allow write by only parent process.
>> Please read the original patches and discuss the root cause. Does it
>> harm to pass user space memory to kernel space and how to make is
>> possible at DRM?
> 
> 
> Where can I read original discussion history?
> I am not expert of DRAM so I can answer only mm stuff and it's why Jerome ccing mm-list.

                     ^^^ silly typo
                     DRM

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  7:05               ` Minchan Kim
@ 2012-05-10  7:59                 ` InKi Dae
  2012-05-10  8:11                   ` Minchan Kim
  0 siblings, 1 reply; 77+ messages in thread
From: InKi Dae @ 2012-05-10  7:59 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Inki Dae, Jerome Glisse, airlied, dri-devel, kyungmin.park,
	sw0312.kim, linux-mm

2012/5/10, Minchan Kim <minchan@kernel.org>:
> On 05/10/2012 03:57 PM, Inki Dae wrote:
>
>>
>>
>>> -----Original Message-----
>>> From: Minchan Kim [mailto:minchan@kernel.org]
>>> Sent: Thursday, May 10, 2012 1:58 PM
>>> To: Inki Dae
>>> Cc: 'Jerome Glisse'; airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>
>>> On 05/10/2012 10:39 AM, Inki Dae wrote:
>>>
>>>> Hi Jerome,
>>>>
>>>>> -----Original Message-----
>>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>>>> Sent: Wednesday, May 09, 2012 11:46 PM
>>>>> To: Inki Dae
>>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>>
>>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>>>>> this feature is used to import user space region allocated by
>>>>>> malloc()
>>>>> or
>>>>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>>>>> swapped out, the VMAs within the user space would be locked and then
>>>>> unlocked
>>>>>> when the pages are released.
>>>>>>
>>>>>> but this lock might result in significant degradation of system
>>>>> performance
>>>>>> because the pages couldn't be swapped out so we limit user-desired
>>>>> userptr
>>>>>> size to pre-defined.
>>>>>>
>>>>>> Signed-off-by: Inki Dae <inki.dae@samsung.com>
>>>>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>>>>
>>>>>
>>>>> Again i would like feedback from mm people (adding cc). I am not sure
>>>>
>>>> Thank you, I missed adding mm as cc.
>>>>
>>>>> locking the vma is the right anwser as i said in my previous mail,
>>>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
>>>>
>>>> I know that with VM_RESERVED flag, also we can avoid the pages from
>>> being
>>>> swapped out. but these pages should be unlocked anytime we want because
>>> we
>>>> could allocate all pages on system and lock them, which in turn, it may
>>>> result in significant deterioration of system performance.(maybe other
>>>> processes requesting free memory would be blocked) so I used VM_LOCKED
>>> flags
>>>> instead. but I'm not sure this way is best also.
>>>>
>>>>> Anyway even not considering that you don't check at all that process
>>>>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>>>>
>>>> Thank you for your advices.
>>>>
>>>>> for how it's done. Also you mlock complete vma but the userptr you get
>>>>> might be inside say 16M vma and you only care about 1M of userptr, if
>>>>> you mark the whole vma as locked than anytime a new page is fault in
>>>>> the vma else where than in the buffer you are interested then it got
>>>>> allocated for ever until the gem buffer is destroy, i am not sure of
>>>>> what happen to the vma on next malloc if it grows or not (i would
>>>>> think it won't grow at it would have different flags than new
>>>>> anonymous memory).
>>>
>>>
>>> I don't know history in detail because you didn't have sent full patches
>>> to linux-mm and
>>> I didn't read the below code, either.
>>> Just read your description and reply of Jerome. Apparently, there is
>>> something I missed.
>>>
>>> Your goal is to avoid swap out some user pages which is used in kernel
>>> at
>>> the same time. Right?
>>> Let's use get_user_pages. Is there any issue you can't use it?
>>> It increases page count so reclaimer can't swap out page.
>>> Isn't it enough?
>>> Marking whole VMA into MLCOKED is overkill.
>>>
>>
>> As I mentioned, we are already using get_user_pages. as you said, this
>> function increases page count but just only things to the user address
>> space
>> cpu already accessed. other would be allocated by page fault hander once
>> get_user_pages call. if so... ok, after that refcount(page->_count) of
>> the
>
>
> Not true. Look __get_user_pages.
> It handles case you mentioned by handle_mm_fault.
> Do I miss something?
>

let's assume that one application want to allocate user space memory
region using malloc() and then write something on the region. as you
may know, user space buffer doen't have real physical pages once
malloc() call so if user tries to access the region then page fault
handler would be triggered and then in turn next process like swap in
to fill physical frame number into entry of the page faulted. of
course, if user never access the buffer and requested userptr then
handle_mm_fault would be called by __get_user_pages. please give me
any comments if there is my missing point.

Thanks,
Inki Dae


>> pages user already accessed would have 2 and just 1 for other all pages.
>> so
>> we may have to consider only pages never accessed by cpu to be locked to
>> avoid from swapped out.
>>
>> Thanks,
>> Inki Dae
>>
>>> --
>>> Kind regards,
>>> Minchan Kim
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Fight unfair telecom internet charges in Canada: sign
>> http://stopthemeter.ca/
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>
>
>
>
> --
> Kind regards,
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign
> http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  7:59                 ` InKi Dae
@ 2012-05-10  8:11                   ` Minchan Kim
  2012-05-10  8:44                     ` Inki Dae
  0 siblings, 1 reply; 77+ messages in thread
From: Minchan Kim @ 2012-05-10  8:11 UTC (permalink / raw)
  To: InKi Dae
  Cc: Inki Dae, Jerome Glisse, airlied, dri-devel, kyungmin.park,
	sw0312.kim, linux-mm

On 05/10/2012 04:59 PM, InKi Dae wrote:

> 2012/5/10, Minchan Kim <minchan@kernel.org>:
>> On 05/10/2012 03:57 PM, Inki Dae wrote:
>>
>>>
>>>
>>>> -----Original Message-----
>>>> From: Minchan Kim [mailto:minchan@kernel.org]
>>>> Sent: Thursday, May 10, 2012 1:58 PM
>>>> To: Inki Dae
>>>> Cc: 'Jerome Glisse'; airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>
>>>> On 05/10/2012 10:39 AM, Inki Dae wrote:
>>>>
>>>>> Hi Jerome,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>>>>> Sent: Wednesday, May 09, 2012 11:46 PM
>>>>>> To: Inki Dae
>>>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>>>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>>>
>>>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>>>>>> this feature is used to import user space region allocated by
>>>>>>> malloc()
>>>>>> or
>>>>>>> mmaped into a gem. and to guarantee the pages to user space not to be
>>>>>>> swapped out, the VMAs within the user space would be locked and then
>>>>>> unlocked
>>>>>>> when the pages are released.
>>>>>>>
>>>>>>> but this lock might result in significant degradation of system
>>>>>> performance
>>>>>>> because the pages couldn't be swapped out so we limit user-desired
>>>>>> userptr
>>>>>>> size to pre-defined.
>>>>>>>
>>>>>>> Signed-off-by: Inki Dae <inki.dae@samsung.com>
>>>>>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>>>>>
>>>>>>
>>>>>> Again i would like feedback from mm people (adding cc). I am not sure
>>>>>
>>>>> Thank you, I missed adding mm as cc.
>>>>>
>>>>>> locking the vma is the right anwser as i said in my previous mail,
>>>>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
>>>>>
>>>>> I know that with VM_RESERVED flag, also we can avoid the pages from
>>>> being
>>>>> swapped out. but these pages should be unlocked anytime we want because
>>>> we
>>>>> could allocate all pages on system and lock them, which in turn, it may
>>>>> result in significant deterioration of system performance.(maybe other
>>>>> processes requesting free memory would be blocked) so I used VM_LOCKED
>>>> flags
>>>>> instead. but I'm not sure this way is best also.
>>>>>
>>>>>> Anyway even not considering that you don't check at all that process
>>>>>> don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>>>>>
>>>>> Thank you for your advices.
>>>>>
>>>>>> for how it's done. Also you mlock complete vma but the userptr you get
>>>>>> might be inside say 16M vma and you only care about 1M of userptr, if
>>>>>> you mark the whole vma as locked than anytime a new page is fault in
>>>>>> the vma else where than in the buffer you are interested then it got
>>>>>> allocated for ever until the gem buffer is destroy, i am not sure of
>>>>>> what happen to the vma on next malloc if it grows or not (i would
>>>>>> think it won't grow at it would have different flags than new
>>>>>> anonymous memory).
>>>>
>>>>
>>>> I don't know history in detail because you didn't have sent full patches
>>>> to linux-mm and
>>>> I didn't read the below code, either.
>>>> Just read your description and reply of Jerome. Apparently, there is
>>>> something I missed.
>>>>
>>>> Your goal is to avoid swap out some user pages which is used in kernel
>>>> at
>>>> the same time. Right?
>>>> Let's use get_user_pages. Is there any issue you can't use it?
>>>> It increases page count so reclaimer can't swap out page.
>>>> Isn't it enough?
>>>> Marking whole VMA into MLCOKED is overkill.
>>>>
>>>
>>> As I mentioned, we are already using get_user_pages. as you said, this
>>> function increases page count but just only things to the user address
>>> space
>>> cpu already accessed. other would be allocated by page fault hander once
>>> get_user_pages call. if so... ok, after that refcount(page->_count) of
>>> the
>>
>>
>> Not true. Look __get_user_pages.
>> It handles case you mentioned by handle_mm_fault.
>> Do I miss something?
>>
> 
> let's assume that one application want to allocate user space memory
> region using malloc() and then write something on the region. as you
> may know, user space buffer doen't have real physical pages once
> malloc() call so if user tries to access the region then page fault
> handler would be triggered


Understood.

> and then in turn next process like swap in to fill physical frame number into entry of the page faulted.


Sorry, I can't understand your point due to my poor English.
Could you rewrite it easiliy? :)

Thanks.

> of course,if user never access the buffer and requested userptr then

> handle_mm_fault would be called by __get_user_pages. please give me
> any comments if there is my missing point.
> 
> Thanks,
> Inki Dae
> 
> 
>>> pages user already accessed would have 2 and just 1 for other all pages.
>>> so
>>> we may have to consider only pages never accessed by cpu to be locked to
>>> avoid from swapped out.
>>>
>>> Thanks,
>>> Inki Dae
>>>
>>>> --
>>>> Kind regards,
>>>> Minchan Kim
>>>
>>> --
>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> the body to majordomo@kvack.org.  For more info on Linux MM,
>>> see: http://www.linux-mm.org/ .
>>> Fight unfair telecom internet charges in Canada: sign
>>> http://stopthemeter.ca/
>>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>>
>>
>>
>>
>> --
>> Kind regards,
>> Minchan Kim
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Fight unfair telecom internet charges in Canada: sign
>> http://stopthemeter.ca/
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  8:11                   ` Minchan Kim
@ 2012-05-10  8:44                     ` Inki Dae
  2012-05-10 17:53                       ` KOSAKI Motohiro
  0 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-10  8:44 UTC (permalink / raw)
  To: 'Minchan Kim', 'InKi Dae'
  Cc: 'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm


> -----Original Message-----
> From: Minchan Kim [mailto:minchan@kernel.org]
> Sent: Thursday, May 10, 2012 5:11 PM
> To: InKi Dae
> Cc: Inki Dae; Jerome Glisse; airlied@linux.ie; dri-
> devel@lists.freedesktop.org; kyungmin.park@samsung.com;
> sw0312.kim@samsung.com; linux-mm@kvack.org
> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> 
> On 05/10/2012 04:59 PM, InKi Dae wrote:
> 
> > 2012/5/10, Minchan Kim <minchan@kernel.org>:
> >> On 05/10/2012 03:57 PM, Inki Dae wrote:
> >>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Minchan Kim [mailto:minchan@kernel.org]
> >>>> Sent: Thursday, May 10, 2012 1:58 PM
> >>>> To: Inki Dae
> >>>> Cc: 'Jerome Glisse'; airlied@linux.ie; dri-
> devel@lists.freedesktop.org;
> >>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
> >>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> >>>>
> >>>> On 05/10/2012 10:39 AM, Inki Dae wrote:
> >>>>
> >>>>> Hi Jerome,
> >>>>>
> >>>>>> -----Original Message-----
> >>>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> >>>>>> Sent: Wednesday, May 09, 2012 11:46 PM
> >>>>>> To: Inki Dae
> >>>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> >>>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-
> mm@kvack.org
> >>>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> >>>>>>
> >>>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com>
> wrote:
> >>>>>>> this feature is used to import user space region allocated by
> >>>>>>> malloc()
> >>>>>> or
> >>>>>>> mmaped into a gem. and to guarantee the pages to user space not to
> be
> >>>>>>> swapped out, the VMAs within the user space would be locked and
> then
> >>>>>> unlocked
> >>>>>>> when the pages are released.
> >>>>>>>
> >>>>>>> but this lock might result in significant degradation of system
> >>>>>> performance
> >>>>>>> because the pages couldn't be swapped out so we limit user-desired
> >>>>>> userptr
> >>>>>>> size to pre-defined.
> >>>>>>>
> >>>>>>> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> >>>>>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >>>>>>
> >>>>>>
> >>>>>> Again i would like feedback from mm people (adding cc). I am not
> sure
> >>>>>
> >>>>> Thank you, I missed adding mm as cc.
> >>>>>
> >>>>>> locking the vma is the right anwser as i said in my previous mail,
> >>>>>> userspace can munlock it in your back, maybe VM_RESERVED is better.
> >>>>>
> >>>>> I know that with VM_RESERVED flag, also we can avoid the pages from
> >>>> being
> >>>>> swapped out. but these pages should be unlocked anytime we want
> because
> >>>> we
> >>>>> could allocate all pages on system and lock them, which in turn, it
> may
> >>>>> result in significant deterioration of system performance.(maybe
> other
> >>>>> processes requesting free memory would be blocked) so I used
> VM_LOCKED
> >>>> flags
> >>>>> instead. but I'm not sure this way is best also.
> >>>>>
> >>>>>> Anyway even not considering that you don't check at all that
> process
> >>>>>> don't go over the limit of locked page see mm/mlock.c
> RLIMIT_MEMLOCK
> >>>>>
> >>>>> Thank you for your advices.
> >>>>>
> >>>>>> for how it's done. Also you mlock complete vma but the userptr you
> get
> >>>>>> might be inside say 16M vma and you only care about 1M of userptr,
> if
> >>>>>> you mark the whole vma as locked than anytime a new page is fault
> in
> >>>>>> the vma else where than in the buffer you are interested then it
> got
> >>>>>> allocated for ever until the gem buffer is destroy, i am not sure
> of
> >>>>>> what happen to the vma on next malloc if it grows or not (i would
> >>>>>> think it won't grow at it would have different flags than new
> >>>>>> anonymous memory).
> >>>>
> >>>>
> >>>> I don't know history in detail because you didn't have sent full
> patches
> >>>> to linux-mm and
> >>>> I didn't read the below code, either.
> >>>> Just read your description and reply of Jerome. Apparently, there is
> >>>> something I missed.
> >>>>
> >>>> Your goal is to avoid swap out some user pages which is used in
> kernel
> >>>> at
> >>>> the same time. Right?
> >>>> Let's use get_user_pages. Is there any issue you can't use it?
> >>>> It increases page count so reclaimer can't swap out page.
> >>>> Isn't it enough?
> >>>> Marking whole VMA into MLCOKED is overkill.
> >>>>
> >>>
> >>> As I mentioned, we are already using get_user_pages. as you said, this
> >>> function increases page count but just only things to the user address
> >>> space
> >>> cpu already accessed. other would be allocated by page fault hander
> once
> >>> get_user_pages call. if so... ok, after that refcount(page->_count) of
> >>> the
> >>
> >>
> >> Not true. Look __get_user_pages.
> >> It handles case you mentioned by handle_mm_fault.
> >> Do I miss something?
> >>
> >
> > let's assume that one application want to allocate user space memory
> > region using malloc() and then write something on the region. as you
> > may know, user space buffer doen't have real physical pages once
> > malloc() call so if user tries to access the region then page fault
> > handler would be triggered
> 
> 
> Understood.
> 
> > and then in turn next process like swap in to fill physical frame number
> into entry of the page faulted.
> 
> 
> Sorry, I can't understand your point due to my poor English.
> Could you rewrite it easiliy? :)
> 

Simply saying, handle_mm_fault would be called to update pte after finding
vma and checking access right. and as you know, there are many cases to
process page fault such as COW or demand paging.

Thanks,
Inki Dae

> Thanks.
> 
> > of course,if user never access the buffer and requested userptr then
> 
> > handle_mm_fault would be called by __get_user_pages. please give me
> > any comments if there is my missing point.
> >
> > Thanks,
> > Inki Dae
> >
> >
> >>> pages user already accessed would have 2 and just 1 for other all
> pages.
> >>> so
> >>> we may have to consider only pages never accessed by cpu to be locked
> to
> >>> avoid from swapped out.
> >>>
> >>> Thanks,
> >>> Inki Dae
> >>>
> >>>> --
> >>>> Kind regards,
> >>>> Minchan Kim
> >>>
> >>> --
> >>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >>> the body to majordomo@kvack.org.  For more info on Linux MM,
> >>> see: http://www.linux-mm.org/ .
> >>> Fight unfair telecom internet charges in Canada: sign
> >>> http://stopthemeter.ca/
> >>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >>>
> >>
> >>
> >>
> >> --
> >> Kind regards,
> >> Minchan Kim
> >>
> >> --
> >> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> >> the body to majordomo@kvack.org.  For more info on Linux MM,
> >> see: http://www.linux-mm.org/ .
> >> Fight unfair telecom internet charges in Canada: sign
> >> http://stopthemeter.ca/
> >> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >>
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Fight unfair telecom internet charges in Canada: sign
> http://stopthemeter.ca/
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> >
> 
> 
> 
> --
> Kind regards,
> Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  2:44           ` Inki Dae
@ 2012-05-10 15:05             ` Jerome Glisse
  2012-05-10 15:31                 ` Daniel Vetter
  0 siblings, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-10 15:05 UTC (permalink / raw)
  To: Inki Dae; +Cc: linux-mm, kyungmin.park, sw0312.kim, dri-devel

On Wed, May 9, 2012 at 10:44 PM, Inki Dae <inki.dae@samsung.com> wrote:
> Hi Jerome,
>
> Thank you again.
>
>> -----Original Message-----
>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>> Sent: Thursday, May 10, 2012 3:33 AM
>> To: Inki Dae
>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>
>> On Wed, May 9, 2012 at 10:45 AM, Jerome Glisse <j.glisse@gmail.com> wrote:
>> > On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> >> this feature is used to import user space region allocated by malloc()
>> or
>> >> mmaped into a gem. and to guarantee the pages to user space not to be
>> >> swapped out, the VMAs within the user space would be locked and then
>> unlocked
>> >> when the pages are released.
>> >>
>> >> but this lock might result in significant degradation of system
>> performance
>> >> because the pages couldn't be swapped out so we limit user-desired
>> userptr
>> >> size to pre-defined.
>> >>
>> >> Signed-off-by: Inki Dae <inki.dae@samsung.com>
>> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>> >
>> >
>> > Again i would like feedback from mm people (adding cc). I am not sure
>> > locking the vma is the right anwser as i said in my previous mail,
>> > userspace can munlock it in your back, maybe VM_RESERVED is better.
>> > Anyway even not considering that you don't check at all that process
>> > don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>> > for how it's done. Also you mlock complete vma but the userptr you get
>> > might be inside say 16M vma and you only care about 1M of userptr, if
>> > you mark the whole vma as locked than anytime a new page is fault in
>> > the vma else where than in the buffer you are interested then it got
>> > allocated for ever until the gem buffer is destroy, i am not sure of
>> > what happen to the vma on next malloc if it grows or not (i would
>> > think it won't grow at it would have different flags than new
>> > anonymous memory).
>> >
>> > The whole business of directly using malloced memory for gpu is fishy
>> > and i would really like to get it right rather than relying on never
>> > hitting strange things like page migration, vma merging, or worse
>> > things like over locking pages and stealing memory.
>> >
>> > Cheers,
>> > Jerome
>>
>> I had a lengthy discussion with mm people (thx a lot for that). I
>> think we should split 2 different use case. The zero-copy upload case
>> ie :
>> app:
>>     ptr = malloc()
>>     ...
>>     glTex/VBO/UBO/...(ptr)
>>     free(ptr) or reuse it for other things
>> For which i guess you want to avoid having to do a memcpy inside the
>> gl library (could be anything else than gl that have same useage
>> pattern).
>>
>
> Right, in this case, we are using the userptr feature as pixman and evas
> backend to use 2d accelerator.
>
>> ie after the upload happen you don't care about those page they can
>> removed from the vma or marked as cow so that anything messing with
>> those page after the upload won't change what you uploaded. Of course
>
> I'm not sure that I understood your mentions but could the pages be removed
> from vma with VM_LOCKED or VM_RESERVED? once glTex/VBO/UBO/..., the VMAs to
> user space would be locked. if cpu accessed significant part of all the
> pages in user mode then pages to the part would be allocated by page fault
> handler, after that, through userptr, the VMAs to user address space would
> be locked(at this time, the remaining pages would be allocated also by
> get_user_pages by calling page fault handler) I'd be glad to give me any
> comments and advices if there is my missing point.
>
>> this is assuming that the tlb cost of doing such thing is smaller than
>> the cost of memcpy the data.
>>
>
> yes, in our test case, the tlb cost(incurred by tlb miss) was smaller than
> the cost of memcpy also cpu usage. of course, this would be depended on gpu
> performance.
>
>> Two way to do that, either you assume app can't not read back data
>> after gl can and you do an unmap_mapping_range (make sure you only
>> unmap fully covered page and that you copy non fully covered page) or
>> you want to allow userspace to still read data or possibly overwrite
>> them
>>
>> Second use case is something more like for the opencl case of
>> CL_MEM_USE_HOST_PTR, in which you want to use the same page in the gpu
>> and keep the userspace vma pointing to those page. I think the
>> agreement on this case is that there is no way right now to do it
>> sanely inside linux kernel. mlocking will need proper accounting
>> against rtlimit but this limit might be low. Also the fork case might
>> be problematic.
>>
>> For the fork case the memory is anonymous so it should be COWed in the
>> fork child but relative to cl context that means the child could not
>> use the cl context with that memory or at least if the child write to
>> this memory the cl will not see those change. I guess the answer to
>> that one is that you really need to use the cl api to read the object
>> or get proper ptr to read it.
>>
>> Anyway in all case, implementing this userptr thing need a lot more
>> code. You have to check to that the vma you are trying to use is
>> anonymous and only handle this case and fallback to alloc new page and
>> copy otherwise..
>>
>
> I'd like to say thank you again you gave me comments and advices in detail.
> there may be my missing points but I will check it again.
>
> Thanks,
> Inki Dae
>
>> Cheers,
>> Jerome
>

I think to sumup things there is 2 use case:
1: we want to steal anonymous page and move them to a gem/gpu object.
So call get_user_pages on the rand and then unmap_mapping_range on the
range we want to still page are the function you would want. That
assume of course that the api case your are trying to accelerate is ok
with having the user process loosing the content of this buffer. If
it's not the other solution is to mark the range as COW and steal the
page, if userspace try to write something new to the range it will get
new page (from copy of old one).

Let call it  drm_gem_from_userptr

2: you want to be able to use malloced area over long period of time
(several minute) with the gpu and you want that userspace vma still
point to those page. It the most problematic case, as i said just
mlocking the vma is troublesome as malicious userspace can munlock it
or try to abuse you to mlock too much. I believe right now there is no
good way to handle this case. That means that page can't be swaped out
or moved as regular anonymous page but on fork or exec this area still
need to behave like an anonymous vma.

Let call drm_gem_backed_by_userptr

Inki i really think you should split this 2 usecase, and do only the
drm_gem_from_userptr if it's enough for what you are trying to do. As
the second case look that too many things can go wrong.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10 15:05             ` Jerome Glisse
@ 2012-05-10 15:31                 ` Daniel Vetter
  0 siblings, 0 replies; 77+ messages in thread
From: Daniel Vetter @ 2012-05-10 15:31 UTC (permalink / raw)
  To: Jerome Glisse; +Cc: Inki Dae, linux-mm, kyungmin.park, sw0312.kim, dri-devel

On Thu, May 10, 2012 at 11:05:07AM -0400, Jerome Glisse wrote:
> On Wed, May 9, 2012 at 10:44 PM, Inki Dae <inki.dae@samsung.com> wrote:
> > Hi Jerome,
> >
> > Thank you again.
> >
> >> -----Original Message-----
> >> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> >> Sent: Thursday, May 10, 2012 3:33 AM
> >> To: Inki Dae
> >> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> >> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
> >> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> >>
> >> On Wed, May 9, 2012 at 10:45 AM, Jerome Glisse <j.glisse@gmail.com> wrote:
> >> > On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >> >> this feature is used to import user space region allocated by malloc()
> >> or
> >> >> mmaped into a gem. and to guarantee the pages to user space not to be
> >> >> swapped out, the VMAs within the user space would be locked and then
> >> unlocked
> >> >> when the pages are released.
> >> >>
> >> >> but this lock might result in significant degradation of system
> >> performance
> >> >> because the pages couldn't be swapped out so we limit user-desired
> >> userptr
> >> >> size to pre-defined.
> >> >>
> >> >> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> >> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >> >
> >> >
> >> > Again i would like feedback from mm people (adding cc). I am not sure
> >> > locking the vma is the right anwser as i said in my previous mail,
> >> > userspace can munlock it in your back, maybe VM_RESERVED is better.
> >> > Anyway even not considering that you don't check at all that process
> >> > don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
> >> > for how it's done. Also you mlock complete vma but the userptr you get
> >> > might be inside say 16M vma and you only care about 1M of userptr, if
> >> > you mark the whole vma as locked than anytime a new page is fault in
> >> > the vma else where than in the buffer you are interested then it got
> >> > allocated for ever until the gem buffer is destroy, i am not sure of
> >> > what happen to the vma on next malloc if it grows or not (i would
> >> > think it won't grow at it would have different flags than new
> >> > anonymous memory).
> >> >
> >> > The whole business of directly using malloced memory for gpu is fishy
> >> > and i would really like to get it right rather than relying on never
> >> > hitting strange things like page migration, vma merging, or worse
> >> > things like over locking pages and stealing memory.
> >> >
> >> > Cheers,
> >> > Jerome
> >>
> >> I had a lengthy discussion with mm people (thx a lot for that). I
> >> think we should split 2 different use case. The zero-copy upload case
> >> ie :
> >> app:
> >>     ptr = malloc()
> >>     ...
> >>     glTex/VBO/UBO/...(ptr)
> >>     free(ptr) or reuse it for other things
> >> For which i guess you want to avoid having to do a memcpy inside the
> >> gl library (could be anything else than gl that have same useage
> >> pattern).
> >>
> >
> > Right, in this case, we are using the userptr feature as pixman and evas
> > backend to use 2d accelerator.
> >
> >> ie after the upload happen you don't care about those page they can
> >> removed from the vma or marked as cow so that anything messing with
> >> those page after the upload won't change what you uploaded. Of course
> >
> > I'm not sure that I understood your mentions but could the pages be removed
> > from vma with VM_LOCKED or VM_RESERVED? once glTex/VBO/UBO/..., the VMAs to
> > user space would be locked. if cpu accessed significant part of all the
> > pages in user mode then pages to the part would be allocated by page fault
> > handler, after that, through userptr, the VMAs to user address space would
> > be locked(at this time, the remaining pages would be allocated also by
> > get_user_pages by calling page fault handler) I'd be glad to give me any
> > comments and advices if there is my missing point.
> >
> >> this is assuming that the tlb cost of doing such thing is smaller than
> >> the cost of memcpy the data.
> >>
> >
> > yes, in our test case, the tlb cost(incurred by tlb miss) was smaller than
> > the cost of memcpy also cpu usage. of course, this would be depended on gpu
> > performance.
> >
> >> Two way to do that, either you assume app can't not read back data
> >> after gl can and you do an unmap_mapping_range (make sure you only
> >> unmap fully covered page and that you copy non fully covered page) or
> >> you want to allow userspace to still read data or possibly overwrite
> >> them
> >>
> >> Second use case is something more like for the opencl case of
> >> CL_MEM_USE_HOST_PTR, in which you want to use the same page in the gpu
> >> and keep the userspace vma pointing to those page. I think the
> >> agreement on this case is that there is no way right now to do it
> >> sanely inside linux kernel. mlocking will need proper accounting
> >> against rtlimit but this limit might be low. Also the fork case might
> >> be problematic.
> >>
> >> For the fork case the memory is anonymous so it should be COWed in the
> >> fork child but relative to cl context that means the child could not
> >> use the cl context with that memory or at least if the child write to
> >> this memory the cl will not see those change. I guess the answer to
> >> that one is that you really need to use the cl api to read the object
> >> or get proper ptr to read it.
> >>
> >> Anyway in all case, implementing this userptr thing need a lot more
> >> code. You have to check to that the vma you are trying to use is
> >> anonymous and only handle this case and fallback to alloc new page and
> >> copy otherwise..
> >>
> >
> > I'd like to say thank you again you gave me comments and advices in detail.
> > there may be my missing points but I will check it again.
> >
> > Thanks,
> > Inki Dae
> >
> >> Cheers,
> >> Jerome
> >
> 
> I think to sumup things there is 2 use case:
> 1: we want to steal anonymous page and move them to a gem/gpu object.
> So call get_user_pages on the rand and then unmap_mapping_range on the
> range we want to still page are the function you would want. That
> assume of course that the api case your are trying to accelerate is ok
> with having the user process loosing the content of this buffer. If
> it's not the other solution is to mark the range as COW and steal the
> page, if userspace try to write something new to the range it will get
> new page (from copy of old one).
> 
> Let call it  drm_gem_from_userptr
> 
> 2: you want to be able to use malloced area over long period of time
> (several minute) with the gpu and you want that userspace vma still
> point to those page. It the most problematic case, as i said just
> mlocking the vma is troublesome as malicious userspace can munlock it
> or try to abuse you to mlock too much. I believe right now there is no
> good way to handle this case. That means that page can't be swaped out
> or moved as regular anonymous page but on fork or exec this area still
> need to behave like an anonymous vma.
> 
> Let call drm_gem_backed_by_userptr
> 
> Inki i really think you should split this 2 usecase, and do only the
> drm_gem_from_userptr if it's enough for what you are trying to do. As
> the second case look that too many things can go wrong.

Jumping into the discussion late: Chris Wilson stitched together a userptr
feature for i915. Iirc he started with your 1st usecase but quickly
noticed that doing all this setup stuff (get_user_pages alone, he didn't
include your proposed cow trick) is too expensive and it's cheaper to just
upload things with the cpu.

So he needs to keep around these mappings for essentially forever to
amortize the setup cost, which boils down to your 2nd use-case. I've
refused to merge that code since, like you point out, too much stuff can
go wrong when we pin arbitrary ranges of userspace.

One example which would only affect ARM platforms is that this would
horribly break CMA: Userspace malloc is allocated as GFP_MOVEABLE, and CMA
relies on migrating moveable pages out of the CMA region to handle large
contigious allocations. Currently the page migrate code simply backs down
and waits a bit if it encounters a page locked by get_user_pages, assuming
that this is due to io and will complete shortly. If you hold onto such
pages for a long time, that retry will eventually fail and break CMA.

There are other problems affecting also desktop machines, but I've figured
I'll pick something that really hurts on arm ;-)

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
@ 2012-05-10 15:31                 ` Daniel Vetter
  0 siblings, 0 replies; 77+ messages in thread
From: Daniel Vetter @ 2012-05-10 15:31 UTC (permalink / raw)
  To: Jerome Glisse; +Cc: Inki Dae, linux-mm, kyungmin.park, sw0312.kim, dri-devel

On Thu, May 10, 2012 at 11:05:07AM -0400, Jerome Glisse wrote:
> On Wed, May 9, 2012 at 10:44 PM, Inki Dae <inki.dae@samsung.com> wrote:
> > Hi Jerome,
> >
> > Thank you again.
> >
> >> -----Original Message-----
> >> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> >> Sent: Thursday, May 10, 2012 3:33 AM
> >> To: Inki Dae
> >> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> >> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
> >> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> >>
> >> On Wed, May 9, 2012 at 10:45 AM, Jerome Glisse <j.glisse@gmail.com> wrote:
> >> > On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >> >> this feature is used to import user space region allocated by malloc()
> >> or
> >> >> mmaped into a gem. and to guarantee the pages to user space not to be
> >> >> swapped out, the VMAs within the user space would be locked and then
> >> unlocked
> >> >> when the pages are released.
> >> >>
> >> >> but this lock might result in significant degradation of system
> >> performance
> >> >> because the pages couldn't be swapped out so we limit user-desired
> >> userptr
> >> >> size to pre-defined.
> >> >>
> >> >> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> >> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >> >
> >> >
> >> > Again i would like feedback from mm people (adding cc). I am not sure
> >> > locking the vma is the right anwser as i said in my previous mail,
> >> > userspace can munlock it in your back, maybe VM_RESERVED is better.
> >> > Anyway even not considering that you don't check at all that process
> >> > don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
> >> > for how it's done. Also you mlock complete vma but the userptr you get
> >> > might be inside say 16M vma and you only care about 1M of userptr, if
> >> > you mark the whole vma as locked than anytime a new page is fault in
> >> > the vma else where than in the buffer you are interested then it got
> >> > allocated for ever until the gem buffer is destroy, i am not sure of
> >> > what happen to the vma on next malloc if it grows or not (i would
> >> > think it won't grow at it would have different flags than new
> >> > anonymous memory).
> >> >
> >> > The whole business of directly using malloced memory for gpu is fishy
> >> > and i would really like to get it right rather than relying on never
> >> > hitting strange things like page migration, vma merging, or worse
> >> > things like over locking pages and stealing memory.
> >> >
> >> > Cheers,
> >> > Jerome
> >>
> >> I had a lengthy discussion with mm people (thx a lot for that). I
> >> think we should split 2 different use case. The zero-copy upload case
> >> ie :
> >> app:
> >>     ptr = malloc()
> >>     ...
> >>     glTex/VBO/UBO/...(ptr)
> >>     free(ptr) or reuse it for other things
> >> For which i guess you want to avoid having to do a memcpy inside the
> >> gl library (could be anything else than gl that have same useage
> >> pattern).
> >>
> >
> > Right, in this case, we are using the userptr feature as pixman and evas
> > backend to use 2d accelerator.
> >
> >> ie after the upload happen you don't care about those page they can
> >> removed from the vma or marked as cow so that anything messing with
> >> those page after the upload won't change what you uploaded. Of course
> >
> > I'm not sure that I understood your mentions but could the pages be removed
> > from vma with VM_LOCKED or VM_RESERVED? once glTex/VBO/UBO/..., the VMAs to
> > user space would be locked. if cpu accessed significant part of all the
> > pages in user mode then pages to the part would be allocated by page fault
> > handler, after that, through userptr, the VMAs to user address space would
> > be locked(at this time, the remaining pages would be allocated also by
> > get_user_pages by calling page fault handler) I'd be glad to give me any
> > comments and advices if there is my missing point.
> >
> >> this is assuming that the tlb cost of doing such thing is smaller than
> >> the cost of memcpy the data.
> >>
> >
> > yes, in our test case, the tlb cost(incurred by tlb miss) was smaller than
> > the cost of memcpy also cpu usage. of course, this would be depended on gpu
> > performance.
> >
> >> Two way to do that, either you assume app can't not read back data
> >> after gl can and you do an unmap_mapping_range (make sure you only
> >> unmap fully covered page and that you copy non fully covered page) or
> >> you want to allow userspace to still read data or possibly overwrite
> >> them
> >>
> >> Second use case is something more like for the opencl case of
> >> CL_MEM_USE_HOST_PTR, in which you want to use the same page in the gpu
> >> and keep the userspace vma pointing to those page. I think the
> >> agreement on this case is that there is no way right now to do it
> >> sanely inside linux kernel. mlocking will need proper accounting
> >> against rtlimit but this limit might be low. Also the fork case might
> >> be problematic.
> >>
> >> For the fork case the memory is anonymous so it should be COWed in the
> >> fork child but relative to cl context that means the child could not
> >> use the cl context with that memory or at least if the child write to
> >> this memory the cl will not see those change. I guess the answer to
> >> that one is that you really need to use the cl api to read the object
> >> or get proper ptr to read it.
> >>
> >> Anyway in all case, implementing this userptr thing need a lot more
> >> code. You have to check to that the vma you are trying to use is
> >> anonymous and only handle this case and fallback to alloc new page and
> >> copy otherwise..
> >>
> >
> > I'd like to say thank you again you gave me comments and advices in detail.
> > there may be my missing points but I will check it again.
> >
> > Thanks,
> > Inki Dae
> >
> >> Cheers,
> >> Jerome
> >
> 
> I think to sumup things there is 2 use case:
> 1: we want to steal anonymous page and move them to a gem/gpu object.
> So call get_user_pages on the rand and then unmap_mapping_range on the
> range we want to still page are the function you would want. That
> assume of course that the api case your are trying to accelerate is ok
> with having the user process loosing the content of this buffer. If
> it's not the other solution is to mark the range as COW and steal the
> page, if userspace try to write something new to the range it will get
> new page (from copy of old one).
> 
> Let call it  drm_gem_from_userptr
> 
> 2: you want to be able to use malloced area over long period of time
> (several minute) with the gpu and you want that userspace vma still
> point to those page. It the most problematic case, as i said just
> mlocking the vma is troublesome as malicious userspace can munlock it
> or try to abuse you to mlock too much. I believe right now there is no
> good way to handle this case. That means that page can't be swaped out
> or moved as regular anonymous page but on fork or exec this area still
> need to behave like an anonymous vma.
> 
> Let call drm_gem_backed_by_userptr
> 
> Inki i really think you should split this 2 usecase, and do only the
> drm_gem_from_userptr if it's enough for what you are trying to do. As
> the second case look that too many things can go wrong.

Jumping into the discussion late: Chris Wilson stitched together a userptr
feature for i915. Iirc he started with your 1st usecase but quickly
noticed that doing all this setup stuff (get_user_pages alone, he didn't
include your proposed cow trick) is too expensive and it's cheaper to just
upload things with the cpu.

So he needs to keep around these mappings for essentially forever to
amortize the setup cost, which boils down to your 2nd use-case. I've
refused to merge that code since, like you point out, too much stuff can
go wrong when we pin arbitrary ranges of userspace.

One example which would only affect ARM platforms is that this would
horribly break CMA: Userspace malloc is allocated as GFP_MOVEABLE, and CMA
relies on migrating moveable pages out of the CMA region to handle large
contigious allocations. Currently the page migrate code simply backs down
and waits a bit if it encounters a page locked by get_user_pages, assuming
that this is due to io and will complete shortly. If you hold onto such
pages for a long time, that retry will eventually fail and break CMA.

There are other problems affecting also desktop machines, but I've figured
I'll pick something that really hurts on arm ;-)

Yours, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10 15:31                 ` Daniel Vetter
  (?)
@ 2012-05-10 15:52                 ` Jerome Glisse
  2012-05-11  1:47                   ` Inki Dae
  -1 siblings, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-10 15:52 UTC (permalink / raw)
  To: Jerome Glisse, Inki Dae, linux-mm, kyungmin.park, sw0312.kim, dri-devel

On Thu, May 10, 2012 at 11:31 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, May 10, 2012 at 11:05:07AM -0400, Jerome Glisse wrote:
>> On Wed, May 9, 2012 at 10:44 PM, Inki Dae <inki.dae@samsung.com> wrote:
>> > Hi Jerome,
>> >
>> > Thank you again.
>> >
>> >> -----Original Message-----
>> >> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>> >> Sent: Thursday, May 10, 2012 3:33 AM
>> >> To: Inki Dae
>> >> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>> >> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-mm@kvack.org
>> >> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>> >>
>> >> On Wed, May 9, 2012 at 10:45 AM, Jerome Glisse <j.glisse@gmail.com> wrote:
>> >> > On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> >> >> this feature is used to import user space region allocated by malloc()
>> >> or
>> >> >> mmaped into a gem. and to guarantee the pages to user space not to be
>> >> >> swapped out, the VMAs within the user space would be locked and then
>> >> unlocked
>> >> >> when the pages are released.
>> >> >>
>> >> >> but this lock might result in significant degradation of system
>> >> performance
>> >> >> because the pages couldn't be swapped out so we limit user-desired
>> >> userptr
>> >> >> size to pre-defined.
>> >> >>
>> >> >> Signed-off-by: Inki Dae <inki.dae@samsung.com>
>> >> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>> >> >
>> >> >
>> >> > Again i would like feedback from mm people (adding cc). I am not sure
>> >> > locking the vma is the right anwser as i said in my previous mail,
>> >> > userspace can munlock it in your back, maybe VM_RESERVED is better.
>> >> > Anyway even not considering that you don't check at all that process
>> >> > don't go over the limit of locked page see mm/mlock.c RLIMIT_MEMLOCK
>> >> > for how it's done. Also you mlock complete vma but the userptr you get
>> >> > might be inside say 16M vma and you only care about 1M of userptr, if
>> >> > you mark the whole vma as locked than anytime a new page is fault in
>> >> > the vma else where than in the buffer you are interested then it got
>> >> > allocated for ever until the gem buffer is destroy, i am not sure of
>> >> > what happen to the vma on next malloc if it grows or not (i would
>> >> > think it won't grow at it would have different flags than new
>> >> > anonymous memory).
>> >> >
>> >> > The whole business of directly using malloced memory for gpu is fishy
>> >> > and i would really like to get it right rather than relying on never
>> >> > hitting strange things like page migration, vma merging, or worse
>> >> > things like over locking pages and stealing memory.
>> >> >
>> >> > Cheers,
>> >> > Jerome
>> >>
>> >> I had a lengthy discussion with mm people (thx a lot for that). I
>> >> think we should split 2 different use case. The zero-copy upload case
>> >> ie :
>> >> app:
>> >>     ptr = malloc()
>> >>     ...
>> >>     glTex/VBO/UBO/...(ptr)
>> >>     free(ptr) or reuse it for other things
>> >> For which i guess you want to avoid having to do a memcpy inside the
>> >> gl library (could be anything else than gl that have same useage
>> >> pattern).
>> >>
>> >
>> > Right, in this case, we are using the userptr feature as pixman and evas
>> > backend to use 2d accelerator.
>> >
>> >> ie after the upload happen you don't care about those page they can
>> >> removed from the vma or marked as cow so that anything messing with
>> >> those page after the upload won't change what you uploaded. Of course
>> >
>> > I'm not sure that I understood your mentions but could the pages be removed
>> > from vma with VM_LOCKED or VM_RESERVED? once glTex/VBO/UBO/..., the VMAs to
>> > user space would be locked. if cpu accessed significant part of all the
>> > pages in user mode then pages to the part would be allocated by page fault
>> > handler, after that, through userptr, the VMAs to user address space would
>> > be locked(at this time, the remaining pages would be allocated also by
>> > get_user_pages by calling page fault handler) I'd be glad to give me any
>> > comments and advices if there is my missing point.
>> >
>> >> this is assuming that the tlb cost of doing such thing is smaller than
>> >> the cost of memcpy the data.
>> >>
>> >
>> > yes, in our test case, the tlb cost(incurred by tlb miss) was smaller than
>> > the cost of memcpy also cpu usage. of course, this would be depended on gpu
>> > performance.
>> >
>> >> Two way to do that, either you assume app can't not read back data
>> >> after gl can and you do an unmap_mapping_range (make sure you only
>> >> unmap fully covered page and that you copy non fully covered page) or
>> >> you want to allow userspace to still read data or possibly overwrite
>> >> them
>> >>
>> >> Second use case is something more like for the opencl case of
>> >> CL_MEM_USE_HOST_PTR, in which you want to use the same page in the gpu
>> >> and keep the userspace vma pointing to those page. I think the
>> >> agreement on this case is that there is no way right now to do it
>> >> sanely inside linux kernel. mlocking will need proper accounting
>> >> against rtlimit but this limit might be low. Also the fork case might
>> >> be problematic.
>> >>
>> >> For the fork case the memory is anonymous so it should be COWed in the
>> >> fork child but relative to cl context that means the child could not
>> >> use the cl context with that memory or at least if the child write to
>> >> this memory the cl will not see those change. I guess the answer to
>> >> that one is that you really need to use the cl api to read the object
>> >> or get proper ptr to read it.
>> >>
>> >> Anyway in all case, implementing this userptr thing need a lot more
>> >> code. You have to check to that the vma you are trying to use is
>> >> anonymous and only handle this case and fallback to alloc new page and
>> >> copy otherwise..
>> >>
>> >
>> > I'd like to say thank you again you gave me comments and advices in detail.
>> > there may be my missing points but I will check it again.
>> >
>> > Thanks,
>> > Inki Dae
>> >
>> >> Cheers,
>> >> Jerome
>> >
>>
>> I think to sumup things there is 2 use case:
>> 1: we want to steal anonymous page and move them to a gem/gpu object.
>> So call get_user_pages on the rand and then unmap_mapping_range on the
>> range we want to still page are the function you would want. That
>> assume of course that the api case your are trying to accelerate is ok
>> with having the user process loosing the content of this buffer. If
>> it's not the other solution is to mark the range as COW and steal the
>> page, if userspace try to write something new to the range it will get
>> new page (from copy of old one).
>>
>> Let call it  drm_gem_from_userptr
>>
>> 2: you want to be able to use malloced area over long period of time
>> (several minute) with the gpu and you want that userspace vma still
>> point to those page. It the most problematic case, as i said just
>> mlocking the vma is troublesome as malicious userspace can munlock it
>> or try to abuse you to mlock too much. I believe right now there is no
>> good way to handle this case. That means that page can't be swaped out
>> or moved as regular anonymous page but on fork or exec this area still
>> need to behave like an anonymous vma.
>>
>> Let call drm_gem_backed_by_userptr
>>
>> Inki i really think you should split this 2 usecase, and do only the
>> drm_gem_from_userptr if it's enough for what you are trying to do. As
>> the second case look that too many things can go wrong.
>
> Jumping into the discussion late: Chris Wilson stitched together a userptr
> feature for i915. Iirc he started with your 1st usecase but quickly
> noticed that doing all this setup stuff (get_user_pages alone, he didn't
> include your proposed cow trick) is too expensive and it's cheaper to just
> upload things with the cpu.

I think use case 1 can still be usefull on desktop x86 but the object
need to be big something like bigger 16M or probably even bigger, so
that the cost of memcpy is bigger than the cost of tlb trashing. I am
sure than once we are seeing 1G dataset of opencl we will want to use
the stealing code path rather than memcpy things. Sadly API like CL
and GL don't make provision saying that the data ptr user supplied
might not have the content anymore so it makes the cow trick kind of
needed but it kills usecase such as :
scratch = malloc()
for (i =0; i<numtex; i++){
   readtexture(scratch, texfilename[i])
   glteximage(scratch)
}
free(scratch)

Or anything with similar access, here obviously the page backing the
scratch area can be stole at each glteximage call.

Anyway if you can define your api and provision that after call the
data you provided is no longer available then use case 1 sounds doable
and worth it to me.

> So he needs to keep around these mappings for essentially forever to
> amortize the setup cost, which boils down to your 2nd use-case. I've
> refused to merge that code since, like you point out, too much stuff can
> go wrong when we pin arbitrary ranges of userspace.
>
> One example which would only affect ARM platforms is that this would
> horribly break CMA: Userspace malloc is allocated as GFP_MOVEABLE, and CMA
> relies on migrating moveable pages out of the CMA region to handle large
> contigious allocations. Currently the page migrate code simply backs down
> and waits a bit if it encounters a page locked by get_user_pages, assuming
> that this is due to io and will complete shortly. If you hold onto such
> pages for a long time, that retry will eventually fail and break CMA.
>
> There are other problems affecting also desktop machines, but I've figured
> I'll pick something that really hurts on arm ;-)
>
> Yours, Daniel
> --
> Daniel Vetter
> Mail: daniel@ffwll.ch
> Mobile: +41 (0)79 365 57 48

Yes, what i also wanted to stress is that get_user_pages is not
enough, we are not doing something over the period of an ioctl in
which case everything is fine and doable, but are stealing page and
expect to use them at random point in the future with no kind of
synchronization with userspace. Anyway i think we agree that this
second use case is way complex and many things can go wrong, i still
think that for opencl we might want to able to do it but by them i am
expecting we will take advantage of iommu being able to pagefault from
process pagetable like the next AMD iommu can.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10  8:44                     ` Inki Dae
@ 2012-05-10 17:53                       ` KOSAKI Motohiro
  2012-05-11  0:50                         ` Minchan Kim
  0 siblings, 1 reply; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-10 17:53 UTC (permalink / raw)
  To: Inki Dae
  Cc: 'Minchan Kim', 'InKi Dae',
	'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm,
	kosaki.motohiro

>>> let's assume that one application want to allocate user space memory
>>> region using malloc() and then write something on the region. as you
>>> may know, user space buffer doen't have real physical pages once
>>> malloc() call so if user tries to access the region then page fault
>>> handler would be triggered
>>
>>
>> Understood.
>>
>>> and then in turn next process like swap in to fill physical frame number
>> into entry of the page faulted.
>>
>>
>> Sorry, I can't understand your point due to my poor English.
>> Could you rewrite it easiliy? :)
>>
>
> Simply saying, handle_mm_fault would be called to update pte after finding
> vma and checking access right. and as you know, there are many cases to
> process page fault such as COW or demand paging.

Hmm. If I understand correctly, you guys misunderstand mlock. it doesn't page pinning
nor prevent pfn change. It only guarantee to don't make swap out. e.g. memory campaction
feature may automatically change page physical address.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10 17:53                       ` KOSAKI Motohiro
@ 2012-05-11  0:50                         ` Minchan Kim
  2012-05-11  2:51                           ` KOSAKI Motohiro
  0 siblings, 1 reply; 77+ messages in thread
From: Minchan Kim @ 2012-05-11  0:50 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Inki Dae, 'InKi Dae', 'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

Hi KOSAKI,

On 05/11/2012 02:53 AM, KOSAKI Motohiro wrote:

>>>> let's assume that one application want to allocate user space memory
>>>> region using malloc() and then write something on the region. as you
>>>> may know, user space buffer doen't have real physical pages once
>>>> malloc() call so if user tries to access the region then page fault
>>>> handler would be triggered
>>>
>>>
>>> Understood.
>>>
>>>> and then in turn next process like swap in to fill physical frame
>>>> number
>>> into entry of the page faulted.
>>>
>>>
>>> Sorry, I can't understand your point due to my poor English.
>>> Could you rewrite it easiliy? :)
>>>
>>
>> Simply saying, handle_mm_fault would be called to update pte after
>> finding
>> vma and checking access right. and as you know, there are many cases to
>> process page fault such as COW or demand paging.
> 
> Hmm. If I understand correctly, you guys misunderstand mlock. it doesn't
> page pinning
> nor prevent pfn change. It only guarantee to don't make swap out. e.g.


Symantic point of view, you're right but the implementation makes sure page pinning.

> memory campaction
> feature may automatically change page physical address.


I tried it last year but decided drop by realtime issue.
https://lkml.org/lkml/2011/8/29/295

so I think mlock is a kind of page pinning. If elsewhere I don't realized is doing, that place should be fixed.
Or my above patch should go ahead.

> 
> 
> -- 
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign
> http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-10 15:52                 ` Jerome Glisse
@ 2012-05-11  1:47                   ` Inki Dae
  2012-05-11  2:08                     ` Minchan Kim
  0 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-11  1:47 UTC (permalink / raw)
  To: 'Jerome Glisse', linux-mm, kyungmin.park, sw0312.kim, dri-devel


> -----Original Message-----
> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> Sent: Friday, May 11, 2012 12:53 AM
> To: Jerome Glisse; Inki Dae; linux-mm@kvack.org;
kyungmin.park@samsung.com;
> sw0312.kim@samsung.com; dri-devel@lists.freedesktop.org
> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> 
> On Thu, May 10, 2012 at 11:31 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Thu, May 10, 2012 at 11:05:07AM -0400, Jerome Glisse wrote:
> >> On Wed, May 9, 2012 at 10:44 PM, Inki Dae <inki.dae@samsung.com> wrote:
> >> > Hi Jerome,
> >> >
> >> > Thank you again.
> >> >
> >> >> -----Original Message-----
> >> >> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> >> >> Sent: Thursday, May 10, 2012 3:33 AM
> >> >> To: Inki Dae
> >> >> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> >> >> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-
> mm@kvack.org
> >> >> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
> >> >>
> >> >> On Wed, May 9, 2012 at 10:45 AM, Jerome Glisse <j.glisse@gmail.com>
> wrote:
> >> >> > On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com>
> wrote:
> >> >> >> this feature is used to import user space region allocated by
> malloc()
> >> >> or
> >> >> >> mmaped into a gem. and to guarantee the pages to user space not
> to be
> >> >> >> swapped out, the VMAs within the user space would be locked and
> then
> >> >> unlocked
> >> >> >> when the pages are released.
> >> >> >>
> >> >> >> but this lock might result in significant degradation of system
> >> >> performance
> >> >> >> because the pages couldn't be swapped out so we limit user-
> desired
> >> >> userptr
> >> >> >> size to pre-defined.
> >> >> >>
> >> >> >> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> >> >> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >> >> >
> >> >> >
> >> >> > Again i would like feedback from mm people (adding cc). I am not
> sure
> >> >> > locking the vma is the right anwser as i said in my previous mail,
> >> >> > userspace can munlock it in your back, maybe VM_RESERVED is
better.
> >> >> > Anyway even not considering that you don't check at all that
> process
> >> >> > don't go over the limit of locked page see mm/mlock.c
> RLIMIT_MEMLOCK
> >> >> > for how it's done. Also you mlock complete vma but the userptr you
> get
> >> >> > might be inside say 16M vma and you only care about 1M of userptr,
> if
> >> >> > you mark the whole vma as locked than anytime a new page is fault
> in
> >> >> > the vma else where than in the buffer you are interested then it
> got
> >> >> > allocated for ever until the gem buffer is destroy, i am not sure
> of
> >> >> > what happen to the vma on next malloc if it grows or not (i would
> >> >> > think it won't grow at it would have different flags than new
> >> >> > anonymous memory).
> >> >> >
> >> >> > The whole business of directly using malloced memory for gpu is
> fishy
> >> >> > and i would really like to get it right rather than relying on
> never
> >> >> > hitting strange things like page migration, vma merging, or worse
> >> >> > things like over locking pages and stealing memory.
> >> >> >
> >> >> > Cheers,
> >> >> > Jerome
> >> >>
> >> >> I had a lengthy discussion with mm people (thx a lot for that). I
> >> >> think we should split 2 different use case. The zero-copy upload
> case
> >> >> ie :
> >> >> app:
> >> >>     ptr = malloc()
> >> >>     ...
> >> >>     glTex/VBO/UBO/...(ptr)
> >> >>     free(ptr) or reuse it for other things
> >> >> For which i guess you want to avoid having to do a memcpy inside the
> >> >> gl library (could be anything else than gl that have same useage
> >> >> pattern).
> >> >>
> >> >
> >> > Right, in this case, we are using the userptr feature as pixman and
> evas
> >> > backend to use 2d accelerator.
> >> >
> >> >> ie after the upload happen you don't care about those page they can
> >> >> removed from the vma or marked as cow so that anything messing with
> >> >> those page after the upload won't change what you uploaded. Of
> course
> >> >
> >> > I'm not sure that I understood your mentions but could the pages be
> removed
> >> > from vma with VM_LOCKED or VM_RESERVED? once glTex/VBO/UBO/..., the
> VMAs to
> >> > user space would be locked. if cpu accessed significant part of all
> the
> >> > pages in user mode then pages to the part would be allocated by page
> fault
> >> > handler, after that, through userptr, the VMAs to user address space
> would
> >> > be locked(at this time, the remaining pages would be allocated also
> by
> >> > get_user_pages by calling page fault handler) I'd be glad to give me
> any
> >> > comments and advices if there is my missing point.
> >> >
> >> >> this is assuming that the tlb cost of doing such thing is smaller
> than
> >> >> the cost of memcpy the data.
> >> >>
> >> >
> >> > yes, in our test case, the tlb cost(incurred by tlb miss) was smaller
> than
> >> > the cost of memcpy also cpu usage. of course, this would be depended
> on gpu
> >> > performance.
> >> >
> >> >> Two way to do that, either you assume app can't not read back data
> >> >> after gl can and you do an unmap_mapping_range (make sure you only
> >> >> unmap fully covered page and that you copy non fully covered page)
> or
> >> >> you want to allow userspace to still read data or possibly overwrite
> >> >> them
> >> >>
> >> >> Second use case is something more like for the opencl case of
> >> >> CL_MEM_USE_HOST_PTR, in which you want to use the same page in the
> gpu
> >> >> and keep the userspace vma pointing to those page. I think the
> >> >> agreement on this case is that there is no way right now to do it
> >> >> sanely inside linux kernel. mlocking will need proper accounting
> >> >> against rtlimit but this limit might be low. Also the fork case
> might
> >> >> be problematic.
> >> >>
> >> >> For the fork case the memory is anonymous so it should be COWed in
> the
> >> >> fork child but relative to cl context that means the child could not
> >> >> use the cl context with that memory or at least if the child write
> to
> >> >> this memory the cl will not see those change. I guess the answer to
> >> >> that one is that you really need to use the cl api to read the
> object
> >> >> or get proper ptr to read it.
> >> >>
> >> >> Anyway in all case, implementing this userptr thing need a lot more
> >> >> code. You have to check to that the vma you are trying to use is
> >> >> anonymous and only handle this case and fallback to alloc new page
> and
> >> >> copy otherwise..
> >> >>
> >> >
> >> > I'd like to say thank you again you gave me comments and advices in
> detail.
> >> > there may be my missing points but I will check it again.
> >> >
> >> > Thanks,
> >> > Inki Dae
> >> >
> >> >> Cheers,
> >> >> Jerome
> >> >
> >>
> >> I think to sumup things there is 2 use case:
> >> 1: we want to steal anonymous page and move them to a gem/gpu object.
> >> So call get_user_pages on the rand and then unmap_mapping_range on the
> >> range we want to still page are the function you would want. That
> >> assume of course that the api case your are trying to accelerate is ok
> >> with having the user process loosing the content of this buffer. If
> >> it's not the other solution is to mark the range as COW and steal the
> >> page, if userspace try to write something new to the range it will get
> >> new page (from copy of old one).
> >>
> >> Let call it  drm_gem_from_userptr
> >>
> >> 2: you want to be able to use malloced area over long period of time
> >> (several minute) with the gpu and you want that userspace vma still
> >> point to those page. It the most problematic case, as i said just
> >> mlocking the vma is troublesome as malicious userspace can munlock it
> >> or try to abuse you to mlock too much. I believe right now there is no
> >> good way to handle this case. That means that page can't be swaped out
> >> or moved as regular anonymous page but on fork or exec this area still
> >> need to behave like an anonymous vma.
> >>
> >> Let call drm_gem_backed_by_userptr
> >>
> >> Inki i really think you should split this 2 usecase, and do only the
> >> drm_gem_from_userptr if it's enough for what you are trying to do. As
> >> the second case look that too many things can go wrong.
> >
> > Jumping into the discussion late: Chris Wilson stitched together a
> userptr
> > feature for i915. Iirc he started with your 1st usecase but quickly
> > noticed that doing all this setup stuff (get_user_pages alone, he didn't
> > include your proposed cow trick) is too expensive and it's cheaper to
> just
> > upload things with the cpu.
> 
> I think use case 1 can still be usefull on desktop x86 but the object
> need to be big something like bigger 16M or probably even bigger, so
> that the cost of memcpy is bigger than the cost of tlb trashing. I am
> sure than once we are seeing 1G dataset of opencl we will want to use
> the stealing code path rather than memcpy things. Sadly API like CL
> and GL don't make provision saying that the data ptr user supplied
> might not have the content anymore so it makes the cow trick kind of
> needed but it kills usecase such as :
> scratch = malloc()
> for (i =0; i<numtex; i++){
>    readtexture(scratch, texfilename[i])
>    glteximage(scratch)
> }
> free(scratch)
> 
> Or anything with similar access, here obviously the page backing the
> scratch area can be stole at each glteximage call.
> 
> Anyway if you can define your api and provision that after call the
> data you provided is no longer available then use case 1 sounds doable
> and worth it to me.
> 

How about forcing VM_DONTCOPY not to copy the vma on fork? this flag may
prevent doing COW. and I was talked that get_user_pages call avoid the pages
from being swapped out, not using mlock. if all the pages from
get_user_pages are MOVABLE then CMA would try to migrate movable pages into
reserved space for DMA once device driver tries allocation through dma api
so if we could prevent the pages from being moved by CMA and we limit
maximum size for userptr(accessed by only root user) then I guess that we
could avoid these issues. there may be many things I don't care so please
give me any comments and advices.

Thanks,
Inki Dae


> > So he needs to keep around these mappings for essentially forever to
> > amortize the setup cost, which boils down to your 2nd use-case. I've
> > refused to merge that code since, like you point out, too much stuff can
> > go wrong when we pin arbitrary ranges of userspace.
> >
> > One example which would only affect ARM platforms is that this would
> > horribly break CMA: Userspace malloc is allocated as GFP_MOVEABLE, and
> CMA
> > relies on migrating moveable pages out of the CMA region to handle large
> > contigious allocations. Currently the page migrate code simply backs
> down
> > and waits a bit if it encounters a page locked by get_user_pages,
> assuming
> > that this is due to io and will complete shortly. If you hold onto such
> > pages for a long time, that retry will eventually fail and break CMA.
> >
> > There are other problems affecting also desktop machines, but I've
> figured
> > I'll pick something that really hurts on arm ;-)
> >
> > Yours, Daniel
> > --
> > Daniel Vetter
> > Mail: daniel@ffwll.ch
> > Mobile: +41 (0)79 365 57 48
> 
> Yes, what i also wanted to stress is that get_user_pages is not
> enough, we are not doing something over the period of an ioctl in
> which case everything is fine and doable, but are stealing page and
> expect to use them at random point in the future with no kind of
> synchronization with userspace. Anyway i think we agree that this
> second use case is way complex and many things can go wrong, i still
> think that for opencl we might want to able to do it but by them i am
> expecting we will take advantage of iommu being able to pagefault from
> process pagetable like the next AMD iommu can.
> 
> Cheers,
> Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11  1:47                   ` Inki Dae
@ 2012-05-11  2:08                     ` Minchan Kim
  0 siblings, 0 replies; 77+ messages in thread
From: Minchan Kim @ 2012-05-11  2:08 UTC (permalink / raw)
  To: Inki Dae
  Cc: 'Jerome Glisse', linux-mm, kyungmin.park, sw0312.kim, dri-devel

On 05/11/2012 10:47 AM, Inki Dae wrote:

> 
>> -----Original Message-----
>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>> Sent: Friday, May 11, 2012 12:53 AM
>> To: Jerome Glisse; Inki Dae; linux-mm@kvack.org;
> kyungmin.park@samsung.com;
>> sw0312.kim@samsung.com; dri-devel@lists.freedesktop.org
>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>
>> On Thu, May 10, 2012 at 11:31 AM, Daniel Vetter <daniel@ffwll.ch> wrote:
>>> On Thu, May 10, 2012 at 11:05:07AM -0400, Jerome Glisse wrote:
>>>> On Wed, May 9, 2012 at 10:44 PM, Inki Dae <inki.dae@samsung.com> wrote:
>>>>> Hi Jerome,
>>>>>
>>>>> Thank you again.
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>>>>>> Sent: Thursday, May 10, 2012 3:33 AM
>>>>>> To: Inki Dae
>>>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com; linux-
>> mm@kvack.org
>>>>>> Subject: Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
>>>>>>
>>>>>> On Wed, May 9, 2012 at 10:45 AM, Jerome Glisse <j.glisse@gmail.com>
>> wrote:
>>>>>>> On Wed, May 9, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com>
>> wrote:
>>>>>>>> this feature is used to import user space region allocated by
>> malloc()
>>>>>> or
>>>>>>>> mmaped into a gem. and to guarantee the pages to user space not
>> to be
>>>>>>>> swapped out, the VMAs within the user space would be locked and
>> then
>>>>>> unlocked
>>>>>>>> when the pages are released.
>>>>>>>>
>>>>>>>> but this lock might result in significant degradation of system
>>>>>> performance
>>>>>>>> because the pages couldn't be swapped out so we limit user-
>> desired
>>>>>> userptr
>>>>>>>> size to pre-defined.
>>>>>>>>
>>>>>>>> Signed-off-by: Inki Dae <inki.dae@samsung.com>
>>>>>>>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>>>>>>
>>>>>>>
>>>>>>> Again i would like feedback from mm people (adding cc). I am not
>> sure
>>>>>>> locking the vma is the right anwser as i said in my previous mail,
>>>>>>> userspace can munlock it in your back, maybe VM_RESERVED is
> better.
>>>>>>> Anyway even not considering that you don't check at all that
>> process
>>>>>>> don't go over the limit of locked page see mm/mlock.c
>> RLIMIT_MEMLOCK
>>>>>>> for how it's done. Also you mlock complete vma but the userptr you
>> get
>>>>>>> might be inside say 16M vma and you only care about 1M of userptr,
>> if
>>>>>>> you mark the whole vma as locked than anytime a new page is fault
>> in
>>>>>>> the vma else where than in the buffer you are interested then it
>> got
>>>>>>> allocated for ever until the gem buffer is destroy, i am not sure
>> of
>>>>>>> what happen to the vma on next malloc if it grows or not (i would
>>>>>>> think it won't grow at it would have different flags than new
>>>>>>> anonymous memory).
>>>>>>>
>>>>>>> The whole business of directly using malloced memory for gpu is
>> fishy
>>>>>>> and i would really like to get it right rather than relying on
>> never
>>>>>>> hitting strange things like page migration, vma merging, or worse
>>>>>>> things like over locking pages and stealing memory.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Jerome
>>>>>>
>>>>>> I had a lengthy discussion with mm people (thx a lot for that). I
>>>>>> think we should split 2 different use case. The zero-copy upload
>> case
>>>>>> ie :
>>>>>> app:
>>>>>>     ptr = malloc()
>>>>>>     ...
>>>>>>     glTex/VBO/UBO/...(ptr)
>>>>>>     free(ptr) or reuse it for other things
>>>>>> For which i guess you want to avoid having to do a memcpy inside the
>>>>>> gl library (could be anything else than gl that have same useage
>>>>>> pattern).
>>>>>>
>>>>>
>>>>> Right, in this case, we are using the userptr feature as pixman and
>> evas
>>>>> backend to use 2d accelerator.
>>>>>
>>>>>> ie after the upload happen you don't care about those page they can
>>>>>> removed from the vma or marked as cow so that anything messing with
>>>>>> those page after the upload won't change what you uploaded. Of
>> course
>>>>>
>>>>> I'm not sure that I understood your mentions but could the pages be
>> removed
>>>>> from vma with VM_LOCKED or VM_RESERVED? once glTex/VBO/UBO/..., the
>> VMAs to
>>>>> user space would be locked. if cpu accessed significant part of all
>> the
>>>>> pages in user mode then pages to the part would be allocated by page
>> fault
>>>>> handler, after that, through userptr, the VMAs to user address space
>> would
>>>>> be locked(at this time, the remaining pages would be allocated also
>> by
>>>>> get_user_pages by calling page fault handler) I'd be glad to give me
>> any
>>>>> comments and advices if there is my missing point.
>>>>>
>>>>>> this is assuming that the tlb cost of doing such thing is smaller
>> than
>>>>>> the cost of memcpy the data.
>>>>>>
>>>>>
>>>>> yes, in our test case, the tlb cost(incurred by tlb miss) was smaller
>> than
>>>>> the cost of memcpy also cpu usage. of course, this would be depended
>> on gpu
>>>>> performance.
>>>>>
>>>>>> Two way to do that, either you assume app can't not read back data
>>>>>> after gl can and you do an unmap_mapping_range (make sure you only
>>>>>> unmap fully covered page and that you copy non fully covered page)
>> or
>>>>>> you want to allow userspace to still read data or possibly overwrite
>>>>>> them
>>>>>>
>>>>>> Second use case is something more like for the opencl case of
>>>>>> CL_MEM_USE_HOST_PTR, in which you want to use the same page in the
>> gpu
>>>>>> and keep the userspace vma pointing to those page. I think the
>>>>>> agreement on this case is that there is no way right now to do it
>>>>>> sanely inside linux kernel. mlocking will need proper accounting
>>>>>> against rtlimit but this limit might be low. Also the fork case
>> might
>>>>>> be problematic.
>>>>>>
>>>>>> For the fork case the memory is anonymous so it should be COWed in
>> the
>>>>>> fork child but relative to cl context that means the child could not
>>>>>> use the cl context with that memory or at least if the child write
>> to
>>>>>> this memory the cl will not see those change. I guess the answer to
>>>>>> that one is that you really need to use the cl api to read the
>> object
>>>>>> or get proper ptr to read it.
>>>>>>
>>>>>> Anyway in all case, implementing this userptr thing need a lot more
>>>>>> code. You have to check to that the vma you are trying to use is
>>>>>> anonymous and only handle this case and fallback to alloc new page
>> and
>>>>>> copy otherwise..
>>>>>>
>>>>>
>>>>> I'd like to say thank you again you gave me comments and advices in
>> detail.
>>>>> there may be my missing points but I will check it again.
>>>>>
>>>>> Thanks,
>>>>> Inki Dae
>>>>>
>>>>>> Cheers,
>>>>>> Jerome
>>>>>
>>>>
>>>> I think to sumup things there is 2 use case:
>>>> 1: we want to steal anonymous page and move them to a gem/gpu object.
>>>> So call get_user_pages on the rand and then unmap_mapping_range on the
>>>> range we want to still page are the function you would want. That
>>>> assume of course that the api case your are trying to accelerate is ok
>>>> with having the user process loosing the content of this buffer. If
>>>> it's not the other solution is to mark the range as COW and steal the
>>>> page, if userspace try to write something new to the range it will get
>>>> new page (from copy of old one).
>>>>
>>>> Let call it  drm_gem_from_userptr
>>>>
>>>> 2: you want to be able to use malloced area over long period of time
>>>> (several minute) with the gpu and you want that userspace vma still
>>>> point to those page. It the most problematic case, as i said just
>>>> mlocking the vma is troublesome as malicious userspace can munlock it
>>>> or try to abuse you to mlock too much. I believe right now there is no
>>>> good way to handle this case. That means that page can't be swaped out
>>>> or moved as regular anonymous page but on fork or exec this area still
>>>> need to behave like an anonymous vma.
>>>>
>>>> Let call drm_gem_backed_by_userptr
>>>>
>>>> Inki i really think you should split this 2 usecase, and do only the
>>>> drm_gem_from_userptr if it's enough for what you are trying to do. As
>>>> the second case look that too many things can go wrong.
>>>
>>> Jumping into the discussion late: Chris Wilson stitched together a
>> userptr
>>> feature for i915. Iirc he started with your 1st usecase but quickly
>>> noticed that doing all this setup stuff (get_user_pages alone, he didn't
>>> include your proposed cow trick) is too expensive and it's cheaper to
>> just
>>> upload things with the cpu.
>>
>> I think use case 1 can still be usefull on desktop x86 but the object
>> need to be big something like bigger 16M or probably even bigger, so
>> that the cost of memcpy is bigger than the cost of tlb trashing. I am
>> sure than once we are seeing 1G dataset of opencl we will want to use
>> the stealing code path rather than memcpy things. Sadly API like CL
>> and GL don't make provision saying that the data ptr user supplied
>> might not have the content anymore so it makes the cow trick kind of
>> needed but it kills usecase such as :
>> scratch = malloc()
>> for (i =0; i<numtex; i++){
>>    readtexture(scratch, texfilename[i])
>>    glteximage(scratch)
>> }
>> free(scratch)
>>
>> Or anything with similar access, here obviously the page backing the
>> scratch area can be stole at each glteximage call.
>>
>> Anyway if you can define your api and provision that after call the
>> data you provided is no longer available then use case 1 sounds doable
>> and worth it to me.
>>
> 
> How about forcing VM_DONTCOPY not to copy the vma on fork? this flag may
> prevent doing COW. and I was talked that get_user_pages call avoid the pages


That was why we introduced MADV_DONTFORK.

> from being swapped out, not using mlock. if all the pages from


True.

> get_user_pages are MOVABLE then CMA would try to migrate movable pages into
> reserved space for DMA once device driver tries allocation through dma api
> so if we could prevent the pages from being moved by CMA and we limit


CMA can't migrate pinning pages which have increased page count by get_user_pages.
So don't worry about it.

> maximum size for userptr(accessed by only root user) then I guess that we
> could avoid these issues. there may be many things I don't care so please
> give me any comments and advices.
> 
> Thanks,
> Inki Dae
> 
> 
>>> So he needs to keep around these mappings for essentially forever to
>>> amortize the setup cost, which boils down to your 2nd use-case. I've
>>> refused to merge that code since, like you point out, too much stuff can
>>> go wrong when we pin arbitrary ranges of userspace.
>>>
>>> One example which would only affect ARM platforms is that this would
>>> horribly break CMA: Userspace malloc is allocated as GFP_MOVEABLE, and
>> CMA
>>> relies on migrating moveable pages out of the CMA region to handle large
>>> contigious allocations. Currently the page migrate code simply backs
>> down
>>> and waits a bit if it encounters a page locked by get_user_pages,
>> assuming
>>> that this is due to io and will complete shortly. If you hold onto such
>>> pages for a long time, that retry will eventually fail and break CMA.
>>>
>>> There are other problems affecting also desktop machines, but I've
>> figured
>>> I'll pick something that really hurts on arm ;-)
>>>
>>> Yours, Daniel
>>> --
>>> Daniel Vetter
>>> Mail: daniel@ffwll.ch
>>> Mobile: +41 (0)79 365 57 48
>>
>> Yes, what i also wanted to stress is that get_user_pages is not
>> enough, we are not doing something over the period of an ioctl in
>> which case everything is fine and doable, but are stealing page and
>> expect to use them at random point in the future with no kind of
>> synchronization with userspace. Anyway i think we agree that this
>> second use case is way complex and many things can go wrong, i still
>> think that for opencl we might want to able to do it but by them i am
>> expecting we will take advantage of iommu being able to pagefault from
>> process pagetable like the next AMD iommu can.
>>
>> Cheers,
>> Jerome
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=ilto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11  0:50                         ` Minchan Kim
@ 2012-05-11  2:51                           ` KOSAKI Motohiro
  2012-05-11  3:01                             ` Jerome Glisse
  0 siblings, 1 reply; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-11  2:51 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, Inki Dae, 'InKi Dae',
	'Jerome Glisse',
	airlied, dri-devel, kyungmin.park, sw0312.kim, linux-mm

(5/10/12 8:50 PM), Minchan Kim wrote:
> Hi KOSAKI,
>
> On 05/11/2012 02:53 AM, KOSAKI Motohiro wrote:
>
>>>>> let's assume that one application want to allocate user space memory
>>>>> region using malloc() and then write something on the region. as you
>>>>> may know, user space buffer doen't have real physical pages once
>>>>> malloc() call so if user tries to access the region then page fault
>>>>> handler would be triggered
>>>>
>>>>
>>>> Understood.
>>>>
>>>>> and then in turn next process like swap in to fill physical frame
>>>>> number
>>>> into entry of the page faulted.
>>>>
>>>>
>>>> Sorry, I can't understand your point due to my poor English.
>>>> Could you rewrite it easiliy? :)
>>>>
>>>
>>> Simply saying, handle_mm_fault would be called to update pte after
>>> finding
>>> vma and checking access right. and as you know, there are many cases to
>>> process page fault such as COW or demand paging.
>>
>> Hmm. If I understand correctly, you guys misunderstand mlock. it doesn't
>> page pinning
>> nor prevent pfn change. It only guarantee to don't make swap out. e.g.
>
>
> Symantic point of view, you're right but the implementation makes sure page pinning.
>
>> memory campaction
>> feature may automatically change page physical address.
>
>
> I tried it last year but decided drop by realtime issue.
> https://lkml.org/lkml/2011/8/29/295
>
> so I think mlock is a kind of page pinning. If elsewhere I don't realized is doing, that place should be fixed.
> Or my above patch should go ahead.

Thanks pointing out. I didn't realized your patch didn't merged. I think it should go ahead. think autonuma case,
if mlock disable autonuma migration, that's bug.  I don't think we can promise mlock don't change physical page.
I wonder if any realtime guys page migration is free lunch. they should disable both auto migration and compaction.

And, think if application explictly use migrate_pages(2) or admins uses cpusets. driver code can't assume such scenario
doesn't occur, yes?


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11  2:51                           ` KOSAKI Motohiro
@ 2012-05-11  3:01                             ` Jerome Glisse
  2012-05-11 21:20                               ` KOSAKI Motohiro
  0 siblings, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-11  3:01 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Inki Dae, InKi Dae, airlied, dri-devel,
	kyungmin.park, sw0312.kim, linux-mm

On Thu, May 10, 2012 at 10:51 PM, KOSAKI Motohiro
<kosaki.motohiro@gmail.com> wrote:
> (5/10/12 8:50 PM), Minchan Kim wrote:
>>
>> Hi KOSAKI,
>>
>> On 05/11/2012 02:53 AM, KOSAKI Motohiro wrote:
>>
>>>>>> let's assume that one application want to allocate user space memory
>>>>>> region using malloc() and then write something on the region. as you
>>>>>> may know, user space buffer doen't have real physical pages once
>>>>>> malloc() call so if user tries to access the region then page fault
>>>>>> handler would be triggered
>>>>>
>>>>>
>>>>>
>>>>> Understood.
>>>>>
>>>>>> and then in turn next process like swap in to fill physical frame
>>>>>> number
>>>>>
>>>>> into entry of the page faulted.
>>>>>
>>>>>
>>>>> Sorry, I can't understand your point due to my poor English.
>>>>> Could you rewrite it easiliy? :)
>>>>>
>>>>
>>>> Simply saying, handle_mm_fault would be called to update pte after
>>>> finding
>>>> vma and checking access right. and as you know, there are many cases to
>>>> process page fault such as COW or demand paging.
>>>
>>>
>>> Hmm. If I understand correctly, you guys misunderstand mlock. it doesn't
>>> page pinning
>>> nor prevent pfn change. It only guarantee to don't make swap out. e.g.
>>
>>
>>
>> Symantic point of view, you're right but the implementation makes sure
>> page pinning.
>>
>>> memory campaction
>>> feature may automatically change page physical address.
>>
>>
>>
>> I tried it last year but decided drop by realtime issue.
>> https://lkml.org/lkml/2011/8/29/295
>>
>> so I think mlock is a kind of page pinning. If elsewhere I don't realized
>> is doing, that place should be fixed.
>> Or my above patch should go ahead.
>
>
> Thanks pointing out. I didn't realized your patch didn't merged. I think it
> should go ahead. think autonuma case,
> if mlock disable autonuma migration, that's bug.  I don't think we can
> promise mlock don't change physical page.
> I wonder if any realtime guys page migration is free lunch. they should
> disable both auto migration and compaction.
>
> And, think if application explictly use migrate_pages(2) or admins uses
> cpusets. driver code can't assume such scenario
> doesn't occur, yes?
>
>

I am ok with patch being merge as is if you add restriction for the
ioctl to be root only and a big comment stating that user ptr thing is
just abusing the kernel API and that it should not be replicated by
other driver except if fully understanding that all hell might break
loose with it.

If you know it's only the ddx that will use it and that their wont be
fork that better to not worry about but again state it in the comment
about the ioctl.

I really wish there was some magical VM_DRIVER_MAPPED flags that would
add the proper restriction to other memory code while keeping fork
behavior consistant (ie cow). But such things would need massive
chirurgy of the linux mm code.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11  3:01                             ` Jerome Glisse
@ 2012-05-11 21:20                               ` KOSAKI Motohiro
  2012-05-11 22:22                                 ` Jerome Glisse
  0 siblings, 1 reply; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-11 21:20 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: KOSAKI Motohiro, Minchan Kim, Inki Dae, InKi Dae, airlied,
	dri-devel, kyungmin.park, sw0312.kim, linux-mm

(5/10/12 11:01 PM), Jerome Glisse wrote:
> On Thu, May 10, 2012 at 10:51 PM, KOSAKI Motohiro
> <kosaki.motohiro@gmail.com>  wrote:
>> (5/10/12 8:50 PM), Minchan Kim wrote:
>>>
>>> Hi KOSAKI,
>>>
>>> On 05/11/2012 02:53 AM, KOSAKI Motohiro wrote:
>>>
>>>>>>> let's assume that one application want to allocate user space memory
>>>>>>> region using malloc() and then write something on the region. as you
>>>>>>> may know, user space buffer doen't have real physical pages once
>>>>>>> malloc() call so if user tries to access the region then page fault
>>>>>>> handler would be triggered
>>>>>>
>>>>>>
>>>>>>
>>>>>> Understood.
>>>>>>
>>>>>>> and then in turn next process like swap in to fill physical frame
>>>>>>> number
>>>>>>
>>>>>> into entry of the page faulted.
>>>>>>
>>>>>>
>>>>>> Sorry, I can't understand your point due to my poor English.
>>>>>> Could you rewrite it easiliy? :)
>>>>>>
>>>>>
>>>>> Simply saying, handle_mm_fault would be called to update pte after
>>>>> finding
>>>>> vma and checking access right. and as you know, there are many cases to
>>>>> process page fault such as COW or demand paging.
>>>>
>>>>
>>>> Hmm. If I understand correctly, you guys misunderstand mlock. it doesn't
>>>> page pinning
>>>> nor prevent pfn change. It only guarantee to don't make swap out. e.g.
>>>
>>>
>>>
>>> Symantic point of view, you're right but the implementation makes sure
>>> page pinning.
>>>
>>>> memory campaction
>>>> feature may automatically change page physical address.
>>>
>>>
>>>
>>> I tried it last year but decided drop by realtime issue.
>>> https://lkml.org/lkml/2011/8/29/295
>>>
>>> so I think mlock is a kind of page pinning. If elsewhere I don't realized
>>> is doing, that place should be fixed.
>>> Or my above patch should go ahead.
>>
>>
>> Thanks pointing out. I didn't realized your patch didn't merged. I think it
>> should go ahead. think autonuma case,
>> if mlock disable autonuma migration, that's bug.  I don't think we can
>> promise mlock don't change physical page.
>> I wonder if any realtime guys page migration is free lunch. they should
>> disable both auto migration and compaction.
>>
>> And, think if application explictly use migrate_pages(2) or admins uses
>> cpusets. driver code can't assume such scenario
>> doesn't occur, yes?
>>
>>
>
> I am ok with patch being merge as is if you add restriction for the
> ioctl to be root only and a big comment stating that user ptr thing is
> just abusing the kernel API and that it should not be replicated by
> other driver except if fully understanding that all hell might break
> loose with it.

Oh, apology. I didn't intend to assist as is merge. Basically I agree with
minchan. Is should be replaced get_user_pages(). I only intended to clarify
pros/cons and where is original author's intention. If I understand correctly,
MADV_DONT_FORK is best solution for this case.




> If you know it's only the ddx that will use it and that their wont be
> fork that better to not worry about but again state it in the comment
> about the ioctl.
>
> I really wish there was some magical VM_DRIVER_MAPPED flags that would
> add the proper restriction to other memory code while keeping fork
> behavior consistant (ie cow). But such things would need massive
> chirurgy of the linux mm code.
>
> Cheers,
> Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11 21:20                               ` KOSAKI Motohiro
@ 2012-05-11 22:22                                 ` Jerome Glisse
  2012-05-11 22:59                                     ` KOSAKI Motohiro
  0 siblings, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-11 22:22 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Inki Dae, InKi Dae, airlied, dri-devel,
	kyungmin.park, sw0312.kim, linux-mm

On Fri, May 11, 2012 at 5:20 PM, KOSAKI Motohiro
<kosaki.motohiro@gmail.com> wrote:
> (5/10/12 11:01 PM), Jerome Glisse wrote:
>>
>> On Thu, May 10, 2012 at 10:51 PM, KOSAKI Motohiro
>> <kosaki.motohiro@gmail.com>  wrote:
>>>
>>> (5/10/12 8:50 PM), Minchan Kim wrote:
>>>>
>>>>
>>>> Hi KOSAKI,
>>>>
>>>> On 05/11/2012 02:53 AM, KOSAKI Motohiro wrote:
>>>>
>>>>>>>> let's assume that one application want to allocate user space memory
>>>>>>>> region using malloc() and then write something on the region. as you
>>>>>>>> may know, user space buffer doen't have real physical pages once
>>>>>>>> malloc() call so if user tries to access the region then page fault
>>>>>>>> handler would be triggered
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Understood.
>>>>>>>
>>>>>>>> and then in turn next process like swap in to fill physical frame
>>>>>>>> number
>>>>>>>
>>>>>>>
>>>>>>> into entry of the page faulted.
>>>>>>>
>>>>>>>
>>>>>>> Sorry, I can't understand your point due to my poor English.
>>>>>>> Could you rewrite it easiliy? :)
>>>>>>>
>>>>>>
>>>>>> Simply saying, handle_mm_fault would be called to update pte after
>>>>>> finding
>>>>>> vma and checking access right. and as you know, there are many cases
>>>>>> to
>>>>>> process page fault such as COW or demand paging.
>>>>>
>>>>>
>>>>>
>>>>> Hmm. If I understand correctly, you guys misunderstand mlock. it
>>>>> doesn't
>>>>> page pinning
>>>>> nor prevent pfn change. It only guarantee to don't make swap out. e.g.
>>>>
>>>>
>>>>
>>>>
>>>> Symantic point of view, you're right but the implementation makes sure
>>>> page pinning.
>>>>
>>>>> memory campaction
>>>>> feature may automatically change page physical address.
>>>>
>>>>
>>>>
>>>>
>>>> I tried it last year but decided drop by realtime issue.
>>>> https://lkml.org/lkml/2011/8/29/295
>>>>
>>>> so I think mlock is a kind of page pinning. If elsewhere I don't
>>>> realized
>>>> is doing, that place should be fixed.
>>>> Or my above patch should go ahead.
>>>
>>>
>>>
>>> Thanks pointing out. I didn't realized your patch didn't merged. I think
>>> it
>>> should go ahead. think autonuma case,
>>> if mlock disable autonuma migration, that's bug.  I don't think we can
>>> promise mlock don't change physical page.
>>> I wonder if any realtime guys page migration is free lunch. they should
>>> disable both auto migration and compaction.
>>>
>>> And, think if application explictly use migrate_pages(2) or admins uses
>>> cpusets. driver code can't assume such scenario
>>> doesn't occur, yes?
>>>
>>>
>>
>> I am ok with patch being merge as is if you add restriction for the
>> ioctl to be root only and a big comment stating that user ptr thing is
>> just abusing the kernel API and that it should not be replicated by
>> other driver except if fully understanding that all hell might break
>> loose with it.
>
>
> Oh, apology. I didn't intend to assist as is merge. Basically I agree with
> minchan. Is should be replaced get_user_pages(). I only intended to clarify
> pros/cons and where is original author's intention. If I understand
> correctly,
> MADV_DONT_FORK is best solution for this case.
>

My point is this ioctl will be restricted to one user (Xserver if i
understand) and only this user, there is no fork in it so no need to
worry about fork, just setting the vma as locked will be enough.

But i don't want people reading this driver suddenly think that what
it's doing is ok, it's not, it's hack and can never make to work
properly on a general case, that's why it needs a big comment stating,
stressing that. I just wanted to make sure Inki and Kyungmin
understood that this kind of ioctl should be restricted to carefully
selected user and that there is no way to make it general or reliable
outside that.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11 22:22                                 ` Jerome Glisse
@ 2012-05-11 22:59                                     ` KOSAKI Motohiro
  0 siblings, 0 replies; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-11 22:59 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: KOSAKI Motohiro, Minchan Kim, Inki Dae, InKi Dae, airlied,
	dri-devel, kyungmin.park, sw0312.kim, linux-mm

> My point is this ioctl will be restricted to one user (Xserver if i
> understand) and only this user, there is no fork in it so no need to
> worry about fork, just setting the vma as locked will be enough.
>
> But i don't want people reading this driver suddenly think that what
> it's doing is ok, it's not, it's hack and can never make to work
> properly on a general case, that's why it needs a big comment stating,
> stressing that. I just wanted to make sure Inki and Kyungmin
> understood that this kind of ioctl should be restricted to carefully
> selected user and that there is no way to make it general or reliable
> outside that.

first off, I'm not drm guy and then I don't intend to insist you. but if
application don't use fork, get_user_pages() has no downside. I guess we
don't need VM_LOCKED hack.

but again, up to drm folks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
@ 2012-05-11 22:59                                     ` KOSAKI Motohiro
  0 siblings, 0 replies; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-11 22:59 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: KOSAKI Motohiro, Minchan Kim, Inki Dae, InKi Dae, airlied,
	dri-devel, kyungmin.park, sw0312.kim, linux-mm

> My point is this ioctl will be restricted to one user (Xserver if i
> understand) and only this user, there is no fork in it so no need to
> worry about fork, just setting the vma as locked will be enough.
>
> But i don't want people reading this driver suddenly think that what
> it's doing is ok, it's not, it's hack and can never make to work
> properly on a general case, that's why it needs a big comment stating,
> stressing that. I just wanted to make sure Inki and Kyungmin
> understood that this kind of ioctl should be restricted to carefully
> selected user and that there is no way to make it general or reliable
> outside that.

first off, I'm not drm guy and then I don't intend to insist you. but if
application don't use fork, get_user_pages() has no downside. I guess we
don't need VM_LOCKED hack.

but again, up to drm folks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11 22:59                                     ` KOSAKI Motohiro
  (?)
@ 2012-05-11 23:29                                     ` Jerome Glisse
  2012-05-11 23:39                                       ` KOSAKI Motohiro
  -1 siblings, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-11 23:29 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Inki Dae, InKi Dae, airlied, dri-devel,
	kyungmin.park, sw0312.kim, linux-mm

On Fri, May 11, 2012 at 6:59 PM, KOSAKI Motohiro
<kosaki.motohiro@gmail.com> wrote:
>> My point is this ioctl will be restricted to one user (Xserver if i
>> understand) and only this user, there is no fork in it so no need to
>> worry about fork, just setting the vma as locked will be enough.
>>
>> But i don't want people reading this driver suddenly think that what
>> it's doing is ok, it's not, it's hack and can never make to work
>> properly on a general case, that's why it needs a big comment stating,
>> stressing that. I just wanted to make sure Inki and Kyungmin
>> understood that this kind of ioctl should be restricted to carefully
>> selected user and that there is no way to make it general or reliable
>> outside that.
>
>
> first off, I'm not drm guy and then I don't intend to insist you. but if
> application don't use fork, get_user_pages() has no downside. I guess we
> don't need VM_LOCKED hack.
>
> but again, up to drm folks.

You need the VM_LOCKED hack to mare sure that the xorg vma still point
to the same page, afaict with get_user_pages pages can be migrated out
of the anonymous vma so the vma might point to new page, while old
page are still in use by the gpu and not recycle until their refcount
drop to 0.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11 23:29                                     ` Jerome Glisse
@ 2012-05-11 23:39                                       ` KOSAKI Motohiro
  2012-05-12  4:48                                         ` InKi Dae
  0 siblings, 1 reply; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-11 23:39 UTC (permalink / raw)
  To: Jerome Glisse
  Cc: Minchan Kim, Inki Dae, InKi Dae, airlied, dri-devel,
	kyungmin.park, sw0312.kim, linux-mm

On Fri, May 11, 2012 at 7:29 PM, Jerome Glisse <j.glisse@gmail.com> wrote:
> On Fri, May 11, 2012 at 6:59 PM, KOSAKI Motohiro
> <kosaki.motohiro@gmail.com> wrote:
>>> My point is this ioctl will be restricted to one user (Xserver if i
>>> understand) and only this user, there is no fork in it so no need to
>>> worry about fork, just setting the vma as locked will be enough.
>>>
>>> But i don't want people reading this driver suddenly think that what
>>> it's doing is ok, it's not, it's hack and can never make to work
>>> properly on a general case, that's why it needs a big comment stating,
>>> stressing that. I just wanted to make sure Inki and Kyungmin
>>> understood that this kind of ioctl should be restricted to carefully
>>> selected user and that there is no way to make it general or reliable
>>> outside that.
>>
>>
>> first off, I'm not drm guy and then I don't intend to insist you. but if
>> application don't use fork, get_user_pages() has no downside. I guess we
>> don't need VM_LOCKED hack.
>>
>> but again, up to drm folks.
>
> You need the VM_LOCKED hack to mare sure that the xorg vma still point
> to the same page, afaict with get_user_pages pages can be migrated out
> of the anonymous vma so the vma might point to new page, while old
> page are still in use by the gpu and not recycle until their refcount
> drop to 0.

afaik, get_user_pages() prevent page migration. (see
migrate_page_move_mapping). but mlock doesn't.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-11 23:39                                       ` KOSAKI Motohiro
@ 2012-05-12  4:48                                         ` InKi Dae
  2012-05-14  4:29                                             ` Minchan Kim
  0 siblings, 1 reply; 77+ messages in thread
From: InKi Dae @ 2012-05-12  4:48 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Jerome Glisse, Minchan Kim, Inki Dae, airlied, dri-devel,
	kyungmin.park, sw0312.kim, linux-mm

2012/5/12 KOSAKI Motohiro <kosaki.motohiro@gmail.com>:
> On Fri, May 11, 2012 at 7:29 PM, Jerome Glisse <j.glisse@gmail.com> wrote:
>> On Fri, May 11, 2012 at 6:59 PM, KOSAKI Motohiro
>> <kosaki.motohiro@gmail.com> wrote:
>>>> My point is this ioctl will be restricted to one user (Xserver if i
>>>> understand) and only this user, there is no fork in it so no need to
>>>> worry about fork, just setting the vma as locked will be enough.
>>>>
>>>> But i don't want people reading this driver suddenly think that what
>>>> it's doing is ok, it's not, it's hack and can never make to work
>>>> properly on a general case, that's why it needs a big comment stating,
>>>> stressing that. I just wanted to make sure Inki and Kyungmin
>>>> understood that this kind of ioctl should be restricted to carefully
>>>> selected user and that there is no way to make it general or reliable
>>>> outside that.
>>>
>>>
>>> first off, I'm not drm guy and then I don't intend to insist you. but if
>>> application don't use fork, get_user_pages() has no downside. I guess we
>>> don't need VM_LOCKED hack.
>>>
>>> but again, up to drm folks.
>>
>> You need the VM_LOCKED hack to mare sure that the xorg vma still point
>> to the same page, afaict with get_user_pages pages can be migrated out
>> of the anonymous vma so the vma might point to new page, while old
>> page are still in use by the gpu and not recycle until their refcount
>> drop to 0.
>
> afaik, get_user_pages() prevent page migration. (see
> migrate_page_move_mapping). but mlock doesn't.


I'd like to make sure some points before preparing next patch to
userptr feature.

in case that userptr ioctl can be accessed by user.
1. all the pages from get_user_pages can't be migrated by CMA and also
it doesn't need VM_LOCKED or VM_RESERVED flag.

2. if VM_DONTCOPY is set to vma->flags then all the pages to this vma
are safe from being COW.

3. userptr ioctl  has limited size and the limited size can be changed
by only root user. this is for preventing from dropping system
performance by malicious software.

with above actions taken, are there something we didn't care? if so,
we will preparing next path for the userptr ioctl to be accessed by
only root user. this means that this feature is used by only X Server
but isn't used by any users. so we are going to wait something
resolved fully. of course, as Jerome said, we wil add big comments
describing this feature enough to next patch.

Thanks,
Inki Dae

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
  2012-05-12  4:48                                         ` InKi Dae
@ 2012-05-14  4:29                                             ` Minchan Kim
  0 siblings, 0 replies; 77+ messages in thread
From: Minchan Kim @ 2012-05-14  4:29 UTC (permalink / raw)
  To: InKi Dae
  Cc: KOSAKI Motohiro, Jerome Glisse, Inki Dae, airlied, dri-devel,
	kyungmin.park, sw0312.kim, linux-mm

On 05/12/2012 01:48 PM, InKi Dae wrote:

> 2012/5/12 KOSAKI Motohiro <kosaki.motohiro@gmail.com>:
>> On Fri, May 11, 2012 at 7:29 PM, Jerome Glisse <j.glisse@gmail.com> wrote:
>>> On Fri, May 11, 2012 at 6:59 PM, KOSAKI Motohiro
>>> <kosaki.motohiro@gmail.com> wrote:
>>>>> My point is this ioctl will be restricted to one user (Xserver if i
>>>>> understand) and only this user, there is no fork in it so no need to
>>>>> worry about fork, just setting the vma as locked will be enough.
>>>>>
>>>>> But i don't want people reading this driver suddenly think that what
>>>>> it's doing is ok, it's not, it's hack and can never make to work
>>>>> properly on a general case, that's why it needs a big comment stating,
>>>>> stressing that. I just wanted to make sure Inki and Kyungmin
>>>>> understood that this kind of ioctl should be restricted to carefully
>>>>> selected user and that there is no way to make it general or reliable
>>>>> outside that.
>>>>
>>>>
>>>> first off, I'm not drm guy and then I don't intend to insist you. but if
>>>> application don't use fork, get_user_pages() has no downside. I guess we
>>>> don't need VM_LOCKED hack.
>>>>
>>>> but again, up to drm folks.
>>>
>>> You need the VM_LOCKED hack to mare sure that the xorg vma still point
>>> to the same page, afaict with get_user_pages pages can be migrated out
>>> of the anonymous vma so the vma might point to new page, while old
>>> page are still in use by the gpu and not recycle until their refcount
>>> drop to 0.
>>
>> afaik, get_user_pages() prevent page migration. (see
>> migrate_page_move_mapping). but mlock doesn't.
> 
> 
> I'd like to make sure some points before preparing next patch to
> userptr feature.
> 
> in case that userptr ioctl can be accessed by user.
> 1. all the pages from get_user_pages can't be migrated by CMA and also


Yes. I already mentioned it.

> it doesn't need VM_LOCKED or VM_RESERVED flag.


Yes, if you just use that flag to prevent migration.

> 
> 2. if VM_DONTCOPY is set to vma->flags then all the pages to this vma
> are safe from being COW.


Yes.

> 
> 3. userptr ioctl  has limited size and the limited size can be changed
> by only root user. this is for preventing from dropping system
> performance by malicious software.


IMHO, looks good to me but need answer from DRM guy on the question.

> 
> with above actions taken, are there something we didn't care? if so,
> we will preparing next path for the userptr ioctl to be accessed by
> only root user. this means that this feature is used by only X Server
> but isn't used by any users. so we are going to wait something
> resolved fully. of course, as Jerome said, we wil add big comments
> describing this feature enough to next patch.
> 
> Thanks,
> Inki Dae
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v3] drm/exynos: added userptr feature.
@ 2012-05-14  4:29                                             ` Minchan Kim
  0 siblings, 0 replies; 77+ messages in thread
From: Minchan Kim @ 2012-05-14  4:29 UTC (permalink / raw)
  To: InKi Dae
  Cc: KOSAKI Motohiro, Jerome Glisse, Inki Dae, airlied, dri-devel,
	kyungmin.park, sw0312.kim, linux-mm

On 05/12/2012 01:48 PM, InKi Dae wrote:

> 2012/5/12 KOSAKI Motohiro <kosaki.motohiro@gmail.com>:
>> On Fri, May 11, 2012 at 7:29 PM, Jerome Glisse <j.glisse@gmail.com> wrote:
>>> On Fri, May 11, 2012 at 6:59 PM, KOSAKI Motohiro
>>> <kosaki.motohiro@gmail.com> wrote:
>>>>> My point is this ioctl will be restricted to one user (Xserver if i
>>>>> understand) and only this user, there is no fork in it so no need to
>>>>> worry about fork, just setting the vma as locked will be enough.
>>>>>
>>>>> But i don't want people reading this driver suddenly think that what
>>>>> it's doing is ok, it's not, it's hack and can never make to work
>>>>> properly on a general case, that's why it needs a big comment stating,
>>>>> stressing that. I just wanted to make sure Inki and Kyungmin
>>>>> understood that this kind of ioctl should be restricted to carefully
>>>>> selected user and that there is no way to make it general or reliable
>>>>> outside that.
>>>>
>>>>
>>>> first off, I'm not drm guy and then I don't intend to insist you. but if
>>>> application don't use fork, get_user_pages() has no downside. I guess we
>>>> don't need VM_LOCKED hack.
>>>>
>>>> but again, up to drm folks.
>>>
>>> You need the VM_LOCKED hack to mare sure that the xorg vma still point
>>> to the same page, afaict with get_user_pages pages can be migrated out
>>> of the anonymous vma so the vma might point to new page, while old
>>> page are still in use by the gpu and not recycle until their refcount
>>> drop to 0.
>>
>> afaik, get_user_pages() prevent page migration. (see
>> migrate_page_move_mapping). but mlock doesn't.
> 
> 
> I'd like to make sure some points before preparing next patch to
> userptr feature.
> 
> in case that userptr ioctl can be accessed by user.
> 1. all the pages from get_user_pages can't be migrated by CMA and also


Yes. I already mentioned it.

> it doesn't need VM_LOCKED or VM_RESERVED flag.


Yes, if you just use that flag to prevent migration.

> 
> 2. if VM_DONTCOPY is set to vma->flags then all the pages to this vma
> are safe from being COW.


Yes.

> 
> 3. userptr ioctl  has limited size and the limited size can be changed
> by only root user. this is for preventing from dropping system
> performance by malicious software.


IMHO, looks good to me but need answer from DRM guy on the question.

> 
> with above actions taken, are there something we didn't care? if so,
> we will preparing next path for the userptr ioctl to be accessed by
> only root user. this means that this feature is used by only X Server
> but isn't used by any users. so we are going to wait something
> resolved fully. of course, as Jerome said, we wil add big comments
> describing this feature enough to next patch.
> 
> Thanks,
> Inki Dae
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 0/2 v4] drm/exynos: added userptr feature
  2012-05-09  6:17   ` [PATCH 0/2 v3] " Inki Dae
  2012-05-09  6:17     ` [PATCH 1/2 v3] drm/exynos: added userptr limit ioctl Inki Dae
  2012-05-09  6:17     ` [PATCH 2/2 v3] drm/exynos: added userptr feature Inki Dae
@ 2012-05-14  6:17     ` Inki Dae
  2012-05-14  6:17       ` [PATCH 1/2 v4] drm/exynos: added userptr limit ioctl Inki Dae
                         ` (2 more replies)
  2 siblings, 3 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-14  6:17 UTC (permalink / raw)
  To: airlied, dri-devel
  Cc: sw0312.kim, Inki Dae, minchan, kyungmin.park, kosaki.motohiro

this feature could be used to memory region allocated by malloc() in user mode
and mmaped memory region allocated by other memory allocators.
userptr interface can identify memory type through vm_flags value and would get
pages or page frame numbers to user space appropriately.

Changelog v4:
we have discussed some issues to userptr feature with drm and mm guys and
below are the issues.

The migration issue.
- Pages reserved by CMA for some device using DMA could be used by
kernel and if the device driver wants to use those pages
while being used by kernel then the pages are copied into
other ones allocated to migrate them and then finally,
the device driver can use the pages for itself.
Thus, migrated, the pages being accessed by DMA could be changed
to other so this situation may incur that DMA accesses any pages
it doesn't want.

The COW issue.
- while DMA of a device is using the pages to VMAs, if current
process was forked then the pages being accessed by the DMA
would be copied into child's pages.(Copy On Write) so
these pages may not have coherrency with parent's ones if
child process wrote something on those pages so we need to
flag VM_DONTCOPY to prevent pages from being COWed.

But the use of get_user_pages is safe from such magration issue
because all the pages from get_user_pages CAN NOT BE not only
migrated but also swapped out. this true has been confirmed
by mm guys, Minchan Kim and KOSAKI Motohiro.
However below issue could be incurred.

The deterioration issue of system performance by malicious process.
- any malicious process can request all the pages of entire system memory
through this userptr ioctl and as the result, all other processes would be
blocked and this would incur the deterioration of system performance
because the pages couldn't be swapped out.

So we limit user-desired userptr size to pre-defined and this ioctl command
CAN BE accessed by only root user.

Changelog v3:
Mitigated the issues pointed out by Dave and Jerome.

for this, added some codes to guarantee the pages to user space not
to be swapped out, the VMAs within the user space would be locked and
then unlocked when the pages are released.

but this lock might result in significant degradation of system performance
because the pages couldn't be swapped out so added one more featrue
that we can limit user-desired userptr size to pre-defined value using
userptr limit ioctl that can be accessed by only root user.

Changelog v2:
the memory regino mmaped with VM_PFNMAP type is physically continuous and
start address of the memory region should be set into buf->dma_addr but
previous patch had a problem that end address is set into buf->dma_addr
so v2 fixes that problem.

Inki Dae (2):
  drm/exynos: added userptr limit ioctl.
  drm/exynos: added userptr feature.

 drivers/gpu/drm/exynos/exynos_drm_drv.c |    8 +
 drivers/gpu/drm/exynos/exynos_drm_drv.h |    6 +
 drivers/gpu/drm/exynos/exynos_drm_gem.c |  413 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |   20 ++-
 include/drm/exynos_drm.h                |   43 +++-
 5 files changed, 487 insertions(+), 3 deletions(-)

-- 
1.7.4.1

^ permalink raw reply	[flat|nested] 77+ messages in thread

* [PATCH 1/2 v4] drm/exynos: added userptr limit ioctl.
  2012-05-14  6:17     ` [PATCH 0/2 v4] " Inki Dae
@ 2012-05-14  6:17       ` Inki Dae
  2012-05-14  8:12         ` Inki Dae
  2012-05-14  6:17       ` [PATCH 2/2 v4] drm/exynos: added userptr feature Inki Dae
  2012-05-14  8:12       ` [PATCH 0/2 " Inki Dae
  2 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-14  6:17 UTC (permalink / raw)
  To: airlied, dri-devel
  Cc: sw0312.kim, Inki Dae, minchan, kyungmin.park, kosaki.motohiro

this ioctl is used to limit user-desired userptr size as pre-defined
and also could be accessed by only root user.

with userptr feature, unprivileged user can allocate all the pages on system,
so the amount of free physical pages will be very limited. if the VMAs
within user address space was pinned, the pages couldn't be swapped out so
it may result in significant degradation of system performance. so this
feature would be used to avoid such situation.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c |    6 ++++++
 drivers/gpu/drm/exynos/exynos_drm_drv.h |    6 ++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.c |   22 ++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |    3 +++
 include/drm/exynos_drm.h                |   17 +++++++++++++++++
 5 files changed, 54 insertions(+), 0 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index 9d3204c..1e68ec2 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -64,6 +64,9 @@ static int exynos_drm_load(struct drm_device *dev, unsigned long flags)
 		return -ENOMEM;
 	}
 
+	/* maximum size of userptr is limited to 16MB as default. */
+	private->userptr_limit = SZ_16M;
+
 	INIT_LIST_HEAD(&private->pageflip_event_list);
 	dev->dev_private = (void *)private;
 
@@ -221,6 +224,9 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
 			exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
 			exynos_drm_gem_get_ioctl, DRM_UNLOCKED),
+	DRM_IOCTL_DEF_DRV(EXYNOS_USER_LIMIT,
+			exynos_drm_gem_user_limit_ioctl, DRM_MASTER |
+			DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS, exynos_plane_set_zpos_ioctl,
 			DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h b/drivers/gpu/drm/exynos/exynos_drm_drv.h
index c82c90c..b38ed6f 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
@@ -235,6 +235,12 @@ struct exynos_drm_private {
 	 * this array is used to be aware of which crtc did it request vblank.
 	 */
 	struct drm_crtc *crtc[MAX_CRTC];
+
+	/*
+	 * maximum size of allocation by userptr feature.
+	 * - as default, this has 16MB and only root user can change it.
+	 */
+	unsigned long userptr_limit;
 };
 
 /*
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index fc91293..e6abb66 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -33,6 +33,8 @@
 #include "exynos_drm_gem.h"
 #include "exynos_drm_buf.h"
 
+#define USERPTR_MAX_SIZE		SZ_64M
+
 static unsigned int convert_to_vm_err_msg(int msg)
 {
 	unsigned int out_msg;
@@ -630,6 +632,26 @@ int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+int exynos_drm_gem_user_limit_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *filp)
+{
+	struct exynos_drm_private *priv = dev->dev_private;
+	struct drm_exynos_user_limit *limit = data;
+
+	if (limit->userptr_limit < PAGE_SIZE ||
+			limit->userptr_limit > USERPTR_MAX_SIZE) {
+		DRM_DEBUG_KMS("invalid userptr_limit size.\n");
+		return -EINVAL;
+	}
+
+	if (priv->userptr_limit == limit->userptr_limit)
+		return 0;
+
+	priv->userptr_limit = limit->userptr_limit;
+
+	return 0;
+}
+
 int exynos_drm_gem_init_object(struct drm_gem_object *obj)
 {
 	DRM_DEBUG_KMS("%s\n", __FILE__);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index 14d038b..3334c9f 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -78,6 +78,9 @@ struct exynos_drm_gem_obj {
 
 struct page **exynos_gem_get_pages(struct drm_gem_object *obj, gfp_t gfpmask);
 
+int exynos_drm_gem_user_limit_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *filp);
+
 /* destroy a buffer with gem object */
 void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj);
 
diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
index 54c97e8..52465dc 100644
--- a/include/drm/exynos_drm.h
+++ b/include/drm/exynos_drm.h
@@ -92,6 +92,19 @@ struct drm_exynos_gem_info {
 };
 
 /**
+ * A structure to userptr limited information.
+ *
+ * @userptr_limit: maximum size to userptr buffer.
+ *	the buffer could be allocated by unprivileged user using malloc()
+ *	and the size of the buffer would be limited as userptr_limit value.
+ * @pad: just padding to be 64-bit aligned.
+ */
+struct drm_exynos_user_limit {
+	unsigned int userptr_limit;
+	unsigned int pad;
+};
+
+/**
  * A structure for user connection request of virtual display.
  *
  * @connection: indicate whether doing connetion or not by user.
@@ -162,6 +175,7 @@ struct drm_exynos_g2d_exec {
 #define DRM_EXYNOS_GEM_MMAP		0x02
 /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
 #define DRM_EXYNOS_GEM_GET		0x04
+#define DRM_EXYNOS_USER_LIMIT		0x05
 #define DRM_EXYNOS_PLANE_SET_ZPOS	0x06
 #define DRM_EXYNOS_VIDI_CONNECTION	0x07
 
@@ -182,6 +196,9 @@ struct drm_exynos_g2d_exec {
 #define DRM_IOCTL_EXYNOS_GEM_GET	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_GET,	struct drm_exynos_gem_info)
 
+#define DRM_IOCTL_EXYNOS_USER_LIMIT	DRM_IOWR(DRM_COMMAND_BASE + \
+		DRM_EXYNOS_USER_LIMIT,	struct drm_exynos_user_limit)
+
 #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_PLANE_SET_ZPOS, struct drm_exynos_plane_set_zpos)
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-14  6:17     ` [PATCH 0/2 v4] " Inki Dae
  2012-05-14  6:17       ` [PATCH 1/2 v4] drm/exynos: added userptr limit ioctl Inki Dae
@ 2012-05-14  6:17       ` Inki Dae
  2012-05-14  6:33         ` KOSAKI Motohiro
  2012-05-14 19:26         ` Jerome Glisse
  2012-05-14  8:12       ` [PATCH 0/2 " Inki Dae
  2 siblings, 2 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-14  6:17 UTC (permalink / raw)
  To: airlied, dri-devel
  Cc: sw0312.kim, Inki Dae, minchan, kyungmin.park, kosaki.motohiro

this feature is used to import user space region allocated by malloc() or
mmaped into a gem. for this, we uses get_user_pages() to get all the pages
to VMAs within user address space. However we should pay attention to use
this userptr feature like below.

The migration issue.
- Pages reserved by CMA for some device using DMA could be used by
kernel and if the device driver wants to use those pages
while being used by kernel then the pages are copied into
other ones allocated to migrate them and then finally,
the device driver can use the pages for itself.
Thus, migrated, the pages being accessed by DMA could be changed
to other so this situation may incur that DMA accesses any pages
it doesn't want.

The COW issue.
- while DMA of a device is using the pages to VMAs, if current
process was forked then the pages being accessed by the DMA
would be copied into child's pages.(Copy On Write) so
these pages may not have coherrency with parent's ones if
child process writed something on those pages so we need to
flag VM_DONTCOPY to prevent pages from being COWed.

But the use of get_user_pages is safe from such magration issue
because all the pages from get_user_pages CAN NOT BE not only
migrated, but also swapped out. However below issue could be incurred.

The deterioration issue of system performance by malicious process.
- any malicious process can request all the pages of entire system memory
through this userptr ioctl. which in turn, all other processes would be
blocked and incur the deterioration of system performance because the pages
couldn't be swapped out.

So this feature limit user-desired userptr size to pre-defined and this value
CAN BE changed by only root user.

Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
---
 drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
 drivers/gpu/drm/exynos/exynos_drm_gem.c |  391 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/exynos/exynos_drm_gem.h |   17 ++-
 include/drm/exynos_drm.h                |   26 ++-
 4 files changed, 433 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index 1e68ec2..e8ae3f1 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -220,6 +220,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MAP_OFFSET,
 			exynos_drm_gem_map_offset_ioctl, DRM_UNLOCKED |
 			DRM_AUTH),
+	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
+			exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
 			exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
 	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index e6abb66..3c8a5f3 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -68,6 +68,80 @@ static int check_gem_flags(unsigned int flags)
 	return 0;
 }
 
+static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
+{
+	struct vm_area_struct *vma_copy;
+
+	vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
+	if (!vma_copy)
+		return NULL;
+
+	if (vma->vm_ops && vma->vm_ops->open)
+		vma->vm_ops->open(vma);
+
+	if (vma->vm_file)
+		get_file(vma->vm_file);
+
+	memcpy(vma_copy, vma, sizeof(*vma));
+
+	vma_copy->vm_mm = NULL;
+	vma_copy->vm_next = NULL;
+	vma_copy->vm_prev = NULL;
+
+	return vma_copy;
+}
+
+
+static void put_vma(struct vm_area_struct *vma)
+{
+	if (!vma)
+		return;
+
+	if (vma->vm_ops && vma->vm_ops->close)
+		vma->vm_ops->close(vma);
+
+	if (vma->vm_file)
+		fput(vma->vm_file);
+
+	kfree(vma);
+}
+
+/*
+ * cow_userptr_vma - flag VM_DONTCOPY to VMAs to user address space.
+ *
+ * this function flags VM_DONTCOPY to VMAs to user address space to prevent
+ * pages to VMAs from being COWed.
+ */
+static int cow_userptr_vma(struct exynos_drm_gem_buf *buf, unsigned int no_cow)
+{
+	struct vm_area_struct *vma;
+	unsigned long start, end;
+
+	start = buf->userptr;
+	end = buf->userptr + buf->size - 1;
+
+	down_write(&current->mm->mmap_sem);
+
+	do {
+		vma = find_vma(current->mm, start);
+		if (!vma) {
+			up_write(&current->mm->mmap_sem);
+			return -EFAULT;
+		}
+
+		if (no_cow)
+			vma->vm_flags |= VM_DONTCOPY;
+		else
+			vma->vm_flags &= ~VM_DONTCOPY;
+
+		start = vma->vm_end + 1;
+	} while (vma->vm_end < end);
+
+	up_write(&current->mm->mmap_sem);
+
+	return 0;
+}
+
 static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
 					struct vm_area_struct *vma)
 {
@@ -256,6 +330,50 @@ static void exynos_drm_gem_put_pages(struct drm_gem_object *obj)
 	/* add some codes for UNCACHED type here. TODO */
 }
 
+static void exynos_drm_put_userptr(struct drm_gem_object *obj)
+{
+	struct drm_device *dev = obj->dev;
+	struct exynos_drm_private *priv = dev->dev_private;
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct exynos_drm_gem_buf *buf;
+	struct vm_area_struct *vma;
+	int npages;
+
+	exynos_gem_obj = to_exynos_gem_obj(obj);
+	buf = exynos_gem_obj->buffer;
+	vma = exynos_gem_obj->vma;
+
+	if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
+		put_vma(exynos_gem_obj->vma);
+		goto out;
+	}
+
+	npages = buf->size >> PAGE_SHIFT;
+
+	if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR && !buf->pfnmap)
+		cow_userptr_vma(buf, 0);
+
+	npages--;
+	while (npages >= 0) {
+		if (buf->write)
+			set_page_dirty_lock(buf->pages[npages]);
+
+		put_page(buf->pages[npages]);
+		npages--;
+	}
+
+out:
+	mutex_lock(&dev->struct_mutex);
+	priv->userptr_limit += buf->size;
+	mutex_unlock(&dev->struct_mutex);
+
+	kfree(buf->pages);
+	buf->pages = NULL;
+
+	kfree(buf->sgt);
+	buf->sgt = NULL;
+}
+
 static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
 					struct drm_file *file_priv,
 					unsigned int *handle)
@@ -295,6 +413,8 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
 
 	if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
 		exynos_drm_gem_put_pages(obj);
+	else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
+		exynos_drm_put_userptr(obj);
 	else
 		exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags, buf);
 
@@ -606,6 +726,277 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+/*
+ * exynos_drm_get_userptr - get pages to VMSs to user address space.
+ *
+ * this function is used for gpu driver to access user space directly
+ * for performance enhancement to avoid memcpy.
+ */
+static int exynos_drm_get_userptr(struct drm_device *dev,
+				struct exynos_drm_gem_obj *obj,
+				unsigned long userptr,
+				unsigned int write)
+{
+	unsigned int get_npages;
+	unsigned long npages = 0;
+	struct vm_area_struct *vma;
+	struct exynos_drm_gem_buf *buf = obj->buffer;
+	int ret;
+
+	down_read(&current->mm->mmap_sem);
+	vma = find_vma(current->mm, userptr);
+
+	/* the memory region mmaped with VM_PFNMAP. */
+	if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
+		unsigned long this_pfn, prev_pfn, pa;
+		unsigned long start, end, offset;
+		struct scatterlist *sgl;
+		int ret;
+
+		start = userptr;
+		offset = userptr & ~PAGE_MASK;
+		end = start + buf->size;
+		sgl = buf->sgt->sgl;
+
+		for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
+			ret = follow_pfn(vma, start, &this_pfn);
+			if (ret)
+				goto err;
+
+			if (prev_pfn == 0) {
+				pa = this_pfn << PAGE_SHIFT;
+				buf->dma_addr = pa + offset;
+			} else if (this_pfn != prev_pfn + 1) {
+				ret = -EINVAL;
+				goto err;
+			}
+
+			sg_dma_address(sgl) = (pa + offset);
+			sg_dma_len(sgl) = PAGE_SIZE;
+			prev_pfn = this_pfn;
+			pa += PAGE_SIZE;
+			npages++;
+			sgl = sg_next(sgl);
+		}
+
+		obj->vma = get_vma(vma);
+		if (!obj->vma) {
+			ret = -ENOMEM;
+			goto err;
+		}
+
+		up_read(&current->mm->mmap_sem);
+		buf->pfnmap = true;
+
+		return npages;
+err:
+		buf->dma_addr = 0;
+		up_read(&current->mm->mmap_sem);
+
+		return ret;
+	}
+
+	up_read(&current->mm->mmap_sem);
+
+	/*
+	 * flag VM_DONTCOPY to VMAs to user address space to prevent
+	 * pages to VMAs from being COWed.
+	 *
+	 * The COW issue.
+	 * - while DMA of a device is using the pages to VMAs, if current
+	 * process was forked then the pages being accessed by the DMA
+	 * would be copied into child's pages.(Copy On Write) so
+	 * these pages may not have coherrency with parent's ones if
+	 * child process writed something on those pages so we need to
+	 * flag VM_DONTCOPY to prevent pages from being COWed.
+	 */
+	ret = cow_userptr_vma(buf, 1);
+	if (ret < 0) {
+		DRM_ERROR("failed to set VM_DONTCOPY to  vma.\n");
+		cow_userptr_vma(buf, 0);
+		return 0;
+	}
+
+	buf->write = write;
+	npages = buf->size >> PAGE_SHIFT;
+
+	down_read(&current->mm->mmap_sem);
+
+	/*
+	 * Basically, all the pages from get_user_pages() can not be not only
+	 * migrated by CMA but also swapped out.
+	 *
+	 * The migration issue.
+	 * - Pages reserved by CMA for some device using DMA could be used by
+	 * kernel and if the device driver wants to use those pages
+	 * while being used by kernel then the pages are copied into
+	 * other ones allocated to migrate them and then finally,
+	 * the device driver can use the pages for itself.
+	 * Thus, migrated, the pages being accessed by DMA could be changed
+	 * to other so this situation may incur that DMA accesses any pages
+	 * it doesn't want.
+	 *
+	 * But the use of get_user_pages is safe from such magration issue
+	 * because all the pages from get_user_pages CAN NOT be not only
+	 * migrated, but also swapped out.
+	 */
+	get_npages = get_user_pages(current, current->mm, userptr,
+					npages, write, 1, buf->pages, NULL);
+	up_read(&current->mm->mmap_sem);
+	if (get_npages != npages)
+		DRM_ERROR("failed to get user_pages.\n");
+
+	buf->userptr = userptr;
+	buf->pfnmap = false;
+
+	return get_npages;
+}
+
+int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv)
+{
+	struct exynos_drm_private *priv = dev->dev_private;
+	struct exynos_drm_gem_obj *exynos_gem_obj;
+	struct drm_exynos_gem_userptr *args = data;
+	struct exynos_drm_gem_buf *buf;
+	struct scatterlist *sgl;
+	unsigned long size, userptr;
+	unsigned int npages;
+	int ret, get_npages;
+
+	DRM_DEBUG_KMS("%s\n", __FILE__);
+
+	if (!args->size) {
+		DRM_ERROR("invalid size.\n");
+		return -EINVAL;
+	}
+
+	ret = check_gem_flags(args->flags);
+	if (ret)
+		return ret;
+
+	size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
+
+	if (size > priv->userptr_limit) {
+		DRM_ERROR("excessed maximum size of userptr.\n");
+		return -EINVAL;
+	}
+
+	mutex_lock(&dev->struct_mutex);
+
+	/*
+	 * Limited userptr size
+	 * - User request with userptr SHOULD BE limited as userptr_limit size
+	 * Malicious process is possible to make all the processes to be
+	 * blocked so this would incur a deterioation of system
+	 * performance. so the size that user can request is limited as
+	 * userptr_limit value and also the value CAN BE changed by only root
+	 * user.
+	 */
+	if (priv->userptr_limit >= size)
+		priv->userptr_limit -= size;
+	else {
+		DRM_DEBUG_KMS("insufficient userptr size.\n");
+		mutex_unlock(&dev->struct_mutex);
+		return -EINVAL;
+	}
+
+	mutex_unlock(&dev->struct_mutex);
+
+	userptr = args->userptr;
+
+	buf = exynos_drm_init_buf(dev, size);
+	if (!buf)
+		return -ENOMEM;
+
+	exynos_gem_obj = exynos_drm_gem_init(dev, size);
+	if (!exynos_gem_obj) {
+		ret = -ENOMEM;
+		goto err_free_buffer;
+	}
+
+	buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
+	if (!buf->sgt) {
+		DRM_ERROR("failed to allocate buf->sgt.\n");
+		ret = -ENOMEM;
+		goto err_release_gem;
+	}
+
+	npages = size >> PAGE_SHIFT;
+
+	ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
+	if (ret < 0) {
+		DRM_ERROR("failed to initailize sg table.\n");
+		goto err_free_sgt;
+	}
+
+	buf->pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
+	if (!buf->pages) {
+		DRM_ERROR("failed to allocate buf->pages\n");
+		ret = -ENOMEM;
+		goto err_free_table;
+	}
+
+	exynos_gem_obj->buffer = buf;
+
+	get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj, userptr, 1);
+	if (get_npages != npages) {
+		DRM_ERROR("failed to get user_pages.\n");
+		ret = get_npages;
+		goto err_release_userptr;
+	}
+
+	ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base, file_priv,
+						&args->handle);
+	if (ret < 0) {
+		DRM_ERROR("failed to create gem handle.\n");
+		goto err_release_userptr;
+	}
+
+	sgl = buf->sgt->sgl;
+
+	/*
+	 * if buf->pfnmap is true then update sgl of sgt with pages else
+	 * then it means the sgl was updated already so it doesn't need
+	 * to update the sgl.
+	 */
+	if (!buf->pfnmap) {
+		unsigned int i = 0;
+
+		/* set all pages to sg list. */
+		while (i < npages) {
+			sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
+			sg_dma_address(sgl) = page_to_phys(buf->pages[i]);
+			i++;
+			sgl = sg_next(sgl);
+		}
+	}
+
+	/* always use EXYNOS_BO_USERPTR as memory type for userptr. */
+	exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
+
+	return 0;
+
+err_release_userptr:
+	get_npages--;
+	while (get_npages >= 0)
+		put_page(buf->pages[get_npages--]);
+	kfree(buf->pages);
+	buf->pages = NULL;
+err_free_table:
+	sg_free_table(buf->sgt);
+err_free_sgt:
+	kfree(buf->sgt);
+	buf->sgt = NULL;
+err_release_gem:
+	drm_gem_object_release(&exynos_gem_obj->base);
+	kfree(exynos_gem_obj);
+	exynos_gem_obj = NULL;
+err_free_buffer:
+	exynos_drm_free_buf(dev, 0, buf);
+	return ret;
+}
+
 int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
 				      struct drm_file *file_priv)
 {	struct exynos_drm_gem_obj *exynos_gem_obj;
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index 3334c9f..72bd993 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -29,27 +29,35 @@
 #define to_exynos_gem_obj(x)	container_of(x,\
 			struct exynos_drm_gem_obj, base)
 
-#define IS_NONCONTIG_BUFFER(f)		(f & EXYNOS_BO_NONCONTIG)
+#define IS_NONCONTIG_BUFFER(f)	((f & EXYNOS_BO_NONCONTIG) ||\
+					(f & EXYNOS_BO_USERPTR))
 
 /*
  * exynos drm gem buffer structure.
  *
  * @kvaddr: kernel virtual address to allocated memory region.
+ * @userptr: user space address.
  * @dma_addr: bus address(accessed by dma) to allocated memory region.
  *	- this address could be physical address without IOMMU and
  *	device address with IOMMU.
+ * @write: whether pages will be written to by the caller.
  * @sgt: sg table to transfer page data.
  * @pages: contain all pages to allocated memory region.
  * @page_size: could be 4K, 64K or 1MB.
  * @size: size of allocated memory region.
+ * @pfnmap: indicate whether memory region from userptr is mmaped with
+ *	VM_PFNMAP or not.
  */
 struct exynos_drm_gem_buf {
 	void __iomem		*kvaddr;
+	unsigned long		userptr;
 	dma_addr_t		dma_addr;
+	unsigned int		write;
 	struct sg_table		*sgt;
 	struct page		**pages;
 	unsigned long		page_size;
 	unsigned long		size;
+	bool			pfnmap;
 };
 
 /*
@@ -64,6 +72,8 @@ struct exynos_drm_gem_buf {
  *	continuous memory region allocated by user request
  *	or at framebuffer creation.
  * @size: total memory size to physically non-continuous memory region.
+ * @vma: a pointer to the vma to user address space and used to release
+ *	the pages to user space.
  * @flags: indicate memory type to allocated buffer and cache attruibute.
  *
  * P.S. this object would be transfered to user as kms_bo.handle so
@@ -73,6 +83,7 @@ struct exynos_drm_gem_obj {
 	struct drm_gem_object		base;
 	struct exynos_drm_gem_buf	*buffer;
 	unsigned long			size;
+	struct vm_area_struct		*vma;
 	unsigned int			flags;
 };
 
@@ -130,6 +141,10 @@ int exynos_drm_gem_map_offset_ioctl(struct drm_device *dev, void *data,
 int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
 			      struct drm_file *file_priv);
 
+/* map user space allocated by malloc to pages. */
+int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
+				      struct drm_file *file_priv);
+
 /* get buffer information to memory region allocated by gem. */
 int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
 				      struct drm_file *file_priv);
diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
index 52465dc..33fa1e2 100644
--- a/include/drm/exynos_drm.h
+++ b/include/drm/exynos_drm.h
@@ -77,6 +77,23 @@ struct drm_exynos_gem_mmap {
 };
 
 /**
+ * User-requested user space importing structure
+ *
+ * @userptr: user space address allocated by malloc.
+ * @size: size to the buffer allocated by malloc.
+ * @flags: indicate user-desired cache attribute to map the allocated buffer
+ *	to kernel space.
+ * @handle: a returned handle to created gem object.
+ *	- this handle will be set by gem module of kernel side.
+ */
+struct drm_exynos_gem_userptr {
+	uint64_t userptr;
+	uint64_t size;
+	unsigned int flags;
+	unsigned int handle;
+};
+
+/**
  * A structure to gem information.
  *
  * @handle: a handle to gem object created.
@@ -135,8 +152,10 @@ enum e_drm_exynos_gem_mem_type {
 	EXYNOS_BO_CACHABLE	= 1 << 1,
 	/* write-combine mapping. */
 	EXYNOS_BO_WC		= 1 << 2,
+	/* user space memory allocated by malloc. */
+	EXYNOS_BO_USERPTR	= 1 << 3,
 	EXYNOS_BO_MASK		= EXYNOS_BO_NONCONTIG | EXYNOS_BO_CACHABLE |
-					EXYNOS_BO_WC
+					EXYNOS_BO_WC | EXYNOS_BO_USERPTR
 };
 
 struct drm_exynos_g2d_get_ver {
@@ -173,7 +192,7 @@ struct drm_exynos_g2d_exec {
 #define DRM_EXYNOS_GEM_CREATE		0x00
 #define DRM_EXYNOS_GEM_MAP_OFFSET	0x01
 #define DRM_EXYNOS_GEM_MMAP		0x02
-/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
+#define DRM_EXYNOS_GEM_USERPTR		0x03
 #define DRM_EXYNOS_GEM_GET		0x04
 #define DRM_EXYNOS_USER_LIMIT		0x05
 #define DRM_EXYNOS_PLANE_SET_ZPOS	0x06
@@ -193,6 +212,9 @@ struct drm_exynos_g2d_exec {
 #define DRM_IOCTL_EXYNOS_GEM_MMAP	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
 
+#define DRM_IOCTL_EXYNOS_GEM_USERPTR	DRM_IOWR(DRM_COMMAND_BASE + \
+		DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
+
 #define DRM_IOCTL_EXYNOS_GEM_GET	DRM_IOWR(DRM_COMMAND_BASE + \
 		DRM_EXYNOS_GEM_GET,	struct drm_exynos_gem_info)
 
-- 
1.7.4.1

^ permalink raw reply related	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-14  6:17       ` [PATCH 2/2 v4] drm/exynos: added userptr feature Inki Dae
@ 2012-05-14  6:33         ` KOSAKI Motohiro
  2012-05-14  6:52           ` Inki Dae
  2012-05-14 19:26         ` Jerome Glisse
  1 sibling, 1 reply; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-14  6:33 UTC (permalink / raw)
  To: Inki Dae; +Cc: sw0312.kim, dri-devel, minchan, kyungmin.park

> +       npages = buf->size >> PAGE_SHIFT;

Why round down? usually we use round up.


> +       down_read(&current->mm->mmap_sem);
> +
> +       /*
> +        * Basically, all the pages from get_user_pages() can not be not only
> +        * migrated by CMA but also swapped out.
> +        *
> +        * The migration issue.
> +        * - Pages reserved by CMA for some device using DMA could be used by
> +        * kernel and if the device driver wants to use those pages
> +        * while being used by kernel then the pages are copied into
> +        * other ones allocated to migrate them and then finally,
> +        * the device driver can use the pages for itself.
> +        * Thus, migrated, the pages being accessed by DMA could be changed
> +        * to other so this situation may incur that DMA accesses any pages
> +        * it doesn't want.
> +        *
> +        * But the use of get_user_pages is safe from such magration issue
> +        * because all the pages from get_user_pages CAN NOT be not only
> +        * migrated, but also swapped out.
> +        */
> +       get_npages = get_user_pages(current, current->mm, userptr,
> +                                       npages, write, 1, buf->pages, NULL);

Why force=1? It is almostly core-dump specific option. Why don't you return
EFAULT when the page has write permission. IOW, Why your Xorg module
don't map memory w/ PROT_WRITE?


> +       up_read(&current->mm->mmap_sem);
> +       if (get_npages != npages)
> +               DRM_ERROR("failed to get user_pages.\n");

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-14  6:33         ` KOSAKI Motohiro
@ 2012-05-14  6:52           ` Inki Dae
  2012-05-14  7:04             ` KOSAKI Motohiro
  0 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-14  6:52 UTC (permalink / raw)
  To: 'KOSAKI Motohiro'; +Cc: sw0312.kim, dri-devel, minchan, kyungmin.park



> -----Original Message-----
> From: KOSAKI Motohiro [mailto:kosaki.motohiro@gmail.com]
> Sent: Monday, May 14, 2012 3:33 PM
> To: Inki Dae
> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org; j.glisse@gmail.com;
> minchan@kernel.org; kyungmin.park@samsung.com; sw0312.kim@samsung.com;
> jy0922.shim@samsung.com
> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
> 
> > +       npages = buf->size >> PAGE_SHIFT;
> 
> Why round down? usually we use round up.
> 

The size was already rounded up by exynos_drm_gem_userptr_ioctl so this is
just used to get page count.

> 
> > +       down_read(&current->mm->mmap_sem);
> > +
> > +       /*
> > +        * Basically, all the pages from get_user_pages() can not be not
> only
> > +        * migrated by CMA but also swapped out.
> > +        *
> > +        * The migration issue.
> > +        * - Pages reserved by CMA for some device using DMA could be
> used by
> > +        * kernel and if the device driver wants to use those pages
> > +        * while being used by kernel then the pages are copied into
> > +        * other ones allocated to migrate them and then finally,
> > +        * the device driver can use the pages for itself.
> > +        * Thus, migrated, the pages being accessed by DMA could be
> changed
> > +        * to other so this situation may incur that DMA accesses any
> pages
> > +        * it doesn't want.
> > +        *
> > +        * But the use of get_user_pages is safe from such magration
> issue
> > +        * because all the pages from get_user_pages CAN NOT be not only
> > +        * migrated, but also swapped out.
> > +        */
> > +       get_npages = get_user_pages(current, current->mm, userptr,
> > +                                       npages, write, 1, buf->pages,
NULL);
> 
> Why force=1? It is almostly core-dump specific option. Why don't you
> return

I know that force indicates whether to force write access even  if user
mapping is readonly. so we just want to use pages from get_user_pages as
read/write permission.

> EFAULT when the page has write permission. IOW, Why your Xorg module
> don't map memory w/ PROT_WRITE?

No, Xorg can map memory w/ PROT_WRITE. Couldn't the Xorg map w/ PROT_WRITE
if force = 1? plz, let me know if there is my missing point.

Thanks,
Inki Dae

> 
> 
> > +       up_read(&current->mm->mmap_sem);
> > +       if (get_npages != npages)
> > +               DRM_ERROR("failed to get user_pages.\n");

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-14  6:52           ` Inki Dae
@ 2012-05-14  7:04             ` KOSAKI Motohiro
  2012-05-14  7:21               ` Inki Dae
  2012-05-14  8:13               ` Inki Dae
  0 siblings, 2 replies; 77+ messages in thread
From: KOSAKI Motohiro @ 2012-05-14  7:04 UTC (permalink / raw)
  To: Inki Dae
  Cc: sw0312.kim, dri-devel, minchan, kyungmin.park, 'KOSAKI Motohiro'

(5/14/12 2:52 AM), Inki Dae wrote:
>
>
>> -----Original Message-----
>> From: KOSAKI Motohiro [mailto:kosaki.motohiro@gmail.com]
>> Sent: Monday, May 14, 2012 3:33 PM
>> To: Inki Dae
>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org; j.glisse@gmail.com;
>> minchan@kernel.org; kyungmin.park@samsung.com; sw0312.kim@samsung.com;
>> jy0922.shim@samsung.com
>> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
>>
>>> +       npages = buf->size>>  PAGE_SHIFT;
>>
>> Why round down? usually we use round up.
>>
>
> The size was already rounded up by exynos_drm_gem_userptr_ioctl so this is
> just used to get page count.

got it.



>>> +       down_read(&current->mm->mmap_sem);
>>> +
>>> +       /*
>>> +        * Basically, all the pages from get_user_pages() can not be not
>> only
>>> +        * migrated by CMA but also swapped out.
>>> +        *
>>> +        * The migration issue.
>>> +        * - Pages reserved by CMA for some device using DMA could be
>> used by
>>> +        * kernel and if the device driver wants to use those pages
>>> +        * while being used by kernel then the pages are copied into
>>> +        * other ones allocated to migrate them and then finally,
>>> +        * the device driver can use the pages for itself.
>>> +        * Thus, migrated, the pages being accessed by DMA could be
>> changed
>>> +        * to other so this situation may incur that DMA accesses any
>> pages
>>> +        * it doesn't want.
>>> +        *
>>> +        * But the use of get_user_pages is safe from such magration
>> issue
>>> +        * because all the pages from get_user_pages CAN NOT be not only
>>> +        * migrated, but also swapped out.
>>> +        */
>>> +       get_npages = get_user_pages(current, current->mm, userptr,
>>> +                                       npages, write, 1, buf->pages,
> NULL);
>>
>> Why force=1? It is almostly core-dump specific option. Why don't you
>> return
>
> I know that force indicates whether to force write access even  if user
> mapping is readonly.

right. and then, usually we don't want to ignore access permission. but note,
I'm only talk about generic thing. I have no knowledge drm area.



> so we just want to use pages from get_user_pages as
> read/write permission.

>> EFAULT when the page has write permission. IOW, Why your Xorg module
>> don't map memory w/ PROT_WRITE?
>
> No, Xorg can map memory w/ PROT_WRITE. Couldn't the Xorg map w/ PROT_WRITE
> if force = 1? plz, let me know if there is my missing point.

I meant, if Xorg always use PROT_WRITE, you don't need force=1.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-14  7:04             ` KOSAKI Motohiro
@ 2012-05-14  7:21               ` Inki Dae
  2012-05-14  8:13               ` Inki Dae
  1 sibling, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-14  7:21 UTC (permalink / raw)
  To: 'KOSAKI Motohiro'; +Cc: sw0312.kim, dri-devel, minchan, kyungmin.park



> -----Original Message-----
> From: KOSAKI Motohiro [mailto:kosaki.motohiro@gmail.com]
> Sent: Monday, May 14, 2012 4:05 PM
> To: Inki Dae
> Cc: 'KOSAKI Motohiro'; airlied@linux.ie; dri-devel@lists.freedesktop.org;
> j.glisse@gmail.com; minchan@kernel.org; kyungmin.park@samsung.com;
> sw0312.kim@samsung.com; jy0922.shim@samsung.com
> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
> 
> (5/14/12 2:52 AM), Inki Dae wrote:
> >
> >
> >> -----Original Message-----
> >> From: KOSAKI Motohiro [mailto:kosaki.motohiro@gmail.com]
> >> Sent: Monday, May 14, 2012 3:33 PM
> >> To: Inki Dae
> >> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> j.glisse@gmail.com;
> >> minchan@kernel.org; kyungmin.park@samsung.com; sw0312.kim@samsung.com;
> >> jy0922.shim@samsung.com
> >> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
> >>
> >>> +       npages = buf->size>>  PAGE_SHIFT;
> >>
> >> Why round down? usually we use round up.
> >>
> >
> > The size was already rounded up by exynos_drm_gem_userptr_ioctl so this
> is
> > just used to get page count.
> 
> got it.
> 
> 
> 
> >>> +       down_read(&current->mm->mmap_sem);
> >>> +
> >>> +       /*
> >>> +        * Basically, all the pages from get_user_pages() can not be
not
> >> only
> >>> +        * migrated by CMA but also swapped out.
> >>> +        *
> >>> +        * The migration issue.
> >>> +        * - Pages reserved by CMA for some device using DMA could be
> >> used by
> >>> +        * kernel and if the device driver wants to use those pages
> >>> +        * while being used by kernel then the pages are copied into
> >>> +        * other ones allocated to migrate them and then finally,
> >>> +        * the device driver can use the pages for itself.
> >>> +        * Thus, migrated, the pages being accessed by DMA could be
> >> changed
> >>> +        * to other so this situation may incur that DMA accesses any
> >> pages
> >>> +        * it doesn't want.
> >>> +        *
> >>> +        * But the use of get_user_pages is safe from such magration
> >> issue
> >>> +        * because all the pages from get_user_pages CAN NOT be not
only
> >>> +        * migrated, but also swapped out.
> >>> +        */
> >>> +       get_npages = get_user_pages(current, current->mm, userptr,
> >>> +                                       npages, write, 1, buf->pages,
> > NULL);
> >>
> >> Why force=1? It is almostly core-dump specific option. Why don't you
> >> return
> >
> > I know that force indicates whether to force write access even  if user
> > mapping is readonly.
> 
> right. and then, usually we don't want to ignore access permission. but
> note,
> I'm only talk about generic thing. I have no knowledge drm area.
> 
> 
> 
> > so we just want to use pages from get_user_pages as
> > read/write permission.
> 
> >> EFAULT when the page has write permission. IOW, Why your Xorg module
> >> don't map memory w/ PROT_WRITE?
> >
> > No, Xorg can map memory w/ PROT_WRITE. Couldn't the Xorg map w/
> PROT_WRITE
> > if force = 1? plz, let me know if there is my missing point.
> 
> I meant, if Xorg always use PROT_WRITE, you don't need force=1.
> 

Ah, right. we don't need force=1. got it. I will fix it to 0

Thanks,
Inki Dae

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 0/2 v4] drm/exynos: added userptr feature
  2012-05-14  6:17     ` [PATCH 0/2 v4] " Inki Dae
  2012-05-14  6:17       ` [PATCH 1/2 v4] drm/exynos: added userptr limit ioctl Inki Dae
  2012-05-14  6:17       ` [PATCH 2/2 v4] drm/exynos: added userptr feature Inki Dae
@ 2012-05-14  8:12       ` Inki Dae
  2 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-14  8:12 UTC (permalink / raw)
  To: 'Inki Dae', airlied, dri-devel
  Cc: linux-mm, j.glisse, minchan, kosaki.motohiro, kyungmin.park,
	sw0312.kim, jy0922.shim

ccing linux-mm

> -----Original Message-----
> From: Inki Dae [mailto:inki.dae@samsung.com]
> Sent: Monday, May 14, 2012 3:18 PM
> To: airlied@linux.ie; dri-devel@lists.freedesktop.org
> Cc: j.glisse@gmail.com; minchan@kernel.org; kosaki.motohiro@gmail.com;
> kyungmin.park@samsung.com; sw0312.kim@samsung.com;
jy0922.shim@samsung.com;
> Inki Dae
> Subject: [PATCH 0/2 v4] drm/exynos: added userptr feature
> 
> this feature could be used to memory region allocated by malloc() in user
> mode
> and mmaped memory region allocated by other memory allocators.
> userptr interface can identify memory type through vm_flags value and
> would get
> pages or page frame numbers to user space appropriately.
> 
> Changelog v4:
> we have discussed some issues to userptr feature with drm and mm guys and
> below are the issues.
> 
> The migration issue.
> - Pages reserved by CMA for some device using DMA could be used by
> kernel and if the device driver wants to use those pages
> while being used by kernel then the pages are copied into
> other ones allocated to migrate them and then finally,
> the device driver can use the pages for itself.
> Thus, migrated, the pages being accessed by DMA could be changed
> to other so this situation may incur that DMA accesses any pages
> it doesn't want.
> 
> The COW issue.
> - while DMA of a device is using the pages to VMAs, if current
> process was forked then the pages being accessed by the DMA
> would be copied into child's pages.(Copy On Write) so
> these pages may not have coherrency with parent's ones if
> child process wrote something on those pages so we need to
> flag VM_DONTCOPY to prevent pages from being COWed.
> 
> But the use of get_user_pages is safe from such magration issue
> because all the pages from get_user_pages CAN NOT BE not only
> migrated but also swapped out. this true has been confirmed
> by mm guys, Minchan Kim and KOSAKI Motohiro.
> However below issue could be incurred.
> 
> The deterioration issue of system performance by malicious process.
> - any malicious process can request all the pages of entire system memory
> through this userptr ioctl and as the result, all other processes would be
> blocked and this would incur the deterioration of system performance
> because the pages couldn't be swapped out.
> 
> So we limit user-desired userptr size to pre-defined and this ioctl
> command
> CAN BE accessed by only root user.
> 
> Changelog v3:
> Mitigated the issues pointed out by Dave and Jerome.
> 
> for this, added some codes to guarantee the pages to user space not
> to be swapped out, the VMAs within the user space would be locked and
> then unlocked when the pages are released.
> 
> but this lock might result in significant degradation of system
> performance
> because the pages couldn't be swapped out so added one more featrue
> that we can limit user-desired userptr size to pre-defined value using
> userptr limit ioctl that can be accessed by only root user.
> 
> Changelog v2:
> the memory regino mmaped with VM_PFNMAP type is physically continuous and
> start address of the memory region should be set into buf->dma_addr but
> previous patch had a problem that end address is set into buf->dma_addr
> so v2 fixes that problem.
> 
> Inki Dae (2):
>   drm/exynos: added userptr limit ioctl.
>   drm/exynos: added userptr feature.
> 
>  drivers/gpu/drm/exynos/exynos_drm_drv.c |    8 +
>  drivers/gpu/drm/exynos/exynos_drm_drv.h |    6 +
>  drivers/gpu/drm/exynos/exynos_drm_gem.c |  413
> +++++++++++++++++++++++++++++++
>  drivers/gpu/drm/exynos/exynos_drm_gem.h |   20 ++-
>  include/drm/exynos_drm.h                |   43 +++-
>  5 files changed, 487 insertions(+), 3 deletions(-)
> 
> --
> 1.7.4.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 1/2 v4] drm/exynos: added userptr limit ioctl.
  2012-05-14  6:17       ` [PATCH 1/2 v4] drm/exynos: added userptr limit ioctl Inki Dae
@ 2012-05-14  8:12         ` Inki Dae
  0 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-14  8:12 UTC (permalink / raw)
  To: 'Inki Dae', airlied, dri-devel
  Cc: linux-mm, j.glisse, minchan, kosaki.motohiro, kyungmin.park,
	sw0312.kim, jy0922.shim

ccing linux-mm

> -----Original Message-----
> From: Inki Dae [mailto:inki.dae@samsung.com]
> Sent: Monday, May 14, 2012 3:18 PM
> To: airlied@linux.ie; dri-devel@lists.freedesktop.org
> Cc: j.glisse@gmail.com; minchan@kernel.org; kosaki.motohiro@gmail.com;
> kyungmin.park@samsung.com; sw0312.kim@samsung.com;
jy0922.shim@samsung.com;
> Inki Dae
> Subject: [PATCH 1/2 v4] drm/exynos: added userptr limit ioctl.
> 
> this ioctl is used to limit user-desired userptr size as pre-defined
> and also could be accessed by only root user.
> 
> with userptr feature, unprivileged user can allocate all the pages on
> system,
> so the amount of free physical pages will be very limited. if the VMAs
> within user address space was pinned, the pages couldn't be swapped out so
> it may result in significant degradation of system performance. so this
> feature would be used to avoid such situation.
> 
> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  drivers/gpu/drm/exynos/exynos_drm_drv.c |    6 ++++++
>  drivers/gpu/drm/exynos/exynos_drm_drv.h |    6 ++++++
>  drivers/gpu/drm/exynos/exynos_drm_gem.c |   22 ++++++++++++++++++++++
>  drivers/gpu/drm/exynos/exynos_drm_gem.h |    3 +++
>  include/drm/exynos_drm.h                |   17 +++++++++++++++++
>  5 files changed, 54 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> index 9d3204c..1e68ec2 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> @@ -64,6 +64,9 @@ static int exynos_drm_load(struct drm_device *dev,
> unsigned long flags)
>  		return -ENOMEM;
>  	}
> 
> +	/* maximum size of userptr is limited to 16MB as default. */
> +	private->userptr_limit = SZ_16M;
> +
>  	INIT_LIST_HEAD(&private->pageflip_event_list);
>  	dev->dev_private = (void *)private;
> 
> @@ -221,6 +224,9 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
>  			exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
>  	DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
>  			exynos_drm_gem_get_ioctl, DRM_UNLOCKED),
> +	DRM_IOCTL_DEF_DRV(EXYNOS_USER_LIMIT,
> +			exynos_drm_gem_user_limit_ioctl, DRM_MASTER |
> +			DRM_ROOT_ONLY),
>  	DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS,
> exynos_plane_set_zpos_ioctl,
>  			DRM_UNLOCKED | DRM_AUTH),
>  	DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h
> b/drivers/gpu/drm/exynos/exynos_drm_drv.h
> index c82c90c..b38ed6f 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
> @@ -235,6 +235,12 @@ struct exynos_drm_private {
>  	 * this array is used to be aware of which crtc did it request
> vblank.
>  	 */
>  	struct drm_crtc *crtc[MAX_CRTC];
> +
> +	/*
> +	 * maximum size of allocation by userptr feature.
> +	 * - as default, this has 16MB and only root user can change it.
> +	 */
> +	unsigned long userptr_limit;
>  };
> 
>  /*
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> index fc91293..e6abb66 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> @@ -33,6 +33,8 @@
>  #include "exynos_drm_gem.h"
>  #include "exynos_drm_buf.h"
> 
> +#define USERPTR_MAX_SIZE		SZ_64M
> +
>  static unsigned int convert_to_vm_err_msg(int msg)
>  {
>  	unsigned int out_msg;
> @@ -630,6 +632,26 @@ int exynos_drm_gem_get_ioctl(struct drm_device *dev,
> void *data,
>  	return 0;
>  }
> 
> +int exynos_drm_gem_user_limit_ioctl(struct drm_device *dev, void *data,
> +				      struct drm_file *filp)
> +{
> +	struct exynos_drm_private *priv = dev->dev_private;
> +	struct drm_exynos_user_limit *limit = data;
> +
> +	if (limit->userptr_limit < PAGE_SIZE ||
> +			limit->userptr_limit > USERPTR_MAX_SIZE) {
> +		DRM_DEBUG_KMS("invalid userptr_limit size.\n");
> +		return -EINVAL;
> +	}
> +
> +	if (priv->userptr_limit == limit->userptr_limit)
> +		return 0;
> +
> +	priv->userptr_limit = limit->userptr_limit;
> +
> +	return 0;
> +}
> +
>  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
>  {
>  	DRM_DEBUG_KMS("%s\n", __FILE__);
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> index 14d038b..3334c9f 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> @@ -78,6 +78,9 @@ struct exynos_drm_gem_obj {
> 
>  struct page **exynos_gem_get_pages(struct drm_gem_object *obj, gfp_t
> gfpmask);
> 
> +int exynos_drm_gem_user_limit_ioctl(struct drm_device *dev, void *data,
> +				      struct drm_file *filp);
> +
>  /* destroy a buffer with gem object */
>  void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj);
> 
> diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> index 54c97e8..52465dc 100644
> --- a/include/drm/exynos_drm.h
> +++ b/include/drm/exynos_drm.h
> @@ -92,6 +92,19 @@ struct drm_exynos_gem_info {
>  };
> 
>  /**
> + * A structure to userptr limited information.
> + *
> + * @userptr_limit: maximum size to userptr buffer.
> + *	the buffer could be allocated by unprivileged user using malloc()
> + *	and the size of the buffer would be limited as userptr_limit value.
> + * @pad: just padding to be 64-bit aligned.
> + */
> +struct drm_exynos_user_limit {
> +	unsigned int userptr_limit;
> +	unsigned int pad;
> +};
> +
> +/**
>   * A structure for user connection request of virtual display.
>   *
>   * @connection: indicate whether doing connetion or not by user.
> @@ -162,6 +175,7 @@ struct drm_exynos_g2d_exec {
>  #define DRM_EXYNOS_GEM_MMAP		0x02
>  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
>  #define DRM_EXYNOS_GEM_GET		0x04
> +#define DRM_EXYNOS_USER_LIMIT		0x05
>  #define DRM_EXYNOS_PLANE_SET_ZPOS	0x06
>  #define DRM_EXYNOS_VIDI_CONNECTION	0x07
> 
> @@ -182,6 +196,9 @@ struct drm_exynos_g2d_exec {
>  #define DRM_IOCTL_EXYNOS_GEM_GET	DRM_IOWR(DRM_COMMAND_BASE + \
>  		DRM_EXYNOS_GEM_GET,	struct drm_exynos_gem_info)
> 
> +#define DRM_IOCTL_EXYNOS_USER_LIMIT	DRM_IOWR(DRM_COMMAND_BASE + \
> +		DRM_EXYNOS_USER_LIMIT,	struct drm_exynos_user_limit)
> +
>  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS	DRM_IOWR(DRM_COMMAND_BASE +
> \
>  		DRM_EXYNOS_PLANE_SET_ZPOS, struct drm_exynos_plane_set_zpos)
> 
> --
> 1.7.4.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-14  7:04             ` KOSAKI Motohiro
  2012-05-14  7:21               ` Inki Dae
@ 2012-05-14  8:13               ` Inki Dae
  1 sibling, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-14  8:13 UTC (permalink / raw)
  To: 'KOSAKI Motohiro'
  Cc: linux-mm, airlied, dri-devel, j.glisse, minchan, kyungmin.park,
	sw0312.kim, jy0922.shim

ccing linux-mm

> -----Original Message-----
> From: KOSAKI Motohiro [mailto:kosaki.motohiro@gmail.com]
> Sent: Monday, May 14, 2012 4:05 PM
> To: Inki Dae
> Cc: 'KOSAKI Motohiro'; airlied@linux.ie; dri-devel@lists.freedesktop.org;
> j.glisse@gmail.com; minchan@kernel.org; kyungmin.park@samsung.com;
> sw0312.kim@samsung.com; jy0922.shim@samsung.com
> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
> 
> (5/14/12 2:52 AM), Inki Dae wrote:
> >
> >
> >> -----Original Message-----
> >> From: KOSAKI Motohiro [mailto:kosaki.motohiro@gmail.com]
> >> Sent: Monday, May 14, 2012 3:33 PM
> >> To: Inki Dae
> >> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> j.glisse@gmail.com;
> >> minchan@kernel.org; kyungmin.park@samsung.com; sw0312.kim@samsung.com;
> >> jy0922.shim@samsung.com
> >> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
> >>
> >>> +       npages = buf->size>>  PAGE_SHIFT;
> >>
> >> Why round down? usually we use round up.
> >>
> >
> > The size was already rounded up by exynos_drm_gem_userptr_ioctl so this
> is
> > just used to get page count.
> 
> got it.
> 
> 
> 
> >>> +       down_read(&current->mm->mmap_sem);
> >>> +
> >>> +       /*
> >>> +        * Basically, all the pages from get_user_pages() can not be
not
> >> only
> >>> +        * migrated by CMA but also swapped out.
> >>> +        *
> >>> +        * The migration issue.
> >>> +        * - Pages reserved by CMA for some device using DMA could be
> >> used by
> >>> +        * kernel and if the device driver wants to use those pages
> >>> +        * while being used by kernel then the pages are copied into
> >>> +        * other ones allocated to migrate them and then finally,
> >>> +        * the device driver can use the pages for itself.
> >>> +        * Thus, migrated, the pages being accessed by DMA could be
> >> changed
> >>> +        * to other so this situation may incur that DMA accesses any
> >> pages
> >>> +        * it doesn't want.
> >>> +        *
> >>> +        * But the use of get_user_pages is safe from such magration
> >> issue
> >>> +        * because all the pages from get_user_pages CAN NOT be not
only
> >>> +        * migrated, but also swapped out.
> >>> +        */
> >>> +       get_npages = get_user_pages(current, current->mm, userptr,
> >>> +                                       npages, write, 1, buf->pages,
> > NULL);
> >>
> >> Why force=1? It is almostly core-dump specific option. Why don't you
> >> return
> >
> > I know that force indicates whether to force write access even  if user
> > mapping is readonly.
> 
> right. and then, usually we don't want to ignore access permission. but
> note,
> I'm only talk about generic thing. I have no knowledge drm area.
> 
> 
> 
> > so we just want to use pages from get_user_pages as
> > read/write permission.
> 
> >> EFAULT when the page has write permission. IOW, Why your Xorg module
> >> don't map memory w/ PROT_WRITE?
> >
> > No, Xorg can map memory w/ PROT_WRITE. Couldn't the Xorg map w/
> PROT_WRITE
> > if force = 1? plz, let me know if there is my missing point.
> 
> I meant, if Xorg always use PROT_WRITE, you don't need force=1.
> 


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-14  6:17       ` [PATCH 2/2 v4] drm/exynos: added userptr feature Inki Dae
  2012-05-14  6:33         ` KOSAKI Motohiro
@ 2012-05-14 19:26         ` Jerome Glisse
  2012-05-15  4:33           ` Inki Dae
  1 sibling, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-14 19:26 UTC (permalink / raw)
  To: Inki Dae; +Cc: sw0312.kim, dri-devel, minchan, kyungmin.park, kosaki.motohiro

On Mon, May 14, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> this feature is used to import user space region allocated by malloc() or
> mmaped into a gem. for this, we uses get_user_pages() to get all the pages
> to VMAs within user address space. However we should pay attention to use
> this userptr feature like below.
>
> The migration issue.
> - Pages reserved by CMA for some device using DMA could be used by
> kernel and if the device driver wants to use those pages
> while being used by kernel then the pages are copied into
> other ones allocated to migrate them and then finally,
> the device driver can use the pages for itself.
> Thus, migrated, the pages being accessed by DMA could be changed
> to other so this situation may incur that DMA accesses any pages
> it doesn't want.
>
> The COW issue.
> - while DMA of a device is using the pages to VMAs, if current
> process was forked then the pages being accessed by the DMA
> would be copied into child's pages.(Copy On Write) so
> these pages may not have coherrency with parent's ones if
> child process writed something on those pages so we need to
> flag VM_DONTCOPY to prevent pages from being COWed.

Note that this is a massive change in behavior of anonymous mapping
this effectively completely change the linux API from application
point of view on your platform. Any application that have memory
mapped by your ioctl will have different fork behavior that other
application. I think this should be stressed, it's one of the thing i
am really uncomfortable with i would rather not have the dont copy
flag and have the page cowed and have the child not working with the
3d/2d/drm driver. That would means that your driver (opengl
implementation for instance) would have to detect fork and work around
it, nvidia closed source driver do that.

Anyway this is a big red flag for me, as app on your platform using
your drm driver might start relaying on a completely non con-formant
fork behavior.

> But the use of get_user_pages is safe from such magration issue
> because all the pages from get_user_pages CAN NOT BE not only
> migrated, but also swapped out. However below issue could be incurred.
>
> The deterioration issue of system performance by malicious process.
> - any malicious process can request all the pages of entire system memory
> through this userptr ioctl. which in turn, all other processes would be
> blocked and incur the deterioration of system performance because the pages
> couldn't be swapped out.
>
> So this feature limit user-desired userptr size to pre-defined and this value
> CAN BE changed by only root user.
>
> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
>  drivers/gpu/drm/exynos/exynos_drm_gem.c |  391 +++++++++++++++++++++++++++++++
>  drivers/gpu/drm/exynos/exynos_drm_gem.h |   17 ++-
>  include/drm/exynos_drm.h                |   26 ++-
>  4 files changed, 433 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> index 1e68ec2..e8ae3f1 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> @@ -220,6 +220,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
>        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MAP_OFFSET,
>                        exynos_drm_gem_map_offset_ioctl, DRM_UNLOCKED |
>                        DRM_AUTH),
> +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
> +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
>        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
>                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
>        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> index e6abb66..3c8a5f3 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> @@ -68,6 +68,80 @@ static int check_gem_flags(unsigned int flags)
>        return 0;
>  }
>
> +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
> +{
> +       struct vm_area_struct *vma_copy;
> +
> +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
> +       if (!vma_copy)
> +               return NULL;
> +
> +       if (vma->vm_ops && vma->vm_ops->open)
> +               vma->vm_ops->open(vma);
> +
> +       if (vma->vm_file)
> +               get_file(vma->vm_file);
> +
> +       memcpy(vma_copy, vma, sizeof(*vma));
> +
> +       vma_copy->vm_mm = NULL;
> +       vma_copy->vm_next = NULL;
> +       vma_copy->vm_prev = NULL;
> +
> +       return vma_copy;
> +}
> +
> +
> +static void put_vma(struct vm_area_struct *vma)
> +{
> +       if (!vma)
> +               return;
> +
> +       if (vma->vm_ops && vma->vm_ops->close)
> +               vma->vm_ops->close(vma);
> +
> +       if (vma->vm_file)
> +               fput(vma->vm_file);
> +
> +       kfree(vma);
> +}
> +
> +/*
> + * cow_userptr_vma - flag VM_DONTCOPY to VMAs to user address space.
> + *
> + * this function flags VM_DONTCOPY to VMAs to user address space to prevent
> + * pages to VMAs from being COWed.
> + */
> +static int cow_userptr_vma(struct exynos_drm_gem_buf *buf, unsigned int no_cow)
> +{
> +       struct vm_area_struct *vma;
> +       unsigned long start, end;
> +
> +       start = buf->userptr;
> +       end = buf->userptr + buf->size - 1;
> +
> +       down_write(&current->mm->mmap_sem);
> +
> +       do {
> +               vma = find_vma(current->mm, start);
> +               if (!vma) {
> +                       up_write(&current->mm->mmap_sem);
> +                       return -EFAULT;
> +               }
> +
> +               if (no_cow)
> +                       vma->vm_flags |= VM_DONTCOPY;
> +               else
> +                       vma->vm_flags &= ~VM_DONTCOPY;
> +
> +               start = vma->vm_end + 1;
> +       } while (vma->vm_end < end);
> +
> +       up_write(&current->mm->mmap_sem);
> +
> +       return 0;
> +}
> +
>  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
>                                        struct vm_area_struct *vma)
>  {
> @@ -256,6 +330,50 @@ static void exynos_drm_gem_put_pages(struct drm_gem_object *obj)
>        /* add some codes for UNCACHED type here. TODO */
>  }
>
> +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
> +{
> +       struct drm_device *dev = obj->dev;
> +       struct exynos_drm_private *priv = dev->dev_private;
> +       struct exynos_drm_gem_obj *exynos_gem_obj;
> +       struct exynos_drm_gem_buf *buf;
> +       struct vm_area_struct *vma;
> +       int npages;
> +
> +       exynos_gem_obj = to_exynos_gem_obj(obj);
> +       buf = exynos_gem_obj->buffer;
> +       vma = exynos_gem_obj->vma;
> +
> +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> +               put_vma(exynos_gem_obj->vma);
> +               goto out;
> +       }
> +
> +       npages = buf->size >> PAGE_SHIFT;
> +
> +       if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR && !buf->pfnmap)
> +               cow_userptr_vma(buf, 0);
> +
> +       npages--;
> +       while (npages >= 0) {
> +               if (buf->write)
> +                       set_page_dirty_lock(buf->pages[npages]);
> +
> +               put_page(buf->pages[npages]);
> +               npages--;
> +       }
> +
> +out:
> +       mutex_lock(&dev->struct_mutex);
> +       priv->userptr_limit += buf->size;
> +       mutex_unlock(&dev->struct_mutex);
> +
> +       kfree(buf->pages);
> +       buf->pages = NULL;
> +
> +       kfree(buf->sgt);
> +       buf->sgt = NULL;
> +}
> +
>  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
>                                        struct drm_file *file_priv,
>                                        unsigned int *handle)
> @@ -295,6 +413,8 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
>
>        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
>                exynos_drm_gem_put_pages(obj);
> +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
> +               exynos_drm_put_userptr(obj);
>        else
>                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags, buf);
>
> @@ -606,6 +726,277 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>        return 0;
>  }
>
> +/*
> + * exynos_drm_get_userptr - get pages to VMSs to user address space.
> + *
> + * this function is used for gpu driver to access user space directly
> + * for performance enhancement to avoid memcpy.
> + */
> +static int exynos_drm_get_userptr(struct drm_device *dev,
> +                               struct exynos_drm_gem_obj *obj,
> +                               unsigned long userptr,
> +                               unsigned int write)
> +{
> +       unsigned int get_npages;
> +       unsigned long npages = 0;
> +       struct vm_area_struct *vma;
> +       struct exynos_drm_gem_buf *buf = obj->buffer;
> +       int ret;
> +
> +       down_read(&current->mm->mmap_sem);
> +       vma = find_vma(current->mm, userptr);
> +
> +       /* the memory region mmaped with VM_PFNMAP. */
> +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> +               unsigned long this_pfn, prev_pfn, pa;
> +               unsigned long start, end, offset;
> +               struct scatterlist *sgl;
> +               int ret;
> +
> +               start = userptr;
> +               offset = userptr & ~PAGE_MASK;
> +               end = start + buf->size;
> +               sgl = buf->sgt->sgl;
> +
> +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
> +                       ret = follow_pfn(vma, start, &this_pfn);
> +                       if (ret)
> +                               goto err;
> +
> +                       if (prev_pfn == 0) {
> +                               pa = this_pfn << PAGE_SHIFT;
> +                               buf->dma_addr = pa + offset;
> +                       } else if (this_pfn != prev_pfn + 1) {
> +                               ret = -EINVAL;
> +                               goto err;
> +                       }
> +
> +                       sg_dma_address(sgl) = (pa + offset);
> +                       sg_dma_len(sgl) = PAGE_SIZE;
> +                       prev_pfn = this_pfn;
> +                       pa += PAGE_SIZE;
> +                       npages++;
> +                       sgl = sg_next(sgl);
> +               }
> +
> +               obj->vma = get_vma(vma);
> +               if (!obj->vma) {
> +                       ret = -ENOMEM;
> +                       goto err;
> +               }
> +
> +               up_read(&current->mm->mmap_sem);
> +               buf->pfnmap = true;
> +
> +               return npages;
> +err:
> +               buf->dma_addr = 0;
> +               up_read(&current->mm->mmap_sem);
> +
> +               return ret;
> +       }
> +
> +       up_read(&current->mm->mmap_sem);
> +
> +       /*
> +        * flag VM_DONTCOPY to VMAs to user address space to prevent
> +        * pages to VMAs from being COWed.
> +        *
> +        * The COW issue.
> +        * - while DMA of a device is using the pages to VMAs, if current
> +        * process was forked then the pages being accessed by the DMA
> +        * would be copied into child's pages.(Copy On Write) so
> +        * these pages may not have coherrency with parent's ones if
> +        * child process writed something on those pages so we need to
> +        * flag VM_DONTCOPY to prevent pages from being COWed.
> +        */
> +       ret = cow_userptr_vma(buf, 1);
> +       if (ret < 0) {
> +               DRM_ERROR("failed to set VM_DONTCOPY to  vma.\n");
> +               cow_userptr_vma(buf, 0);
> +               return 0;
> +       }
> +
> +       buf->write = write;
> +       npages = buf->size >> PAGE_SHIFT;
> +
> +       down_read(&current->mm->mmap_sem);
> +
> +       /*
> +        * Basically, all the pages from get_user_pages() can not be not only
> +        * migrated by CMA but also swapped out.
> +        *
> +        * The migration issue.
> +        * - Pages reserved by CMA for some device using DMA could be used by
> +        * kernel and if the device driver wants to use those pages
> +        * while being used by kernel then the pages are copied into
> +        * other ones allocated to migrate them and then finally,
> +        * the device driver can use the pages for itself.
> +        * Thus, migrated, the pages being accessed by DMA could be changed
> +        * to other so this situation may incur that DMA accesses any pages
> +        * it doesn't want.
> +        *
> +        * But the use of get_user_pages is safe from such magration issue
> +        * because all the pages from get_user_pages CAN NOT be not only
> +        * migrated, but also swapped out.
> +        */
> +       get_npages = get_user_pages(current, current->mm, userptr,
> +                                       npages, write, 1, buf->pages, NULL);
> +       up_read(&current->mm->mmap_sem);
> +       if (get_npages != npages)
> +               DRM_ERROR("failed to get user_pages.\n");
> +
> +       buf->userptr = userptr;
> +       buf->pfnmap = false;
> +
> +       return get_npages;
> +}
> +
> +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> +                                     struct drm_file *file_priv)
> +{
> +       struct exynos_drm_private *priv = dev->dev_private;
> +       struct exynos_drm_gem_obj *exynos_gem_obj;
> +       struct drm_exynos_gem_userptr *args = data;
> +       struct exynos_drm_gem_buf *buf;
> +       struct scatterlist *sgl;
> +       unsigned long size, userptr;
> +       unsigned int npages;
> +       int ret, get_npages;
> +
> +       DRM_DEBUG_KMS("%s\n", __FILE__);
> +
> +       if (!args->size) {
> +               DRM_ERROR("invalid size.\n");
> +               return -EINVAL;
> +       }
> +
> +       ret = check_gem_flags(args->flags);
> +       if (ret)
> +               return ret;
> +
> +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
> +
> +       if (size > priv->userptr_limit) {
> +               DRM_ERROR("excessed maximum size of userptr.\n");
> +               return -EINVAL;
> +       }
> +
> +       mutex_lock(&dev->struct_mutex);
> +
> +       /*
> +        * Limited userptr size
> +        * - User request with userptr SHOULD BE limited as userptr_limit size
> +        * Malicious process is possible to make all the processes to be
> +        * blocked so this would incur a deterioation of system
> +        * performance. so the size that user can request is limited as
> +        * userptr_limit value and also the value CAN BE changed by only root
> +        * user.
> +        */
> +       if (priv->userptr_limit >= size)
> +               priv->userptr_limit -= size;
> +       else {
> +               DRM_DEBUG_KMS("insufficient userptr size.\n");
> +               mutex_unlock(&dev->struct_mutex);
> +               return -EINVAL;
> +       }
> +
> +       mutex_unlock(&dev->struct_mutex);
> +
> +       userptr = args->userptr;
> +
> +       buf = exynos_drm_init_buf(dev, size);
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
> +       if (!exynos_gem_obj) {
> +               ret = -ENOMEM;
> +               goto err_free_buffer;
> +       }
> +
> +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> +       if (!buf->sgt) {
> +               DRM_ERROR("failed to allocate buf->sgt.\n");
> +               ret = -ENOMEM;
> +               goto err_release_gem;
> +       }
> +
> +       npages = size >> PAGE_SHIFT;
> +
> +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
> +       if (ret < 0) {
> +               DRM_ERROR("failed to initailize sg table.\n");
> +               goto err_free_sgt;
> +       }
> +
> +       buf->pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
> +       if (!buf->pages) {
> +               DRM_ERROR("failed to allocate buf->pages\n");
> +               ret = -ENOMEM;
> +               goto err_free_table;
> +       }
> +
> +       exynos_gem_obj->buffer = buf;
> +
> +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj, userptr, 1);
> +       if (get_npages != npages) {
> +               DRM_ERROR("failed to get user_pages.\n");
> +               ret = get_npages;
> +               goto err_release_userptr;
> +       }
> +
> +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base, file_priv,
> +                                               &args->handle);
> +       if (ret < 0) {
> +               DRM_ERROR("failed to create gem handle.\n");
> +               goto err_release_userptr;
> +       }
> +
> +       sgl = buf->sgt->sgl;
> +
> +       /*
> +        * if buf->pfnmap is true then update sgl of sgt with pages else
> +        * then it means the sgl was updated already so it doesn't need
> +        * to update the sgl.
> +        */
> +       if (!buf->pfnmap) {
> +               unsigned int i = 0;
> +
> +               /* set all pages to sg list. */
> +               while (i < npages) {
> +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
> +                       sg_dma_address(sgl) = page_to_phys(buf->pages[i]);
> +                       i++;
> +                       sgl = sg_next(sgl);
> +               }
> +       }
> +
> +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
> +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
> +
> +       return 0;
> +
> +err_release_userptr:
> +       get_npages--;
> +       while (get_npages >= 0)
> +               put_page(buf->pages[get_npages--]);
> +       kfree(buf->pages);
> +       buf->pages = NULL;
> +err_free_table:
> +       sg_free_table(buf->sgt);
> +err_free_sgt:
> +       kfree(buf->sgt);
> +       buf->sgt = NULL;
> +err_release_gem:
> +       drm_gem_object_release(&exynos_gem_obj->base);
> +       kfree(exynos_gem_obj);
> +       exynos_gem_obj = NULL;
> +err_free_buffer:
> +       exynos_drm_free_buf(dev, 0, buf);
> +       return ret;
> +}
> +
>  int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
>                                      struct drm_file *file_priv)
>  {      struct exynos_drm_gem_obj *exynos_gem_obj;
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> index 3334c9f..72bd993 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> @@ -29,27 +29,35 @@
>  #define to_exynos_gem_obj(x)   container_of(x,\
>                        struct exynos_drm_gem_obj, base)
>
> -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
> +#define IS_NONCONTIG_BUFFER(f) ((f & EXYNOS_BO_NONCONTIG) ||\
> +                                       (f & EXYNOS_BO_USERPTR))
>
>  /*
>  * exynos drm gem buffer structure.
>  *
>  * @kvaddr: kernel virtual address to allocated memory region.
> + * @userptr: user space address.
>  * @dma_addr: bus address(accessed by dma) to allocated memory region.
>  *     - this address could be physical address without IOMMU and
>  *     device address with IOMMU.
> + * @write: whether pages will be written to by the caller.
>  * @sgt: sg table to transfer page data.
>  * @pages: contain all pages to allocated memory region.
>  * @page_size: could be 4K, 64K or 1MB.
>  * @size: size of allocated memory region.
> + * @pfnmap: indicate whether memory region from userptr is mmaped with
> + *     VM_PFNMAP or not.
>  */
>  struct exynos_drm_gem_buf {
>        void __iomem            *kvaddr;
> +       unsigned long           userptr;
>        dma_addr_t              dma_addr;
> +       unsigned int            write;
>        struct sg_table         *sgt;
>        struct page             **pages;
>        unsigned long           page_size;
>        unsigned long           size;
> +       bool                    pfnmap;
>  };
>
>  /*
> @@ -64,6 +72,8 @@ struct exynos_drm_gem_buf {
>  *     continuous memory region allocated by user request
>  *     or at framebuffer creation.
>  * @size: total memory size to physically non-continuous memory region.
> + * @vma: a pointer to the vma to user address space and used to release
> + *     the pages to user space.
>  * @flags: indicate memory type to allocated buffer and cache attruibute.
>  *
>  * P.S. this object would be transfered to user as kms_bo.handle so
> @@ -73,6 +83,7 @@ struct exynos_drm_gem_obj {
>        struct drm_gem_object           base;
>        struct exynos_drm_gem_buf       *buffer;
>        unsigned long                   size;
> +       struct vm_area_struct           *vma;
>        unsigned int                    flags;
>  };
>
> @@ -130,6 +141,10 @@ int exynos_drm_gem_map_offset_ioctl(struct drm_device *dev, void *data,
>  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>                              struct drm_file *file_priv);
>
> +/* map user space allocated by malloc to pages. */
> +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> +                                     struct drm_file *file_priv);
> +
>  /* get buffer information to memory region allocated by gem. */
>  int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
>                                      struct drm_file *file_priv);
> diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> index 52465dc..33fa1e2 100644
> --- a/include/drm/exynos_drm.h
> +++ b/include/drm/exynos_drm.h
> @@ -77,6 +77,23 @@ struct drm_exynos_gem_mmap {
>  };
>
>  /**
> + * User-requested user space importing structure
> + *
> + * @userptr: user space address allocated by malloc.
> + * @size: size to the buffer allocated by malloc.
> + * @flags: indicate user-desired cache attribute to map the allocated buffer
> + *     to kernel space.
> + * @handle: a returned handle to created gem object.
> + *     - this handle will be set by gem module of kernel side.
> + */
> +struct drm_exynos_gem_userptr {
> +       uint64_t userptr;
> +       uint64_t size;
> +       unsigned int flags;
> +       unsigned int handle;
> +};
> +
> +/**
>  * A structure to gem information.
>  *
>  * @handle: a handle to gem object created.
> @@ -135,8 +152,10 @@ enum e_drm_exynos_gem_mem_type {
>        EXYNOS_BO_CACHABLE      = 1 << 1,
>        /* write-combine mapping. */
>        EXYNOS_BO_WC            = 1 << 2,
> +       /* user space memory allocated by malloc. */
> +       EXYNOS_BO_USERPTR       = 1 << 3,
>        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG | EXYNOS_BO_CACHABLE |
> -                                       EXYNOS_BO_WC
> +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
>  };
>
>  struct drm_exynos_g2d_get_ver {
> @@ -173,7 +192,7 @@ struct drm_exynos_g2d_exec {
>  #define DRM_EXYNOS_GEM_CREATE          0x00
>  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
>  #define DRM_EXYNOS_GEM_MMAP            0x02
> -/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
> +#define DRM_EXYNOS_GEM_USERPTR         0x03
>  #define DRM_EXYNOS_GEM_GET             0x04
>  #define DRM_EXYNOS_USER_LIMIT          0x05
>  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
> @@ -193,6 +212,9 @@ struct drm_exynos_g2d_exec {
>  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
>                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
>
> +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
> +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
> +
>  #define DRM_IOCTL_EXYNOS_GEM_GET       DRM_IOWR(DRM_COMMAND_BASE + \
>                DRM_EXYNOS_GEM_GET,     struct drm_exynos_gem_info)
>
> --
> 1.7.4.1
>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-14 19:26         ` Jerome Glisse
@ 2012-05-15  4:33           ` Inki Dae
  2012-05-15 14:31             ` Jerome Glisse
  0 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-15  4:33 UTC (permalink / raw)
  To: 'Jerome Glisse'
  Cc: airlied, dri-devel, linux-mm, minchan, kosaki.motohiro,
	kyungmin.park, sw0312.kim, jy0922.shim

Hi Jerome,

> -----Original Message-----
> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> Sent: Tuesday, May 15, 2012 4:27 AM
> To: Inki Dae
> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org; minchan@kernel.org;
> kosaki.motohiro@gmail.com; kyungmin.park@samsung.com;
> sw0312.kim@samsung.com; jy0922.shim@samsung.com
> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
> 
> On Mon, May 14, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> > this feature is used to import user space region allocated by malloc()
> or
> > mmaped into a gem. for this, we uses get_user_pages() to get all the
> pages
> > to VMAs within user address space. However we should pay attention to
> use
> > this userptr feature like below.
> >
> > The migration issue.
> > - Pages reserved by CMA for some device using DMA could be used by
> > kernel and if the device driver wants to use those pages
> > while being used by kernel then the pages are copied into
> > other ones allocated to migrate them and then finally,
> > the device driver can use the pages for itself.
> > Thus, migrated, the pages being accessed by DMA could be changed
> > to other so this situation may incur that DMA accesses any pages
> > it doesn't want.
> >
> > The COW issue.
> > - while DMA of a device is using the pages to VMAs, if current
> > process was forked then the pages being accessed by the DMA
> > would be copied into child's pages.(Copy On Write) so
> > these pages may not have coherrency with parent's ones if
> > child process wrote something on those pages so we need to
> > flag VM_DONTCOPY to prevent pages from being COWed.
> 
> Note that this is a massive change in behavior of anonymous mapping
> this effectively completely change the linux API from application
> point of view on your platform. Any application that have memory
> mapped by your ioctl will have different fork behavior that other
> application. I think this should be stressed, it's one of the thing i
> am really uncomfortable with i would rather not have the dont copy
> flag and have the page cowed and have the child not working with the
> 3d/2d/drm driver. That would means that your driver (opengl
> implementation for instance) would have to detect fork and work around
> it, nvidia closed source driver do that.
> 

First of all, thank you for your comments.

Right, VM_DONTCOPY flag would change original behavior of user. Do you think
this way has no problem but no generic way? anyway our issue was that the
pages to VMAs are copied into child's ones(COW) so we prevented those pages
from being COWed with using VM_DONTCOPY flag.

For this, I have three questions below

1. in case of not using VM_DONTCOPY flag, you think that the application
using our userptr feature has COW issue; parent's pages being accessed by
DMA of some device would be copied into child's ones if the child wrote
something on the pages. after that, DMA of a device could access pages user
doesn't want. I'm not sure but I think such behavior has no any problem and
is generic behavior and it's role of user to do fork or not. Do you think
such COW behavior could create any issue I don't aware of so we have to
prevent that somehow?

2. so we added VM_DONTCOPY flag to prevent the pages from being COWed but
this changes original behavior of user. Do you think this is not generic way
or could create any issue also?

3. and last one, what is the difference between to flag VM_DONTCOPY and to
detect fork? I mean the device driver should do something to need after
detecting fork. and I'm not sure but I think the something may also change
original behavior of user.

Please let me know if there is my missing point.

Thanks,
Inki Dae

> Anyway this is a big red flag for me, as app on your platform using
> your drm driver might start relaying on a completely non con-formant
> fork behavior.
> 
> > But the use of get_user_pages is safe from such magration issue
> > because all the pages from get_user_pages CAN NOT BE not only
> > migrated, but also swapped out. However below issue could be incurred.
> >
> > The deterioration issue of system performance by malicious process.
> > - any malicious process can request all the pages of entire system
> memory
> > through this userptr ioctl. which in turn, all other processes would be
> > blocked and incur the deterioration of system performance because the
> pages
> > couldn't be swapped out.
> >
> > So this feature limit user-desired userptr size to pre-defined and this
> value
> > CAN BE changed by only root user.
> >
> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > ---
> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  391
> +++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   17 ++-
> >  include/drm/exynos_drm.h                |   26 ++-
> >  4 files changed, 433 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > index 1e68ec2..e8ae3f1 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > @@ -220,6 +220,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MAP_OFFSET,
> >                        exynos_drm_gem_map_offset_ioctl, DRM_UNLOCKED |
> >                        DRM_AUTH),
> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
> > +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
DRM_AUTH),
> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_GET,
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > index e6abb66..3c8a5f3 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > @@ -68,6 +68,80 @@ static int check_gem_flags(unsigned int flags)
> >        return 0;
> >  }
> >
> > +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
> > +{
> > +       struct vm_area_struct *vma_copy;
> > +
> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
> > +       if (!vma_copy)
> > +               return NULL;
> > +
> > +       if (vma->vm_ops && vma->vm_ops->open)
> > +               vma->vm_ops->open(vma);
> > +
> > +       if (vma->vm_file)
> > +               get_file(vma->vm_file);
> > +
> > +       memcpy(vma_copy, vma, sizeof(*vma));
> > +
> > +       vma_copy->vm_mm = NULL;
> > +       vma_copy->vm_next = NULL;
> > +       vma_copy->vm_prev = NULL;
> > +
> > +       return vma_copy;
> > +}
> > +
> > +
> > +static void put_vma(struct vm_area_struct *vma)
> > +{
> > +       if (!vma)
> > +               return;
> > +
> > +       if (vma->vm_ops && vma->vm_ops->close)
> > +               vma->vm_ops->close(vma);
> > +
> > +       if (vma->vm_file)
> > +               fput(vma->vm_file);
> > +
> > +       kfree(vma);
> > +}
> > +
> > +/*
> > + * cow_userptr_vma - flag VM_DONTCOPY to VMAs to user address space.
> > + *
> > + * this function flags VM_DONTCOPY to VMAs to user address space to
> prevent
> > + * pages to VMAs from being COWed.
> > + */
> > +static int cow_userptr_vma(struct exynos_drm_gem_buf *buf, unsigned int
> no_cow)
> > +{
> > +       struct vm_area_struct *vma;
> > +       unsigned long start, end;
> > +
> > +       start = buf->userptr;
> > +       end = buf->userptr + buf->size - 1;
> > +
> > +       down_write(&current->mm->mmap_sem);
> > +
> > +       do {
> > +               vma = find_vma(current->mm, start);
> > +               if (!vma) {
> > +                       up_write(&current->mm->mmap_sem);
> > +                       return -EFAULT;
> > +               }
> > +
> > +               if (no_cow)
> > +                       vma->vm_flags |= VM_DONTCOPY;
> > +               else
> > +                       vma->vm_flags &= ~VM_DONTCOPY;
> > +
> > +               start = vma->vm_end + 1;
> > +       } while (vma->vm_end < end);
> > +
> > +       up_write(&current->mm->mmap_sem);
> > +
> > +       return 0;
> > +}
> > +
> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
> >                                        struct vm_area_struct *vma)
> >  {
> > @@ -256,6 +330,50 @@ static void exynos_drm_gem_put_pages(struct
> drm_gem_object *obj)
> >        /* add some codes for UNCACHED type here. TODO */
> >  }
> >
> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
> > +{
> > +       struct drm_device *dev = obj->dev;
> > +       struct exynos_drm_private *priv = dev->dev_private;
> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> > +       struct exynos_drm_gem_buf *buf;
> > +       struct vm_area_struct *vma;
> > +       int npages;
> > +
> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
> > +       buf = exynos_gem_obj->buffer;
> > +       vma = exynos_gem_obj->vma;
> > +
> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> > +               put_vma(exynos_gem_obj->vma);
> > +               goto out;
> > +       }
> > +
> > +       npages = buf->size >> PAGE_SHIFT;
> > +
> > +       if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR && !buf->pfnmap)
> > +               cow_userptr_vma(buf, 0);
> > +
> > +       npages--;
> > +       while (npages >= 0) {
> > +               if (buf->write)
> > +                       set_page_dirty_lock(buf->pages[npages]);
> > +
> > +               put_page(buf->pages[npages]);
> > +               npages--;
> > +       }
> > +
> > +out:
> > +       mutex_lock(&dev->struct_mutex);
> > +       priv->userptr_limit += buf->size;
> > +       mutex_unlock(&dev->struct_mutex);
> > +
> > +       kfree(buf->pages);
> > +       buf->pages = NULL;
> > +
> > +       kfree(buf->sgt);
> > +       buf->sgt = NULL;
> > +}
> > +
> >  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
> >                                        struct drm_file *file_priv,
> >                                        unsigned int *handle)
> > @@ -295,6 +413,8 @@ void exynos_drm_gem_destroy(struct
> exynos_drm_gem_obj *exynos_gem_obj)
> >
> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
> >                exynos_drm_gem_put_pages(obj);
> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
> > +               exynos_drm_put_userptr(obj);
> >        else
> >                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
buf);
> >
> > @@ -606,6 +726,277 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device
> *dev, void *data,
> >        return 0;
> >  }
> >
> > +/*
> > + * exynos_drm_get_userptr - get pages to VMSs to user address space.
> > + *
> > + * this function is used for gpu driver to access user space directly
> > + * for performance enhancement to avoid memcpy.
> > + */
> > +static int exynos_drm_get_userptr(struct drm_device *dev,
> > +                               struct exynos_drm_gem_obj *obj,
> > +                               unsigned long userptr,
> > +                               unsigned int write)
> > +{
> > +       unsigned int get_npages;
> > +       unsigned long npages = 0;
> > +       struct vm_area_struct *vma;
> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
> > +       int ret;
> > +
> > +       down_read(&current->mm->mmap_sem);
> > +       vma = find_vma(current->mm, userptr);
> > +
> > +       /* the memory region mmaped with VM_PFNMAP. */
> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> > +               unsigned long this_pfn, prev_pfn, pa;
> > +               unsigned long start, end, offset;
> > +               struct scatterlist *sgl;
> > +               int ret;
> > +
> > +               start = userptr;
> > +               offset = userptr & ~PAGE_MASK;
> > +               end = start + buf->size;
> > +               sgl = buf->sgt->sgl;
> > +
> > +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
> > +                       ret = follow_pfn(vma, start, &this_pfn);
> > +                       if (ret)
> > +                               goto err;
> > +
> > +                       if (prev_pfn == 0) {
> > +                               pa = this_pfn << PAGE_SHIFT;
> > +                               buf->dma_addr = pa + offset;
> > +                       } else if (this_pfn != prev_pfn + 1) {
> > +                               ret = -EINVAL;
> > +                               goto err;
> > +                       }
> > +
> > +                       sg_dma_address(sgl) = (pa + offset);
> > +                       sg_dma_len(sgl) = PAGE_SIZE;
> > +                       prev_pfn = this_pfn;
> > +                       pa += PAGE_SIZE;
> > +                       npages++;
> > +                       sgl = sg_next(sgl);
> > +               }
> > +
> > +               obj->vma = get_vma(vma);
> > +               if (!obj->vma) {
> > +                       ret = -ENOMEM;
> > +                       goto err;
> > +               }
> > +
> > +               up_read(&current->mm->mmap_sem);
> > +               buf->pfnmap = true;
> > +
> > +               return npages;
> > +err:
> > +               buf->dma_addr = 0;
> > +               up_read(&current->mm->mmap_sem);
> > +
> > +               return ret;
> > +       }
> > +
> > +       up_read(&current->mm->mmap_sem);
> > +
> > +       /*
> > +        * flag VM_DONTCOPY to VMAs to user address space to prevent
> > +        * pages to VMAs from being COWed.
> > +        *
> > +        * The COW issue.
> > +        * - while DMA of a device is using the pages to VMAs, if
current
> > +        * process was forked then the pages being accessed by the DMA
> > +        * would be copied into child's pages.(Copy On Write) so
> > +        * these pages may not have coherrency with parent's ones if
> > +        * child process wrote something on those pages so we need to
> > +        * flag VM_DONTCOPY to prevent pages from being COWed.
> > +        */
> > +       ret = cow_userptr_vma(buf, 1);
> > +       if (ret < 0) {
> > +               DRM_ERROR("failed to set VM_DONTCOPY to  vma.\n");
> > +               cow_userptr_vma(buf, 0);
> > +               return 0;
> > +       }
> > +
> > +       buf->write = write;
> > +       npages = buf->size >> PAGE_SHIFT;
> > +
> > +       down_read(&current->mm->mmap_sem);
> > +
> > +       /*
> > +        * Basically, all the pages from get_user_pages() can not be not
> only
> > +        * migrated by CMA but also swapped out.
> > +        *
> > +        * The migration issue.
> > +        * - Pages reserved by CMA for some device using DMA could be
> used by
> > +        * kernel and if the device driver wants to use those pages
> > +        * while being used by kernel then the pages are copied into
> > +        * other ones allocated to migrate them and then finally,
> > +        * the device driver can use the pages for itself.
> > +        * Thus, migrated, the pages being accessed by DMA could be
> changed
> > +        * to other so this situation may incur that DMA accesses any
> pages
> > +        * it doesn't want.
> > +        *
> > +        * But the use of get_user_pages is safe from such magration
> issue
> > +        * because all the pages from get_user_pages CAN NOT be not only
> > +        * migrated, but also swapped out.
> > +        */
> > +       get_npages = get_user_pages(current, current->mm, userptr,
> > +                                       npages, write, 1, buf->pages,
NULL);
> > +       up_read(&current->mm->mmap_sem);
> > +       if (get_npages != npages)
> > +               DRM_ERROR("failed to get user_pages.\n");
> > +
> > +       buf->userptr = userptr;
> > +       buf->pfnmap = false;
> > +
> > +       return get_npages;
> > +}
> > +
> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> > +                                     struct drm_file *file_priv)
> > +{
> > +       struct exynos_drm_private *priv = dev->dev_private;
> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> > +       struct drm_exynos_gem_userptr *args = data;
> > +       struct exynos_drm_gem_buf *buf;
> > +       struct scatterlist *sgl;
> > +       unsigned long size, userptr;
> > +       unsigned int npages;
> > +       int ret, get_npages;
> > +
> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
> > +
> > +       if (!args->size) {
> > +               DRM_ERROR("invalid size.\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       ret = check_gem_flags(args->flags);
> > +       if (ret)
> > +               return ret;
> > +
> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
> > +
> > +       if (size > priv->userptr_limit) {
> > +               DRM_ERROR("excessed maximum size of userptr.\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       mutex_lock(&dev->struct_mutex);
> > +
> > +       /*
> > +        * Limited userptr size
> > +        * - User request with userptr SHOULD BE limited as
userptr_limit
> size
> > +        * Malicious process is possible to make all the processes to be
> > +        * blocked so this would incur a deterioation of system
> > +        * performance. so the size that user can request is limited as
> > +        * userptr_limit value and also the value CAN BE changed by only
> root
> > +        * user.
> > +        */
> > +       if (priv->userptr_limit >= size)
> > +               priv->userptr_limit -= size;
> > +       else {
> > +               DRM_DEBUG_KMS("insufficient userptr size.\n");
> > +               mutex_unlock(&dev->struct_mutex);
> > +               return -EINVAL;
> > +       }
> > +
> > +       mutex_unlock(&dev->struct_mutex);
> > +
> > +       userptr = args->userptr;
> > +
> > +       buf = exynos_drm_init_buf(dev, size);
> > +       if (!buf)
> > +               return -ENOMEM;
> > +
> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
> > +       if (!exynos_gem_obj) {
> > +               ret = -ENOMEM;
> > +               goto err_free_buffer;
> > +       }
> > +
> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> > +       if (!buf->sgt) {
> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
> > +               ret = -ENOMEM;
> > +               goto err_release_gem;
> > +       }
> > +
> > +       npages = size >> PAGE_SHIFT;
> > +
> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
> > +       if (ret < 0) {
> > +               DRM_ERROR("failed to initailize sg table.\n");
> > +               goto err_free_sgt;
> > +       }
> > +
> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
GFP_KERNEL);
> > +       if (!buf->pages) {
> > +               DRM_ERROR("failed to allocate buf->pages\n");
> > +               ret = -ENOMEM;
> > +               goto err_free_table;
> > +       }
> > +
> > +       exynos_gem_obj->buffer = buf;
> > +
> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
userptr,
> 1);
> > +       if (get_npages != npages) {
> > +               DRM_ERROR("failed to get user_pages.\n");
> > +               ret = get_npages;
> > +               goto err_release_userptr;
> > +       }
> > +
> > +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base,
> file_priv,
> > +                                               &args->handle);
> > +       if (ret < 0) {
> > +               DRM_ERROR("failed to create gem handle.\n");
> > +               goto err_release_userptr;
> > +       }
> > +
> > +       sgl = buf->sgt->sgl;
> > +
> > +       /*
> > +        * if buf->pfnmap is true then update sgl of sgt with pages else
> > +        * then it means the sgl was updated already so it doesn't need
> > +        * to update the sgl.
> > +        */
> > +       if (!buf->pfnmap) {
> > +               unsigned int i = 0;
> > +
> > +               /* set all pages to sg list. */
> > +               while (i < npages) {
> > +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
> > +                       sg_dma_address(sgl) =
page_to_phys(buf->pages[i]);
> > +                       i++;
> > +                       sgl = sg_next(sgl);
> > +               }
> > +       }
> > +
> > +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
> > +
> > +       return 0;
> > +
> > +err_release_userptr:
> > +       get_npages--;
> > +       while (get_npages >= 0)
> > +               put_page(buf->pages[get_npages--]);
> > +       kfree(buf->pages);
> > +       buf->pages = NULL;
> > +err_free_table:
> > +       sg_free_table(buf->sgt);
> > +err_free_sgt:
> > +       kfree(buf->sgt);
> > +       buf->sgt = NULL;
> > +err_release_gem:
> > +       drm_gem_object_release(&exynos_gem_obj->base);
> > +       kfree(exynos_gem_obj);
> > +       exynos_gem_obj = NULL;
> > +err_free_buffer:
> > +       exynos_drm_free_buf(dev, 0, buf);
> > +       return ret;
> > +}
> > +
> >  int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
> >                                      struct drm_file *file_priv)
> >  {      struct exynos_drm_gem_obj *exynos_gem_obj;
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > index 3334c9f..72bd993 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > @@ -29,27 +29,35 @@
> >  #define to_exynos_gem_obj(x)   container_of(x,\
> >                        struct exynos_drm_gem_obj, base)
> >
> > -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
> > +#define IS_NONCONTIG_BUFFER(f) ((f & EXYNOS_BO_NONCONTIG) ||\
> > +                                       (f & EXYNOS_BO_USERPTR))
> >
> >  /*
> >  * exynos drm gem buffer structure.
> >  *
> >  * @kvaddr: kernel virtual address to allocated memory region.
> > + * @userptr: user space address.
> >  * @dma_addr: bus address(accessed by dma) to allocated memory region.
> >  *     - this address could be physical address without IOMMU and
> >  *     device address with IOMMU.
> > + * @write: whether pages will be written to by the caller.
> >  * @sgt: sg table to transfer page data.
> >  * @pages: contain all pages to allocated memory region.
> >  * @page_size: could be 4K, 64K or 1MB.
> >  * @size: size of allocated memory region.
> > + * @pfnmap: indicate whether memory region from userptr is mmaped with
> > + *     VM_PFNMAP or not.
> >  */
> >  struct exynos_drm_gem_buf {
> >        void __iomem            *kvaddr;
> > +       unsigned long           userptr;
> >        dma_addr_t              dma_addr;
> > +       unsigned int            write;
> >        struct sg_table         *sgt;
> >        struct page             **pages;
> >        unsigned long           page_size;
> >        unsigned long           size;
> > +       bool                    pfnmap;
> >  };
> >
> >  /*
> > @@ -64,6 +72,8 @@ struct exynos_drm_gem_buf {
> >  *     continuous memory region allocated by user request
> >  *     or at framebuffer creation.
> >  * @size: total memory size to physically non-continuous memory region.
> > + * @vma: a pointer to the vma to user address space and used to release
> > + *     the pages to user space.
> >  * @flags: indicate memory type to allocated buffer and cache
attruibute.
> >  *
> >  * P.S. this object would be transfered to user as kms_bo.handle so
> > @@ -73,6 +83,7 @@ struct exynos_drm_gem_obj {
> >        struct drm_gem_object           base;
> >        struct exynos_drm_gem_buf       *buffer;
> >        unsigned long                   size;
> > +       struct vm_area_struct           *vma;
> >        unsigned int                    flags;
> >  };
> >
> > @@ -130,6 +141,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
> drm_device *dev, void *data,
> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
> >                              struct drm_file *file_priv);
> >
> > +/* map user space allocated by malloc to pages. */
> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> > +                                     struct drm_file *file_priv);
> > +
> >  /* get buffer information to memory region allocated by gem. */
> >  int exynos_drm_gem_get_ioctl(struct drm_device *dev, void *data,
> >                                      struct drm_file *file_priv);
> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> > index 52465dc..33fa1e2 100644
> > --- a/include/drm/exynos_drm.h
> > +++ b/include/drm/exynos_drm.h
> > @@ -77,6 +77,23 @@ struct drm_exynos_gem_mmap {
> >  };
> >
> >  /**
> > + * User-requested user space importing structure
> > + *
> > + * @userptr: user space address allocated by malloc.
> > + * @size: size to the buffer allocated by malloc.
> > + * @flags: indicate user-desired cache attribute to map the allocated
> buffer
> > + *     to kernel space.
> > + * @handle: a returned handle to created gem object.
> > + *     - this handle will be set by gem module of kernel side.
> > + */
> > +struct drm_exynos_gem_userptr {
> > +       uint64_t userptr;
> > +       uint64_t size;
> > +       unsigned int flags;
> > +       unsigned int handle;
> > +};
> > +
> > +/**
> >  * A structure to gem information.
> >  *
> >  * @handle: a handle to gem object created.
> > @@ -135,8 +152,10 @@ enum e_drm_exynos_gem_mem_type {
> >        EXYNOS_BO_CACHABLE      = 1 << 1,
> >        /* write-combine mapping. */
> >        EXYNOS_BO_WC            = 1 << 2,
> > +       /* user space memory allocated by malloc. */
> > +       EXYNOS_BO_USERPTR       = 1 << 3,
> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
EXYNOS_BO_CACHABLE
> |
> > -                                       EXYNOS_BO_WC
> > +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
> >  };
> >
> >  struct drm_exynos_g2d_get_ver {
> > @@ -173,7 +192,7 @@ struct drm_exynos_g2d_exec {
> >  #define DRM_EXYNOS_GEM_CREATE          0x00
> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
> >  #define DRM_EXYNOS_GEM_MMAP            0x02
> > -/* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
> >  #define DRM_EXYNOS_GEM_GET             0x04
> >  #define DRM_EXYNOS_USER_LIMIT          0x05
> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
> > @@ -193,6 +212,9 @@ struct drm_exynos_g2d_exec {
> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
> >
> > +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
> > +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
> > +
> >  #define DRM_IOCTL_EXYNOS_GEM_GET       DRM_IOWR(DRM_COMMAND_BASE + \
> >                DRM_EXYNOS_GEM_GET,     struct drm_exynos_gem_info)
> >
> > --
> > 1.7.4.1
> >

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-04-23 13:43 ` [PATCH 3/4] drm/exynos: added userptr feature Inki Dae
  2012-04-24  5:17   ` [PATCH v2 " Inki Dae
  2012-05-09  6:17   ` [PATCH 0/2 v3] " Inki Dae
@ 2012-05-15  7:34   ` Rob Clark
  2012-05-15  8:17     ` Inki Dae
  2012-05-16  9:22     ` Dave Airlie
  2 siblings, 2 replies; 77+ messages in thread
From: Rob Clark @ 2012-05-15  7:34 UTC (permalink / raw)
  To: Inki Dae; +Cc: kyungmin.park, sw0312.kim, dri-devel

On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com> wrote:
> this feature could be used to use memory region allocated by malloc() in user
> mode and mmaped memory region allocated by other memory allocators. userptr
> interface can identify memory type through vm_flags value and would get
> pages or page frame numbers to user space appropriately.

I apologize for being a little late to jump in on this thread, but...

I must confess to not being a huge fan of userptr.  It really is
opening a can of worms, and seems best avoided if at all possible.
I'm not entirely sure the use-case for which you require this, but I
wonder if there isn't an alternative way?   I mean, the main case I
could think of for mapping userspace memory would be something like
texture upload.  But could that be handled in an alternative way,
something like a pwrite or texture_upload API, which could temporarily
pin the userspace memory, kick off a dma from that user buffer to a
proper GEM buffer, and then unpin the user buffer when the DMA
completes?

BR,
-R

> Signed-off-by: Inki Dae <inki.dae@samsung.com>
> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> ---
>  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
>  drivers/gpu/drm/exynos/exynos_drm_gem.c |  258 +++++++++++++++++++++++++++++++
>  drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
>  include/drm/exynos_drm.h                |   25 +++-
>  4 files changed, 296 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> index f58a487..5bb0361 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> @@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
>                        DRM_AUTH),
>        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
>                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED | DRM_AUTH),
> +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
> +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
>        DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS, exynos_plane_set_zpos_ioctl,
>                        DRM_UNLOCKED | DRM_AUTH),
>        DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> index afd0cd4..b68d4ea 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> @@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
>        return 0;
>  }
>
> +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
> +{
> +       struct vm_area_struct *vma_copy;
> +
> +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
> +       if (!vma_copy)
> +               return NULL;
> +
> +       if (vma->vm_ops && vma->vm_ops->open)
> +               vma->vm_ops->open(vma);
> +
> +       if (vma->vm_file)
> +               get_file(vma->vm_file);
> +
> +       memcpy(vma_copy, vma, sizeof(*vma));
> +
> +       vma_copy->vm_mm = NULL;
> +       vma_copy->vm_next = NULL;
> +       vma_copy->vm_prev = NULL;
> +
> +       return vma_copy;
> +}
> +
> +static void put_vma(struct vm_area_struct *vma)
> +{
> +       if (!vma)
> +               return;
> +
> +       if (vma->vm_ops && vma->vm_ops->close)
> +               vma->vm_ops->close(vma);
> +
> +       if (vma->vm_file)
> +               fput(vma->vm_file);
> +
> +       kfree(vma);
> +}
> +
>  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
>                                        struct vm_area_struct *vma)
>  {
> @@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct drm_gem_object *obj)
>        /* add some codes for UNCACHED type here. TODO */
>  }
>
> +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
> +{
> +       struct exynos_drm_gem_obj *exynos_gem_obj;
> +       struct exynos_drm_gem_buf *buf;
> +       struct vm_area_struct *vma;
> +       int npages;
> +
> +       exynos_gem_obj = to_exynos_gem_obj(obj);
> +       buf = exynos_gem_obj->buffer;
> +       vma = exynos_gem_obj->vma;
> +
> +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> +               put_vma(exynos_gem_obj->vma);
> +               goto out;
> +       }
> +
> +       npages = buf->size >> PAGE_SHIFT;
> +
> +       npages--;
> +       while (npages >= 0) {
> +               if (buf->write)
> +                       set_page_dirty_lock(buf->pages[npages]);
> +
> +               put_page(buf->pages[npages]);
> +               npages--;
> +       }
> +
> +out:
> +       kfree(buf->pages);
> +       buf->pages = NULL;
> +
> +       kfree(buf->sgt);
> +       buf->sgt = NULL;
> +}
> +
>  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
>                                        struct drm_file *file_priv,
>                                        unsigned int *handle)
> @@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct exynos_drm_gem_obj *exynos_gem_obj)
>
>        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
>                exynos_drm_gem_put_pages(obj);
> +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
> +               exynos_drm_put_userptr(obj);
>        else
>                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags, buf);
>
> @@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>        return 0;
>  }
>
> +static int exynos_drm_get_userptr(struct drm_device *dev,
> +                               struct exynos_drm_gem_obj *obj,
> +                               unsigned long userptr,
> +                               unsigned int write)
> +{
> +       unsigned int get_npages;
> +       unsigned long npages = 0;
> +       struct vm_area_struct *vma;
> +       struct exynos_drm_gem_buf *buf = obj->buffer;
> +
> +       vma = find_vma(current->mm, userptr);
> +
> +       /* the memory region mmaped with VM_PFNMAP. */
> +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> +               unsigned long this_pfn, prev_pfn, pa;
> +               unsigned long start, end, offset;
> +               struct scatterlist *sgl;
> +
> +               start = userptr;
> +               offset = userptr & ~PAGE_MASK;
> +               end = start + buf->size;
> +               sgl = buf->sgt->sgl;
> +
> +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
> +                       int ret = follow_pfn(vma, start, &this_pfn);
> +                       if (ret)
> +                               return ret;
> +
> +                       if (prev_pfn == 0)
> +                               pa = this_pfn << PAGE_SHIFT;
> +                       else if (this_pfn != prev_pfn + 1)
> +                               return -EFAULT;
> +
> +                       sg_dma_address(sgl) = (pa + offset);
> +                       sg_dma_len(sgl) = PAGE_SIZE;
> +                       prev_pfn = this_pfn;
> +                       pa += PAGE_SIZE;
> +                       npages++;
> +                       sgl = sg_next(sgl);
> +               }
> +
> +               buf->dma_addr = pa + offset;
> +
> +               obj->vma = get_vma(vma);
> +               if (!obj->vma)
> +                       return -ENOMEM;
> +
> +               buf->pfnmap = true;
> +
> +               return npages;
> +       }
> +
> +       buf->write = write;
> +       npages = buf->size >> PAGE_SHIFT;
> +
> +       down_read(&current->mm->mmap_sem);
> +       get_npages = get_user_pages(current, current->mm, userptr,
> +                                       npages, write, 1, buf->pages, NULL);
> +       up_read(&current->mm->mmap_sem);
> +       if (get_npages != npages)
> +               DRM_ERROR("failed to get user_pages.\n");
> +
> +       buf->pfnmap = false;
> +
> +       return get_npages;
> +}
> +
> +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> +                                     struct drm_file *file_priv)
> +{
> +       struct exynos_drm_gem_obj *exynos_gem_obj;
> +       struct drm_exynos_gem_userptr *args = data;
> +       struct exynos_drm_gem_buf *buf;
> +       struct scatterlist *sgl;
> +       unsigned long size, userptr;
> +       unsigned int npages;
> +       int ret, get_npages;
> +
> +       DRM_DEBUG_KMS("%s\n", __FILE__);
> +
> +       if (!args->size) {
> +               DRM_ERROR("invalid size.\n");
> +               return -EINVAL;
> +       }
> +
> +       ret = check_gem_flags(args->flags);
> +       if (ret)
> +               return ret;
> +
> +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
> +       userptr = args->userptr;
> +
> +       buf = exynos_drm_init_buf(dev, size);
> +       if (!buf)
> +               return -ENOMEM;
> +
> +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
> +       if (!exynos_gem_obj) {
> +               ret = -ENOMEM;
> +               goto err_free_buffer;
> +       }
> +
> +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> +       if (!buf->sgt) {
> +               DRM_ERROR("failed to allocate buf->sgt.\n");
> +               ret = -ENOMEM;
> +               goto err_release_gem;
> +       }
> +
> +       npages = size >> PAGE_SHIFT;
> +
> +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
> +       if (ret < 0) {
> +               DRM_ERROR("failed to initailize sg table.\n");
> +               goto err_free_sgt;
> +       }
> +
> +       buf->pages = kzalloc(npages * sizeof(struct page *), GFP_KERNEL);
> +       if (!buf->pages) {
> +               DRM_ERROR("failed to allocate buf->pages\n");
> +               ret = -ENOMEM;
> +               goto err_free_table;
> +       }
> +
> +       exynos_gem_obj->buffer = buf;
> +
> +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj, userptr, 1);
> +       if (get_npages != npages) {
> +               DRM_ERROR("failed to get user_pages.\n");
> +               ret = get_npages;
> +               goto err_release_userptr;
> +       }
> +
> +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base, file_priv,
> +                                               &args->handle);
> +       if (ret < 0) {
> +               DRM_ERROR("failed to create gem handle.\n");
> +               goto err_release_userptr;
> +       }
> +
> +       sgl = buf->sgt->sgl;
> +
> +       /*
> +        * if buf->pfnmap is true then update sgl of sgt with pages but
> +        * if buf->pfnmap is false then it means the sgl was updated already
> +        * so it doesn't need to update the sgl.
> +        */
> +       if (!buf->pfnmap) {
> +               unsigned int i = 0;
> +
> +               /* set all pages to sg list. */
> +               while (i < npages) {
> +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
> +                       sg_dma_address(sgl) = page_to_phys(buf->pages[i]);
> +                       i++;
> +                       sgl = sg_next(sgl);
> +               }
> +       }
> +
> +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
> +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
> +
> +       return 0;
> +
> +err_release_userptr:
> +       get_npages--;
> +       while (get_npages >= 0)
> +               put_page(buf->pages[get_npages--]);
> +       kfree(buf->pages);
> +       buf->pages = NULL;
> +err_free_table:
> +       sg_free_table(buf->sgt);
> +err_free_sgt:
> +       kfree(buf->sgt);
> +       buf->sgt = NULL;
> +err_release_gem:
> +       drm_gem_object_release(&exynos_gem_obj->base);
> +       kfree(exynos_gem_obj);
> +       exynos_gem_obj = NULL;
> +err_free_buffer:
> +       exynos_drm_free_buf(dev, 0, buf);
> +       return ret;
> +}
> +
>  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
>  {
>        DRM_DEBUG_KMS("%s\n", __FILE__);
> diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> index efc8252..1de2241 100644
> --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> @@ -29,7 +29,8 @@
>  #define to_exynos_gem_obj(x)   container_of(x,\
>                        struct exynos_drm_gem_obj, base)
>
> -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
> +#define IS_NONCONTIG_BUFFER(f)         ((f & EXYNOS_BO_NONCONTIG) ||\
> +                                       (f & EXYNOS_BO_USERPTR))
>
>  /*
>  * exynos drm gem buffer structure.
> @@ -38,18 +39,23 @@
>  * @dma_addr: bus address(accessed by dma) to allocated memory region.
>  *     - this address could be physical address without IOMMU and
>  *     device address with IOMMU.
> + * @write: whether pages will be written to by the caller.
>  * @sgt: sg table to transfer page data.
>  * @pages: contain all pages to allocated memory region.
>  * @page_size: could be 4K, 64K or 1MB.
>  * @size: size of allocated memory region.
> + * @pfnmap: indicate whether memory region from userptr is mmaped with
> + *     VM_PFNMAP or not.
>  */
>  struct exynos_drm_gem_buf {
>        void __iomem            *kvaddr;
>        dma_addr_t              dma_addr;
> +       unsigned int            write;
>        struct sg_table         *sgt;
>        struct page             **pages;
>        unsigned long           page_size;
>        unsigned long           size;
> +       bool                    pfnmap;
>  };
>
>  /*
> @@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
>        struct drm_gem_object           base;
>        struct exynos_drm_gem_buf       *buffer;
>        unsigned long                   size;
> +       struct vm_area_struct           *vma;
>        unsigned int                    flags;
>  };
>
> @@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct drm_device *dev, void *data,
>  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>                              struct drm_file *file_priv);
>
> +/* map user space allocated by malloc to pages. */
> +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> +                                     struct drm_file *file_priv);
> +
>  /* initialize gem object. */
>  int exynos_drm_gem_init_object(struct drm_gem_object *obj);
>
> diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> index 2d6eb06..48eda6e 100644
> --- a/include/drm/exynos_drm.h
> +++ b/include/drm/exynos_drm.h
> @@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
>  };
>
>  /**
> + * User-requested user space importing structure
> + *
> + * @userptr: user space address allocated by malloc.
> + * @size: size to the buffer allocated by malloc.
> + * @flags: indicate user-desired cache attribute to map the allocated buffer
> + *     to kernel space.
> + * @handle: a returned handle to created gem object.
> + *     - this handle will be set by gem module of kernel side.
> + */
> +struct drm_exynos_gem_userptr {
> +       uint64_t userptr;
> +       uint64_t size;
> +       unsigned int flags;
> +       unsigned int handle;
> +};
> +
> +/**
>  * A structure for user connection request of virtual display.
>  *
>  * @connection: indicate whether doing connetion or not by user.
> @@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
>        EXYNOS_BO_CACHABLE      = 1 << 1,
>        /* write-combine mapping. */
>        EXYNOS_BO_WC            = 1 << 2,
> +       /* user space memory allocated by malloc. */
> +       EXYNOS_BO_USERPTR       = 1 << 3,
>        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG | EXYNOS_BO_CACHABLE |
> -                                       EXYNOS_BO_WC
> +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
>  };
>
>  #define DRM_EXYNOS_GEM_CREATE          0x00
>  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
>  #define DRM_EXYNOS_GEM_MMAP            0x02
> +#define DRM_EXYNOS_GEM_USERPTR         0x03
>  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
>  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
>  #define DRM_EXYNOS_VIDI_CONNECTION     0x07
> @@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
>  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
>                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
>
> +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
> +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
> +
>  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS        DRM_IOWR(DRM_COMMAND_BASE + \
>                DRM_EXYNOS_PLANE_SET_ZPOS, struct drm_exynos_plane_set_zpos)
>
> --
> 1.7.4.1
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-15  7:34   ` [PATCH 3/4] " Rob Clark
@ 2012-05-15  8:17     ` Inki Dae
  2012-05-15  9:35       ` Rob Clark
  2012-05-16  9:22     ` Dave Airlie
  1 sibling, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-15  8:17 UTC (permalink / raw)
  To: 'Rob Clark'; +Cc: kyungmin.park, sw0312.kim, dri-devel

Hi Rob,

> -----Original Message-----
> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
> Clark
> Sent: Tuesday, May 15, 2012 4:35 PM
> To: Inki Dae
> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> kyungmin.park@samsung.com; sw0312.kim@samsung.com
> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> 
> On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com> wrote:
> > this feature could be used to use memory region allocated by malloc() in
> user
> > mode and mmaped memory region allocated by other memory allocators.
> userptr
> > interface can identify memory type through vm_flags value and would get
> > pages or page frame numbers to user space appropriately.
> 
> I apologize for being a little late to jump in on this thread, but...
> 
> I must confess to not being a huge fan of userptr.  It really is
> opening a can of worms, and seems best avoided if at all possible.
> I'm not entirely sure the use-case for which you require this, but I
> wonder if there isn't an alternative way?   I mean, the main case I
> could think of for mapping userspace memory would be something like
> texture upload.  But could that be handled in an alternative way,
> something like a pwrite or texture_upload API, which could temporarily
> pin the userspace memory, kick off a dma from that user buffer to a
> proper GEM buffer, and then unpin the user buffer when the DMA
> completes?
> 

This feature have being discussed with drm and linux-mm guys and I have
posted this patch four times.
So please, see below e-mail threads.
http://www.spinics.net/lists/dri-devel/msg22729.html

and then please, give me your opinions.

Thanks,
Inki Dae

> BR,
> -R
> 
> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> > ---
> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  258
> +++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
> >  include/drm/exynos_drm.h                |   25 +++-
> >  4 files changed, 296 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > index f58a487..5bb0361 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> > @@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
> >                        DRM_AUTH),
> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
DRM_AUTH),
> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
> > +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
> >        DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS,
> exynos_plane_set_zpos_ioctl,
> >                        DRM_UNLOCKED | DRM_AUTH),
> >        DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > index afd0cd4..b68d4ea 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> > @@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
> >        return 0;
> >  }
> >
> > +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
> > +{
> > +       struct vm_area_struct *vma_copy;
> > +
> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
> > +       if (!vma_copy)
> > +               return NULL;
> > +
> > +       if (vma->vm_ops && vma->vm_ops->open)
> > +               vma->vm_ops->open(vma);
> > +
> > +       if (vma->vm_file)
> > +               get_file(vma->vm_file);
> > +
> > +       memcpy(vma_copy, vma, sizeof(*vma));
> > +
> > +       vma_copy->vm_mm = NULL;
> > +       vma_copy->vm_next = NULL;
> > +       vma_copy->vm_prev = NULL;
> > +
> > +       return vma_copy;
> > +}
> > +
> > +static void put_vma(struct vm_area_struct *vma)
> > +{
> > +       if (!vma)
> > +               return;
> > +
> > +       if (vma->vm_ops && vma->vm_ops->close)
> > +               vma->vm_ops->close(vma);
> > +
> > +       if (vma->vm_file)
> > +               fput(vma->vm_file);
> > +
> > +       kfree(vma);
> > +}
> > +
> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
> >                                        struct vm_area_struct *vma)
> >  {
> > @@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct
> drm_gem_object *obj)
> >        /* add some codes for UNCACHED type here. TODO */
> >  }
> >
> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
> > +{
> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> > +       struct exynos_drm_gem_buf *buf;
> > +       struct vm_area_struct *vma;
> > +       int npages;
> > +
> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
> > +       buf = exynos_gem_obj->buffer;
> > +       vma = exynos_gem_obj->vma;
> > +
> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> > +               put_vma(exynos_gem_obj->vma);
> > +               goto out;
> > +       }
> > +
> > +       npages = buf->size >> PAGE_SHIFT;
> > +
> > +       npages--;
> > +       while (npages >= 0) {
> > +               if (buf->write)
> > +                       set_page_dirty_lock(buf->pages[npages]);
> > +
> > +               put_page(buf->pages[npages]);
> > +               npages--;
> > +       }
> > +
> > +out:
> > +       kfree(buf->pages);
> > +       buf->pages = NULL;
> > +
> > +       kfree(buf->sgt);
> > +       buf->sgt = NULL;
> > +}
> > +
> >  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
> >                                        struct drm_file *file_priv,
> >                                        unsigned int *handle)
> > @@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct
> exynos_drm_gem_obj *exynos_gem_obj)
> >
> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
> >                exynos_drm_gem_put_pages(obj);
> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
> > +               exynos_drm_put_userptr(obj);
> >        else
> >                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
buf);
> >
> > @@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device
> *dev, void *data,
> >        return 0;
> >  }
> >
> > +static int exynos_drm_get_userptr(struct drm_device *dev,
> > +                               struct exynos_drm_gem_obj *obj,
> > +                               unsigned long userptr,
> > +                               unsigned int write)
> > +{
> > +       unsigned int get_npages;
> > +       unsigned long npages = 0;
> > +       struct vm_area_struct *vma;
> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
> > +
> > +       vma = find_vma(current->mm, userptr);
> > +
> > +       /* the memory region mmaped with VM_PFNMAP. */
> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
> > +               unsigned long this_pfn, prev_pfn, pa;
> > +               unsigned long start, end, offset;
> > +               struct scatterlist *sgl;
> > +
> > +               start = userptr;
> > +               offset = userptr & ~PAGE_MASK;
> > +               end = start + buf->size;
> > +               sgl = buf->sgt->sgl;
> > +
> > +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
> > +                       int ret = follow_pfn(vma, start, &this_pfn);
> > +                       if (ret)
> > +                               return ret;
> > +
> > +                       if (prev_pfn == 0)
> > +                               pa = this_pfn << PAGE_SHIFT;
> > +                       else if (this_pfn != prev_pfn + 1)
> > +                               return -EFAULT;
> > +
> > +                       sg_dma_address(sgl) = (pa + offset);
> > +                       sg_dma_len(sgl) = PAGE_SIZE;
> > +                       prev_pfn = this_pfn;
> > +                       pa += PAGE_SIZE;
> > +                       npages++;
> > +                       sgl = sg_next(sgl);
> > +               }
> > +
> > +               buf->dma_addr = pa + offset;
> > +
> > +               obj->vma = get_vma(vma);
> > +               if (!obj->vma)
> > +                       return -ENOMEM;
> > +
> > +               buf->pfnmap = true;
> > +
> > +               return npages;
> > +       }
> > +
> > +       buf->write = write;
> > +       npages = buf->size >> PAGE_SHIFT;
> > +
> > +       down_read(&current->mm->mmap_sem);
> > +       get_npages = get_user_pages(current, current->mm, userptr,
> > +                                       npages, write, 1, buf->pages,
NULL);
> > +       up_read(&current->mm->mmap_sem);
> > +       if (get_npages != npages)
> > +               DRM_ERROR("failed to get user_pages.\n");
> > +
> > +       buf->pfnmap = false;
> > +
> > +       return get_npages;
> > +}
> > +
> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> > +                                     struct drm_file *file_priv)
> > +{
> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> > +       struct drm_exynos_gem_userptr *args = data;
> > +       struct exynos_drm_gem_buf *buf;
> > +       struct scatterlist *sgl;
> > +       unsigned long size, userptr;
> > +       unsigned int npages;
> > +       int ret, get_npages;
> > +
> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
> > +
> > +       if (!args->size) {
> > +               DRM_ERROR("invalid size.\n");
> > +               return -EINVAL;
> > +       }
> > +
> > +       ret = check_gem_flags(args->flags);
> > +       if (ret)
> > +               return ret;
> > +
> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
> > +       userptr = args->userptr;
> > +
> > +       buf = exynos_drm_init_buf(dev, size);
> > +       if (!buf)
> > +               return -ENOMEM;
> > +
> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
> > +       if (!exynos_gem_obj) {
> > +               ret = -ENOMEM;
> > +               goto err_free_buffer;
> > +       }
> > +
> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> > +       if (!buf->sgt) {
> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
> > +               ret = -ENOMEM;
> > +               goto err_release_gem;
> > +       }
> > +
> > +       npages = size >> PAGE_SHIFT;
> > +
> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
> > +       if (ret < 0) {
> > +               DRM_ERROR("failed to initailize sg table.\n");
> > +               goto err_free_sgt;
> > +       }
> > +
> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
GFP_KERNEL);
> > +       if (!buf->pages) {
> > +               DRM_ERROR("failed to allocate buf->pages\n");
> > +               ret = -ENOMEM;
> > +               goto err_free_table;
> > +       }
> > +
> > +       exynos_gem_obj->buffer = buf;
> > +
> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
userptr,
> 1);
> > +       if (get_npages != npages) {
> > +               DRM_ERROR("failed to get user_pages.\n");
> > +               ret = get_npages;
> > +               goto err_release_userptr;
> > +       }
> > +
> > +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base,
> file_priv,
> > +                                               &args->handle);
> > +       if (ret < 0) {
> > +               DRM_ERROR("failed to create gem handle.\n");
> > +               goto err_release_userptr;
> > +       }
> > +
> > +       sgl = buf->sgt->sgl;
> > +
> > +       /*
> > +        * if buf->pfnmap is true then update sgl of sgt with pages but
> > +        * if buf->pfnmap is false then it means the sgl was updated
> already
> > +        * so it doesn't need to update the sgl.
> > +        */
> > +       if (!buf->pfnmap) {
> > +               unsigned int i = 0;
> > +
> > +               /* set all pages to sg list. */
> > +               while (i < npages) {
> > +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
> > +                       sg_dma_address(sgl) =
page_to_phys(buf->pages[i]);
> > +                       i++;
> > +                       sgl = sg_next(sgl);
> > +               }
> > +       }
> > +
> > +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
> > +
> > +       return 0;
> > +
> > +err_release_userptr:
> > +       get_npages--;
> > +       while (get_npages >= 0)
> > +               put_page(buf->pages[get_npages--]);
> > +       kfree(buf->pages);
> > +       buf->pages = NULL;
> > +err_free_table:
> > +       sg_free_table(buf->sgt);
> > +err_free_sgt:
> > +       kfree(buf->sgt);
> > +       buf->sgt = NULL;
> > +err_release_gem:
> > +       drm_gem_object_release(&exynos_gem_obj->base);
> > +       kfree(exynos_gem_obj);
> > +       exynos_gem_obj = NULL;
> > +err_free_buffer:
> > +       exynos_drm_free_buf(dev, 0, buf);
> > +       return ret;
> > +}
> > +
> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
> >  {
> >        DRM_DEBUG_KMS("%s\n", __FILE__);
> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > index efc8252..1de2241 100644
> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> > @@ -29,7 +29,8 @@
> >  #define to_exynos_gem_obj(x)   container_of(x,\
> >                        struct exynos_drm_gem_obj, base)
> >
> > -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
> > +#define IS_NONCONTIG_BUFFER(f)         ((f & EXYNOS_BO_NONCONTIG) ||\
> > +                                       (f & EXYNOS_BO_USERPTR))
> >
> >  /*
> >  * exynos drm gem buffer structure.
> > @@ -38,18 +39,23 @@
> >  * @dma_addr: bus address(accessed by dma) to allocated memory region.
> >  *     - this address could be physical address without IOMMU and
> >  *     device address with IOMMU.
> > + * @write: whether pages will be written to by the caller.
> >  * @sgt: sg table to transfer page data.
> >  * @pages: contain all pages to allocated memory region.
> >  * @page_size: could be 4K, 64K or 1MB.
> >  * @size: size of allocated memory region.
> > + * @pfnmap: indicate whether memory region from userptr is mmaped with
> > + *     VM_PFNMAP or not.
> >  */
> >  struct exynos_drm_gem_buf {
> >        void __iomem            *kvaddr;
> >        dma_addr_t              dma_addr;
> > +       unsigned int            write;
> >        struct sg_table         *sgt;
> >        struct page             **pages;
> >        unsigned long           page_size;
> >        unsigned long           size;
> > +       bool                    pfnmap;
> >  };
> >
> >  /*
> > @@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
> >        struct drm_gem_object           base;
> >        struct exynos_drm_gem_buf       *buffer;
> >        unsigned long                   size;
> > +       struct vm_area_struct           *vma;
> >        unsigned int                    flags;
> >  };
> >
> > @@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
> drm_device *dev, void *data,
> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
> >                              struct drm_file *file_priv);
> >
> > +/* map user space allocated by malloc to pages. */
> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
> > +                                     struct drm_file *file_priv);
> > +
> >  /* initialize gem object. */
> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj);
> >
> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> > index 2d6eb06..48eda6e 100644
> > --- a/include/drm/exynos_drm.h
> > +++ b/include/drm/exynos_drm.h
> > @@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
> >  };
> >
> >  /**
> > + * User-requested user space importing structure
> > + *
> > + * @userptr: user space address allocated by malloc.
> > + * @size: size to the buffer allocated by malloc.
> > + * @flags: indicate user-desired cache attribute to map the allocated
> buffer
> > + *     to kernel space.
> > + * @handle: a returned handle to created gem object.
> > + *     - this handle will be set by gem module of kernel side.
> > + */
> > +struct drm_exynos_gem_userptr {
> > +       uint64_t userptr;
> > +       uint64_t size;
> > +       unsigned int flags;
> > +       unsigned int handle;
> > +};
> > +
> > +/**
> >  * A structure for user connection request of virtual display.
> >  *
> >  * @connection: indicate whether doing connetion or not by user.
> > @@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
> >        EXYNOS_BO_CACHABLE      = 1 << 1,
> >        /* write-combine mapping. */
> >        EXYNOS_BO_WC            = 1 << 2,
> > +       /* user space memory allocated by malloc. */
> > +       EXYNOS_BO_USERPTR       = 1 << 3,
> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
EXYNOS_BO_CACHABLE
> |
> > -                                       EXYNOS_BO_WC
> > +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
> >  };
> >
> >  #define DRM_EXYNOS_GEM_CREATE          0x00
> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
> >  #define DRM_EXYNOS_GEM_MMAP            0x02
> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
> >  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
> >  #define DRM_EXYNOS_VIDI_CONNECTION     0x07
> > @@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
> >
> > +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
> > +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
> > +
> >  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS      
 DRM_IOWR(DRM_COMMAND_BASE
> + \
> >                DRM_EXYNOS_PLANE_SET_ZPOS, struct
drm_exynos_plane_set_zpos)
> >
> > --
> > 1.7.4.1
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-15  8:17     ` Inki Dae
@ 2012-05-15  9:35       ` Rob Clark
  2012-05-15 13:40         ` InKi Dae
  0 siblings, 1 reply; 77+ messages in thread
From: Rob Clark @ 2012-05-15  9:35 UTC (permalink / raw)
  To: Inki Dae; +Cc: kyungmin.park, sw0312.kim, dri-devel

On Tue, May 15, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> Hi Rob,
>
>> -----Original Message-----
>> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
>> Clark
>> Sent: Tuesday, May 15, 2012 4:35 PM
>> To: Inki Dae
>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>> kyungmin.park@samsung.com; sw0312.kim@samsung.com
>> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
>>
>> On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> > this feature could be used to use memory region allocated by malloc() in
>> user
>> > mode and mmaped memory region allocated by other memory allocators.
>> userptr
>> > interface can identify memory type through vm_flags value and would get
>> > pages or page frame numbers to user space appropriately.
>>
>> I apologize for being a little late to jump in on this thread, but...
>>
>> I must confess to not being a huge fan of userptr.  It really is
>> opening a can of worms, and seems best avoided if at all possible.
>> I'm not entirely sure the use-case for which you require this, but I
>> wonder if there isn't an alternative way?   I mean, the main case I
>> could think of for mapping userspace memory would be something like
>> texture upload.  But could that be handled in an alternative way,
>> something like a pwrite or texture_upload API, which could temporarily
>> pin the userspace memory, kick off a dma from that user buffer to a
>> proper GEM buffer, and then unpin the user buffer when the DMA
>> completes?
>>
>
> This feature have being discussed with drm and linux-mm guys and I have
> posted this patch four times.
> So please, see below e-mail threads.
> http://www.spinics.net/lists/dri-devel/msg22729.html
>
> and then please, give me your opinions.


Yeah, I read that and understand that the purpose is to support
malloc()'d memory (and btw, all the other changes like limits to the
amount of userptr buffers are good if we do decide to support userptr
buffers).  But my question was more about why do you need to support
malloc'd buffers... other than "just because v4l2 does" ;-)

I don't really like the userptr feature in v4l2 either, and view it as
a necessary evil because at the time there was nothing like dmabuf.
But now that we have dmabuf to share buffers with zero copy between
two devices, to we *really* still need userptr?

I guess if I understood better the use-case, maybe I could try to
think of some alternatives.

BR,
-R

> Thanks,
> Inki Dae
>
>> BR,
>> -R
>>
>> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
>> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>> > ---
>> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
>> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  258
>> +++++++++++++++++++++++++++++++
>> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
>> >  include/drm/exynos_drm.h                |   25 +++-
>> >  4 files changed, 296 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
>> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
>> > index f58a487..5bb0361 100644
>> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
>> > @@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
>> >                        DRM_AUTH),
>> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
>> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
> DRM_AUTH),
>> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
>> > +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
>> >        DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS,
>> exynos_plane_set_zpos_ioctl,
>> >                        DRM_UNLOCKED | DRM_AUTH),
>> >        DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
>> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
>> > index afd0cd4..b68d4ea 100644
>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
>> > @@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
>> >        return 0;
>> >  }
>> >
>> > +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
>> > +{
>> > +       struct vm_area_struct *vma_copy;
>> > +
>> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
>> > +       if (!vma_copy)
>> > +               return NULL;
>> > +
>> > +       if (vma->vm_ops && vma->vm_ops->open)
>> > +               vma->vm_ops->open(vma);
>> > +
>> > +       if (vma->vm_file)
>> > +               get_file(vma->vm_file);
>> > +
>> > +       memcpy(vma_copy, vma, sizeof(*vma));
>> > +
>> > +       vma_copy->vm_mm = NULL;
>> > +       vma_copy->vm_next = NULL;
>> > +       vma_copy->vm_prev = NULL;
>> > +
>> > +       return vma_copy;
>> > +}
>> > +
>> > +static void put_vma(struct vm_area_struct *vma)
>> > +{
>> > +       if (!vma)
>> > +               return;
>> > +
>> > +       if (vma->vm_ops && vma->vm_ops->close)
>> > +               vma->vm_ops->close(vma);
>> > +
>> > +       if (vma->vm_file)
>> > +               fput(vma->vm_file);
>> > +
>> > +       kfree(vma);
>> > +}
>> > +
>> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
>> >                                        struct vm_area_struct *vma)
>> >  {
>> > @@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct
>> drm_gem_object *obj)
>> >        /* add some codes for UNCACHED type here. TODO */
>> >  }
>> >
>> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
>> > +{
>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
>> > +       struct exynos_drm_gem_buf *buf;
>> > +       struct vm_area_struct *vma;
>> > +       int npages;
>> > +
>> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
>> > +       buf = exynos_gem_obj->buffer;
>> > +       vma = exynos_gem_obj->vma;
>> > +
>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
>> > +               put_vma(exynos_gem_obj->vma);
>> > +               goto out;
>> > +       }
>> > +
>> > +       npages = buf->size >> PAGE_SHIFT;
>> > +
>> > +       npages--;
>> > +       while (npages >= 0) {
>> > +               if (buf->write)
>> > +                       set_page_dirty_lock(buf->pages[npages]);
>> > +
>> > +               put_page(buf->pages[npages]);
>> > +               npages--;
>> > +       }
>> > +
>> > +out:
>> > +       kfree(buf->pages);
>> > +       buf->pages = NULL;
>> > +
>> > +       kfree(buf->sgt);
>> > +       buf->sgt = NULL;
>> > +}
>> > +
>> >  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
>> >                                        struct drm_file *file_priv,
>> >                                        unsigned int *handle)
>> > @@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct
>> exynos_drm_gem_obj *exynos_gem_obj)
>> >
>> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
>> >                exynos_drm_gem_put_pages(obj);
>> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
>> > +               exynos_drm_put_userptr(obj);
>> >        else
>> >                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
> buf);
>> >
>> > @@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device
>> *dev, void *data,
>> >        return 0;
>> >  }
>> >
>> > +static int exynos_drm_get_userptr(struct drm_device *dev,
>> > +                               struct exynos_drm_gem_obj *obj,
>> > +                               unsigned long userptr,
>> > +                               unsigned int write)
>> > +{
>> > +       unsigned int get_npages;
>> > +       unsigned long npages = 0;
>> > +       struct vm_area_struct *vma;
>> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
>> > +
>> > +       vma = find_vma(current->mm, userptr);
>> > +
>> > +       /* the memory region mmaped with VM_PFNMAP. */
>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
>> > +               unsigned long this_pfn, prev_pfn, pa;
>> > +               unsigned long start, end, offset;
>> > +               struct scatterlist *sgl;
>> > +
>> > +               start = userptr;
>> > +               offset = userptr & ~PAGE_MASK;
>> > +               end = start + buf->size;
>> > +               sgl = buf->sgt->sgl;
>> > +
>> > +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
>> > +                       int ret = follow_pfn(vma, start, &this_pfn);
>> > +                       if (ret)
>> > +                               return ret;
>> > +
>> > +                       if (prev_pfn == 0)
>> > +                               pa = this_pfn << PAGE_SHIFT;
>> > +                       else if (this_pfn != prev_pfn + 1)
>> > +                               return -EFAULT;
>> > +
>> > +                       sg_dma_address(sgl) = (pa + offset);
>> > +                       sg_dma_len(sgl) = PAGE_SIZE;
>> > +                       prev_pfn = this_pfn;
>> > +                       pa += PAGE_SIZE;
>> > +                       npages++;
>> > +                       sgl = sg_next(sgl);
>> > +               }
>> > +
>> > +               buf->dma_addr = pa + offset;
>> > +
>> > +               obj->vma = get_vma(vma);
>> > +               if (!obj->vma)
>> > +                       return -ENOMEM;
>> > +
>> > +               buf->pfnmap = true;
>> > +
>> > +               return npages;
>> > +       }
>> > +
>> > +       buf->write = write;
>> > +       npages = buf->size >> PAGE_SHIFT;
>> > +
>> > +       down_read(&current->mm->mmap_sem);
>> > +       get_npages = get_user_pages(current, current->mm, userptr,
>> > +                                       npages, write, 1, buf->pages,
> NULL);
>> > +       up_read(&current->mm->mmap_sem);
>> > +       if (get_npages != npages)
>> > +               DRM_ERROR("failed to get user_pages.\n");
>> > +
>> > +       buf->pfnmap = false;
>> > +
>> > +       return get_npages;
>> > +}
>> > +
>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
>> > +                                     struct drm_file *file_priv)
>> > +{
>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
>> > +       struct drm_exynos_gem_userptr *args = data;
>> > +       struct exynos_drm_gem_buf *buf;
>> > +       struct scatterlist *sgl;
>> > +       unsigned long size, userptr;
>> > +       unsigned int npages;
>> > +       int ret, get_npages;
>> > +
>> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
>> > +
>> > +       if (!args->size) {
>> > +               DRM_ERROR("invalid size.\n");
>> > +               return -EINVAL;
>> > +       }
>> > +
>> > +       ret = check_gem_flags(args->flags);
>> > +       if (ret)
>> > +               return ret;
>> > +
>> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
>> > +       userptr = args->userptr;
>> > +
>> > +       buf = exynos_drm_init_buf(dev, size);
>> > +       if (!buf)
>> > +               return -ENOMEM;
>> > +
>> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
>> > +       if (!exynos_gem_obj) {
>> > +               ret = -ENOMEM;
>> > +               goto err_free_buffer;
>> > +       }
>> > +
>> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
>> > +       if (!buf->sgt) {
>> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
>> > +               ret = -ENOMEM;
>> > +               goto err_release_gem;
>> > +       }
>> > +
>> > +       npages = size >> PAGE_SHIFT;
>> > +
>> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
>> > +       if (ret < 0) {
>> > +               DRM_ERROR("failed to initailize sg table.\n");
>> > +               goto err_free_sgt;
>> > +       }
>> > +
>> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
> GFP_KERNEL);
>> > +       if (!buf->pages) {
>> > +               DRM_ERROR("failed to allocate buf->pages\n");
>> > +               ret = -ENOMEM;
>> > +               goto err_free_table;
>> > +       }
>> > +
>> > +       exynos_gem_obj->buffer = buf;
>> > +
>> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
> userptr,
>> 1);
>> > +       if (get_npages != npages) {
>> > +               DRM_ERROR("failed to get user_pages.\n");
>> > +               ret = get_npages;
>> > +               goto err_release_userptr;
>> > +       }
>> > +
>> > +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base,
>> file_priv,
>> > +                                               &args->handle);
>> > +       if (ret < 0) {
>> > +               DRM_ERROR("failed to create gem handle.\n");
>> > +               goto err_release_userptr;
>> > +       }
>> > +
>> > +       sgl = buf->sgt->sgl;
>> > +
>> > +       /*
>> > +        * if buf->pfnmap is true then update sgl of sgt with pages but
>> > +        * if buf->pfnmap is false then it means the sgl was updated
>> already
>> > +        * so it doesn't need to update the sgl.
>> > +        */
>> > +       if (!buf->pfnmap) {
>> > +               unsigned int i = 0;
>> > +
>> > +               /* set all pages to sg list. */
>> > +               while (i < npages) {
>> > +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
>> > +                       sg_dma_address(sgl) =
> page_to_phys(buf->pages[i]);
>> > +                       i++;
>> > +                       sgl = sg_next(sgl);
>> > +               }
>> > +       }
>> > +
>> > +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
>> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
>> > +
>> > +       return 0;
>> > +
>> > +err_release_userptr:
>> > +       get_npages--;
>> > +       while (get_npages >= 0)
>> > +               put_page(buf->pages[get_npages--]);
>> > +       kfree(buf->pages);
>> > +       buf->pages = NULL;
>> > +err_free_table:
>> > +       sg_free_table(buf->sgt);
>> > +err_free_sgt:
>> > +       kfree(buf->sgt);
>> > +       buf->sgt = NULL;
>> > +err_release_gem:
>> > +       drm_gem_object_release(&exynos_gem_obj->base);
>> > +       kfree(exynos_gem_obj);
>> > +       exynos_gem_obj = NULL;
>> > +err_free_buffer:
>> > +       exynos_drm_free_buf(dev, 0, buf);
>> > +       return ret;
>> > +}
>> > +
>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
>> >  {
>> >        DRM_DEBUG_KMS("%s\n", __FILE__);
>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
>> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
>> > index efc8252..1de2241 100644
>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
>> > @@ -29,7 +29,8 @@
>> >  #define to_exynos_gem_obj(x)   container_of(x,\
>> >                        struct exynos_drm_gem_obj, base)
>> >
>> > -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
>> > +#define IS_NONCONTIG_BUFFER(f)         ((f & EXYNOS_BO_NONCONTIG) ||\
>> > +                                       (f & EXYNOS_BO_USERPTR))
>> >
>> >  /*
>> >  * exynos drm gem buffer structure.
>> > @@ -38,18 +39,23 @@
>> >  * @dma_addr: bus address(accessed by dma) to allocated memory region.
>> >  *     - this address could be physical address without IOMMU and
>> >  *     device address with IOMMU.
>> > + * @write: whether pages will be written to by the caller.
>> >  * @sgt: sg table to transfer page data.
>> >  * @pages: contain all pages to allocated memory region.
>> >  * @page_size: could be 4K, 64K or 1MB.
>> >  * @size: size of allocated memory region.
>> > + * @pfnmap: indicate whether memory region from userptr is mmaped with
>> > + *     VM_PFNMAP or not.
>> >  */
>> >  struct exynos_drm_gem_buf {
>> >        void __iomem            *kvaddr;
>> >        dma_addr_t              dma_addr;
>> > +       unsigned int            write;
>> >        struct sg_table         *sgt;
>> >        struct page             **pages;
>> >        unsigned long           page_size;
>> >        unsigned long           size;
>> > +       bool                    pfnmap;
>> >  };
>> >
>> >  /*
>> > @@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
>> >        struct drm_gem_object           base;
>> >        struct exynos_drm_gem_buf       *buffer;
>> >        unsigned long                   size;
>> > +       struct vm_area_struct           *vma;
>> >        unsigned int                    flags;
>> >  };
>> >
>> > @@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
>> drm_device *dev, void *data,
>> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>> >                              struct drm_file *file_priv);
>> >
>> > +/* map user space allocated by malloc to pages. */
>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
>> > +                                     struct drm_file *file_priv);
>> > +
>> >  /* initialize gem object. */
>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj);
>> >
>> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
>> > index 2d6eb06..48eda6e 100644
>> > --- a/include/drm/exynos_drm.h
>> > +++ b/include/drm/exynos_drm.h
>> > @@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
>> >  };
>> >
>> >  /**
>> > + * User-requested user space importing structure
>> > + *
>> > + * @userptr: user space address allocated by malloc.
>> > + * @size: size to the buffer allocated by malloc.
>> > + * @flags: indicate user-desired cache attribute to map the allocated
>> buffer
>> > + *     to kernel space.
>> > + * @handle: a returned handle to created gem object.
>> > + *     - this handle will be set by gem module of kernel side.
>> > + */
>> > +struct drm_exynos_gem_userptr {
>> > +       uint64_t userptr;
>> > +       uint64_t size;
>> > +       unsigned int flags;
>> > +       unsigned int handle;
>> > +};
>> > +
>> > +/**
>> >  * A structure for user connection request of virtual display.
>> >  *
>> >  * @connection: indicate whether doing connetion or not by user.
>> > @@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
>> >        EXYNOS_BO_CACHABLE      = 1 << 1,
>> >        /* write-combine mapping. */
>> >        EXYNOS_BO_WC            = 1 << 2,
>> > +       /* user space memory allocated by malloc. */
>> > +       EXYNOS_BO_USERPTR       = 1 << 3,
>> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
> EXYNOS_BO_CACHABLE
>> |
>> > -                                       EXYNOS_BO_WC
>> > +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
>> >  };
>> >
>> >  #define DRM_EXYNOS_GEM_CREATE          0x00
>> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
>> >  #define DRM_EXYNOS_GEM_MMAP            0x02
>> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
>> >  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
>> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
>> >  #define DRM_EXYNOS_VIDI_CONNECTION     0x07
>> > @@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
>> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
>> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
>> >
>> > +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
>> > +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
>> > +
>> >  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS
>  DRM_IOWR(DRM_COMMAND_BASE
>> + \
>> >                DRM_EXYNOS_PLANE_SET_ZPOS, struct
> drm_exynos_plane_set_zpos)
>> >
>> > --
>> > 1.7.4.1
>> >
>> > _______________________________________________
>> > dri-devel mailing list
>> > dri-devel@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-15  9:35       ` Rob Clark
@ 2012-05-15 13:40         ` InKi Dae
  2012-05-15 14:28           ` Rob Clark
  0 siblings, 1 reply; 77+ messages in thread
From: InKi Dae @ 2012-05-15 13:40 UTC (permalink / raw)
  To: Rob Clark; +Cc: Inki Dae, kyungmin.park, sw0312.kim, dri-devel

2012/5/15 Rob Clark <rob.clark@linaro.org>:
> On Tue, May 15, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> Hi Rob,
>>
>>> -----Original Message-----
>>> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
>>> Clark
>>> Sent: Tuesday, May 15, 2012 4:35 PM
>>> To: Inki Dae
>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com
>>> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
>>>
>>> On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>> > this feature could be used to use memory region allocated by malloc() in
>>> user
>>> > mode and mmaped memory region allocated by other memory allocators.
>>> userptr
>>> > interface can identify memory type through vm_flags value and would get
>>> > pages or page frame numbers to user space appropriately.
>>>
>>> I apologize for being a little late to jump in on this thread, but...
>>>
>>> I must confess to not being a huge fan of userptr.  It really is
>>> opening a can of worms, and seems best avoided if at all possible.
>>> I'm not entirely sure the use-case for which you require this, but I
>>> wonder if there isn't an alternative way?   I mean, the main case I
>>> could think of for mapping userspace memory would be something like
>>> texture upload.  But could that be handled in an alternative way,
>>> something like a pwrite or texture_upload API, which could temporarily
>>> pin the userspace memory, kick off a dma from that user buffer to a
>>> proper GEM buffer, and then unpin the user buffer when the DMA
>>> completes?
>>>
>>
>> This feature have being discussed with drm and linux-mm guys and I have
>> posted this patch four times.
>> So please, see below e-mail threads.
>> http://www.spinics.net/lists/dri-devel/msg22729.html
>>
>> and then please, give me your opinions.
>
>
> Yeah, I read that and understand that the purpose is to support
> malloc()'d memory (and btw, all the other changes like limits to the
> amount of userptr buffers are good if we do decide to support userptr
> buffers).  But my question was more about why do you need to support
> malloc'd buffers... other than "just because v4l2 does" ;-)
>
> I don't really like the userptr feature in v4l2 either, and view it as
> a necessary evil because at the time there was nothing like dmabuf.
> But now that we have dmabuf to share buffers with zero copy between
> two devices, to we *really* still need userptr?
>

Definitely no, as I mentioned on this email thread, we are using the
userptr feature for pixman and evas
backend to use gpu acceleration directly(zero-copy). as you may know,
Evas is part of the Enlightenment Foundation Libraries(EFL) and
Elementary for pixman. all the applicaions based on them uses user
address to draw something so the backends(such as gpu accelerators)
can refer to only the user address to the buffer rendered. do you
think we can avoid memory copy to use those GPUs without such userptr
feature? I think there is no way and only the way that the gpu uses
the user address to render without memory copy. and as mentioned on
this thread, this feature had been tried by i915 in the desktop world.
for this, you can refer to Daniel's comments.

and as you know, note that we are already using dmabuf to share a
buffer between device drivers or v4l2 and drm framework. for this, we
also tested drm prime feature with v4l2 world and we applied umm
concept to ump for 3d gpu(known as mali) also.

Anyway, this userptr feature is different from your thought. and maybe
you also need such feature for performance enhancement. Please give me
any idea if you have any good way instead of this userptr fearure.
there may be any good idea I don't know.

Thanks,
Inki Dae

> I guess if I understood better the use-case, maybe I could try to
> think of some alternatives.
>
> BR,
> -R
>
>> Thanks,
>> Inki Dae
>>
>>> BR,
>>> -R
>>>
>>> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
>>> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>> > ---
>>> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  258
>>> +++++++++++++++++++++++++++++++
>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
>>> >  include/drm/exynos_drm.h                |   25 +++-
>>> >  4 files changed, 296 insertions(+), 2 deletions(-)
>>> >
>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
>>> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
>>> > index f58a487..5bb0361 100644
>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
>>> > @@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
>>> >                        DRM_AUTH),
>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
>>> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
>> DRM_AUTH),
>>> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
>>> > +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS,
>>> exynos_plane_set_zpos_ioctl,
>>> >                        DRM_UNLOCKED | DRM_AUTH),
>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
>>> > index afd0cd4..b68d4ea 100644
>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
>>> > @@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
>>> >        return 0;
>>> >  }
>>> >
>>> > +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
>>> > +{
>>> > +       struct vm_area_struct *vma_copy;
>>> > +
>>> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
>>> > +       if (!vma_copy)
>>> > +               return NULL;
>>> > +
>>> > +       if (vma->vm_ops && vma->vm_ops->open)
>>> > +               vma->vm_ops->open(vma);
>>> > +
>>> > +       if (vma->vm_file)
>>> > +               get_file(vma->vm_file);
>>> > +
>>> > +       memcpy(vma_copy, vma, sizeof(*vma));
>>> > +
>>> > +       vma_copy->vm_mm = NULL;
>>> > +       vma_copy->vm_next = NULL;
>>> > +       vma_copy->vm_prev = NULL;
>>> > +
>>> > +       return vma_copy;
>>> > +}
>>> > +
>>> > +static void put_vma(struct vm_area_struct *vma)
>>> > +{
>>> > +       if (!vma)
>>> > +               return;
>>> > +
>>> > +       if (vma->vm_ops && vma->vm_ops->close)
>>> > +               vma->vm_ops->close(vma);
>>> > +
>>> > +       if (vma->vm_file)
>>> > +               fput(vma->vm_file);
>>> > +
>>> > +       kfree(vma);
>>> > +}
>>> > +
>>> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
>>> >                                        struct vm_area_struct *vma)
>>> >  {
>>> > @@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct
>>> drm_gem_object *obj)
>>> >        /* add some codes for UNCACHED type here. TODO */
>>> >  }
>>> >
>>> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
>>> > +{
>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
>>> > +       struct exynos_drm_gem_buf *buf;
>>> > +       struct vm_area_struct *vma;
>>> > +       int npages;
>>> > +
>>> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
>>> > +       buf = exynos_gem_obj->buffer;
>>> > +       vma = exynos_gem_obj->vma;
>>> > +
>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
>>> > +               put_vma(exynos_gem_obj->vma);
>>> > +               goto out;
>>> > +       }
>>> > +
>>> > +       npages = buf->size >> PAGE_SHIFT;
>>> > +
>>> > +       npages--;
>>> > +       while (npages >= 0) {
>>> > +               if (buf->write)
>>> > +                       set_page_dirty_lock(buf->pages[npages]);
>>> > +
>>> > +               put_page(buf->pages[npages]);
>>> > +               npages--;
>>> > +       }
>>> > +
>>> > +out:
>>> > +       kfree(buf->pages);
>>> > +       buf->pages = NULL;
>>> > +
>>> > +       kfree(buf->sgt);
>>> > +       buf->sgt = NULL;
>>> > +}
>>> > +
>>> >  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
>>> >                                        struct drm_file *file_priv,
>>> >                                        unsigned int *handle)
>>> > @@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct
>>> exynos_drm_gem_obj *exynos_gem_obj)
>>> >
>>> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
>>> >                exynos_drm_gem_put_pages(obj);
>>> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
>>> > +               exynos_drm_put_userptr(obj);
>>> >        else
>>> >                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
>> buf);
>>> >
>>> > @@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device
>>> *dev, void *data,
>>> >        return 0;
>>> >  }
>>> >
>>> > +static int exynos_drm_get_userptr(struct drm_device *dev,
>>> > +                               struct exynos_drm_gem_obj *obj,
>>> > +                               unsigned long userptr,
>>> > +                               unsigned int write)
>>> > +{
>>> > +       unsigned int get_npages;
>>> > +       unsigned long npages = 0;
>>> > +       struct vm_area_struct *vma;
>>> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
>>> > +
>>> > +       vma = find_vma(current->mm, userptr);
>>> > +
>>> > +       /* the memory region mmaped with VM_PFNMAP. */
>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
>>> > +               unsigned long this_pfn, prev_pfn, pa;
>>> > +               unsigned long start, end, offset;
>>> > +               struct scatterlist *sgl;
>>> > +
>>> > +               start = userptr;
>>> > +               offset = userptr & ~PAGE_MASK;
>>> > +               end = start + buf->size;
>>> > +               sgl = buf->sgt->sgl;
>>> > +
>>> > +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
>>> > +                       int ret = follow_pfn(vma, start, &this_pfn);
>>> > +                       if (ret)
>>> > +                               return ret;
>>> > +
>>> > +                       if (prev_pfn == 0)
>>> > +                               pa = this_pfn << PAGE_SHIFT;
>>> > +                       else if (this_pfn != prev_pfn + 1)
>>> > +                               return -EFAULT;
>>> > +
>>> > +                       sg_dma_address(sgl) = (pa + offset);
>>> > +                       sg_dma_len(sgl) = PAGE_SIZE;
>>> > +                       prev_pfn = this_pfn;
>>> > +                       pa += PAGE_SIZE;
>>> > +                       npages++;
>>> > +                       sgl = sg_next(sgl);
>>> > +               }
>>> > +
>>> > +               buf->dma_addr = pa + offset;
>>> > +
>>> > +               obj->vma = get_vma(vma);
>>> > +               if (!obj->vma)
>>> > +                       return -ENOMEM;
>>> > +
>>> > +               buf->pfnmap = true;
>>> > +
>>> > +               return npages;
>>> > +       }
>>> > +
>>> > +       buf->write = write;
>>> > +       npages = buf->size >> PAGE_SHIFT;
>>> > +
>>> > +       down_read(&current->mm->mmap_sem);
>>> > +       get_npages = get_user_pages(current, current->mm, userptr,
>>> > +                                       npages, write, 1, buf->pages,
>> NULL);
>>> > +       up_read(&current->mm->mmap_sem);
>>> > +       if (get_npages != npages)
>>> > +               DRM_ERROR("failed to get user_pages.\n");
>>> > +
>>> > +       buf->pfnmap = false;
>>> > +
>>> > +       return get_npages;
>>> > +}
>>> > +
>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
>>> > +                                     struct drm_file *file_priv)
>>> > +{
>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
>>> > +       struct drm_exynos_gem_userptr *args = data;
>>> > +       struct exynos_drm_gem_buf *buf;
>>> > +       struct scatterlist *sgl;
>>> > +       unsigned long size, userptr;
>>> > +       unsigned int npages;
>>> > +       int ret, get_npages;
>>> > +
>>> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
>>> > +
>>> > +       if (!args->size) {
>>> > +               DRM_ERROR("invalid size.\n");
>>> > +               return -EINVAL;
>>> > +       }
>>> > +
>>> > +       ret = check_gem_flags(args->flags);
>>> > +       if (ret)
>>> > +               return ret;
>>> > +
>>> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
>>> > +       userptr = args->userptr;
>>> > +
>>> > +       buf = exynos_drm_init_buf(dev, size);
>>> > +       if (!buf)
>>> > +               return -ENOMEM;
>>> > +
>>> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
>>> > +       if (!exynos_gem_obj) {
>>> > +               ret = -ENOMEM;
>>> > +               goto err_free_buffer;
>>> > +       }
>>> > +
>>> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
>>> > +       if (!buf->sgt) {
>>> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
>>> > +               ret = -ENOMEM;
>>> > +               goto err_release_gem;
>>> > +       }
>>> > +
>>> > +       npages = size >> PAGE_SHIFT;
>>> > +
>>> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
>>> > +       if (ret < 0) {
>>> > +               DRM_ERROR("failed to initailize sg table.\n");
>>> > +               goto err_free_sgt;
>>> > +       }
>>> > +
>>> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
>> GFP_KERNEL);
>>> > +       if (!buf->pages) {
>>> > +               DRM_ERROR("failed to allocate buf->pages\n");
>>> > +               ret = -ENOMEM;
>>> > +               goto err_free_table;
>>> > +       }
>>> > +
>>> > +       exynos_gem_obj->buffer = buf;
>>> > +
>>> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
>> userptr,
>>> 1);
>>> > +       if (get_npages != npages) {
>>> > +               DRM_ERROR("failed to get user_pages.\n");
>>> > +               ret = get_npages;
>>> > +               goto err_release_userptr;
>>> > +       }
>>> > +
>>> > +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base,
>>> file_priv,
>>> > +                                               &args->handle);
>>> > +       if (ret < 0) {
>>> > +               DRM_ERROR("failed to create gem handle.\n");
>>> > +               goto err_release_userptr;
>>> > +       }
>>> > +
>>> > +       sgl = buf->sgt->sgl;
>>> > +
>>> > +       /*
>>> > +        * if buf->pfnmap is true then update sgl of sgt with pages but
>>> > +        * if buf->pfnmap is false then it means the sgl was updated
>>> already
>>> > +        * so it doesn't need to update the sgl.
>>> > +        */
>>> > +       if (!buf->pfnmap) {
>>> > +               unsigned int i = 0;
>>> > +
>>> > +               /* set all pages to sg list. */
>>> > +               while (i < npages) {
>>> > +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
>>> > +                       sg_dma_address(sgl) =
>> page_to_phys(buf->pages[i]);
>>> > +                       i++;
>>> > +                       sgl = sg_next(sgl);
>>> > +               }
>>> > +       }
>>> > +
>>> > +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
>>> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
>>> > +
>>> > +       return 0;
>>> > +
>>> > +err_release_userptr:
>>> > +       get_npages--;
>>> > +       while (get_npages >= 0)
>>> > +               put_page(buf->pages[get_npages--]);
>>> > +       kfree(buf->pages);
>>> > +       buf->pages = NULL;
>>> > +err_free_table:
>>> > +       sg_free_table(buf->sgt);
>>> > +err_free_sgt:
>>> > +       kfree(buf->sgt);
>>> > +       buf->sgt = NULL;
>>> > +err_release_gem:
>>> > +       drm_gem_object_release(&exynos_gem_obj->base);
>>> > +       kfree(exynos_gem_obj);
>>> > +       exynos_gem_obj = NULL;
>>> > +err_free_buffer:
>>> > +       exynos_drm_free_buf(dev, 0, buf);
>>> > +       return ret;
>>> > +}
>>> > +
>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
>>> >  {
>>> >        DRM_DEBUG_KMS("%s\n", __FILE__);
>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
>>> > index efc8252..1de2241 100644
>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
>>> > @@ -29,7 +29,8 @@
>>> >  #define to_exynos_gem_obj(x)   container_of(x,\
>>> >                        struct exynos_drm_gem_obj, base)
>>> >
>>> > -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
>>> > +#define IS_NONCONTIG_BUFFER(f)         ((f & EXYNOS_BO_NONCONTIG) ||\
>>> > +                                       (f & EXYNOS_BO_USERPTR))
>>> >
>>> >  /*
>>> >  * exynos drm gem buffer structure.
>>> > @@ -38,18 +39,23 @@
>>> >  * @dma_addr: bus address(accessed by dma) to allocated memory region.
>>> >  *     - this address could be physical address without IOMMU and
>>> >  *     device address with IOMMU.
>>> > + * @write: whether pages will be written to by the caller.
>>> >  * @sgt: sg table to transfer page data.
>>> >  * @pages: contain all pages to allocated memory region.
>>> >  * @page_size: could be 4K, 64K or 1MB.
>>> >  * @size: size of allocated memory region.
>>> > + * @pfnmap: indicate whether memory region from userptr is mmaped with
>>> > + *     VM_PFNMAP or not.
>>> >  */
>>> >  struct exynos_drm_gem_buf {
>>> >        void __iomem            *kvaddr;
>>> >        dma_addr_t              dma_addr;
>>> > +       unsigned int            write;
>>> >        struct sg_table         *sgt;
>>> >        struct page             **pages;
>>> >        unsigned long           page_size;
>>> >        unsigned long           size;
>>> > +       bool                    pfnmap;
>>> >  };
>>> >
>>> >  /*
>>> > @@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
>>> >        struct drm_gem_object           base;
>>> >        struct exynos_drm_gem_buf       *buffer;
>>> >        unsigned long                   size;
>>> > +       struct vm_area_struct           *vma;
>>> >        unsigned int                    flags;
>>> >  };
>>> >
>>> > @@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
>>> drm_device *dev, void *data,
>>> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>>> >                              struct drm_file *file_priv);
>>> >
>>> > +/* map user space allocated by malloc to pages. */
>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
>>> > +                                     struct drm_file *file_priv);
>>> > +
>>> >  /* initialize gem object. */
>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj);
>>> >
>>> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
>>> > index 2d6eb06..48eda6e 100644
>>> > --- a/include/drm/exynos_drm.h
>>> > +++ b/include/drm/exynos_drm.h
>>> > @@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
>>> >  };
>>> >
>>> >  /**
>>> > + * User-requested user space importing structure
>>> > + *
>>> > + * @userptr: user space address allocated by malloc.
>>> > + * @size: size to the buffer allocated by malloc.
>>> > + * @flags: indicate user-desired cache attribute to map the allocated
>>> buffer
>>> > + *     to kernel space.
>>> > + * @handle: a returned handle to created gem object.
>>> > + *     - this handle will be set by gem module of kernel side.
>>> > + */
>>> > +struct drm_exynos_gem_userptr {
>>> > +       uint64_t userptr;
>>> > +       uint64_t size;
>>> > +       unsigned int flags;
>>> > +       unsigned int handle;
>>> > +};
>>> > +
>>> > +/**
>>> >  * A structure for user connection request of virtual display.
>>> >  *
>>> >  * @connection: indicate whether doing connetion or not by user.
>>> > @@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
>>> >        EXYNOS_BO_CACHABLE      = 1 << 1,
>>> >        /* write-combine mapping. */
>>> >        EXYNOS_BO_WC            = 1 << 2,
>>> > +       /* user space memory allocated by malloc. */
>>> > +       EXYNOS_BO_USERPTR       = 1 << 3,
>>> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
>> EXYNOS_BO_CACHABLE
>>> |
>>> > -                                       EXYNOS_BO_WC
>>> > +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
>>> >  };
>>> >
>>> >  #define DRM_EXYNOS_GEM_CREATE          0x00
>>> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
>>> >  #define DRM_EXYNOS_GEM_MMAP            0x02
>>> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
>>> >  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
>>> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
>>> >  #define DRM_EXYNOS_VIDI_CONNECTION     0x07
>>> > @@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
>>> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
>>> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
>>> >
>>> > +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
>>> > +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
>>> > +
>>> >  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS
>>  DRM_IOWR(DRM_COMMAND_BASE
>>> + \
>>> >                DRM_EXYNOS_PLANE_SET_ZPOS, struct
>> drm_exynos_plane_set_zpos)
>>> >
>>> > --
>>> > 1.7.4.1
>>> >
>>> > _______________________________________________
>>> > dri-devel mailing list
>>> > dri-devel@lists.freedesktop.org
>>> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-15 13:40         ` InKi Dae
@ 2012-05-15 14:28           ` Rob Clark
  2012-05-16  6:04             ` Inki Dae
  0 siblings, 1 reply; 77+ messages in thread
From: Rob Clark @ 2012-05-15 14:28 UTC (permalink / raw)
  To: InKi Dae; +Cc: Inki Dae, kyungmin.park, sw0312.kim, dri-devel

On Tue, May 15, 2012 at 7:40 AM, InKi Dae <daeinki@gmail.com> wrote:
> 2012/5/15 Rob Clark <rob.clark@linaro.org>:
>> On Tue, May 15, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>> Hi Rob,
>>>
>>>> -----Original Message-----
>>>> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
>>>> Clark
>>>> Sent: Tuesday, May 15, 2012 4:35 PM
>>>> To: Inki Dae
>>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com
>>>> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
>>>>
>>>> On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com> wrote:
>>>> > this feature could be used to use memory region allocated by malloc() in
>>>> user
>>>> > mode and mmaped memory region allocated by other memory allocators.
>>>> userptr
>>>> > interface can identify memory type through vm_flags value and would get
>>>> > pages or page frame numbers to user space appropriately.
>>>>
>>>> I apologize for being a little late to jump in on this thread, but...
>>>>
>>>> I must confess to not being a huge fan of userptr.  It really is
>>>> opening a can of worms, and seems best avoided if at all possible.
>>>> I'm not entirely sure the use-case for which you require this, but I
>>>> wonder if there isn't an alternative way?   I mean, the main case I
>>>> could think of for mapping userspace memory would be something like
>>>> texture upload.  But could that be handled in an alternative way,
>>>> something like a pwrite or texture_upload API, which could temporarily
>>>> pin the userspace memory, kick off a dma from that user buffer to a
>>>> proper GEM buffer, and then unpin the user buffer when the DMA
>>>> completes?
>>>>
>>>
>>> This feature have being discussed with drm and linux-mm guys and I have
>>> posted this patch four times.
>>> So please, see below e-mail threads.
>>> http://www.spinics.net/lists/dri-devel/msg22729.html
>>>
>>> and then please, give me your opinions.
>>
>>
>> Yeah, I read that and understand that the purpose is to support
>> malloc()'d memory (and btw, all the other changes like limits to the
>> amount of userptr buffers are good if we do decide to support userptr
>> buffers).  But my question was more about why do you need to support
>> malloc'd buffers... other than "just because v4l2 does" ;-)
>>
>> I don't really like the userptr feature in v4l2 either, and view it as
>> a necessary evil because at the time there was nothing like dmabuf.
>> But now that we have dmabuf to share buffers with zero copy between
>> two devices, to we *really* still need userptr?
>>
>
> Definitely no, as I mentioned on this email thread, we are using the
> userptr feature for pixman and evas
> backend to use gpu acceleration directly(zero-copy). as you may know,
> Evas is part of the Enlightenment Foundation Libraries(EFL) and
> Elementary for pixman. all the applicaions based on them uses user
> address to draw something so the backends(such as gpu accelerators)
> can refer to only the user address to the buffer rendered. do you
> think we can avoid memory copy to use those GPUs without such userptr
> feature? I think there is no way and only the way that the gpu uses
> the user address to render without memory copy. and as mentioned on
> this thread, this feature had been tried by i915 in the desktop world.
> for this, you can refer to Daniel's comments.

oh, yeah, for something like pixman it is hard to accelerate.  But I
wonder if there is still some benefit.  I don't know as much about
evas, but pixman is assumed to be a synchronous API, so you must
always block until the operation completes which looses a lot of the
potential benefit of a blitter.  And also I think you have no control
over when the client free()'s a buffer.  So you end up having to
map/unmap the buffer to the gpu/blitter/whatever every single
operation.  Not to mention possible need for cache operations, etc.  I
wonder if trying to accel it in hw really brings any benefit vs just
using NEON?

I thought EFL has a GLES backend..  I could be wrong about that, but
if it does probably you'd get better performance by using the GLES
backend vs trying to accelerate what is intended as a sw path..

> and as you know, note that we are already using dmabuf to share a
> buffer between device drivers or v4l2 and drm framework. for this, we
> also tested drm prime feature with v4l2 world and we applied umm
> concept to ump for 3d gpu(known as mali) also.
>
> Anyway, this userptr feature is different from your thought. and maybe
> you also need such feature for performance enhancement. Please give me
> any idea if you have any good way instead of this userptr fearure.
> there may be any good idea I don't know.

well, I just try and point people to APIs that are actually designed
with hw acceleration in mind in the first place (x11/EXA, GLES, etc)
;-)

There missing really an API for client side 2d accel.  Well, I guess
there is OpenVG, although I'm not sure if anyone actually ever used
it.. maybe we do need something for better supporting 2d blitters in
weston compositor, for example.

BR,
-R


> Thanks,
> Inki Dae
>
>> I guess if I understood better the use-case, maybe I could try to
>> think of some alternatives.
>>
>> BR,
>> -R
>>
>>> Thanks,
>>> Inki Dae
>>>
>>>> BR,
>>>> -R
>>>>
>>>> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
>>>> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>>>> > ---
>>>> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
>>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  258
>>>> +++++++++++++++++++++++++++++++
>>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
>>>> >  include/drm/exynos_drm.h                |   25 +++-
>>>> >  4 files changed, 296 insertions(+), 2 deletions(-)
>>>> >
>>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
>>>> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
>>>> > index f58a487..5bb0361 100644
>>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
>>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
>>>> > @@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] = {
>>>> >                        DRM_AUTH),
>>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
>>>> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
>>> DRM_AUTH),
>>>> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
>>>> > +                       exynos_drm_gem_userptr_ioctl, DRM_UNLOCKED),
>>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS,
>>>> exynos_plane_set_zpos_ioctl,
>>>> >                        DRM_UNLOCKED | DRM_AUTH),
>>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
>>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
>>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
>>>> > index afd0cd4..b68d4ea 100644
>>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
>>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
>>>> > @@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
>>>> >        return 0;
>>>> >  }
>>>> >
>>>> > +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
>>>> > +{
>>>> > +       struct vm_area_struct *vma_copy;
>>>> > +
>>>> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
>>>> > +       if (!vma_copy)
>>>> > +               return NULL;
>>>> > +
>>>> > +       if (vma->vm_ops && vma->vm_ops->open)
>>>> > +               vma->vm_ops->open(vma);
>>>> > +
>>>> > +       if (vma->vm_file)
>>>> > +               get_file(vma->vm_file);
>>>> > +
>>>> > +       memcpy(vma_copy, vma, sizeof(*vma));
>>>> > +
>>>> > +       vma_copy->vm_mm = NULL;
>>>> > +       vma_copy->vm_next = NULL;
>>>> > +       vma_copy->vm_prev = NULL;
>>>> > +
>>>> > +       return vma_copy;
>>>> > +}
>>>> > +
>>>> > +static void put_vma(struct vm_area_struct *vma)
>>>> > +{
>>>> > +       if (!vma)
>>>> > +               return;
>>>> > +
>>>> > +       if (vma->vm_ops && vma->vm_ops->close)
>>>> > +               vma->vm_ops->close(vma);
>>>> > +
>>>> > +       if (vma->vm_file)
>>>> > +               fput(vma->vm_file);
>>>> > +
>>>> > +       kfree(vma);
>>>> > +}
>>>> > +
>>>> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
>>>> >                                        struct vm_area_struct *vma)
>>>> >  {
>>>> > @@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct
>>>> drm_gem_object *obj)
>>>> >        /* add some codes for UNCACHED type here. TODO */
>>>> >  }
>>>> >
>>>> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
>>>> > +{
>>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
>>>> > +       struct exynos_drm_gem_buf *buf;
>>>> > +       struct vm_area_struct *vma;
>>>> > +       int npages;
>>>> > +
>>>> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
>>>> > +       buf = exynos_gem_obj->buffer;
>>>> > +       vma = exynos_gem_obj->vma;
>>>> > +
>>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
>>>> > +               put_vma(exynos_gem_obj->vma);
>>>> > +               goto out;
>>>> > +       }
>>>> > +
>>>> > +       npages = buf->size >> PAGE_SHIFT;
>>>> > +
>>>> > +       npages--;
>>>> > +       while (npages >= 0) {
>>>> > +               if (buf->write)
>>>> > +                       set_page_dirty_lock(buf->pages[npages]);
>>>> > +
>>>> > +               put_page(buf->pages[npages]);
>>>> > +               npages--;
>>>> > +       }
>>>> > +
>>>> > +out:
>>>> > +       kfree(buf->pages);
>>>> > +       buf->pages = NULL;
>>>> > +
>>>> > +       kfree(buf->sgt);
>>>> > +       buf->sgt = NULL;
>>>> > +}
>>>> > +
>>>> >  static int exynos_drm_gem_handle_create(struct drm_gem_object *obj,
>>>> >                                        struct drm_file *file_priv,
>>>> >                                        unsigned int *handle)
>>>> > @@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct
>>>> exynos_drm_gem_obj *exynos_gem_obj)
>>>> >
>>>> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
>>>> >                exynos_drm_gem_put_pages(obj);
>>>> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
>>>> > +               exynos_drm_put_userptr(obj);
>>>> >        else
>>>> >                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
>>> buf);
>>>> >
>>>> > @@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct drm_device
>>>> *dev, void *data,
>>>> >        return 0;
>>>> >  }
>>>> >
>>>> > +static int exynos_drm_get_userptr(struct drm_device *dev,
>>>> > +                               struct exynos_drm_gem_obj *obj,
>>>> > +                               unsigned long userptr,
>>>> > +                               unsigned int write)
>>>> > +{
>>>> > +       unsigned int get_npages;
>>>> > +       unsigned long npages = 0;
>>>> > +       struct vm_area_struct *vma;
>>>> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
>>>> > +
>>>> > +       vma = find_vma(current->mm, userptr);
>>>> > +
>>>> > +       /* the memory region mmaped with VM_PFNMAP. */
>>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff)) {
>>>> > +               unsigned long this_pfn, prev_pfn, pa;
>>>> > +               unsigned long start, end, offset;
>>>> > +               struct scatterlist *sgl;
>>>> > +
>>>> > +               start = userptr;
>>>> > +               offset = userptr & ~PAGE_MASK;
>>>> > +               end = start + buf->size;
>>>> > +               sgl = buf->sgt->sgl;
>>>> > +
>>>> > +               for (prev_pfn = 0; start < end; start += PAGE_SIZE) {
>>>> > +                       int ret = follow_pfn(vma, start, &this_pfn);
>>>> > +                       if (ret)
>>>> > +                               return ret;
>>>> > +
>>>> > +                       if (prev_pfn == 0)
>>>> > +                               pa = this_pfn << PAGE_SHIFT;
>>>> > +                       else if (this_pfn != prev_pfn + 1)
>>>> > +                               return -EFAULT;
>>>> > +
>>>> > +                       sg_dma_address(sgl) = (pa + offset);
>>>> > +                       sg_dma_len(sgl) = PAGE_SIZE;
>>>> > +                       prev_pfn = this_pfn;
>>>> > +                       pa += PAGE_SIZE;
>>>> > +                       npages++;
>>>> > +                       sgl = sg_next(sgl);
>>>> > +               }
>>>> > +
>>>> > +               buf->dma_addr = pa + offset;
>>>> > +
>>>> > +               obj->vma = get_vma(vma);
>>>> > +               if (!obj->vma)
>>>> > +                       return -ENOMEM;
>>>> > +
>>>> > +               buf->pfnmap = true;
>>>> > +
>>>> > +               return npages;
>>>> > +       }
>>>> > +
>>>> > +       buf->write = write;
>>>> > +       npages = buf->size >> PAGE_SHIFT;
>>>> > +
>>>> > +       down_read(&current->mm->mmap_sem);
>>>> > +       get_npages = get_user_pages(current, current->mm, userptr,
>>>> > +                                       npages, write, 1, buf->pages,
>>> NULL);
>>>> > +       up_read(&current->mm->mmap_sem);
>>>> > +       if (get_npages != npages)
>>>> > +               DRM_ERROR("failed to get user_pages.\n");
>>>> > +
>>>> > +       buf->pfnmap = false;
>>>> > +
>>>> > +       return get_npages;
>>>> > +}
>>>> > +
>>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
>>>> > +                                     struct drm_file *file_priv)
>>>> > +{
>>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
>>>> > +       struct drm_exynos_gem_userptr *args = data;
>>>> > +       struct exynos_drm_gem_buf *buf;
>>>> > +       struct scatterlist *sgl;
>>>> > +       unsigned long size, userptr;
>>>> > +       unsigned int npages;
>>>> > +       int ret, get_npages;
>>>> > +
>>>> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
>>>> > +
>>>> > +       if (!args->size) {
>>>> > +               DRM_ERROR("invalid size.\n");
>>>> > +               return -EINVAL;
>>>> > +       }
>>>> > +
>>>> > +       ret = check_gem_flags(args->flags);
>>>> > +       if (ret)
>>>> > +               return ret;
>>>> > +
>>>> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
>>>> > +       userptr = args->userptr;
>>>> > +
>>>> > +       buf = exynos_drm_init_buf(dev, size);
>>>> > +       if (!buf)
>>>> > +               return -ENOMEM;
>>>> > +
>>>> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
>>>> > +       if (!exynos_gem_obj) {
>>>> > +               ret = -ENOMEM;
>>>> > +               goto err_free_buffer;
>>>> > +       }
>>>> > +
>>>> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
>>>> > +       if (!buf->sgt) {
>>>> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
>>>> > +               ret = -ENOMEM;
>>>> > +               goto err_release_gem;
>>>> > +       }
>>>> > +
>>>> > +       npages = size >> PAGE_SHIFT;
>>>> > +
>>>> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
>>>> > +       if (ret < 0) {
>>>> > +               DRM_ERROR("failed to initailize sg table.\n");
>>>> > +               goto err_free_sgt;
>>>> > +       }
>>>> > +
>>>> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
>>> GFP_KERNEL);
>>>> > +       if (!buf->pages) {
>>>> > +               DRM_ERROR("failed to allocate buf->pages\n");
>>>> > +               ret = -ENOMEM;
>>>> > +               goto err_free_table;
>>>> > +       }
>>>> > +
>>>> > +       exynos_gem_obj->buffer = buf;
>>>> > +
>>>> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
>>> userptr,
>>>> 1);
>>>> > +       if (get_npages != npages) {
>>>> > +               DRM_ERROR("failed to get user_pages.\n");
>>>> > +               ret = get_npages;
>>>> > +               goto err_release_userptr;
>>>> > +       }
>>>> > +
>>>> > +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base,
>>>> file_priv,
>>>> > +                                               &args->handle);
>>>> > +       if (ret < 0) {
>>>> > +               DRM_ERROR("failed to create gem handle.\n");
>>>> > +               goto err_release_userptr;
>>>> > +       }
>>>> > +
>>>> > +       sgl = buf->sgt->sgl;
>>>> > +
>>>> > +       /*
>>>> > +        * if buf->pfnmap is true then update sgl of sgt with pages but
>>>> > +        * if buf->pfnmap is false then it means the sgl was updated
>>>> already
>>>> > +        * so it doesn't need to update the sgl.
>>>> > +        */
>>>> > +       if (!buf->pfnmap) {
>>>> > +               unsigned int i = 0;
>>>> > +
>>>> > +               /* set all pages to sg list. */
>>>> > +               while (i < npages) {
>>>> > +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE, 0);
>>>> > +                       sg_dma_address(sgl) =
>>> page_to_phys(buf->pages[i]);
>>>> > +                       i++;
>>>> > +                       sgl = sg_next(sgl);
>>>> > +               }
>>>> > +       }
>>>> > +
>>>> > +       /* always use EXYNOS_BO_USERPTR as memory type for userptr. */
>>>> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
>>>> > +
>>>> > +       return 0;
>>>> > +
>>>> > +err_release_userptr:
>>>> > +       get_npages--;
>>>> > +       while (get_npages >= 0)
>>>> > +               put_page(buf->pages[get_npages--]);
>>>> > +       kfree(buf->pages);
>>>> > +       buf->pages = NULL;
>>>> > +err_free_table:
>>>> > +       sg_free_table(buf->sgt);
>>>> > +err_free_sgt:
>>>> > +       kfree(buf->sgt);
>>>> > +       buf->sgt = NULL;
>>>> > +err_release_gem:
>>>> > +       drm_gem_object_release(&exynos_gem_obj->base);
>>>> > +       kfree(exynos_gem_obj);
>>>> > +       exynos_gem_obj = NULL;
>>>> > +err_free_buffer:
>>>> > +       exynos_drm_free_buf(dev, 0, buf);
>>>> > +       return ret;
>>>> > +}
>>>> > +
>>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
>>>> >  {
>>>> >        DRM_DEBUG_KMS("%s\n", __FILE__);
>>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
>>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
>>>> > index efc8252..1de2241 100644
>>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
>>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
>>>> > @@ -29,7 +29,8 @@
>>>> >  #define to_exynos_gem_obj(x)   container_of(x,\
>>>> >                        struct exynos_drm_gem_obj, base)
>>>> >
>>>> > -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
>>>> > +#define IS_NONCONTIG_BUFFER(f)         ((f & EXYNOS_BO_NONCONTIG) ||\
>>>> > +                                       (f & EXYNOS_BO_USERPTR))
>>>> >
>>>> >  /*
>>>> >  * exynos drm gem buffer structure.
>>>> > @@ -38,18 +39,23 @@
>>>> >  * @dma_addr: bus address(accessed by dma) to allocated memory region.
>>>> >  *     - this address could be physical address without IOMMU and
>>>> >  *     device address with IOMMU.
>>>> > + * @write: whether pages will be written to by the caller.
>>>> >  * @sgt: sg table to transfer page data.
>>>> >  * @pages: contain all pages to allocated memory region.
>>>> >  * @page_size: could be 4K, 64K or 1MB.
>>>> >  * @size: size of allocated memory region.
>>>> > + * @pfnmap: indicate whether memory region from userptr is mmaped with
>>>> > + *     VM_PFNMAP or not.
>>>> >  */
>>>> >  struct exynos_drm_gem_buf {
>>>> >        void __iomem            *kvaddr;
>>>> >        dma_addr_t              dma_addr;
>>>> > +       unsigned int            write;
>>>> >        struct sg_table         *sgt;
>>>> >        struct page             **pages;
>>>> >        unsigned long           page_size;
>>>> >        unsigned long           size;
>>>> > +       bool                    pfnmap;
>>>> >  };
>>>> >
>>>> >  /*
>>>> > @@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
>>>> >        struct drm_gem_object           base;
>>>> >        struct exynos_drm_gem_buf       *buffer;
>>>> >        unsigned long                   size;
>>>> > +       struct vm_area_struct           *vma;
>>>> >        unsigned int                    flags;
>>>> >  };
>>>> >
>>>> > @@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
>>>> drm_device *dev, void *data,
>>>> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>>>> >                              struct drm_file *file_priv);
>>>> >
>>>> > +/* map user space allocated by malloc to pages. */
>>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void *data,
>>>> > +                                     struct drm_file *file_priv);
>>>> > +
>>>> >  /* initialize gem object. */
>>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj);
>>>> >
>>>> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
>>>> > index 2d6eb06..48eda6e 100644
>>>> > --- a/include/drm/exynos_drm.h
>>>> > +++ b/include/drm/exynos_drm.h
>>>> > @@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
>>>> >  };
>>>> >
>>>> >  /**
>>>> > + * User-requested user space importing structure
>>>> > + *
>>>> > + * @userptr: user space address allocated by malloc.
>>>> > + * @size: size to the buffer allocated by malloc.
>>>> > + * @flags: indicate user-desired cache attribute to map the allocated
>>>> buffer
>>>> > + *     to kernel space.
>>>> > + * @handle: a returned handle to created gem object.
>>>> > + *     - this handle will be set by gem module of kernel side.
>>>> > + */
>>>> > +struct drm_exynos_gem_userptr {
>>>> > +       uint64_t userptr;
>>>> > +       uint64_t size;
>>>> > +       unsigned int flags;
>>>> > +       unsigned int handle;
>>>> > +};
>>>> > +
>>>> > +/**
>>>> >  * A structure for user connection request of virtual display.
>>>> >  *
>>>> >  * @connection: indicate whether doing connetion or not by user.
>>>> > @@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
>>>> >        EXYNOS_BO_CACHABLE      = 1 << 1,
>>>> >        /* write-combine mapping. */
>>>> >        EXYNOS_BO_WC            = 1 << 2,
>>>> > +       /* user space memory allocated by malloc. */
>>>> > +       EXYNOS_BO_USERPTR       = 1 << 3,
>>>> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
>>> EXYNOS_BO_CACHABLE
>>>> |
>>>> > -                                       EXYNOS_BO_WC
>>>> > +                                       EXYNOS_BO_WC | EXYNOS_BO_USERPTR
>>>> >  };
>>>> >
>>>> >  #define DRM_EXYNOS_GEM_CREATE          0x00
>>>> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
>>>> >  #define DRM_EXYNOS_GEM_MMAP            0x02
>>>> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
>>>> >  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
>>>> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
>>>> >  #define DRM_EXYNOS_VIDI_CONNECTION     0x07
>>>> > @@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
>>>> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE + \
>>>> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
>>>> >
>>>> > +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE + \
>>>> > +               DRM_EXYNOS_GEM_USERPTR, struct drm_exynos_gem_userptr)
>>>> > +
>>>> >  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS
>>>  DRM_IOWR(DRM_COMMAND_BASE
>>>> + \
>>>> >                DRM_EXYNOS_PLANE_SET_ZPOS, struct
>>> drm_exynos_plane_set_zpos)
>>>> >
>>>> > --
>>>> > 1.7.4.1
>>>> >
>>>> > _______________________________________________
>>>> > dri-devel mailing list
>>>> > dri-devel@lists.freedesktop.org
>>>> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>
>>> _______________________________________________
>>> dri-devel mailing list
>>> dri-devel@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-15  4:33           ` Inki Dae
@ 2012-05-15 14:31             ` Jerome Glisse
  2012-05-16  8:49               ` Inki Dae
  0 siblings, 1 reply; 77+ messages in thread
From: Jerome Glisse @ 2012-05-15 14:31 UTC (permalink / raw)
  To: Inki Dae
  Cc: airlied, dri-devel, linux-mm, minchan, kosaki.motohiro,
	kyungmin.park, sw0312.kim, jy0922.shim

On Tue, May 15, 2012 at 12:33 AM, Inki Dae <inki.dae@samsung.com> wrote:
> Hi Jerome,
>
>> -----Original Message-----
>> From: Jerome Glisse [mailto:j.glisse@gmail.com]
>> Sent: Tuesday, May 15, 2012 4:27 AM
>> To: Inki Dae
>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org; minchan@kernel.org;
>> kosaki.motohiro@gmail.com; kyungmin.park@samsung.com;
>> sw0312.kim@samsung.com; jy0922.shim@samsung.com
>> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
>>
>> On Mon, May 14, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> > this feature is used to import user space region allocated by malloc()
>> or
>> > mmaped into a gem. for this, we uses get_user_pages() to get all the
>> pages
>> > to VMAs within user address space. However we should pay attention to
>> use
>> > this userptr feature like below.
>> >
>> > The migration issue.
>> > - Pages reserved by CMA for some device using DMA could be used by
>> > kernel and if the device driver wants to use those pages
>> > while being used by kernel then the pages are copied into
>> > other ones allocated to migrate them and then finally,
>> > the device driver can use the pages for itself.
>> > Thus, migrated, the pages being accessed by DMA could be changed
>> > to other so this situation may incur that DMA accesses any pages
>> > it doesn't want.
>> >
>> > The COW issue.
>> > - while DMA of a device is using the pages to VMAs, if current
>> > process was forked then the pages being accessed by the DMA
>> > would be copied into child's pages.(Copy On Write) so
>> > these pages may not have coherrency with parent's ones if
>> > child process wrote something on those pages so we need to
>> > flag VM_DONTCOPY to prevent pages from being COWed.
>>
>> Note that this is a massive change in behavior of anonymous mapping
>> this effectively completely change the linux API from application
>> point of view on your platform. Any application that have memory
>> mapped by your ioctl will have different fork behavior that other
>> application. I think this should be stressed, it's one of the thing i
>> am really uncomfortable with i would rather not have the dont copy
>> flag and have the page cowed and have the child not working with the
>> 3d/2d/drm driver. That would means that your driver (opengl
>> implementation for instance) would have to detect fork and work around
>> it, nvidia closed source driver do that.
>>
>
> First of all, thank you for your comments.
>
> Right, VM_DONTCOPY flag would change original behavior of user. Do you think
> this way has no problem but no generic way? anyway our issue was that the
> pages to VMAs are copied into child's ones(COW) so we prevented those pages
> from being COWed with using VM_DONTCOPY flag.
>
> For this, I have three questions below
>
> 1. in case of not using VM_DONTCOPY flag, you think that the application
> using our userptr feature has COW issue; parent's pages being accessed by
> DMA of some device would be copied into child's ones if the child wrote
> something on the pages. after that, DMA of a device could access pages user
> doesn't want. I'm not sure but I think such behavior has no any problem and
> is generic behavior and it's role of user to do fork or not. Do you think
> such COW behavior could create any issue I don't aware of so we have to
> prevent that somehow?

My point is the father will keep the page that the GPU know about as
long as the father dont destroy the associated object. But if the
child expect to be able to use the same GPU object and still be able
to change the content through its anonymous mapping than i would
consider this behavior buggy (ie application have wrong expectation).
So i am all for only the father is able to keep its memory mapped into
GPU address space through same GEM object.

> 2. so we added VM_DONTCOPY flag to prevent the pages from being COWed but
> this changes original behavior of user. Do you think this is not generic way
> or could create any issue also?

I would say don't add the flag and consider application that do fork
as special case in userspace. See below for how i would handle it.

> 3. and last one, what is the difference between to flag VM_DONTCOPY and to
> detect fork? I mean the device driver should do something to need after
> detecting fork. and I'm not sure but I think the something may also change
> original behavior of user.
>
> Please let me know if there is my missing point.

I would detect fork by storing process id along gem object. So
something like (userspace code that could be in your pixman library):

struct gpu_object_process {
  struct list list;
  uint32_t gem_handle;
  unsigned process_id;
};

struct gpu_object {
  struct list gpu_object_process;
  void *ptr;
  unsigned size;
  ...
}

When creating a GPU object from userptr you fill the above structure
in the userspace code. Then whenever you library want to use this
object it call something like:

int gpu_object_validate(struct gpu_object *bo)

Which check if there is the current process id in the
gpu_object_process list, if there is one then use the gem object
handle, otherwise you create a new GEM object using this userptr and
same size and other properties.

Note you really need this only in case you expect application using
you library to fork and still expect to use your gpu accelerated
library in the same way.

So doing this you conserve proper unix fork behavior, child change to
anonymous memory don't reflect into the father anonymous memory and
that should be the expected behavior even regarding GPU object. Of
course this means there would be memcpy btw father and child on write
but that's the expected behavior of fork.

Note also that i don't expect any of your graphic application to use
fork so in most case your  gpu_object_process list would be only one
element.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-15 14:28           ` Rob Clark
@ 2012-05-16  6:04             ` Inki Dae
  2012-05-16  8:42               ` Rob Clark
  0 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-16  6:04 UTC (permalink / raw)
  To: 'Rob Clark', 'InKi Dae'
  Cc: kyungmin.park, sw0312.kim, dri-devel



> -----Original Message-----
> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
> Clark
> Sent: Tuesday, May 15, 2012 11:29 PM
> To: InKi Dae
> Cc: Inki Dae; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
> devel@lists.freedesktop.org
> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> 
> On Tue, May 15, 2012 at 7:40 AM, InKi Dae <daeinki@gmail.com> wrote:
> > 2012/5/15 Rob Clark <rob.clark@linaro.org>:
> >> On Tue, May 15, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >>> Hi Rob,
> >>>
> >>>> -----Original Message-----
> >>>> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of
> Rob
> >>>> Clark
> >>>> Sent: Tuesday, May 15, 2012 4:35 PM
> >>>> To: Inki Dae
> >>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> >>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com
> >>>> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> >>>>
> >>>> On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com>
> wrote:
> >>>> > this feature could be used to use memory region allocated by
> malloc() in
> >>>> user
> >>>> > mode and mmaped memory region allocated by other memory allocators.
> >>>> userptr
> >>>> > interface can identify memory type through vm_flags value and would
> get
> >>>> > pages or page frame numbers to user space appropriately.
> >>>>
> >>>> I apologize for being a little late to jump in on this thread, but...
> >>>>
> >>>> I must confess to not being a huge fan of userptr.  It really is
> >>>> opening a can of worms, and seems best avoided if at all possible.
> >>>> I'm not entirely sure the use-case for which you require this, but I
> >>>> wonder if there isn't an alternative way?   I mean, the main case I
> >>>> could think of for mapping userspace memory would be something like
> >>>> texture upload.  But could that be handled in an alternative way,
> >>>> something like a pwrite or texture_upload API, which could
> temporarily
> >>>> pin the userspace memory, kick off a dma from that user buffer to a
> >>>> proper GEM buffer, and then unpin the user buffer when the DMA
> >>>> completes?
> >>>>
> >>>
> >>> This feature have being discussed with drm and linux-mm guys and I
> have
> >>> posted this patch four times.
> >>> So please, see below e-mail threads.
> >>> http://www.spinics.net/lists/dri-devel/msg22729.html
> >>>
> >>> and then please, give me your opinions.
> >>
> >>
> >> Yeah, I read that and understand that the purpose is to support
> >> malloc()'d memory (and btw, all the other changes like limits to the
> >> amount of userptr buffers are good if we do decide to support userptr
> >> buffers).  But my question was more about why do you need to support
> >> malloc'd buffers... other than "just because v4l2 does" ;-)
> >>
> >> I don't really like the userptr feature in v4l2 either, and view it as
> >> a necessary evil because at the time there was nothing like dmabuf.
> >> But now that we have dmabuf to share buffers with zero copy between
> >> two devices, to we *really* still need userptr?
> >>
> >
> > Definitely no, as I mentioned on this email thread, we are using the
> > userptr feature for pixman and evas
> > backend to use gpu acceleration directly(zero-copy). as you may know,
> > Evas is part of the Enlightenment Foundation Libraries(EFL) and
> > Elementary for pixman. all the applicaions based on them uses user
> > address to draw something so the backends(such as gpu accelerators)
> > can refer to only the user address to the buffer rendered. do you
> > think we can avoid memory copy to use those GPUs without such userptr
> > feature? I think there is no way and only the way that the gpu uses
> > the user address to render without memory copy. and as mentioned on
> > this thread, this feature had been tried by i915 in the desktop world.
> > for this, you can refer to Daniel's comments.
> 
> oh, yeah, for something like pixman it is hard to accelerate.  But I
> wonder if there is still some benefit.  I don't know as much about
> evas, but pixman is assumed to be a synchronous API, so you must
> always block until the operation completes which looses a lot of the
> potential benefit of a blitter.  And also I think you have no control
> over when the client free()'s a buffer.  So you end up having to
> map/unmap the buffer to the gpu/blitter/whatever every single
> operation.  Not to mention possible need for cache operations, etc.  I
> wonder if trying to accel it in hw really brings any benefit vs just
> using NEON?
> 

Yes, actually, we are using NEON backend of PIXMAN for best performance
also. like this, PIXMAN's NEON backend for small image and 2D GPU backend of
EXA for large image because NEON is faster than 2d gpu for small image but
not for large image.

> I thought EFL has a GLES backend..  I could be wrong about that, but
> if it does probably you'd get better performance by using the GLES
> backend vs trying to accelerate what is intended as a sw path..
> 
Right, EFL has a GLES backend and also PIXMAN for software path. application
using Elementary can use PIXMAN for software path and also GLES for hardware
acceleration. so the purpose of using userptr feature is for buffer drawn by
any user using Evas to be rendered by gpu hardware.


Thanks,
Inki Dae

> > and as you know, note that we are already using dmabuf to share a
> > buffer between device drivers or v4l2 and drm framework. for this, we
> > also tested drm prime feature with v4l2 world and we applied umm
> > concept to ump for 3d gpu(known as mali) also.
> >
> > Anyway, this userptr feature is different from your thought. and maybe
> > you also need such feature for performance enhancement. Please give me
> > any idea if you have any good way instead of this userptr fearure.
> > there may be any good idea I don't know.
> 
> well, I just try and point people to APIs that are actually designed
> with hw acceleration in mind in the first place (x11/EXA, GLES, etc)
> ;-)
> 
> There missing really an API for client side 2d accel.  Well, I guess
> there is OpenVG, although I'm not sure if anyone actually ever used
> it.. maybe we do need something for better supporting 2d blitters in
> weston compositor, for example.
> 
> BR,
> -R
> 
> 
> > Thanks,
> > Inki Dae
> >
> >> I guess if I understood better the use-case, maybe I could try to
> >> think of some alternatives.
> >>
> >> BR,
> >> -R
> >>
> >>> Thanks,
> >>> Inki Dae
> >>>
> >>>> BR,
> >>>> -R
> >>>>
> >>>> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
> >>>> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >>>> > ---
> >>>> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
> >>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  258
> >>>> +++++++++++++++++++++++++++++++
> >>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
> >>>> >  include/drm/exynos_drm.h                |   25 +++-
> >>>> >  4 files changed, 296 insertions(+), 2 deletions(-)
> >>>> >
> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> >>>> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> >>>> > index f58a487..5bb0361 100644
> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> >>>> > @@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] =
> {
> >>>> >                        DRM_AUTH),
> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
> >>>> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
> >>> DRM_AUTH),
> >>>> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
> >>>> > +                       exynos_drm_gem_userptr_ioctl,
DRM_UNLOCKED),
> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS,
> >>>> exynos_plane_set_zpos_ioctl,
> >>>> >                        DRM_UNLOCKED | DRM_AUTH),
> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> >>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> >>>> > index afd0cd4..b68d4ea 100644
> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> >>>> > @@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
> >>>> >        return 0;
> >>>> >  }
> >>>> >
> >>>> > +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
> >>>> > +{
> >>>> > +       struct vm_area_struct *vma_copy;
> >>>> > +
> >>>> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
> >>>> > +       if (!vma_copy)
> >>>> > +               return NULL;
> >>>> > +
> >>>> > +       if (vma->vm_ops && vma->vm_ops->open)
> >>>> > +               vma->vm_ops->open(vma);
> >>>> > +
> >>>> > +       if (vma->vm_file)
> >>>> > +               get_file(vma->vm_file);
> >>>> > +
> >>>> > +       memcpy(vma_copy, vma, sizeof(*vma));
> >>>> > +
> >>>> > +       vma_copy->vm_mm = NULL;
> >>>> > +       vma_copy->vm_next = NULL;
> >>>> > +       vma_copy->vm_prev = NULL;
> >>>> > +
> >>>> > +       return vma_copy;
> >>>> > +}
> >>>> > +
> >>>> > +static void put_vma(struct vm_area_struct *vma)
> >>>> > +{
> >>>> > +       if (!vma)
> >>>> > +               return;
> >>>> > +
> >>>> > +       if (vma->vm_ops && vma->vm_ops->close)
> >>>> > +               vma->vm_ops->close(vma);
> >>>> > +
> >>>> > +       if (vma->vm_file)
> >>>> > +               fput(vma->vm_file);
> >>>> > +
> >>>> > +       kfree(vma);
> >>>> > +}
> >>>> > +
> >>>> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
> >>>> >                                        struct vm_area_struct *vma)
> >>>> >  {
> >>>> > @@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct
> >>>> drm_gem_object *obj)
> >>>> >        /* add some codes for UNCACHED type here. TODO */
> >>>> >  }
> >>>> >
> >>>> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
> >>>> > +{
> >>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> >>>> > +       struct exynos_drm_gem_buf *buf;
> >>>> > +       struct vm_area_struct *vma;
> >>>> > +       int npages;
> >>>> > +
> >>>> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
> >>>> > +       buf = exynos_gem_obj->buffer;
> >>>> > +       vma = exynos_gem_obj->vma;
> >>>> > +
> >>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff))
{
> >>>> > +               put_vma(exynos_gem_obj->vma);
> >>>> > +               goto out;
> >>>> > +       }
> >>>> > +
> >>>> > +       npages = buf->size >> PAGE_SHIFT;
> >>>> > +
> >>>> > +       npages--;
> >>>> > +       while (npages >= 0) {
> >>>> > +               if (buf->write)
> >>>> > +                       set_page_dirty_lock(buf->pages[npages]);
> >>>> > +
> >>>> > +               put_page(buf->pages[npages]);
> >>>> > +               npages--;
> >>>> > +       }
> >>>> > +
> >>>> > +out:
> >>>> > +       kfree(buf->pages);
> >>>> > +       buf->pages = NULL;
> >>>> > +
> >>>> > +       kfree(buf->sgt);
> >>>> > +       buf->sgt = NULL;
> >>>> > +}
> >>>> > +
> >>>> >  static int exynos_drm_gem_handle_create(struct drm_gem_object
*obj,
> >>>> >                                        struct drm_file *file_priv,
> >>>> >                                        unsigned int *handle)
> >>>> > @@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct
> >>>> exynos_drm_gem_obj *exynos_gem_obj)
> >>>> >
> >>>> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
> >>>> >                exynos_drm_gem_put_pages(obj);
> >>>> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
> >>>> > +               exynos_drm_put_userptr(obj);
> >>>> >        else
> >>>> >                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
> >>> buf);
> >>>> >
> >>>> > @@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct
> drm_device
> >>>> *dev, void *data,
> >>>> >        return 0;
> >>>> >  }
> >>>> >
> >>>> > +static int exynos_drm_get_userptr(struct drm_device *dev,
> >>>> > +                               struct exynos_drm_gem_obj *obj,
> >>>> > +                               unsigned long userptr,
> >>>> > +                               unsigned int write)
> >>>> > +{
> >>>> > +       unsigned int get_npages;
> >>>> > +       unsigned long npages = 0;
> >>>> > +       struct vm_area_struct *vma;
> >>>> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
> >>>> > +
> >>>> > +       vma = find_vma(current->mm, userptr);
> >>>> > +
> >>>> > +       /* the memory region mmaped with VM_PFNMAP. */
> >>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff))
{
> >>>> > +               unsigned long this_pfn, prev_pfn, pa;
> >>>> > +               unsigned long start, end, offset;
> >>>> > +               struct scatterlist *sgl;
> >>>> > +
> >>>> > +               start = userptr;
> >>>> > +               offset = userptr & ~PAGE_MASK;
> >>>> > +               end = start + buf->size;
> >>>> > +               sgl = buf->sgt->sgl;
> >>>> > +
> >>>> > +               for (prev_pfn = 0; start < end; start += PAGE_SIZE)
{
> >>>> > +                       int ret = follow_pfn(vma, start,
&this_pfn);
> >>>> > +                       if (ret)
> >>>> > +                               return ret;
> >>>> > +
> >>>> > +                       if (prev_pfn == 0)
> >>>> > +                               pa = this_pfn << PAGE_SHIFT;
> >>>> > +                       else if (this_pfn != prev_pfn + 1)
> >>>> > +                               return -EFAULT;
> >>>> > +
> >>>> > +                       sg_dma_address(sgl) = (pa + offset);
> >>>> > +                       sg_dma_len(sgl) = PAGE_SIZE;
> >>>> > +                       prev_pfn = this_pfn;
> >>>> > +                       pa += PAGE_SIZE;
> >>>> > +                       npages++;
> >>>> > +                       sgl = sg_next(sgl);
> >>>> > +               }
> >>>> > +
> >>>> > +               buf->dma_addr = pa + offset;
> >>>> > +
> >>>> > +               obj->vma = get_vma(vma);
> >>>> > +               if (!obj->vma)
> >>>> > +                       return -ENOMEM;
> >>>> > +
> >>>> > +               buf->pfnmap = true;
> >>>> > +
> >>>> > +               return npages;
> >>>> > +       }
> >>>> > +
> >>>> > +       buf->write = write;
> >>>> > +       npages = buf->size >> PAGE_SHIFT;
> >>>> > +
> >>>> > +       down_read(&current->mm->mmap_sem);
> >>>> > +       get_npages = get_user_pages(current, current->mm, userptr,
> >>>> > +                                       npages, write, 1,
buf->pages,
> >>> NULL);
> >>>> > +       up_read(&current->mm->mmap_sem);
> >>>> > +       if (get_npages != npages)
> >>>> > +               DRM_ERROR("failed to get user_pages.\n");
> >>>> > +
> >>>> > +       buf->pfnmap = false;
> >>>> > +
> >>>> > +       return get_npages;
> >>>> > +}
> >>>> > +
> >>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void
> *data,
> >>>> > +                                     struct drm_file *file_priv)
> >>>> > +{
> >>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> >>>> > +       struct drm_exynos_gem_userptr *args = data;
> >>>> > +       struct exynos_drm_gem_buf *buf;
> >>>> > +       struct scatterlist *sgl;
> >>>> > +       unsigned long size, userptr;
> >>>> > +       unsigned int npages;
> >>>> > +       int ret, get_npages;
> >>>> > +
> >>>> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
> >>>> > +
> >>>> > +       if (!args->size) {
> >>>> > +               DRM_ERROR("invalid size.\n");
> >>>> > +               return -EINVAL;
> >>>> > +       }
> >>>> > +
> >>>> > +       ret = check_gem_flags(args->flags);
> >>>> > +       if (ret)
> >>>> > +               return ret;
> >>>> > +
> >>>> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
> >>>> > +       userptr = args->userptr;
> >>>> > +
> >>>> > +       buf = exynos_drm_init_buf(dev, size);
> >>>> > +       if (!buf)
> >>>> > +               return -ENOMEM;
> >>>> > +
> >>>> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
> >>>> > +       if (!exynos_gem_obj) {
> >>>> > +               ret = -ENOMEM;
> >>>> > +               goto err_free_buffer;
> >>>> > +       }
> >>>> > +
> >>>> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> >>>> > +       if (!buf->sgt) {
> >>>> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
> >>>> > +               ret = -ENOMEM;
> >>>> > +               goto err_release_gem;
> >>>> > +       }
> >>>> > +
> >>>> > +       npages = size >> PAGE_SHIFT;
> >>>> > +
> >>>> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
> >>>> > +       if (ret < 0) {
> >>>> > +               DRM_ERROR("failed to initailize sg table.\n");
> >>>> > +               goto err_free_sgt;
> >>>> > +       }
> >>>> > +
> >>>> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
> >>> GFP_KERNEL);
> >>>> > +       if (!buf->pages) {
> >>>> > +               DRM_ERROR("failed to allocate buf->pages\n");
> >>>> > +               ret = -ENOMEM;
> >>>> > +               goto err_free_table;
> >>>> > +       }
> >>>> > +
> >>>> > +       exynos_gem_obj->buffer = buf;
> >>>> > +
> >>>> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
> >>> userptr,
> >>>> 1);
> >>>> > +       if (get_npages != npages) {
> >>>> > +               DRM_ERROR("failed to get user_pages.\n");
> >>>> > +               ret = get_npages;
> >>>> > +               goto err_release_userptr;
> >>>> > +       }
> >>>> > +
> >>>> > +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base,
> >>>> file_priv,
> >>>> > +                                               &args->handle);
> >>>> > +       if (ret < 0) {
> >>>> > +               DRM_ERROR("failed to create gem handle.\n");
> >>>> > +               goto err_release_userptr;
> >>>> > +       }
> >>>> > +
> >>>> > +       sgl = buf->sgt->sgl;
> >>>> > +
> >>>> > +       /*
> >>>> > +        * if buf->pfnmap is true then update sgl of sgt with pages
> but
> >>>> > +        * if buf->pfnmap is false then it means the sgl was
updated
> >>>> already
> >>>> > +        * so it doesn't need to update the sgl.
> >>>> > +        */
> >>>> > +       if (!buf->pfnmap) {
> >>>> > +               unsigned int i = 0;
> >>>> > +
> >>>> > +               /* set all pages to sg list. */
> >>>> > +               while (i < npages) {
> >>>> > +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE,
0);
> >>>> > +                       sg_dma_address(sgl) =
> >>> page_to_phys(buf->pages[i]);
> >>>> > +                       i++;
> >>>> > +                       sgl = sg_next(sgl);
> >>>> > +               }
> >>>> > +       }
> >>>> > +
> >>>> > +       /* always use EXYNOS_BO_USERPTR as memory type for userptr.
> */
> >>>> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
> >>>> > +
> >>>> > +       return 0;
> >>>> > +
> >>>> > +err_release_userptr:
> >>>> > +       get_npages--;
> >>>> > +       while (get_npages >= 0)
> >>>> > +               put_page(buf->pages[get_npages--]);
> >>>> > +       kfree(buf->pages);
> >>>> > +       buf->pages = NULL;
> >>>> > +err_free_table:
> >>>> > +       sg_free_table(buf->sgt);
> >>>> > +err_free_sgt:
> >>>> > +       kfree(buf->sgt);
> >>>> > +       buf->sgt = NULL;
> >>>> > +err_release_gem:
> >>>> > +       drm_gem_object_release(&exynos_gem_obj->base);
> >>>> > +       kfree(exynos_gem_obj);
> >>>> > +       exynos_gem_obj = NULL;
> >>>> > +err_free_buffer:
> >>>> > +       exynos_drm_free_buf(dev, 0, buf);
> >>>> > +       return ret;
> >>>> > +}
> >>>> > +
> >>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
> >>>> >  {
> >>>> >        DRM_DEBUG_KMS("%s\n", __FILE__);
> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> >>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> >>>> > index efc8252..1de2241 100644
> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> >>>> > @@ -29,7 +29,8 @@
> >>>> >  #define to_exynos_gem_obj(x)   container_of(x,\
> >>>> >                        struct exynos_drm_gem_obj, base)
> >>>> >
> >>>> > -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
> >>>> > +#define IS_NONCONTIG_BUFFER(f)         ((f & EXYNOS_BO_NONCONTIG)
> ||\
> >>>> > +                                       (f & EXYNOS_BO_USERPTR))
> >>>> >
> >>>> >  /*
> >>>> >  * exynos drm gem buffer structure.
> >>>> > @@ -38,18 +39,23 @@
> >>>> >  * @dma_addr: bus address(accessed by dma) to allocated memory
> region.
> >>>> >  *     - this address could be physical address without IOMMU and
> >>>> >  *     device address with IOMMU.
> >>>> > + * @write: whether pages will be written to by the caller.
> >>>> >  * @sgt: sg table to transfer page data.
> >>>> >  * @pages: contain all pages to allocated memory region.
> >>>> >  * @page_size: could be 4K, 64K or 1MB.
> >>>> >  * @size: size of allocated memory region.
> >>>> > + * @pfnmap: indicate whether memory region from userptr is mmaped
> with
> >>>> > + *     VM_PFNMAP or not.
> >>>> >  */
> >>>> >  struct exynos_drm_gem_buf {
> >>>> >        void __iomem            *kvaddr;
> >>>> >        dma_addr_t              dma_addr;
> >>>> > +       unsigned int            write;
> >>>> >        struct sg_table         *sgt;
> >>>> >        struct page             **pages;
> >>>> >        unsigned long           page_size;
> >>>> >        unsigned long           size;
> >>>> > +       bool                    pfnmap;
> >>>> >  };
> >>>> >
> >>>> >  /*
> >>>> > @@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
> >>>> >        struct drm_gem_object           base;
> >>>> >        struct exynos_drm_gem_buf       *buffer;
> >>>> >        unsigned long                   size;
> >>>> > +       struct vm_area_struct           *vma;
> >>>> >        unsigned int                    flags;
> >>>> >  };
> >>>> >
> >>>> > @@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
> >>>> drm_device *dev, void *data,
> >>>> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
> >>>> >                              struct drm_file *file_priv);
> >>>> >
> >>>> > +/* map user space allocated by malloc to pages. */
> >>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void
> *data,
> >>>> > +                                     struct drm_file *file_priv);
> >>>> > +
> >>>> >  /* initialize gem object. */
> >>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj);
> >>>> >
> >>>> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> >>>> > index 2d6eb06..48eda6e 100644
> >>>> > --- a/include/drm/exynos_drm.h
> >>>> > +++ b/include/drm/exynos_drm.h
> >>>> > @@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
> >>>> >  };
> >>>> >
> >>>> >  /**
> >>>> > + * User-requested user space importing structure
> >>>> > + *
> >>>> > + * @userptr: user space address allocated by malloc.
> >>>> > + * @size: size to the buffer allocated by malloc.
> >>>> > + * @flags: indicate user-desired cache attribute to map the
> allocated
> >>>> buffer
> >>>> > + *     to kernel space.
> >>>> > + * @handle: a returned handle to created gem object.
> >>>> > + *     - this handle will be set by gem module of kernel side.
> >>>> > + */
> >>>> > +struct drm_exynos_gem_userptr {
> >>>> > +       uint64_t userptr;
> >>>> > +       uint64_t size;
> >>>> > +       unsigned int flags;
> >>>> > +       unsigned int handle;
> >>>> > +};
> >>>> > +
> >>>> > +/**
> >>>> >  * A structure for user connection request of virtual display.
> >>>> >  *
> >>>> >  * @connection: indicate whether doing connetion or not by user.
> >>>> > @@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
> >>>> >        EXYNOS_BO_CACHABLE      = 1 << 1,
> >>>> >        /* write-combine mapping. */
> >>>> >        EXYNOS_BO_WC            = 1 << 2,
> >>>> > +       /* user space memory allocated by malloc. */
> >>>> > +       EXYNOS_BO_USERPTR       = 1 << 3,
> >>>> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
> >>> EXYNOS_BO_CACHABLE
> >>>> |
> >>>> > -                                       EXYNOS_BO_WC
> >>>> > +                                       EXYNOS_BO_WC |
EXYNOS_BO_USERPTR
> >>>> >  };
> >>>> >
> >>>> >  #define DRM_EXYNOS_GEM_CREATE          0x00
> >>>> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
> >>>> >  #define DRM_EXYNOS_GEM_MMAP            0x02
> >>>> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
> >>>> >  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
> >>>> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
> >>>> >  #define DRM_EXYNOS_VIDI_CONNECTION     0x07
> >>>> > @@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
> >>>> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE +
> \
> >>>> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
> >>>> >
> >>>> > +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE +
> \
> >>>> > +               DRM_EXYNOS_GEM_USERPTR, struct
drm_exynos_gem_userptr)
> >>>> > +
> >>>> >  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS
> >>>  DRM_IOWR(DRM_COMMAND_BASE
> >>>> + \
> >>>> >                DRM_EXYNOS_PLANE_SET_ZPOS, struct
> >>> drm_exynos_plane_set_zpos)
> >>>> >
> >>>> > --
> >>>> > 1.7.4.1
> >>>> >
> >>>> > _______________________________________________
> >>>> > dri-devel mailing list
> >>>> > dri-devel@lists.freedesktop.org
> >>>> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >>>
> >>> _______________________________________________
> >>> dri-devel mailing list
> >>> dri-devel@lists.freedesktop.org
> >>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >> _______________________________________________
> >> dri-devel mailing list
> >> dri-devel@lists.freedesktop.org
> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-16  6:04             ` Inki Dae
@ 2012-05-16  8:42               ` Rob Clark
  2012-05-16 10:30                 ` Inki Dae
  0 siblings, 1 reply; 77+ messages in thread
From: Rob Clark @ 2012-05-16  8:42 UTC (permalink / raw)
  To: Inki Dae; +Cc: kyungmin.park, sw0312.kim, dri-devel

On Wed, May 16, 2012 at 12:04 AM, Inki Dae <inki.dae@samsung.com> wrote:
>
>
>> -----Original Message-----
>> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
>> Clark
>> Sent: Tuesday, May 15, 2012 11:29 PM
>> To: InKi Dae
>> Cc: Inki Dae; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
>> devel@lists.freedesktop.org
>> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
>>
>> On Tue, May 15, 2012 at 7:40 AM, InKi Dae <daeinki@gmail.com> wrote:
>> > 2012/5/15 Rob Clark <rob.clark@linaro.org>:
>> >> On Tue, May 15, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> >>> Hi Rob,
>> >>>
>> >>>> -----Original Message-----
>> >>>> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of
>> Rob
>> >>>> Clark
>> >>>> Sent: Tuesday, May 15, 2012 4:35 PM
>> >>>> To: Inki Dae
>> >>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
>> >>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com
>> >>>> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
>> >>>>
>> >>>> On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com>
>> wrote:
>> >>>> > this feature could be used to use memory region allocated by
>> malloc() in
>> >>>> user
>> >>>> > mode and mmaped memory region allocated by other memory allocators.
>> >>>> userptr
>> >>>> > interface can identify memory type through vm_flags value and would
>> get
>> >>>> > pages or page frame numbers to user space appropriately.
>> >>>>
>> >>>> I apologize for being a little late to jump in on this thread, but...
>> >>>>
>> >>>> I must confess to not being a huge fan of userptr.  It really is
>> >>>> opening a can of worms, and seems best avoided if at all possible.
>> >>>> I'm not entirely sure the use-case for which you require this, but I
>> >>>> wonder if there isn't an alternative way?   I mean, the main case I
>> >>>> could think of for mapping userspace memory would be something like
>> >>>> texture upload.  But could that be handled in an alternative way,
>> >>>> something like a pwrite or texture_upload API, which could
>> temporarily
>> >>>> pin the userspace memory, kick off a dma from that user buffer to a
>> >>>> proper GEM buffer, and then unpin the user buffer when the DMA
>> >>>> completes?
>> >>>>
>> >>>
>> >>> This feature have being discussed with drm and linux-mm guys and I
>> have
>> >>> posted this patch four times.
>> >>> So please, see below e-mail threads.
>> >>> http://www.spinics.net/lists/dri-devel/msg22729.html
>> >>>
>> >>> and then please, give me your opinions.
>> >>
>> >>
>> >> Yeah, I read that and understand that the purpose is to support
>> >> malloc()'d memory (and btw, all the other changes like limits to the
>> >> amount of userptr buffers are good if we do decide to support userptr
>> >> buffers).  But my question was more about why do you need to support
>> >> malloc'd buffers... other than "just because v4l2 does" ;-)
>> >>
>> >> I don't really like the userptr feature in v4l2 either, and view it as
>> >> a necessary evil because at the time there was nothing like dmabuf.
>> >> But now that we have dmabuf to share buffers with zero copy between
>> >> two devices, to we *really* still need userptr?
>> >>
>> >
>> > Definitely no, as I mentioned on this email thread, we are using the
>> > userptr feature for pixman and evas
>> > backend to use gpu acceleration directly(zero-copy). as you may know,
>> > Evas is part of the Enlightenment Foundation Libraries(EFL) and
>> > Elementary for pixman. all the applicaions based on them uses user
>> > address to draw something so the backends(such as gpu accelerators)
>> > can refer to only the user address to the buffer rendered. do you
>> > think we can avoid memory copy to use those GPUs without such userptr
>> > feature? I think there is no way and only the way that the gpu uses
>> > the user address to render without memory copy. and as mentioned on
>> > this thread, this feature had been tried by i915 in the desktop world.
>> > for this, you can refer to Daniel's comments.
>>
>> oh, yeah, for something like pixman it is hard to accelerate.  But I
>> wonder if there is still some benefit.  I don't know as much about
>> evas, but pixman is assumed to be a synchronous API, so you must
>> always block until the operation completes which looses a lot of the
>> potential benefit of a blitter.  And also I think you have no control
>> over when the client free()'s a buffer.  So you end up having to
>> map/unmap the buffer to the gpu/blitter/whatever every single
>> operation.  Not to mention possible need for cache operations, etc.  I
>> wonder if trying to accel it in hw really brings any benefit vs just
>> using NEON?
>>
>
> Yes, actually, we are using NEON backend of PIXMAN for best performance
> also. like this, PIXMAN's NEON backend for small image and 2D GPU backend of
> EXA for large image because NEON is faster than 2d gpu for small image but
> not for large image.
>
>> I thought EFL has a GLES backend..  I could be wrong about that, but
>> if it does probably you'd get better performance by using the GLES
>> backend vs trying to accelerate what is intended as a sw path..
>>
> Right, EFL has a GLES backend and also PIXMAN for software path. application
> using Elementary can use PIXMAN for software path and also GLES for hardware
> acceleration. so the purpose of using userptr feature is for buffer drawn by
> any user using Evas to be rendered by gpu hardware.


hmm, I don't suppose it is primarily a one-way transfer (ie. like one
time blit from sw rendered surface to texture)?  I'd be more
comfortable w/ an API that somehow just created a temporary mapping
that unpins the pages before the syscall returns, ie. like a
EXYNOS_IOCTL_BLIT_FROM_USERPTR type interface, which does some blit
from a user ptr to a GEM bo.

BR,
-R

>
> Thanks,
> Inki Dae
>
>> > and as you know, note that we are already using dmabuf to share a
>> > buffer between device drivers or v4l2 and drm framework. for this, we
>> > also tested drm prime feature with v4l2 world and we applied umm
>> > concept to ump for 3d gpu(known as mali) also.
>> >
>> > Anyway, this userptr feature is different from your thought. and maybe
>> > you also need such feature for performance enhancement. Please give me
>> > any idea if you have any good way instead of this userptr fearure.
>> > there may be any good idea I don't know.
>>
>> well, I just try and point people to APIs that are actually designed
>> with hw acceleration in mind in the first place (x11/EXA, GLES, etc)
>> ;-)
>>
>> There missing really an API for client side 2d accel.  Well, I guess
>> there is OpenVG, although I'm not sure if anyone actually ever used
>> it.. maybe we do need something for better supporting 2d blitters in
>> weston compositor, for example.
>>
>> BR,
>> -R
>>
>>
>> > Thanks,
>> > Inki Dae
>> >
>> >> I guess if I understood better the use-case, maybe I could try to
>> >> think of some alternatives.
>> >>
>> >> BR,
>> >> -R
>> >>
>> >>> Thanks,
>> >>> Inki Dae
>> >>>
>> >>>> BR,
>> >>>> -R
>> >>>>
>> >>>> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
>> >>>> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
>> >>>> > ---
>> >>>> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
>> >>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  258
>> >>>> +++++++++++++++++++++++++++++++
>> >>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
>> >>>> >  include/drm/exynos_drm.h                |   25 +++-
>> >>>> >  4 files changed, 296 insertions(+), 2 deletions(-)
>> >>>> >
>> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
>> >>>> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
>> >>>> > index f58a487..5bb0361 100644
>> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
>> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
>> >>>> > @@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[] =
>> {
>> >>>> >                        DRM_AUTH),
>> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
>> >>>> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
>> >>> DRM_AUTH),
>> >>>> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
>> >>>> > +                       exynos_drm_gem_userptr_ioctl,
> DRM_UNLOCKED),
>> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS,
>> >>>> exynos_plane_set_zpos_ioctl,
>> >>>> >                        DRM_UNLOCKED | DRM_AUTH),
>> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
>> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
>> >>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
>> >>>> > index afd0cd4..b68d4ea 100644
>> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
>> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
>> >>>> > @@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int flags)
>> >>>> >        return 0;
>> >>>> >  }
>> >>>> >
>> >>>> > +static struct vm_area_struct *get_vma(struct vm_area_struct *vma)
>> >>>> > +{
>> >>>> > +       struct vm_area_struct *vma_copy;
>> >>>> > +
>> >>>> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
>> >>>> > +       if (!vma_copy)
>> >>>> > +               return NULL;
>> >>>> > +
>> >>>> > +       if (vma->vm_ops && vma->vm_ops->open)
>> >>>> > +               vma->vm_ops->open(vma);
>> >>>> > +
>> >>>> > +       if (vma->vm_file)
>> >>>> > +               get_file(vma->vm_file);
>> >>>> > +
>> >>>> > +       memcpy(vma_copy, vma, sizeof(*vma));
>> >>>> > +
>> >>>> > +       vma_copy->vm_mm = NULL;
>> >>>> > +       vma_copy->vm_next = NULL;
>> >>>> > +       vma_copy->vm_prev = NULL;
>> >>>> > +
>> >>>> > +       return vma_copy;
>> >>>> > +}
>> >>>> > +
>> >>>> > +static void put_vma(struct vm_area_struct *vma)
>> >>>> > +{
>> >>>> > +       if (!vma)
>> >>>> > +               return;
>> >>>> > +
>> >>>> > +       if (vma->vm_ops && vma->vm_ops->close)
>> >>>> > +               vma->vm_ops->close(vma);
>> >>>> > +
>> >>>> > +       if (vma->vm_file)
>> >>>> > +               fput(vma->vm_file);
>> >>>> > +
>> >>>> > +       kfree(vma);
>> >>>> > +}
>> >>>> > +
>> >>>> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj *obj,
>> >>>> >                                        struct vm_area_struct *vma)
>> >>>> >  {
>> >>>> > @@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct
>> >>>> drm_gem_object *obj)
>> >>>> >        /* add some codes for UNCACHED type here. TODO */
>> >>>> >  }
>> >>>> >
>> >>>> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
>> >>>> > +{
>> >>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
>> >>>> > +       struct exynos_drm_gem_buf *buf;
>> >>>> > +       struct vm_area_struct *vma;
>> >>>> > +       int npages;
>> >>>> > +
>> >>>> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
>> >>>> > +       buf = exynos_gem_obj->buffer;
>> >>>> > +       vma = exynos_gem_obj->vma;
>> >>>> > +
>> >>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff))
> {
>> >>>> > +               put_vma(exynos_gem_obj->vma);
>> >>>> > +               goto out;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       npages = buf->size >> PAGE_SHIFT;
>> >>>> > +
>> >>>> > +       npages--;
>> >>>> > +       while (npages >= 0) {
>> >>>> > +               if (buf->write)
>> >>>> > +                       set_page_dirty_lock(buf->pages[npages]);
>> >>>> > +
>> >>>> > +               put_page(buf->pages[npages]);
>> >>>> > +               npages--;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +out:
>> >>>> > +       kfree(buf->pages);
>> >>>> > +       buf->pages = NULL;
>> >>>> > +
>> >>>> > +       kfree(buf->sgt);
>> >>>> > +       buf->sgt = NULL;
>> >>>> > +}
>> >>>> > +
>> >>>> >  static int exynos_drm_gem_handle_create(struct drm_gem_object
> *obj,
>> >>>> >                                        struct drm_file *file_priv,
>> >>>> >                                        unsigned int *handle)
>> >>>> > @@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct
>> >>>> exynos_drm_gem_obj *exynos_gem_obj)
>> >>>> >
>> >>>> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
>> >>>> >                exynos_drm_gem_put_pages(obj);
>> >>>> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
>> >>>> > +               exynos_drm_put_userptr(obj);
>> >>>> >        else
>> >>>> >                exynos_drm_free_buf(obj->dev, exynos_gem_obj->flags,
>> >>> buf);
>> >>>> >
>> >>>> > @@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct
>> drm_device
>> >>>> *dev, void *data,
>> >>>> >        return 0;
>> >>>> >  }
>> >>>> >
>> >>>> > +static int exynos_drm_get_userptr(struct drm_device *dev,
>> >>>> > +                               struct exynos_drm_gem_obj *obj,
>> >>>> > +                               unsigned long userptr,
>> >>>> > +                               unsigned int write)
>> >>>> > +{
>> >>>> > +       unsigned int get_npages;
>> >>>> > +       unsigned long npages = 0;
>> >>>> > +       struct vm_area_struct *vma;
>> >>>> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
>> >>>> > +
>> >>>> > +       vma = find_vma(current->mm, userptr);
>> >>>> > +
>> >>>> > +       /* the memory region mmaped with VM_PFNMAP. */
>> >>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) && (vma->vm_pgoff))
> {
>> >>>> > +               unsigned long this_pfn, prev_pfn, pa;
>> >>>> > +               unsigned long start, end, offset;
>> >>>> > +               struct scatterlist *sgl;
>> >>>> > +
>> >>>> > +               start = userptr;
>> >>>> > +               offset = userptr & ~PAGE_MASK;
>> >>>> > +               end = start + buf->size;
>> >>>> > +               sgl = buf->sgt->sgl;
>> >>>> > +
>> >>>> > +               for (prev_pfn = 0; start < end; start += PAGE_SIZE)
> {
>> >>>> > +                       int ret = follow_pfn(vma, start,
> &this_pfn);
>> >>>> > +                       if (ret)
>> >>>> > +                               return ret;
>> >>>> > +
>> >>>> > +                       if (prev_pfn == 0)
>> >>>> > +                               pa = this_pfn << PAGE_SHIFT;
>> >>>> > +                       else if (this_pfn != prev_pfn + 1)
>> >>>> > +                               return -EFAULT;
>> >>>> > +
>> >>>> > +                       sg_dma_address(sgl) = (pa + offset);
>> >>>> > +                       sg_dma_len(sgl) = PAGE_SIZE;
>> >>>> > +                       prev_pfn = this_pfn;
>> >>>> > +                       pa += PAGE_SIZE;
>> >>>> > +                       npages++;
>> >>>> > +                       sgl = sg_next(sgl);
>> >>>> > +               }
>> >>>> > +
>> >>>> > +               buf->dma_addr = pa + offset;
>> >>>> > +
>> >>>> > +               obj->vma = get_vma(vma);
>> >>>> > +               if (!obj->vma)
>> >>>> > +                       return -ENOMEM;
>> >>>> > +
>> >>>> > +               buf->pfnmap = true;
>> >>>> > +
>> >>>> > +               return npages;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       buf->write = write;
>> >>>> > +       npages = buf->size >> PAGE_SHIFT;
>> >>>> > +
>> >>>> > +       down_read(&current->mm->mmap_sem);
>> >>>> > +       get_npages = get_user_pages(current, current->mm, userptr,
>> >>>> > +                                       npages, write, 1,
> buf->pages,
>> >>> NULL);
>> >>>> > +       up_read(&current->mm->mmap_sem);
>> >>>> > +       if (get_npages != npages)
>> >>>> > +               DRM_ERROR("failed to get user_pages.\n");
>> >>>> > +
>> >>>> > +       buf->pfnmap = false;
>> >>>> > +
>> >>>> > +       return get_npages;
>> >>>> > +}
>> >>>> > +
>> >>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void
>> *data,
>> >>>> > +                                     struct drm_file *file_priv)
>> >>>> > +{
>> >>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
>> >>>> > +       struct drm_exynos_gem_userptr *args = data;
>> >>>> > +       struct exynos_drm_gem_buf *buf;
>> >>>> > +       struct scatterlist *sgl;
>> >>>> > +       unsigned long size, userptr;
>> >>>> > +       unsigned int npages;
>> >>>> > +       int ret, get_npages;
>> >>>> > +
>> >>>> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
>> >>>> > +
>> >>>> > +       if (!args->size) {
>> >>>> > +               DRM_ERROR("invalid size.\n");
>> >>>> > +               return -EINVAL;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       ret = check_gem_flags(args->flags);
>> >>>> > +       if (ret)
>> >>>> > +               return ret;
>> >>>> > +
>> >>>> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
>> >>>> > +       userptr = args->userptr;
>> >>>> > +
>> >>>> > +       buf = exynos_drm_init_buf(dev, size);
>> >>>> > +       if (!buf)
>> >>>> > +               return -ENOMEM;
>> >>>> > +
>> >>>> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
>> >>>> > +       if (!exynos_gem_obj) {
>> >>>> > +               ret = -ENOMEM;
>> >>>> > +               goto err_free_buffer;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
>> >>>> > +       if (!buf->sgt) {
>> >>>> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
>> >>>> > +               ret = -ENOMEM;
>> >>>> > +               goto err_release_gem;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       npages = size >> PAGE_SHIFT;
>> >>>> > +
>> >>>> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
>> >>>> > +       if (ret < 0) {
>> >>>> > +               DRM_ERROR("failed to initailize sg table.\n");
>> >>>> > +               goto err_free_sgt;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
>> >>> GFP_KERNEL);
>> >>>> > +       if (!buf->pages) {
>> >>>> > +               DRM_ERROR("failed to allocate buf->pages\n");
>> >>>> > +               ret = -ENOMEM;
>> >>>> > +               goto err_free_table;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       exynos_gem_obj->buffer = buf;
>> >>>> > +
>> >>>> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
>> >>> userptr,
>> >>>> 1);
>> >>>> > +       if (get_npages != npages) {
>> >>>> > +               DRM_ERROR("failed to get user_pages.\n");
>> >>>> > +               ret = get_npages;
>> >>>> > +               goto err_release_userptr;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       ret = exynos_drm_gem_handle_create(&exynos_gem_obj->base,
>> >>>> file_priv,
>> >>>> > +                                               &args->handle);
>> >>>> > +       if (ret < 0) {
>> >>>> > +               DRM_ERROR("failed to create gem handle.\n");
>> >>>> > +               goto err_release_userptr;
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       sgl = buf->sgt->sgl;
>> >>>> > +
>> >>>> > +       /*
>> >>>> > +        * if buf->pfnmap is true then update sgl of sgt with pages
>> but
>> >>>> > +        * if buf->pfnmap is false then it means the sgl was
> updated
>> >>>> already
>> >>>> > +        * so it doesn't need to update the sgl.
>> >>>> > +        */
>> >>>> > +       if (!buf->pfnmap) {
>> >>>> > +               unsigned int i = 0;
>> >>>> > +
>> >>>> > +               /* set all pages to sg list. */
>> >>>> > +               while (i < npages) {
>> >>>> > +                       sg_set_page(sgl, buf->pages[i], PAGE_SIZE,
> 0);
>> >>>> > +                       sg_dma_address(sgl) =
>> >>> page_to_phys(buf->pages[i]);
>> >>>> > +                       i++;
>> >>>> > +                       sgl = sg_next(sgl);
>> >>>> > +               }
>> >>>> > +       }
>> >>>> > +
>> >>>> > +       /* always use EXYNOS_BO_USERPTR as memory type for userptr.
>> */
>> >>>> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
>> >>>> > +
>> >>>> > +       return 0;
>> >>>> > +
>> >>>> > +err_release_userptr:
>> >>>> > +       get_npages--;
>> >>>> > +       while (get_npages >= 0)
>> >>>> > +               put_page(buf->pages[get_npages--]);
>> >>>> > +       kfree(buf->pages);
>> >>>> > +       buf->pages = NULL;
>> >>>> > +err_free_table:
>> >>>> > +       sg_free_table(buf->sgt);
>> >>>> > +err_free_sgt:
>> >>>> > +       kfree(buf->sgt);
>> >>>> > +       buf->sgt = NULL;
>> >>>> > +err_release_gem:
>> >>>> > +       drm_gem_object_release(&exynos_gem_obj->base);
>> >>>> > +       kfree(exynos_gem_obj);
>> >>>> > +       exynos_gem_obj = NULL;
>> >>>> > +err_free_buffer:
>> >>>> > +       exynos_drm_free_buf(dev, 0, buf);
>> >>>> > +       return ret;
>> >>>> > +}
>> >>>> > +
>> >>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
>> >>>> >  {
>> >>>> >        DRM_DEBUG_KMS("%s\n", __FILE__);
>> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
>> >>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
>> >>>> > index efc8252..1de2241 100644
>> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
>> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
>> >>>> > @@ -29,7 +29,8 @@
>> >>>> >  #define to_exynos_gem_obj(x)   container_of(x,\
>> >>>> >                        struct exynos_drm_gem_obj, base)
>> >>>> >
>> >>>> > -#define IS_NONCONTIG_BUFFER(f)         (f & EXYNOS_BO_NONCONTIG)
>> >>>> > +#define IS_NONCONTIG_BUFFER(f)         ((f & EXYNOS_BO_NONCONTIG)
>> ||\
>> >>>> > +                                       (f & EXYNOS_BO_USERPTR))
>> >>>> >
>> >>>> >  /*
>> >>>> >  * exynos drm gem buffer structure.
>> >>>> > @@ -38,18 +39,23 @@
>> >>>> >  * @dma_addr: bus address(accessed by dma) to allocated memory
>> region.
>> >>>> >  *     - this address could be physical address without IOMMU and
>> >>>> >  *     device address with IOMMU.
>> >>>> > + * @write: whether pages will be written to by the caller.
>> >>>> >  * @sgt: sg table to transfer page data.
>> >>>> >  * @pages: contain all pages to allocated memory region.
>> >>>> >  * @page_size: could be 4K, 64K or 1MB.
>> >>>> >  * @size: size of allocated memory region.
>> >>>> > + * @pfnmap: indicate whether memory region from userptr is mmaped
>> with
>> >>>> > + *     VM_PFNMAP or not.
>> >>>> >  */
>> >>>> >  struct exynos_drm_gem_buf {
>> >>>> >        void __iomem            *kvaddr;
>> >>>> >        dma_addr_t              dma_addr;
>> >>>> > +       unsigned int            write;
>> >>>> >        struct sg_table         *sgt;
>> >>>> >        struct page             **pages;
>> >>>> >        unsigned long           page_size;
>> >>>> >        unsigned long           size;
>> >>>> > +       bool                    pfnmap;
>> >>>> >  };
>> >>>> >
>> >>>> >  /*
>> >>>> > @@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
>> >>>> >        struct drm_gem_object           base;
>> >>>> >        struct exynos_drm_gem_buf       *buffer;
>> >>>> >        unsigned long                   size;
>> >>>> > +       struct vm_area_struct           *vma;
>> >>>> >        unsigned int                    flags;
>> >>>> >  };
>> >>>> >
>> >>>> > @@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
>> >>>> drm_device *dev, void *data,
>> >>>> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void *data,
>> >>>> >                              struct drm_file *file_priv);
>> >>>> >
>> >>>> > +/* map user space allocated by malloc to pages. */
>> >>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void
>> *data,
>> >>>> > +                                     struct drm_file *file_priv);
>> >>>> > +
>> >>>> >  /* initialize gem object. */
>> >>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj);
>> >>>> >
>> >>>> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
>> >>>> > index 2d6eb06..48eda6e 100644
>> >>>> > --- a/include/drm/exynos_drm.h
>> >>>> > +++ b/include/drm/exynos_drm.h
>> >>>> > @@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
>> >>>> >  };
>> >>>> >
>> >>>> >  /**
>> >>>> > + * User-requested user space importing structure
>> >>>> > + *
>> >>>> > + * @userptr: user space address allocated by malloc.
>> >>>> > + * @size: size to the buffer allocated by malloc.
>> >>>> > + * @flags: indicate user-desired cache attribute to map the
>> allocated
>> >>>> buffer
>> >>>> > + *     to kernel space.
>> >>>> > + * @handle: a returned handle to created gem object.
>> >>>> > + *     - this handle will be set by gem module of kernel side.
>> >>>> > + */
>> >>>> > +struct drm_exynos_gem_userptr {
>> >>>> > +       uint64_t userptr;
>> >>>> > +       uint64_t size;
>> >>>> > +       unsigned int flags;
>> >>>> > +       unsigned int handle;
>> >>>> > +};
>> >>>> > +
>> >>>> > +/**
>> >>>> >  * A structure for user connection request of virtual display.
>> >>>> >  *
>> >>>> >  * @connection: indicate whether doing connetion or not by user.
>> >>>> > @@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
>> >>>> >        EXYNOS_BO_CACHABLE      = 1 << 1,
>> >>>> >        /* write-combine mapping. */
>> >>>> >        EXYNOS_BO_WC            = 1 << 2,
>> >>>> > +       /* user space memory allocated by malloc. */
>> >>>> > +       EXYNOS_BO_USERPTR       = 1 << 3,
>> >>>> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
>> >>> EXYNOS_BO_CACHABLE
>> >>>> |
>> >>>> > -                                       EXYNOS_BO_WC
>> >>>> > +                                       EXYNOS_BO_WC |
> EXYNOS_BO_USERPTR
>> >>>> >  };
>> >>>> >
>> >>>> >  #define DRM_EXYNOS_GEM_CREATE          0x00
>> >>>> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
>> >>>> >  #define DRM_EXYNOS_GEM_MMAP            0x02
>> >>>> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
>> >>>> >  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
>> >>>> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
>> >>>> >  #define DRM_EXYNOS_VIDI_CONNECTION     0x07
>> >>>> > @@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
>> >>>> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP      DRM_IOWR(DRM_COMMAND_BASE +
>> \
>> >>>> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
>> >>>> >
>> >>>> > +#define DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE +
>> \
>> >>>> > +               DRM_EXYNOS_GEM_USERPTR, struct
> drm_exynos_gem_userptr)
>> >>>> > +
>> >>>> >  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS
>> >>>  DRM_IOWR(DRM_COMMAND_BASE
>> >>>> + \
>> >>>> >                DRM_EXYNOS_PLANE_SET_ZPOS, struct
>> >>> drm_exynos_plane_set_zpos)
>> >>>> >
>> >>>> > --
>> >>>> > 1.7.4.1
>> >>>> >
>> >>>> > _______________________________________________
>> >>>> > dri-devel mailing list
>> >>>> > dri-devel@lists.freedesktop.org
>> >>>> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
>> >>>
>> >>> _______________________________________________
>> >>> dri-devel mailing list
>> >>> dri-devel@lists.freedesktop.org
>> >>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>> >> _______________________________________________
>> >> dri-devel mailing list
>> >> dri-devel@lists.freedesktop.org
>> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
>> > _______________________________________________
>> > dri-devel mailing list
>> > dri-devel@lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 2/2 v4] drm/exynos: added userptr feature.
  2012-05-15 14:31             ` Jerome Glisse
@ 2012-05-16  8:49               ` Inki Dae
  0 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-16  8:49 UTC (permalink / raw)
  To: 'Jerome Glisse'
  Cc: airlied, dri-devel, linux-mm, minchan, kosaki.motohiro,
	kyungmin.park, sw0312.kim, jy0922.shim



> -----Original Message-----
> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> Sent: Tuesday, May 15, 2012 11:31 PM
> To: Inki Dae
> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org; linux-mm@kvack.org;
> minchan@kernel.org; kosaki.motohiro@gmail.com; kyungmin.park@samsung.com;
> sw0312.kim@samsung.com; jy0922.shim@samsung.com
> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
> 
> On Tue, May 15, 2012 at 12:33 AM, Inki Dae <inki.dae@samsung.com> wrote:
> > Hi Jerome,
> >
> >> -----Original Message-----
> >> From: Jerome Glisse [mailto:j.glisse@gmail.com]
> >> Sent: Tuesday, May 15, 2012 4:27 AM
> >> To: Inki Dae
> >> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> minchan@kernel.org;
> >> kosaki.motohiro@gmail.com; kyungmin.park@samsung.com;
> >> sw0312.kim@samsung.com; jy0922.shim@samsung.com
> >> Subject: Re: [PATCH 2/2 v4] drm/exynos: added userptr feature.
> >>
> >> On Mon, May 14, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >> > this feature is used to import user space region allocated by
malloc()
> >> or
> >> > mmaped into a gem. for this, we uses get_user_pages() to get all the
> >> pages
> >> > to VMAs within user address space. However we should pay attention to
> >> use
> >> > this userptr feature like below.
> >> >
> >> > The migration issue.
> >> > - Pages reserved by CMA for some device using DMA could be used by
> >> > kernel and if the device driver wants to use those pages
> >> > while being used by kernel then the pages are copied into
> >> > other ones allocated to migrate them and then finally,
> >> > the device driver can use the pages for itself.
> >> > Thus, migrated, the pages being accessed by DMA could be changed
> >> > to other so this situation may incur that DMA accesses any pages
> >> > it doesn't want.
> >> >
> >> > The COW issue.
> >> > - while DMA of a device is using the pages to VMAs, if current
> >> > process was forked then the pages being accessed by the DMA
> >> > would be copied into child's pages.(Copy On Write) so
> >> > these pages may not have coherrency with parent's ones if
> >> > child process wrote something on those pages so we need to
> >> > flag VM_DONTCOPY to prevent pages from being COWed.
> >>
> >> Note that this is a massive change in behavior of anonymous mapping
> >> this effectively completely change the linux API from application
> >> point of view on your platform. Any application that have memory
> >> mapped by your ioctl will have different fork behavior that other
> >> application. I think this should be stressed, it's one of the thing i
> >> am really uncomfortable with i would rather not have the dont copy
> >> flag and have the page cowed and have the child not working with the
> >> 3d/2d/drm driver. That would means that your driver (opengl
> >> implementation for instance) would have to detect fork and work around
> >> it, nvidia closed source driver do that.
> >>
> >
> > First of all, thank you for your comments.
> >
> > Right, VM_DONTCOPY flag would change original behavior of user. Do you
> think
> > this way has no problem but no generic way? anyway our issue was that
> the
> > pages to VMAs are copied into child's ones(COW) so we prevented those
> pages
> > from being COWed with using VM_DONTCOPY flag.
> >
> > For this, I have three questions below
> >
> > 1. in case of not using VM_DONTCOPY flag, you think that the application
> > using our userptr feature has COW issue; parent's pages being accessed
> by
> > DMA of some device would be copied into child's ones if the child wrote
> > something on the pages. after that, DMA of a device could access pages
> user
> > doesn't want. I'm not sure but I think such behavior has no any problem
> and
> > is generic behavior and it's role of user to do fork or not. Do you
> think
> > such COW behavior could create any issue I don't aware of so we have to
> > prevent that somehow?
> 
> My point is the father will keep the page that the GPU know about as
> long as the father dont destroy the associated object. But if the
> child expect to be able to use the same GPU object and still be able
> to change the content through its anonymous mapping than i would
> consider this behavior buggy (ie application have wrong expectation).
> So i am all for only the father is able to keep its memory mapped into
> GPU address space through same GEM object.
> 
> > 2. so we added VM_DONTCOPY flag to prevent the pages from being COWed
> but
> > this changes original behavior of user. Do you think this is not generic
> way
> > or could create any issue also?
> 
> I would say don't add the flag and consider application that do fork
> as special case in userspace. See below for how i would handle it.
> 
> > 3. and last one, what is the difference between to flag VM_DONTCOPY and
> to
> > detect fork? I mean the device driver should do something to need after
> > detecting fork. and I'm not sure but I think the something may also
> change
> > original behavior of user.
> >
> > Please let me know if there is my missing point.
> 
> I would detect fork by storing process id along gem object. So
> something like (userspace code that could be in your pixman library):
> 
> struct gpu_object_process {
>   struct list list;
>   uint32_t gem_handle;
>   unsigned process_id;
> };
> 
> struct gpu_object {
>   struct list gpu_object_process;
>   void *ptr;
>   unsigned size;
>   ...
> }
> 
> When creating a GPU object from userptr you fill the above structure
> in the userspace code. Then whenever you library want to use this
> object it call something like:
> 
> int gpu_object_validate(struct gpu_object *bo)
> 
> Which check if there is the current process id in the
> gpu_object_process list, if there is one then use the gem object
> handle, otherwise you create a new GEM object using this userptr and
> same size and other properties.
> 
> Note you really need this only in case you expect application using
> you library to fork and still expect to use your gpu accelerated
> library in the same way.
> 
> So doing this you conserve proper unix fork behavior, child change to
> anonymous memory don't reflect into the father anonymous memory and
> that should be the expected behavior even regarding GPU object. Of
> course this means there would be memcpy btw father and child on write
> but that's the expected behavior of fork.
> 
> Note also that i don't expect any of your graphic application to use
> fork so in most case your  gpu_object_process list would be only one
> element.
> 

Thanks for detailed example and that would be very helpful to me and also I
think gpu_object_validate() should be called in normal case(not userptr
case). User can allocate a new gem and map it with its own user space. after
that, if fork is done then it would have same issue as userptr. so
gpu_object_validate() should be called before mapping the gem with user
space also. I understood what you mention and I will unset VM_DONTCOPY flag
and in our case, EXA backend will check that. for this, I gonna add some
comments enough to next patch. please let me know if there is my missing
point.

Thanks,
Inki Dae

> Cheers,
> Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-15  7:34   ` [PATCH 3/4] " Rob Clark
  2012-05-15  8:17     ` Inki Dae
@ 2012-05-16  9:22     ` Dave Airlie
  2012-05-16 10:20       ` Inki Dae
  1 sibling, 1 reply; 77+ messages in thread
From: Dave Airlie @ 2012-05-16  9:22 UTC (permalink / raw)
  To: Rob Clark; +Cc: Inki Dae, kyungmin.park, sw0312.kim, dri-devel

On Tue, May 15, 2012 at 8:34 AM, Rob Clark <rob.clark@linaro.org> wrote:
> On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> this feature could be used to use memory region allocated by malloc() in user
>> mode and mmaped memory region allocated by other memory allocators. userptr
>> interface can identify memory type through vm_flags value and would get
>> pages or page frame numbers to user space appropriately.
>
> I apologize for being a little late to jump in on this thread, but...
>
> I must confess to not being a huge fan of userptr.  It really is
> opening a can of worms, and seems best avoided if at all possible.
> I'm not entirely sure the use-case for which you require this, but I
> wonder if there isn't an alternative way?   I mean, the main case I
> could think of for mapping userspace memory would be something like
> texture upload.  But could that be handled in an alternative way,
> something like a pwrite or texture_upload API, which could temporarily
> pin the userspace memory, kick off a dma from that user buffer to a
> proper GEM buffer, and then unpin the user buffer when the DMA
> completes?

I'm with Rob on this, I really hate the userptr idea, and my problem
with letting it into exynos is it sets a benchmark for others to do
things the same way. I'm still not 100% sure how its going to be used
even with all your explainations.

Since we've agreed only the X server can access the interface, it
makes 0 sense to me to exist at all, as the X server can avoid malloc
memory for all objects it accesses.

I don't think pixman is at the level where you should be acceleration
it directly. I thought the point of pixman was a fast SW engine, not
something to be trunked down to a hw engine. The idea being you use
cairo and backend it onto something.

I know ssp had some ideas for making pixman be able to do hw accel,
but userptr doesn't seem like the proper solution, it seems like a
hack that needs a lot more VM work to operate properly, and by setting
a precedent for one GPU driver, I'll have 20 implementations of this
from ARM vendors and nobody will ever go back and fix things properly.

So I'm really not sure the best way to move this forward, maybe a very
clear set of use cases of where stuff plugs into this, and why dma-buf
or some other method isn't sufficient, but I'm having trouble getting
past the fact its setting a dangerous precedent.

Dave.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-16  9:22     ` Dave Airlie
@ 2012-05-16 10:20       ` Inki Dae
  2012-05-16 12:12         ` Rob Clark
  0 siblings, 1 reply; 77+ messages in thread
From: Inki Dae @ 2012-05-16 10:20 UTC (permalink / raw)
  To: 'Dave Airlie', 'Rob Clark'
  Cc: kyungmin.park, sw0312.kim, dri-devel



> -----Original Message-----
> From: Dave Airlie [mailto:airlied@gmail.com]
> Sent: Wednesday, May 16, 2012 6:23 PM
> To: Rob Clark
> Cc: Inki Dae; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
> devel@lists.freedesktop.org
> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> 
> On Tue, May 15, 2012 at 8:34 AM, Rob Clark <rob.clark@linaro.org> wrote:
> > On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >> this feature could be used to use memory region allocated by malloc()
> in user
> >> mode and mmaped memory region allocated by other memory allocators.
> userptr
> >> interface can identify memory type through vm_flags value and would get
> >> pages or page frame numbers to user space appropriately.
> >
> > I apologize for being a little late to jump in on this thread, but...
> >
> > I must confess to not being a huge fan of userptr.  It really is
> > opening a can of worms, and seems best avoided if at all possible.
> > I'm not entirely sure the use-case for which you require this, but I
> > wonder if there isn't an alternative way?   I mean, the main case I
> > could think of for mapping userspace memory would be something like
> > texture upload.  But could that be handled in an alternative way,
> > something like a pwrite or texture_upload API, which could temporarily
> > pin the userspace memory, kick off a dma from that user buffer to a
> > proper GEM buffer, and then unpin the user buffer when the DMA
> > completes?
> 
> I'm with Rob on this, I really hate the userptr idea, and my problem
> with letting it into exynos is it sets a benchmark for others to do
> things the same way. I'm still not 100% sure how its going to be used
> even with all your explainations.
> 
> Since we've agreed only the X server can access the interface, it
> makes 0 sense to me to exist at all, as the X server can avoid malloc
> memory for all objects it accesses.
> 
> I don't think pixman is at the level where you should be acceleration
> it directly. I thought the point of pixman was a fast SW engine, not
> something to be trunked down to a hw engine. The idea being you use
> cairo and backend it onto something.
> 

For more understanding, PIXMAN draws something on shmem and next id of the
shmem is sent to X and next X gets user address to the id and next EXA's gpu
backend does BitBLT using gpu hardware. However, the gpu backend doesn't
aware of the user address so the purpose of using userptr is to import the
shmem into a gem object for gpu to aware of the memory region as source. so
pixman would use SW engine as is.

> I know ssp had some ideas for making pixman be able to do hw accel,
> but userptr doesn't seem like the proper solution, it seems like a
> hack that needs a lot more VM work to operate properly, and by setting
> a precedent for one GPU driver, I'll have 20 implementations of this
> from ARM vendors and nobody will ever go back and fix things properly.
> 

I'm not sure that this is a hack or not but thing similar to this is being
used at via driver. you can refer to via_lock_all_dma_pages function of
via_dmablit.c file and this driver also uses get_user_pages() to lock all
the pages to the user space for DMA to access the memory region, maybe for
BitBLT. And I really wonder you think just using get_user_pages() is a hack
or importing user address into a gem object. there may be my
misunderstanding so give me any comments.

Thanks,
Inki Dae

> So I'm really not sure the best way to move this forward, maybe a very
> clear set of use cases of where stuff plugs into this, and why dma-buf
> or some other method isn't sufficient, but I'm having trouble getting
> past the fact its setting a dangerous precedent.
> 
> Dave.

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-16  8:42               ` Rob Clark
@ 2012-05-16 10:30                 ` Inki Dae
  0 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-16 10:30 UTC (permalink / raw)
  To: 'Rob Clark'; +Cc: kyungmin.park, sw0312.kim, dri-devel



> -----Original Message-----
> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
> Clark
> Sent: Wednesday, May 16, 2012 5:43 PM
> To: Inki Dae
> Cc: InKi Dae; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
> devel@lists.freedesktop.org
> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> 
> On Wed, May 16, 2012 at 12:04 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >
> >
> >> -----Original Message-----
> >> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
> >> Clark
> >> Sent: Tuesday, May 15, 2012 11:29 PM
> >> To: InKi Dae
> >> Cc: Inki Dae; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
> >> devel@lists.freedesktop.org
> >> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> >>
> >> On Tue, May 15, 2012 at 7:40 AM, InKi Dae <daeinki@gmail.com> wrote:
> >> > 2012/5/15 Rob Clark <rob.clark@linaro.org>:
> >> >> On Tue, May 15, 2012 at 2:17 AM, Inki Dae <inki.dae@samsung.com>
> wrote:
> >> >>> Hi Rob,
> >> >>>
> >> >>>> -----Original Message-----
> >> >>>> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf
> Of
> >> Rob
> >> >>>> Clark
> >> >>>> Sent: Tuesday, May 15, 2012 4:35 PM
> >> >>>> To: Inki Dae
> >> >>>> Cc: airlied@linux.ie; dri-devel@lists.freedesktop.org;
> >> >>>> kyungmin.park@samsung.com; sw0312.kim@samsung.com
> >> >>>> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> >> >>>>
> >> >>>> On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com>
> >> wrote:
> >> >>>> > this feature could be used to use memory region allocated by
> >> malloc() in
> >> >>>> user
> >> >>>> > mode and mmaped memory region allocated by other memory
> allocators.
> >> >>>> userptr
> >> >>>> > interface can identify memory type through vm_flags value and
> would
> >> get
> >> >>>> > pages or page frame numbers to user space appropriately.
> >> >>>>
> >> >>>> I apologize for being a little late to jump in on this thread,
> but...
> >> >>>>
> >> >>>> I must confess to not being a huge fan of userptr.  It really is
> >> >>>> opening a can of worms, and seems best avoided if at all possible.
> >> >>>> I'm not entirely sure the use-case for which you require this, but
> I
> >> >>>> wonder if there isn't an alternative way?   I mean, the main case
> I
> >> >>>> could think of for mapping userspace memory would be something
> like
> >> >>>> texture upload.  But could that be handled in an alternative way,
> >> >>>> something like a pwrite or texture_upload API, which could
> >> temporarily
> >> >>>> pin the userspace memory, kick off a dma from that user buffer to
> a
> >> >>>> proper GEM buffer, and then unpin the user buffer when the DMA
> >> >>>> completes?
> >> >>>>
> >> >>>
> >> >>> This feature have being discussed with drm and linux-mm guys and I
> >> have
> >> >>> posted this patch four times.
> >> >>> So please, see below e-mail threads.
> >> >>> http://www.spinics.net/lists/dri-devel/msg22729.html
> >> >>>
> >> >>> and then please, give me your opinions.
> >> >>
> >> >>
> >> >> Yeah, I read that and understand that the purpose is to support
> >> >> malloc()'d memory (and btw, all the other changes like limits to the
> >> >> amount of userptr buffers are good if we do decide to support
> userptr
> >> >> buffers).  But my question was more about why do you need to support
> >> >> malloc'd buffers... other than "just because v4l2 does" ;-)
> >> >>
> >> >> I don't really like the userptr feature in v4l2 either, and view it
> as
> >> >> a necessary evil because at the time there was nothing like dmabuf.
> >> >> But now that we have dmabuf to share buffers with zero copy between
> >> >> two devices, to we *really* still need userptr?
> >> >>
> >> >
> >> > Definitely no, as I mentioned on this email thread, we are using the
> >> > userptr feature for pixman and evas
> >> > backend to use gpu acceleration directly(zero-copy). as you may know,
> >> > Evas is part of the Enlightenment Foundation Libraries(EFL) and
> >> > Elementary for pixman. all the applicaions based on them uses user
> >> > address to draw something so the backends(such as gpu accelerators)
> >> > can refer to only the user address to the buffer rendered. do you
> >> > think we can avoid memory copy to use those GPUs without such userptr
> >> > feature? I think there is no way and only the way that the gpu uses
> >> > the user address to render without memory copy. and as mentioned on
> >> > this thread, this feature had been tried by i915 in the desktop
world.
> >> > for this, you can refer to Daniel's comments.
> >>
> >> oh, yeah, for something like pixman it is hard to accelerate.  But I
> >> wonder if there is still some benefit.  I don't know as much about
> >> evas, but pixman is assumed to be a synchronous API, so you must
> >> always block until the operation completes which looses a lot of the
> >> potential benefit of a blitter.  And also I think you have no control
> >> over when the client free()'s a buffer.  So you end up having to
> >> map/unmap the buffer to the gpu/blitter/whatever every single
> >> operation.  Not to mention possible need for cache operations, etc.  I
> >> wonder if trying to accel it in hw really brings any benefit vs just
> >> using NEON?
> >>
> >
> > Yes, actually, we are using NEON backend of PIXMAN for best performance
> > also. like this, PIXMAN's NEON backend for small image and 2D GPU
> backend of
> > EXA for large image because NEON is faster than 2d gpu for small image
> but
> > not for large image.
> >
> >> I thought EFL has a GLES backend..  I could be wrong about that, but
> >> if it does probably you'd get better performance by using the GLES
> >> backend vs trying to accelerate what is intended as a sw path..
> >>
> > Right, EFL has a GLES backend and also PIXMAN for software path.
> application
> > using Elementary can use PIXMAN for software path and also GLES for
> hardware
> > acceleration. so the purpose of using userptr feature is for buffer
> drawn by
> > any user using Evas to be rendered by gpu hardware.
> 
> 
> hmm, I don't suppose it is primarily a one-way transfer (ie. like one
> time blit from sw rendered surface to texture)?  I'd be more
> comfortable w/ an API that somehow just created a temporary mapping
> that unpins the pages before the syscall returns, ie. like a
> EXYNOS_IOCTL_BLIT_FROM_USERPTR type interface, which does some blit
> from a user ptr to a GEM bo.
> 

You mean use such interface to do BitBLIT instead of using userptr? via
driver also uses this way.

> BR,
> -R
> 
> >
> > Thanks,
> > Inki Dae
> >
> >> > and as you know, note that we are already using dmabuf to share a
> >> > buffer between device drivers or v4l2 and drm framework. for this, we
> >> > also tested drm prime feature with v4l2 world and we applied umm
> >> > concept to ump for 3d gpu(known as mali) also.
> >> >
> >> > Anyway, this userptr feature is different from your thought. and
> maybe
> >> > you also need such feature for performance enhancement. Please give
> me
> >> > any idea if you have any good way instead of this userptr fearure.
> >> > there may be any good idea I don't know.
> >>
> >> well, I just try and point people to APIs that are actually designed
> >> with hw acceleration in mind in the first place (x11/EXA, GLES, etc)
> >> ;-)
> >>
> >> There missing really an API for client side 2d accel.  Well, I guess
> >> there is OpenVG, although I'm not sure if anyone actually ever used
> >> it.. maybe we do need something for better supporting 2d blitters in
> >> weston compositor, for example.
> >>
> >> BR,
> >> -R
> >>
> >>
> >> > Thanks,
> >> > Inki Dae
> >> >
> >> >> I guess if I understood better the use-case, maybe I could try to
> >> >> think of some alternatives.
> >> >>
> >> >> BR,
> >> >> -R
> >> >>
> >> >>> Thanks,
> >> >>> Inki Dae
> >> >>>
> >> >>>> BR,
> >> >>>> -R
> >> >>>>
> >> >>>> > Signed-off-by: Inki Dae <inki.dae@samsung.com>
> >> >>>> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
> >> >>>> > ---
> >> >>>> >  drivers/gpu/drm/exynos/exynos_drm_drv.c |    2 +
> >> >>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.c |  258
> >> >>>> +++++++++++++++++++++++++++++++
> >> >>>> >  drivers/gpu/drm/exynos/exynos_drm_gem.h |   13 ++-
> >> >>>> >  include/drm/exynos_drm.h                |   25 +++-
> >> >>>> >  4 files changed, 296 insertions(+), 2 deletions(-)
> >> >>>> >
> >> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> >> >>>> b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> >> >>>> > index f58a487..5bb0361 100644
> >> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
> >> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
> >> >>>> > @@ -211,6 +211,8 @@ static struct drm_ioctl_desc exynos_ioctls[]
> =
> >> {
> >> >>>> >                        DRM_AUTH),
> >> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_GEM_MMAP,
> >> >>>> >                        exynos_drm_gem_mmap_ioctl, DRM_UNLOCKED |
> >> >>> DRM_AUTH),
> >> >>>> > +       DRM_IOCTL_DEF_DRV(EXYNOS_GEM_USERPTR,
> >> >>>> > +                       exynos_drm_gem_userptr_ioctl,
> > DRM_UNLOCKED),
> >> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_PLANE_SET_ZPOS,
> >> >>>> exynos_plane_set_zpos_ioctl,
> >> >>>> >                        DRM_UNLOCKED | DRM_AUTH),
> >> >>>> >        DRM_IOCTL_DEF_DRV(EXYNOS_VIDI_CONNECTION,
> >> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> >> >>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> >> >>>> > index afd0cd4..b68d4ea 100644
> >> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
> >> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
> >> >>>> > @@ -66,6 +66,43 @@ static int check_gem_flags(unsigned int
flags)
> >> >>>> >        return 0;
> >> >>>> >  }
> >> >>>> >
> >> >>>> > +static struct vm_area_struct *get_vma(struct vm_area_struct
> *vma)
> >> >>>> > +{
> >> >>>> > +       struct vm_area_struct *vma_copy;
> >> >>>> > +
> >> >>>> > +       vma_copy = kmalloc(sizeof(*vma_copy), GFP_KERNEL);
> >> >>>> > +       if (!vma_copy)
> >> >>>> > +               return NULL;
> >> >>>> > +
> >> >>>> > +       if (vma->vm_ops && vma->vm_ops->open)
> >> >>>> > +               vma->vm_ops->open(vma);
> >> >>>> > +
> >> >>>> > +       if (vma->vm_file)
> >> >>>> > +               get_file(vma->vm_file);
> >> >>>> > +
> >> >>>> > +       memcpy(vma_copy, vma, sizeof(*vma));
> >> >>>> > +
> >> >>>> > +       vma_copy->vm_mm = NULL;
> >> >>>> > +       vma_copy->vm_next = NULL;
> >> >>>> > +       vma_copy->vm_prev = NULL;
> >> >>>> > +
> >> >>>> > +       return vma_copy;
> >> >>>> > +}
> >> >>>> > +
> >> >>>> > +static void put_vma(struct vm_area_struct *vma)
> >> >>>> > +{
> >> >>>> > +       if (!vma)
> >> >>>> > +               return;
> >> >>>> > +
> >> >>>> > +       if (vma->vm_ops && vma->vm_ops->close)
> >> >>>> > +               vma->vm_ops->close(vma);
> >> >>>> > +
> >> >>>> > +       if (vma->vm_file)
> >> >>>> > +               fput(vma->vm_file);
> >> >>>> > +
> >> >>>> > +       kfree(vma);
> >> >>>> > +}
> >> >>>> > +
> >> >>>> >  static void update_vm_cache_attr(struct exynos_drm_gem_obj
*obj,
> >> >>>> >                                        struct vm_area_struct
*vma)
> >> >>>> >  {
> >> >>>> > @@ -254,6 +291,41 @@ static void exynos_drm_gem_put_pages(struct
> >> >>>> drm_gem_object *obj)
> >> >>>> >        /* add some codes for UNCACHED type here. TODO */
> >> >>>> >  }
> >> >>>> >
> >> >>>> > +static void exynos_drm_put_userptr(struct drm_gem_object *obj)
> >> >>>> > +{
> >> >>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> >> >>>> > +       struct exynos_drm_gem_buf *buf;
> >> >>>> > +       struct vm_area_struct *vma;
> >> >>>> > +       int npages;
> >> >>>> > +
> >> >>>> > +       exynos_gem_obj = to_exynos_gem_obj(obj);
> >> >>>> > +       buf = exynos_gem_obj->buffer;
> >> >>>> > +       vma = exynos_gem_obj->vma;
> >> >>>> > +
> >> >>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) &&
(vma->vm_pgoff))
> > {
> >> >>>> > +               put_vma(exynos_gem_obj->vma);
> >> >>>> > +               goto out;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       npages = buf->size >> PAGE_SHIFT;
> >> >>>> > +
> >> >>>> > +       npages--;
> >> >>>> > +       while (npages >= 0) {
> >> >>>> > +               if (buf->write)
> >> >>>> > +                       set_page_dirty_lock(buf->pages[npages]);
> >> >>>> > +
> >> >>>> > +               put_page(buf->pages[npages]);
> >> >>>> > +               npages--;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +out:
> >> >>>> > +       kfree(buf->pages);
> >> >>>> > +       buf->pages = NULL;
> >> >>>> > +
> >> >>>> > +       kfree(buf->sgt);
> >> >>>> > +       buf->sgt = NULL;
> >> >>>> > +}
> >> >>>> > +
> >> >>>> >  static int exynos_drm_gem_handle_create(struct drm_gem_object
> > *obj,
> >> >>>> >                                        struct drm_file
*file_priv,
> >> >>>> >                                        unsigned int *handle)
> >> >>>> > @@ -293,6 +365,8 @@ void exynos_drm_gem_destroy(struct
> >> >>>> exynos_drm_gem_obj *exynos_gem_obj)
> >> >>>> >
> >> >>>> >        if (exynos_gem_obj->flags & EXYNOS_BO_NONCONTIG)
> >> >>>> >                exynos_drm_gem_put_pages(obj);
> >> >>>> > +       else if (exynos_gem_obj->flags & EXYNOS_BO_USERPTR)
> >> >>>> > +               exynos_drm_put_userptr(obj);
> >> >>>> >        else
> >> >>>> >                exynos_drm_free_buf(obj->dev,
exynos_gem_obj->flags,
> >> >>> buf);
> >> >>>> >
> >> >>>> > @@ -606,6 +680,190 @@ int exynos_drm_gem_mmap_ioctl(struct
> >> drm_device
> >> >>>> *dev, void *data,
> >> >>>> >        return 0;
> >> >>>> >  }
> >> >>>> >
> >> >>>> > +static int exynos_drm_get_userptr(struct drm_device *dev,
> >> >>>> > +                               struct exynos_drm_gem_obj *obj,
> >> >>>> > +                               unsigned long userptr,
> >> >>>> > +                               unsigned int write)
> >> >>>> > +{
> >> >>>> > +       unsigned int get_npages;
> >> >>>> > +       unsigned long npages = 0;
> >> >>>> > +       struct vm_area_struct *vma;
> >> >>>> > +       struct exynos_drm_gem_buf *buf = obj->buffer;
> >> >>>> > +
> >> >>>> > +       vma = find_vma(current->mm, userptr);
> >> >>>> > +
> >> >>>> > +       /* the memory region mmaped with VM_PFNMAP. */
> >> >>>> > +       if (vma && (vma->vm_flags & VM_PFNMAP) &&
(vma->vm_pgoff))
> > {
> >> >>>> > +               unsigned long this_pfn, prev_pfn, pa;
> >> >>>> > +               unsigned long start, end, offset;
> >> >>>> > +               struct scatterlist *sgl;
> >> >>>> > +
> >> >>>> > +               start = userptr;
> >> >>>> > +               offset = userptr & ~PAGE_MASK;
> >> >>>> > +               end = start + buf->size;
> >> >>>> > +               sgl = buf->sgt->sgl;
> >> >>>> > +
> >> >>>> > +               for (prev_pfn = 0; start < end; start +=
PAGE_SIZE)
> > {
> >> >>>> > +                       int ret = follow_pfn(vma, start,
> > &this_pfn);
> >> >>>> > +                       if (ret)
> >> >>>> > +                               return ret;
> >> >>>> > +
> >> >>>> > +                       if (prev_pfn == 0)
> >> >>>> > +                               pa = this_pfn << PAGE_SHIFT;
> >> >>>> > +                       else if (this_pfn != prev_pfn + 1)
> >> >>>> > +                               return -EFAULT;
> >> >>>> > +
> >> >>>> > +                       sg_dma_address(sgl) = (pa + offset);
> >> >>>> > +                       sg_dma_len(sgl) = PAGE_SIZE;
> >> >>>> > +                       prev_pfn = this_pfn;
> >> >>>> > +                       pa += PAGE_SIZE;
> >> >>>> > +                       npages++;
> >> >>>> > +                       sgl = sg_next(sgl);
> >> >>>> > +               }
> >> >>>> > +
> >> >>>> > +               buf->dma_addr = pa + offset;
> >> >>>> > +
> >> >>>> > +               obj->vma = get_vma(vma);
> >> >>>> > +               if (!obj->vma)
> >> >>>> > +                       return -ENOMEM;
> >> >>>> > +
> >> >>>> > +               buf->pfnmap = true;
> >> >>>> > +
> >> >>>> > +               return npages;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       buf->write = write;
> >> >>>> > +       npages = buf->size >> PAGE_SHIFT;
> >> >>>> > +
> >> >>>> > +       down_read(&current->mm->mmap_sem);
> >> >>>> > +       get_npages = get_user_pages(current, current->mm,
userptr,
> >> >>>> > +                                       npages, write, 1,
> > buf->pages,
> >> >>> NULL);
> >> >>>> > +       up_read(&current->mm->mmap_sem);
> >> >>>> > +       if (get_npages != npages)
> >> >>>> > +               DRM_ERROR("failed to get user_pages.\n");
> >> >>>> > +
> >> >>>> > +       buf->pfnmap = false;
> >> >>>> > +
> >> >>>> > +       return get_npages;
> >> >>>> > +}
> >> >>>> > +
> >> >>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void
> >> *data,
> >> >>>> > +                                     struct drm_file
*file_priv)
> >> >>>> > +{
> >> >>>> > +       struct exynos_drm_gem_obj *exynos_gem_obj;
> >> >>>> > +       struct drm_exynos_gem_userptr *args = data;
> >> >>>> > +       struct exynos_drm_gem_buf *buf;
> >> >>>> > +       struct scatterlist *sgl;
> >> >>>> > +       unsigned long size, userptr;
> >> >>>> > +       unsigned int npages;
> >> >>>> > +       int ret, get_npages;
> >> >>>> > +
> >> >>>> > +       DRM_DEBUG_KMS("%s\n", __FILE__);
> >> >>>> > +
> >> >>>> > +       if (!args->size) {
> >> >>>> > +               DRM_ERROR("invalid size.\n");
> >> >>>> > +               return -EINVAL;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       ret = check_gem_flags(args->flags);
> >> >>>> > +       if (ret)
> >> >>>> > +               return ret;
> >> >>>> > +
> >> >>>> > +       size = roundup_gem_size(args->size, EXYNOS_BO_USERPTR);
> >> >>>> > +       userptr = args->userptr;
> >> >>>> > +
> >> >>>> > +       buf = exynos_drm_init_buf(dev, size);
> >> >>>> > +       if (!buf)
> >> >>>> > +               return -ENOMEM;
> >> >>>> > +
> >> >>>> > +       exynos_gem_obj = exynos_drm_gem_init(dev, size);
> >> >>>> > +       if (!exynos_gem_obj) {
> >> >>>> > +               ret = -ENOMEM;
> >> >>>> > +               goto err_free_buffer;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       buf->sgt = kzalloc(sizeof(struct sg_table), GFP_KERNEL);
> >> >>>> > +       if (!buf->sgt) {
> >> >>>> > +               DRM_ERROR("failed to allocate buf->sgt.\n");
> >> >>>> > +               ret = -ENOMEM;
> >> >>>> > +               goto err_release_gem;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       npages = size >> PAGE_SHIFT;
> >> >>>> > +
> >> >>>> > +       ret = sg_alloc_table(buf->sgt, npages, GFP_KERNEL);
> >> >>>> > +       if (ret < 0) {
> >> >>>> > +               DRM_ERROR("failed to initailize sg table.\n");
> >> >>>> > +               goto err_free_sgt;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       buf->pages = kzalloc(npages * sizeof(struct page *),
> >> >>> GFP_KERNEL);
> >> >>>> > +       if (!buf->pages) {
> >> >>>> > +               DRM_ERROR("failed to allocate buf->pages\n");
> >> >>>> > +               ret = -ENOMEM;
> >> >>>> > +               goto err_free_table;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       exynos_gem_obj->buffer = buf;
> >> >>>> > +
> >> >>>> > +       get_npages = exynos_drm_get_userptr(dev, exynos_gem_obj,
> >> >>> userptr,
> >> >>>> 1);
> >> >>>> > +       if (get_npages != npages) {
> >> >>>> > +               DRM_ERROR("failed to get user_pages.\n");
> >> >>>> > +               ret = get_npages;
> >> >>>> > +               goto err_release_userptr;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       ret =
exynos_drm_gem_handle_create(&exynos_gem_obj->base,
> >> >>>> file_priv,
> >> >>>> > +                                               &args->handle);
> >> >>>> > +       if (ret < 0) {
> >> >>>> > +               DRM_ERROR("failed to create gem handle.\n");
> >> >>>> > +               goto err_release_userptr;
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       sgl = buf->sgt->sgl;
> >> >>>> > +
> >> >>>> > +       /*
> >> >>>> > +        * if buf->pfnmap is true then update sgl of sgt with
> pages
> >> but
> >> >>>> > +        * if buf->pfnmap is false then it means the sgl was
> > updated
> >> >>>> already
> >> >>>> > +        * so it doesn't need to update the sgl.
> >> >>>> > +        */
> >> >>>> > +       if (!buf->pfnmap) {
> >> >>>> > +               unsigned int i = 0;
> >> >>>> > +
> >> >>>> > +               /* set all pages to sg list. */
> >> >>>> > +               while (i < npages) {
> >> >>>> > +                       sg_set_page(sgl, buf->pages[i],
PAGE_SIZE,
> > 0);
> >> >>>> > +                       sg_dma_address(sgl) =
> >> >>> page_to_phys(buf->pages[i]);
> >> >>>> > +                       i++;
> >> >>>> > +                       sgl = sg_next(sgl);
> >> >>>> > +               }
> >> >>>> > +       }
> >> >>>> > +
> >> >>>> > +       /* always use EXYNOS_BO_USERPTR as memory type for
> userptr.
> >> */
> >> >>>> > +       exynos_gem_obj->flags |= EXYNOS_BO_USERPTR;
> >> >>>> > +
> >> >>>> > +       return 0;
> >> >>>> > +
> >> >>>> > +err_release_userptr:
> >> >>>> > +       get_npages--;
> >> >>>> > +       while (get_npages >= 0)
> >> >>>> > +               put_page(buf->pages[get_npages--]);
> >> >>>> > +       kfree(buf->pages);
> >> >>>> > +       buf->pages = NULL;
> >> >>>> > +err_free_table:
> >> >>>> > +       sg_free_table(buf->sgt);
> >> >>>> > +err_free_sgt:
> >> >>>> > +       kfree(buf->sgt);
> >> >>>> > +       buf->sgt = NULL;
> >> >>>> > +err_release_gem:
> >> >>>> > +       drm_gem_object_release(&exynos_gem_obj->base);
> >> >>>> > +       kfree(exynos_gem_obj);
> >> >>>> > +       exynos_gem_obj = NULL;
> >> >>>> > +err_free_buffer:
> >> >>>> > +       exynos_drm_free_buf(dev, 0, buf);
> >> >>>> > +       return ret;
> >> >>>> > +}
> >> >>>> > +
> >> >>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj)
> >> >>>> >  {
> >> >>>> >        DRM_DEBUG_KMS("%s\n", __FILE__);
> >> >>>> > diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> >> >>>> b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> >> >>>> > index efc8252..1de2241 100644
> >> >>>> > --- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
> >> >>>> > +++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
> >> >>>> > @@ -29,7 +29,8 @@
> >> >>>> >  #define to_exynos_gem_obj(x)   container_of(x,\
> >> >>>> >                        struct exynos_drm_gem_obj, base)
> >> >>>> >
> >> >>>> > -#define IS_NONCONTIG_BUFFER(f)         (f &
EXYNOS_BO_NONCONTIG)
> >> >>>> > +#define IS_NONCONTIG_BUFFER(f)         ((f &
EXYNOS_BO_NONCONTIG)
> >> ||\
> >> >>>> > +                                       (f & EXYNOS_BO_USERPTR))
> >> >>>> >
> >> >>>> >  /*
> >> >>>> >  * exynos drm gem buffer structure.
> >> >>>> > @@ -38,18 +39,23 @@
> >> >>>> >  * @dma_addr: bus address(accessed by dma) to allocated memory
> >> region.
> >> >>>> >  *     - this address could be physical address without IOMMU
and
> >> >>>> >  *     device address with IOMMU.
> >> >>>> > + * @write: whether pages will be written to by the caller.
> >> >>>> >  * @sgt: sg table to transfer page data.
> >> >>>> >  * @pages: contain all pages to allocated memory region.
> >> >>>> >  * @page_size: could be 4K, 64K or 1MB.
> >> >>>> >  * @size: size of allocated memory region.
> >> >>>> > + * @pfnmap: indicate whether memory region from userptr is
> mmaped
> >> with
> >> >>>> > + *     VM_PFNMAP or not.
> >> >>>> >  */
> >> >>>> >  struct exynos_drm_gem_buf {
> >> >>>> >        void __iomem            *kvaddr;
> >> >>>> >        dma_addr_t              dma_addr;
> >> >>>> > +       unsigned int            write;
> >> >>>> >        struct sg_table         *sgt;
> >> >>>> >        struct page             **pages;
> >> >>>> >        unsigned long           page_size;
> >> >>>> >        unsigned long           size;
> >> >>>> > +       bool                    pfnmap;
> >> >>>> >  };
> >> >>>> >
> >> >>>> >  /*
> >> >>>> > @@ -73,6 +79,7 @@ struct exynos_drm_gem_obj {
> >> >>>> >        struct drm_gem_object           base;
> >> >>>> >        struct exynos_drm_gem_buf       *buffer;
> >> >>>> >        unsigned long                   size;
> >> >>>> > +       struct vm_area_struct           *vma;
> >> >>>> >        unsigned int                    flags;
> >> >>>> >  };
> >> >>>> >
> >> >>>> > @@ -127,6 +134,10 @@ int exynos_drm_gem_map_offset_ioctl(struct
> >> >>>> drm_device *dev, void *data,
> >> >>>> >  int exynos_drm_gem_mmap_ioctl(struct drm_device *dev, void
> *data,
> >> >>>> >                              struct drm_file *file_priv);
> >> >>>> >
> >> >>>> > +/* map user space allocated by malloc to pages. */
> >> >>>> > +int exynos_drm_gem_userptr_ioctl(struct drm_device *dev, void
> >> *data,
> >> >>>> > +                                     struct drm_file
*file_priv);
> >> >>>> > +
> >> >>>> >  /* initialize gem object. */
> >> >>>> >  int exynos_drm_gem_init_object(struct drm_gem_object *obj);
> >> >>>> >
> >> >>>> > diff --git a/include/drm/exynos_drm.h b/include/drm/exynos_drm.h
> >> >>>> > index 2d6eb06..48eda6e 100644
> >> >>>> > --- a/include/drm/exynos_drm.h
> >> >>>> > +++ b/include/drm/exynos_drm.h
> >> >>>> > @@ -75,6 +75,23 @@ struct drm_exynos_gem_mmap {
> >> >>>> >  };
> >> >>>> >
> >> >>>> >  /**
> >> >>>> > + * User-requested user space importing structure
> >> >>>> > + *
> >> >>>> > + * @userptr: user space address allocated by malloc.
> >> >>>> > + * @size: size to the buffer allocated by malloc.
> >> >>>> > + * @flags: indicate user-desired cache attribute to map the
> >> allocated
> >> >>>> buffer
> >> >>>> > + *     to kernel space.
> >> >>>> > + * @handle: a returned handle to created gem object.
> >> >>>> > + *     - this handle will be set by gem module of kernel side.
> >> >>>> > + */
> >> >>>> > +struct drm_exynos_gem_userptr {
> >> >>>> > +       uint64_t userptr;
> >> >>>> > +       uint64_t size;
> >> >>>> > +       unsigned int flags;
> >> >>>> > +       unsigned int handle;
> >> >>>> > +};
> >> >>>> > +
> >> >>>> > +/**
> >> >>>> >  * A structure for user connection request of virtual display.
> >> >>>> >  *
> >> >>>> >  * @connection: indicate whether doing connetion or not by user.
> >> >>>> > @@ -105,13 +122,16 @@ enum e_drm_exynos_gem_mem_type {
> >> >>>> >        EXYNOS_BO_CACHABLE      = 1 << 1,
> >> >>>> >        /* write-combine mapping. */
> >> >>>> >        EXYNOS_BO_WC            = 1 << 2,
> >> >>>> > +       /* user space memory allocated by malloc. */
> >> >>>> > +       EXYNOS_BO_USERPTR       = 1 << 3,
> >> >>>> >        EXYNOS_BO_MASK          = EXYNOS_BO_NONCONTIG |
> >> >>> EXYNOS_BO_CACHABLE
> >> >>>> |
> >> >>>> > -                                       EXYNOS_BO_WC
> >> >>>> > +                                       EXYNOS_BO_WC |
> > EXYNOS_BO_USERPTR
> >> >>>> >  };
> >> >>>> >
> >> >>>> >  #define DRM_EXYNOS_GEM_CREATE          0x00
> >> >>>> >  #define DRM_EXYNOS_GEM_MAP_OFFSET      0x01
> >> >>>> >  #define DRM_EXYNOS_GEM_MMAP            0x02
> >> >>>> > +#define DRM_EXYNOS_GEM_USERPTR         0x03
> >> >>>> >  /* Reserved 0x03 ~ 0x05 for exynos specific gem ioctl */
> >> >>>> >  #define DRM_EXYNOS_PLANE_SET_ZPOS      0x06
> >> >>>> >  #define DRM_EXYNOS_VIDI_CONNECTION     0x07
> >> >>>> > @@ -125,6 +145,9 @@ enum e_drm_exynos_gem_mem_type {
> >> >>>> >  #define DRM_IOCTL_EXYNOS_GEM_MMAP    
 DRM_IOWR(DRM_COMMAND_BASE
> +
> >> \
> >> >>>> >                DRM_EXYNOS_GEM_MMAP, struct drm_exynos_gem_mmap)
> >> >>>> >
> >> >>>> > +#define
> DRM_IOCTL_EXYNOS_GEM_USERPTR   DRM_IOWR(DRM_COMMAND_BASE +
> >> \
> >> >>>> > +               DRM_EXYNOS_GEM_USERPTR, struct
> > drm_exynos_gem_userptr)
> >> >>>> > +
> >> >>>> >  #define DRM_IOCTL_EXYNOS_PLANE_SET_ZPOS
> >> >>>  DRM_IOWR(DRM_COMMAND_BASE
> >> >>>> + \
> >> >>>> >                DRM_EXYNOS_PLANE_SET_ZPOS, struct
> >> >>> drm_exynos_plane_set_zpos)
> >> >>>> >
> >> >>>> > --
> >> >>>> > 1.7.4.1
> >> >>>> >
> >> >>>> > _______________________________________________
> >> >>>> > dri-devel mailing list
> >> >>>> > dri-devel@lists.freedesktop.org
> >> >>>> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >> >>>
> >> >>> _______________________________________________
> >> >>> dri-devel mailing list
> >> >>> dri-devel@lists.freedesktop.org
> >> >>> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >> >> _______________________________________________
> >> >> dri-devel mailing list
> >> >> dri-devel@lists.freedesktop.org
> >> >> http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >> > _______________________________________________
> >> > dri-devel mailing list
> >> > dri-devel@lists.freedesktop.org
> >> > http://lists.freedesktop.org/mailman/listinfo/dri-devel
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* Re: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-16 10:20       ` Inki Dae
@ 2012-05-16 12:12         ` Rob Clark
  2012-05-16 13:27           ` Inki Dae
  0 siblings, 1 reply; 77+ messages in thread
From: Rob Clark @ 2012-05-16 12:12 UTC (permalink / raw)
  To: Inki Dae; +Cc: sw0312.kim, kyungmin.park, dri-devel

On Wed, May 16, 2012 at 4:20 AM, Inki Dae <inki.dae@samsung.com> wrote:
>
>
>> -----Original Message-----
>> From: Dave Airlie [mailto:airlied@gmail.com]
>> Sent: Wednesday, May 16, 2012 6:23 PM
>> To: Rob Clark
>> Cc: Inki Dae; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
>> devel@lists.freedesktop.org
>> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
>>
>> On Tue, May 15, 2012 at 8:34 AM, Rob Clark <rob.clark@linaro.org> wrote:
>> > On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com> wrote:
>> >> this feature could be used to use memory region allocated by malloc()
>> in user
>> >> mode and mmaped memory region allocated by other memory allocators.
>> userptr
>> >> interface can identify memory type through vm_flags value and would get
>> >> pages or page frame numbers to user space appropriately.
>> >
>> > I apologize for being a little late to jump in on this thread, but...
>> >
>> > I must confess to not being a huge fan of userptr.  It really is
>> > opening a can of worms, and seems best avoided if at all possible.
>> > I'm not entirely sure the use-case for which you require this, but I
>> > wonder if there isn't an alternative way?   I mean, the main case I
>> > could think of for mapping userspace memory would be something like
>> > texture upload.  But could that be handled in an alternative way,
>> > something like a pwrite or texture_upload API, which could temporarily
>> > pin the userspace memory, kick off a dma from that user buffer to a
>> > proper GEM buffer, and then unpin the user buffer when the DMA
>> > completes?
>>
>> I'm with Rob on this, I really hate the userptr idea, and my problem
>> with letting it into exynos is it sets a benchmark for others to do
>> things the same way. I'm still not 100% sure how its going to be used
>> even with all your explainations.
>>
>> Since we've agreed only the X server can access the interface, it
>> makes 0 sense to me to exist at all, as the X server can avoid malloc
>> memory for all objects it accesses.
>>
>> I don't think pixman is at the level where you should be acceleration
>> it directly. I thought the point of pixman was a fast SW engine, not
>> something to be trunked down to a hw engine. The idea being you use
>> cairo and backend it onto something.
>>
>
> For more understanding, PIXMAN draws something on shmem and next id of the
> shmem is sent to X and next X gets user address to the id and next EXA's gpu
> backend does BitBLT using gpu hardware. However, the gpu backend doesn't
> aware of the user address so the purpose of using userptr is to import the
> shmem into a gem object for gpu to aware of the memory region as source. so
> pixman would use SW engine as is.

if this is all just for X/EXA, wouldn't it make more sense to handle
the operation at the EXA layer before falling back to sw (which would
go to pixman)?

>> I know ssp had some ideas for making pixman be able to do hw accel,
>> but userptr doesn't seem like the proper solution, it seems like a
>> hack that needs a lot more VM work to operate properly, and by setting
>> a precedent for one GPU driver, I'll have 20 implementations of this
>> from ARM vendors and nobody will ever go back and fix things properly.
>>
>
> I'm not sure that this is a hack or not but thing similar to this is being
> used at via driver. you can refer to via_lock_all_dma_pages function of
> via_dmablit.c file and this driver also uses get_user_pages() to lock all
> the pages to the user space for DMA to access the memory region, maybe for
> BitBLT. And I really wonder you think just using get_user_pages() is a hack
> or importing user address into a gem object. there may be my
> misunderstanding so give me any comments.

My objection is not really the get_user_pages(), but rather converting
that into a GEM object, which might exist for longer period of time,
be exported to dmabuf and passed to other driver, etc.  (Not to
mention you don't really have control if userspace does free(ptr)
while the GEM object still exists..)

>From a quick look at the via code, it appears to return while the blit
is still queued, which I don't completely like but at least the
pinning and hw use of the userspace buffer is just temporary and not
able to exist for an indefinite amount of time.

BR,
-R

> Thanks,
> Inki Dae
>
>> So I'm really not sure the best way to move this forward, maybe a very
>> clear set of use cases of where stuff plugs into this, and why dma-buf
>> or some other method isn't sufficient, but I'm having trouble getting
>> past the fact its setting a dangerous precedent.
>>
>> Dave.
>
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

* RE: [PATCH 3/4] drm/exynos: added userptr feature.
  2012-05-16 12:12         ` Rob Clark
@ 2012-05-16 13:27           ` Inki Dae
  0 siblings, 0 replies; 77+ messages in thread
From: Inki Dae @ 2012-05-16 13:27 UTC (permalink / raw)
  To: 'Rob Clark'; +Cc: sw0312.kim, kyungmin.park, dri-devel



> -----Original Message-----
> From: robdclark@gmail.com [mailto:robdclark@gmail.com] On Behalf Of Rob
> Clark
> Sent: Wednesday, May 16, 2012 9:13 PM
> To: Inki Dae
> Cc: Dave Airlie; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
> devel@lists.freedesktop.org
> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> 
> On Wed, May 16, 2012 at 4:20 AM, Inki Dae <inki.dae@samsung.com> wrote:
> >
> >
> >> -----Original Message-----
> >> From: Dave Airlie [mailto:airlied@gmail.com]
> >> Sent: Wednesday, May 16, 2012 6:23 PM
> >> To: Rob Clark
> >> Cc: Inki Dae; kyungmin.park@samsung.com; sw0312.kim@samsung.com; dri-
> >> devel@lists.freedesktop.org
> >> Subject: Re: [PATCH 3/4] drm/exynos: added userptr feature.
> >>
> >> On Tue, May 15, 2012 at 8:34 AM, Rob Clark <rob.clark@linaro.org>
wrote:
> >> > On Mon, Apr 23, 2012 at 7:43 AM, Inki Dae <inki.dae@samsung.com>
> wrote:
> >> >> this feature could be used to use memory region allocated by
malloc()
> >> in user
> >> >> mode and mmaped memory region allocated by other memory allocators.
> >> userptr
> >> >> interface can identify memory type through vm_flags value and would
> get
> >> >> pages or page frame numbers to user space appropriately.
> >> >
> >> > I apologize for being a little late to jump in on this thread, but...
> >> >
> >> > I must confess to not being a huge fan of userptr.  It really is
> >> > opening a can of worms, and seems best avoided if at all possible.
> >> > I'm not entirely sure the use-case for which you require this, but I
> >> > wonder if there isn't an alternative way?   I mean, the main case I
> >> > could think of for mapping userspace memory would be something like
> >> > texture upload.  But could that be handled in an alternative way,
> >> > something like a pwrite or texture_upload API, which could
> temporarily
> >> > pin the userspace memory, kick off a dma from that user buffer to a
> >> > proper GEM buffer, and then unpin the user buffer when the DMA
> >> > completes?
> >>
> >> I'm with Rob on this, I really hate the userptr idea, and my problem
> >> with letting it into exynos is it sets a benchmark for others to do
> >> things the same way. I'm still not 100% sure how its going to be used
> >> even with all your explainations.
> >>
> >> Since we've agreed only the X server can access the interface, it
> >> makes 0 sense to me to exist at all, as the X server can avoid malloc
> >> memory for all objects it accesses.
> >>
> >> I don't think pixman is at the level where you should be acceleration
> >> it directly. I thought the point of pixman was a fast SW engine, not
> >> something to be trunked down to a hw engine. The idea being you use
> >> cairo and backend it onto something.
> >>
> >
> > For more understanding, PIXMAN draws something on shmem and next id of
> the
> > shmem is sent to X and next X gets user address to the id and next EXA's
> gpu
> > backend does BitBLT using gpu hardware. However, the gpu backend doesn't
> > aware of the user address so the purpose of using userptr is to import
> the
> > shmem into a gem object for gpu to aware of the memory region as source.
> so
> > pixman would use SW engine as is.
> 
> if this is all just for X/EXA, wouldn't it make more sense to handle
> the operation at the EXA layer before falling back to sw (which would
> go to pixman)?
> 

Note that PIXMAN is also used as backend of Evas amd the shmem is sent to
EXA backend of X to use gpu.(not same process)

> >> I know ssp had some ideas for making pixman be able to do hw accel,
> >> but userptr doesn't seem like the proper solution, it seems like a
> >> hack that needs a lot more VM work to operate properly, and by setting
> >> a precedent for one GPU driver, I'll have 20 implementations of this
> >> from ARM vendors and nobody will ever go back and fix things properly.
> >>
> >
> > I'm not sure that this is a hack or not but thing similar to this is
> being
> > used at via driver. you can refer to via_lock_all_dma_pages function of
> > via_dmablit.c file and this driver also uses get_user_pages() to lock
> all
> > the pages to the user space for DMA to access the memory region, maybe
> for
> > BitBLT. And I really wonder you think just using get_user_pages() is a
> hack
> > or importing user address into a gem object. there may be my
> > misunderstanding so give me any comments.
> 
> My objection is not really the get_user_pages(), but rather converting
> that into a GEM object, which might exist for longer period of time,
> be exported to dmabuf and passed to other driver, etc.  (Not to
> mention you don't really have control if userspace does free(ptr)
> while the GEM object still exists..)
> 

Hm...Good point. ok, got it. I gonna stop it forward. Instead, we will
follow the via way for next so do you agree that the via way has no any
problem?

Thanks,
Inki Dae

> From a quick look at the via code, it appears to return while the blit
> is still queued, which I don't completely like but at least the
> pinning and hw use of the userspace buffer is just temporary and not
> able to exist for an indefinite amount of time.
> 
> BR,
> -R
> 
> > Thanks,
> > Inki Dae
> >
> >> So I'm really not sure the best way to move this forward, maybe a very
> >> clear set of use cases of where stuff plugs into this, and why dma-buf
> >> or some other method isn't sufficient, but I'm having trouble getting
> >> past the fact its setting a dangerous precedent.
> >>
> >> Dave.
> >
> > _______________________________________________
> > dri-devel mailing list
> > dri-devel@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 77+ messages in thread

end of thread, other threads:[~2012-05-16 13:27 UTC | newest]

Thread overview: 77+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-23 13:43 [PATCH 0/4] updated exynos-drm-next Inki Dae
2012-04-23 13:43 ` [PATCH 1/4] drm/exynos: added cache attribute support for gem Inki Dae
2012-04-23 13:43 ` [PATCH 2/4] drm/exynos: added drm prime feature Inki Dae
2012-04-23 13:43 ` [PATCH 3/4] drm/exynos: added userptr feature Inki Dae
2012-04-24  5:17   ` [PATCH v2 " Inki Dae
2012-04-25 10:15     ` Dave Airlie
2012-04-25 12:46       ` InKi Dae
2012-05-05 10:19       ` daeinki
2012-05-05 10:22         ` Dave Airlie
2012-05-07 18:18           ` Jerome Glisse
2012-05-08  7:59             ` Inki Dae
2012-05-08 15:05               ` Jerome Glisse
2012-05-08  6:48           ` Inki Dae
2012-05-09  6:17   ` [PATCH 0/2 v3] " Inki Dae
2012-05-09  6:17     ` [PATCH 1/2 v3] drm/exynos: added userptr limit ioctl Inki Dae
2012-05-09  6:17     ` [PATCH 2/2 v3] drm/exynos: added userptr feature Inki Dae
2012-05-09 14:45       ` Jerome Glisse
2012-05-09 18:32         ` Jerome Glisse
2012-05-10  2:44           ` Inki Dae
2012-05-10 15:05             ` Jerome Glisse
2012-05-10 15:31               ` Daniel Vetter
2012-05-10 15:31                 ` Daniel Vetter
2012-05-10 15:52                 ` Jerome Glisse
2012-05-11  1:47                   ` Inki Dae
2012-05-11  2:08                     ` Minchan Kim
2012-05-10  1:39         ` Inki Dae
2012-05-10  4:58           ` Minchan Kim
2012-05-10  6:53             ` KOSAKI Motohiro
2012-05-10  7:27               ` Minchan Kim
2012-05-10  7:31                 ` Kyungmin Park
2012-05-10  7:56                   ` Minchan Kim
2012-05-10  7:58                     ` Minchan Kim
2012-05-10  6:57             ` Inki Dae
2012-05-10  7:05               ` Minchan Kim
2012-05-10  7:59                 ` InKi Dae
2012-05-10  8:11                   ` Minchan Kim
2012-05-10  8:44                     ` Inki Dae
2012-05-10 17:53                       ` KOSAKI Motohiro
2012-05-11  0:50                         ` Minchan Kim
2012-05-11  2:51                           ` KOSAKI Motohiro
2012-05-11  3:01                             ` Jerome Glisse
2012-05-11 21:20                               ` KOSAKI Motohiro
2012-05-11 22:22                                 ` Jerome Glisse
2012-05-11 22:59                                   ` KOSAKI Motohiro
2012-05-11 22:59                                     ` KOSAKI Motohiro
2012-05-11 23:29                                     ` Jerome Glisse
2012-05-11 23:39                                       ` KOSAKI Motohiro
2012-05-12  4:48                                         ` InKi Dae
2012-05-14  4:29                                           ` Minchan Kim
2012-05-14  4:29                                             ` Minchan Kim
2012-05-14  6:17     ` [PATCH 0/2 v4] " Inki Dae
2012-05-14  6:17       ` [PATCH 1/2 v4] drm/exynos: added userptr limit ioctl Inki Dae
2012-05-14  8:12         ` Inki Dae
2012-05-14  6:17       ` [PATCH 2/2 v4] drm/exynos: added userptr feature Inki Dae
2012-05-14  6:33         ` KOSAKI Motohiro
2012-05-14  6:52           ` Inki Dae
2012-05-14  7:04             ` KOSAKI Motohiro
2012-05-14  7:21               ` Inki Dae
2012-05-14  8:13               ` Inki Dae
2012-05-14 19:26         ` Jerome Glisse
2012-05-15  4:33           ` Inki Dae
2012-05-15 14:31             ` Jerome Glisse
2012-05-16  8:49               ` Inki Dae
2012-05-14  8:12       ` [PATCH 0/2 " Inki Dae
2012-05-15  7:34   ` [PATCH 3/4] " Rob Clark
2012-05-15  8:17     ` Inki Dae
2012-05-15  9:35       ` Rob Clark
2012-05-15 13:40         ` InKi Dae
2012-05-15 14:28           ` Rob Clark
2012-05-16  6:04             ` Inki Dae
2012-05-16  8:42               ` Rob Clark
2012-05-16 10:30                 ` Inki Dae
2012-05-16  9:22     ` Dave Airlie
2012-05-16 10:20       ` Inki Dae
2012-05-16 12:12         ` Rob Clark
2012-05-16 13:27           ` Inki Dae
2012-04-23 13:43 ` [PATCH 4/4] drm/exynos: added a feature to get gem buffer information Inki Dae

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.