All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT
@ 2016-01-26 14:53 Tvrtko Ursulin
  2016-01-26 14:53 ` [RFC 1/5] drm: Allow drivers setting vm_ops per vma offset node Tvrtko Ursulin
                   ` (6 more replies)
  0 siblings, 7 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 14:53 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

I had this code laying around from an abandoned project and decided to float it
in case someone can see the usefulness of it. Assuming the approach is even
remotely reasonable.

Currently the driver implements two ioctls to implement mmap(2) functionality.
Between the i915_gem_mmap_ioctl being not the right way to do it (it makes using
Valgrind impossible etc.), and i915_gem_mmap_gtt only supporting mapping via
GTT, there is an obvious way to improve things.

So this patch series tries to do that by extending the proper ioctl to support
mapping via CPU.

Approach I took was to extend the DRM offset manager to support assigning
vm operations with each node, instead of only having one set per-driver.

Won't spend too much time describing it unless there is real interest to
develop this or something like this further and the approach is at least
generally okay-ish.

Tvrtko Ursulin (5):
  drm: Allow drivers setting vm_ops per vma offset node
  drm/i915: Extract code mapping errno to vm fault code
  drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  drm/i915: Add support for write-combined CPU mapping to
    DRM_IOCTL_I915_GEM_MMAP_GTT
  drm/i915: Announce the new DRM_IOCTL_I915_GEM_MMAP_GTT capabilities

 drivers/gpu/drm/drm_gem.c         |  53 ++++++----
 drivers/gpu/drm/drm_vma_manager.c |   1 +
 drivers/gpu/drm/i915/i915_dma.c   |   3 +
 drivers/gpu/drm/i915/i915_gem.c   | 207 +++++++++++++++++++++++++++++---------
 include/drm/drm_vma_manager.h     |  13 +++
 include/uapi/drm/i915_drm.h       |   5 +
 6 files changed, 216 insertions(+), 66 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC 1/5] drm: Allow drivers setting vm_ops per vma offset node
  2016-01-26 14:53 [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
@ 2016-01-26 14:53 ` Tvrtko Ursulin
  2016-01-26 14:53 ` [RFC 2/5] drm/i915: Extract code mapping errno to vm fault code Tvrtko Ursulin
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 14:53 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This allows drivers to implement per-object mmap(2) strategies.

If not set via the drm_vma_node_set_vm_ops helper, driver
default vm_ops are used preserving compatibility with the
existing code base.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/drm_gem.c         | 53 ++++++++++++++++++++++++---------------
 drivers/gpu/drm/drm_vma_manager.c |  1 +
 include/drm/drm_vma_manager.h     | 13 ++++++++++
 3 files changed, 47 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index 2e8c77e71e1f..90d9c1141af3 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -824,6 +824,31 @@ void drm_gem_vm_close(struct vm_area_struct *vma)
 }
 EXPORT_SYMBOL(drm_gem_vm_close);
 
+static int drm_gem_mmap_obj_ops(struct drm_gem_object *obj,
+				unsigned long obj_size,
+				const struct vm_operations_struct *vm_ops,
+				struct vm_area_struct *vma)
+{
+	/* Check for valid size. */
+	if (obj_size < vma->vm_end - vma->vm_start)
+		return -EINVAL;
+
+	vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
+	vma->vm_ops = vm_ops;
+	vma->vm_private_data = obj;
+	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+
+	/* Take a ref for this mapping of the object, so that the fault
+	 * handler can dereference the mmap offset's pointer to the object.
+	 * This reference is cleaned up by the corresponding vm_close
+	 * (which should happen whether the vma was created by this call, or
+	 * by a vm_open due to mremap or partial unmap or whatever).
+	 */
+	drm_gem_object_reference(obj);
+
+	return 0;
+}
+
 /**
  * drm_gem_mmap_obj - memory map a GEM object
  * @obj: the GEM object to map
@@ -853,27 +878,11 @@ int drm_gem_mmap_obj(struct drm_gem_object *obj, unsigned long obj_size,
 {
 	struct drm_device *dev = obj->dev;
 
-	/* Check for valid size. */
-	if (obj_size < vma->vm_end - vma->vm_start)
-		return -EINVAL;
-
 	if (!dev->driver->gem_vm_ops)
 		return -EINVAL;
 
-	vma->vm_flags |= VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP;
-	vma->vm_ops = dev->driver->gem_vm_ops;
-	vma->vm_private_data = obj;
-	vma->vm_page_prot = pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
-
-	/* Take a ref for this mapping of the object, so that the fault
-	 * handler can dereference the mmap offset's pointer to the object.
-	 * This reference is cleaned up by the corresponding vm_close
-	 * (which should happen whether the vma was created by this call, or
-	 * by a vm_open due to mremap or partial unmap or whatever).
-	 */
-	drm_gem_object_reference(obj);
-
-	return 0;
+	return drm_gem_mmap_obj_ops(obj, obj_size, dev->driver->gem_vm_ops,
+				    vma);
 }
 EXPORT_SYMBOL(drm_gem_mmap_obj);
 
@@ -898,6 +907,7 @@ int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
 	struct drm_device *dev = priv->minor->dev;
 	struct drm_gem_object *obj = NULL;
 	struct drm_vma_offset_node *node;
+	const struct vm_operations_struct *vm_ops;
 	int ret;
 
 	if (drm_device_is_unplugged(dev))
@@ -932,8 +942,11 @@ int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
 		return -EACCES;
 	}
 
-	ret = drm_gem_mmap_obj(obj, drm_vma_node_size(node) << PAGE_SHIFT,
-			       vma);
+	vm_ops = node->vm_ops;
+	if (!vm_ops)
+		vm_ops = dev->driver->gem_vm_ops;
+	ret = drm_gem_mmap_obj_ops(obj, drm_vma_node_size(node) << PAGE_SHIFT,
+				   vm_ops, vma);
 
 	drm_gem_object_unreference_unlocked(obj);
 
diff --git a/drivers/gpu/drm/drm_vma_manager.c b/drivers/gpu/drm/drm_vma_manager.c
index 2f2ecde8285b..e73776757554 100644
--- a/drivers/gpu/drm/drm_vma_manager.c
+++ b/drivers/gpu/drm/drm_vma_manager.c
@@ -265,6 +265,7 @@ void drm_vma_offset_remove(struct drm_vma_offset_manager *mgr,
 		rb_erase(&node->vm_rb, &mgr->vm_addr_space_rb);
 		drm_mm_remove_node(&node->vm_node);
 		memset(&node->vm_node, 0, sizeof(node->vm_node));
+		node->vm_ops = NULL;
 	}
 
 	write_unlock(&mgr->vm_lock);
diff --git a/include/drm/drm_vma_manager.h b/include/drm/drm_vma_manager.h
index 2f63dd5e05eb..78f02911a5c5 100644
--- a/include/drm/drm_vma_manager.h
+++ b/include/drm/drm_vma_manager.h
@@ -42,6 +42,7 @@ struct drm_vma_offset_node {
 	struct drm_mm_node vm_node;
 	struct rb_node vm_rb;
 	struct rb_root vm_files;
+	const struct vm_operations_struct *vm_ops;
 };
 
 struct drm_vma_offset_manager {
@@ -244,4 +245,16 @@ static inline int drm_vma_node_verify_access(struct drm_vma_offset_node *node,
 	return drm_vma_node_is_allowed(node, filp) ? 0 : -EACCES;
 }
 
+static inline int
+drm_vma_node_set_vm_ops(struct drm_vma_offset_node *node,
+			const struct vm_operations_struct *vm_ops)
+{
+	if (node->vm_ops && node->vm_ops != vm_ops)
+		return -ENODEV;
+
+	node->vm_ops = vm_ops;
+
+	return 0;
+}
+
 #endif /* __DRM_VMA_MANAGER_H__ */
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC 2/5] drm/i915: Extract code mapping errno to vm fault code
  2016-01-26 14:53 [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
  2016-01-26 14:53 ` [RFC 1/5] drm: Allow drivers setting vm_ops per vma offset node Tvrtko Ursulin
@ 2016-01-26 14:53 ` Tvrtko Ursulin
  2016-01-26 15:18   ` Chris Wilson
  2016-01-26 14:53 ` [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 14:53 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Will be used from multiple callers in a following patch.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 91 ++++++++++++++++++++++-------------------
 1 file changed, 49 insertions(+), 42 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index af15c290c71d..dacf6a0013c5 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1772,6 +1772,53 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+static int
+i915_gem_ret_to_vm_ret(struct drm_i915_private *dev_priv, int ret)
+{
+	switch (ret) {
+	case -EIO:
+		/*
+		 * We eat errors when the gpu is terminally wedged to avoid
+		 * userspace unduly crashing (gl has no provisions for mmaps to
+		 * fail). But any other -EIO isn't ours (e.g. swap in failure)
+		 * and so needs to be reported.
+		 */
+		if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
+			ret = VM_FAULT_SIGBUS;
+			break;
+		}
+	case -EAGAIN:
+		/*
+		 * EAGAIN means the gpu is hung and we'll wait for the error
+		 * handler to reset everything when re-faulting in
+		 * i915_mutex_lock_interruptible.
+		 */
+	case 0:
+	case -ERESTARTSYS:
+	case -EINTR:
+	case -EBUSY:
+		/*
+		 * EBUSY is ok: this just means that another thread
+		 * already did the job.
+		 */
+		ret = VM_FAULT_NOPAGE;
+		break;
+	case -ENOMEM:
+		ret = VM_FAULT_OOM;
+		break;
+	case -ENOSPC:
+	case -EFAULT:
+		ret = VM_FAULT_SIGBUS;
+		break;
+	default:
+		WARN_ONCE(ret, "unhandled error in page fault\n");
+		ret = VM_FAULT_SIGBUS;
+		break;
+	}
+
+	return ret;
+}
+
 /**
  * i915_gem_fault - fault a page into the GTT
  * @vma: VMA in question
@@ -1902,49 +1949,9 @@ unpin:
 unlock:
 	mutex_unlock(&dev->struct_mutex);
 out:
-	switch (ret) {
-	case -EIO:
-		/*
-		 * We eat errors when the gpu is terminally wedged to avoid
-		 * userspace unduly crashing (gl has no provisions for mmaps to
-		 * fail). But any other -EIO isn't ours (e.g. swap in failure)
-		 * and so needs to be reported.
-		 */
-		if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
-			ret = VM_FAULT_SIGBUS;
-			break;
-		}
-	case -EAGAIN:
-		/*
-		 * EAGAIN means the gpu is hung and we'll wait for the error
-		 * handler to reset everything when re-faulting in
-		 * i915_mutex_lock_interruptible.
-		 */
-	case 0:
-	case -ERESTARTSYS:
-	case -EINTR:
-	case -EBUSY:
-		/*
-		 * EBUSY is ok: this just means that another thread
-		 * already did the job.
-		 */
-		ret = VM_FAULT_NOPAGE;
-		break;
-	case -ENOMEM:
-		ret = VM_FAULT_OOM;
-		break;
-	case -ENOSPC:
-	case -EFAULT:
-		ret = VM_FAULT_SIGBUS;
-		break;
-	default:
-		WARN_ONCE(ret, "unhandled error in i915_gem_fault: %i\n", ret);
-		ret = VM_FAULT_SIGBUS;
-		break;
-	}
-
 	intel_runtime_pm_put(dev_priv);
-	return ret;
+
+	return i915_gem_ret_to_vm_ret(dev_priv, ret);
 }
 
 /**
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 14:53 [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
  2016-01-26 14:53 ` [RFC 1/5] drm: Allow drivers setting vm_ops per vma offset node Tvrtko Ursulin
  2016-01-26 14:53 ` [RFC 2/5] drm/i915: Extract code mapping errno to vm fault code Tvrtko Ursulin
@ 2016-01-26 14:53 ` Tvrtko Ursulin
  2016-01-26 15:10   ` Chris Wilson
  2016-01-27 15:21   ` [PATCH v2 " Tvrtko Ursulin
  2016-01-26 14:53 ` [RFC 4/5] drm/i915: Add support for write-combined " Tvrtko Ursulin
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 14:53 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 96 ++++++++++++++++++++++++++++++++++++++---
 include/uapi/drm/i915_drm.h     |  3 ++
 2 files changed, 93 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index dacf6a0013c5..039d55a49fc6 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1954,6 +1954,60 @@ out:
 	return i915_gem_ret_to_vm_ret(dev_priv, ret);
 }
 
+static int
+i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
+	pgoff_t page_offset;
+	struct page *page;
+	int ret;
+
+	/* We don't use vmf->pgoff since that has the fake offset */
+	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
+			PAGE_SHIFT;
+
+	trace_i915_gem_object_fault(obj, page_offset, true, write);
+
+	intel_runtime_pm_get(dev_priv);
+
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		goto out;
+
+	ret = i915_gem_object_set_to_cpu_domain(obj, write);
+	if (ret)
+		goto out_unlock;
+
+	ret = i915_gem_object_get_pages(obj);
+	if (ret)
+		goto out_unlock;
+
+	page = i915_gem_object_get_page(obj, page_offset);
+	if (!page) {
+		ret = -ERANGE;
+		goto out_unlock;
+	}
+
+	mutex_unlock(&dev->struct_mutex);
+
+	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
+			    page_to_pfn(page));
+
+	intel_runtime_pm_put(dev_priv);
+
+	return i915_gem_ret_to_vm_ret(dev_priv, ret);
+
+out_unlock:
+	mutex_unlock(&dev->struct_mutex);
+out:
+	intel_runtime_pm_put(dev_priv);
+
+	return i915_gem_ret_to_vm_ret(dev_priv, ret);
+}
+
 /**
  * i915_gem_release_mmap - remove physical page mappings
  * @obj: obj in question
@@ -2078,11 +2132,18 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
 	drm_gem_free_mmap_offset(&obj->base);
 }
 
-int
-i915_gem_mmap_gtt(struct drm_file *file,
-		  struct drm_device *dev,
-		  uint32_t handle,
-		  uint64_t *offset)
+static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
+	.fault = i915_gem_cpu_fault,
+	.open = drm_gem_vm_open,
+	.close = drm_gem_vm_close,
+};
+
+static int
+i915_gem_mmap(struct drm_file *file,
+	      struct drm_device *dev,
+	      uint32_t handle,
+	      uint32_t flags,
+	      uint64_t *offset)
 {
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -2103,10 +2164,23 @@ i915_gem_mmap_gtt(struct drm_file *file,
 		goto out;
 	}
 
+	if (!obj->base.filp && (flags & I915_MMAP2_CPU)) {
+		DRM_DEBUG("Attempting to mmap non-shm based object via CPU!\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
 	ret = i915_gem_object_create_mmap_offset(obj);
 	if (ret)
 		goto out;
 
+	if (flags & I915_MMAP2_CPU) {
+		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
+					      &i915_gem_cpu_vm_ops);
+		if (ret)
+			goto out;
+	}
+
 	*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
 
 out:
@@ -2116,6 +2190,15 @@ unlock:
 	return ret;
 }
 
+int
+i915_gem_mmap_gtt(struct drm_file *file,
+		  struct drm_device *dev,
+		  uint32_t handle,
+		  uint64_t *offset)
+{
+	return i915_gem_mmap(file, dev, handle, 0, offset);
+}
+
 /**
  * i915_gem_mmap_gtt_ioctl - prepare an object for GTT mmap'ing
  * @dev: DRM device
@@ -2137,7 +2220,8 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_mmap_gtt *args = data;
 
-	return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset);
+	return i915_gem_mmap(file, dev, args->handle, args->flags,
+			     &args->offset);
 }
 
 /* Immediately discard the backing storage */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 6a19371391fa..359a36d604bb 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -528,6 +528,9 @@ struct drm_i915_gem_mmap_gtt {
 	 * This is a fixed-size type for 32/64 compatibility.
 	 */
 	__u64 offset;
+
+#define I915_MMAP2_CPU   0x1
+	__u64 flags;
 };
 
 struct drm_i915_gem_set_domain {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC 4/5] drm/i915: Add support for write-combined CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 14:53 [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
                   ` (2 preceding siblings ...)
  2016-01-26 14:53 ` [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
@ 2016-01-26 14:53 ` Tvrtko Ursulin
  2016-01-26 15:11   ` Chris Wilson
  2016-01-27 15:22   ` [PATCH v2 " Tvrtko Ursulin
  2016-01-26 14:53 ` [RFC 5/5] drm/i915: Announce the new DRM_IOCTL_I915_GEM_MMAP_GTT capabilities Tvrtko Ursulin
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 14:53 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 36 ++++++++++++++++++++++++++++++------
 include/uapi/drm/i915_drm.h     |  3 ++-
 2 files changed, 32 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 039d55a49fc6..5b805cda9b52 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1954,8 +1954,8 @@ out:
 	return i915_gem_ret_to_vm_ret(dev_priv, ret);
 }
 
-static int
-i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+static inline int
+__i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf, bool wc)
 {
 	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
 	struct drm_device *dev = obj->base.dev;
@@ -1996,6 +1996,10 @@ i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
 			    page_to_pfn(page));
 
+	if (ret == 0 && wc)
+		vma->vm_page_prot =
+			pgprot_writecombine(vm_get_page_prot(vma->vm_flags));
+
 	intel_runtime_pm_put(dev_priv);
 
 	return i915_gem_ret_to_vm_ret(dev_priv, ret);
@@ -2008,6 +2012,18 @@ out:
 	return i915_gem_ret_to_vm_ret(dev_priv, ret);
 }
 
+static int
+i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	return __i915_gem_cpu_fault(vma, vmf, false);
+}
+
+static int
+i915_gem_cpu_wc_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	return __i915_gem_cpu_fault(vma, vmf, true);
+}
+
 /**
  * i915_gem_release_mmap - remove physical page mappings
  * @obj: obj in question
@@ -2138,6 +2154,12 @@ static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
 	.close = drm_gem_vm_close,
 };
 
+static const struct vm_operations_struct i915_gem_cpu_wc_vm_ops = {
+	.fault = i915_gem_cpu_wc_fault,
+	.open = drm_gem_vm_open,
+	.close = drm_gem_vm_close,
+};
+
 static int
 i915_gem_mmap(struct drm_file *file,
 	      struct drm_device *dev,
@@ -2174,12 +2196,14 @@ i915_gem_mmap(struct drm_file *file,
 	if (ret)
 		goto out;
 
-	if (flags & I915_MMAP2_CPU) {
+	if (flags & I915_MMAP2_CPU_WC)
+		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
+					      &i915_gem_cpu_wc_vm_ops);
+	else if (flags & I915_MMAP2_CPU)
 		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
 					      &i915_gem_cpu_vm_ops);
-		if (ret)
-			goto out;
-	}
+	if (ret)
+		goto out;
 
 	*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 359a36d604bb..05e74c7aef4f 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -529,7 +529,8 @@ struct drm_i915_gem_mmap_gtt {
 	 */
 	__u64 offset;
 
-#define I915_MMAP2_CPU   0x1
+#define I915_MMAP2_CPU    0x1
+#define I915_MMAP2_CPU_WC 0x3
 	__u64 flags;
 };
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [RFC 5/5] drm/i915: Announce the new DRM_IOCTL_I915_GEM_MMAP_GTT capabilities
  2016-01-26 14:53 [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
                   ` (3 preceding siblings ...)
  2016-01-26 14:53 ` [RFC 4/5] drm/i915: Add support for write-combined " Tvrtko Ursulin
@ 2016-01-26 14:53 ` Tvrtko Ursulin
  2016-01-28  9:18 ` ✓ Fi.CI.BAT: success for Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT (rev3) Patchwork
  2016-01-28 16:10 ` Patchwork
  6 siblings, 0 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 14:53 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c | 3 +++
 include/uapi/drm/i915_drm.h     | 1 +
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 4725e8d61d0f..b22cbe5215d2 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -172,6 +172,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_SOFTPIN:
 		value = 1;
 		break;
+	case I915_PARAM_MMAP_GTT_VERSION:
+		value = 2;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 05e74c7aef4f..886056a3f858 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -357,6 +357,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_GPU_RESET	 35
 #define I915_PARAM_HAS_RESOURCE_STREAMER 36
 #define I915_PARAM_HAS_EXEC_SOFTPIN	 37
+#define I915_PARAM_MMAP_GTT_VERSION	 38
 
 typedef struct drm_i915_getparam {
 	__s32 param;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 14:53 ` [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
@ 2016-01-26 15:10   ` Chris Wilson
  2016-01-26 16:23     ` Tvrtko Ursulin
  2016-01-27 15:21   ` [PATCH v2 " Tvrtko Ursulin
  1 sibling, 1 reply; 24+ messages in thread
From: Chris Wilson @ 2016-01-26 15:10 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Tue, Jan 26, 2016 at 02:53:31PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 96 ++++++++++++++++++++++++++++++++++++++---
>  include/uapi/drm/i915_drm.h     |  3 ++
>  2 files changed, 93 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index dacf6a0013c5..039d55a49fc6 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1954,6 +1954,60 @@ out:
>  	return i915_gem_ret_to_vm_ret(dev_priv, ret);
>  }
>  
> +static int
> +i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> +{
> +	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
> +	struct drm_device *dev = obj->base.dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
> +	pgoff_t page_offset;
> +	struct page *page;
> +	int ret;
> +
> +	/* We don't use vmf->pgoff since that has the fake offset */
> +	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
> +			PAGE_SHIFT;
> +
> +	trace_i915_gem_object_fault(obj, page_offset, true, write);
> +
> +	intel_runtime_pm_get(dev_priv);
> +
> +	ret = i915_mutex_lock_interruptible(dev);
> +	if (ret)
> +		goto out;
> +
> +	ret = i915_gem_object_set_to_cpu_domain(obj, write);
> +	if (ret)
> +		goto out_unlock;

That was a mistake in the GTT gem_fault(). If you do this, we also want
the nonblocking wait for obvious reasons.

> +	ret = i915_gem_object_get_pages(obj);
> +	if (ret)
> +		goto out_unlock;
> +
> +	page = i915_gem_object_get_page(obj, page_offset);
> +	if (!page) {
> +		ret = -ERANGE;
> +		goto out_unlock;
> +	}
> +
> +	mutex_unlock(&dev->struct_mutex);
> +
> +	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
> +			    page_to_pfn(page));

We don't have a page ref at this point, so this obj+page could be
freed (via the shrinker at least) before we insert it.

I would also be more interested in having a version that faulted the
entire object at once - though maybe we will see more random access in
future.

> +	intel_runtime_pm_put(dev_priv);
> +
> +	return i915_gem_ret_to_vm_ret(dev_priv, ret);
> +
> +out_unlock:
> +	mutex_unlock(&dev->struct_mutex);
> +out:
> +	intel_runtime_pm_put(dev_priv);
> +
> +	return i915_gem_ret_to_vm_ret(dev_priv, ret);
> +}
> +
>  /**
>   * i915_gem_release_mmap - remove physical page mappings
>   * @obj: obj in question
> @@ -2078,11 +2132,18 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
>  	drm_gem_free_mmap_offset(&obj->base);
>  }
>  
> -int
> -i915_gem_mmap_gtt(struct drm_file *file,
> -		  struct drm_device *dev,
> -		  uint32_t handle,
> -		  uint64_t *offset)
> +static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
> +	.fault = i915_gem_cpu_fault,
> +	.open = drm_gem_vm_open,
> +	.close = drm_gem_vm_close,
> +};
> +
> +static int
> +i915_gem_mmap(struct drm_file *file,
> +	      struct drm_device *dev,
> +	      uint32_t handle,
> +	      uint32_t flags,
> +	      uint64_t *offset)
>  {
>  	struct drm_i915_gem_object *obj;
>  	int ret;
> @@ -2103,10 +2164,23 @@ i915_gem_mmap_gtt(struct drm_file *file,
>  		goto out;
>  	}
>  
> +	if (!obj->base.filp && (flags & I915_MMAP2_CPU)) {
> +		DRM_DEBUG("Attempting to mmap non-shm based object via CPU!\n");
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
>  	ret = i915_gem_object_create_mmap_offset(obj);
>  	if (ret)
>  		goto out;
>  
> +	if (flags & I915_MMAP2_CPU) {
> +		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
> +					      &i915_gem_cpu_vm_ops);
> +		if (ret)
> +			goto out;
> +	}

We would also need a WC equivalent.

It looks fairly sane. I wanted this just a short while ago, but figured
out a way of using regular mmap() to give me the inheritance instead.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 4/5] drm/i915: Add support for write-combined CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 14:53 ` [RFC 4/5] drm/i915: Add support for write-combined " Tvrtko Ursulin
@ 2016-01-26 15:11   ` Chris Wilson
  2016-01-27 15:22   ` [PATCH v2 " Tvrtko Ursulin
  1 sibling, 0 replies; 24+ messages in thread
From: Chris Wilson @ 2016-01-26 15:11 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Tue, Jan 26, 2016 at 02:53:32PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Speak of the devil.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 2/5] drm/i915: Extract code mapping errno to vm fault code
  2016-01-26 14:53 ` [RFC 2/5] drm/i915: Extract code mapping errno to vm fault code Tvrtko Ursulin
@ 2016-01-26 15:18   ` Chris Wilson
  2016-01-26 16:24     ` Tvrtko Ursulin
  0 siblings, 1 reply; 24+ messages in thread
From: Chris Wilson @ 2016-01-26 15:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Tue, Jan 26, 2016 at 02:53:30PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> Will be used from multiple callers in a following patch.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 91 ++++++++++++++++++++++-------------------
>  1 file changed, 49 insertions(+), 42 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index af15c290c71d..dacf6a0013c5 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1772,6 +1772,53 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>  	return 0;
>  }
>  
> +static int
> +i915_gem_ret_to_vm_ret(struct drm_i915_private *dev_priv, int ret)
> +{
> +	switch (ret) {
> +	case -EIO:
> +		/*
> +		 * We eat errors when the gpu is terminally wedged to avoid
> +		 * userspace unduly crashing (gl has no provisions for mmaps to
> +		 * fail). But any other -EIO isn't ours (e.g. swap in failure)
> +		 * and so needs to be reported.
> +		 */
> +		if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
> +			ret = VM_FAULT_SIGBUS;
> +			break;
> +		}
> +	case -EAGAIN:
> +		/*
> +		 * EAGAIN means the gpu is hung and we'll wait for the error
> +		 * handler to reset everything when re-faulting in
> +		 * i915_mutex_lock_interruptible.
> +		 */
> +	case 0:
> +	case -ERESTARTSYS:
> +	case -EINTR:
> +	case -EBUSY:
> +		/*
> +		 * EBUSY is ok: this just means that another thread
> +		 * already did the job.
> +		 */
> +		ret = VM_FAULT_NOPAGE;
> +		break;
> +	case -ENOMEM:
> +		ret = VM_FAULT_OOM;
> +		break;
> +	case -ENOSPC:
> +	case -EFAULT:
> +		ret = VM_FAULT_SIGBUS;
> +		break;
> +	default:
> +		WARN_ONCE(ret, "unhandled error in page fault\n");
> +		ret = VM_FAULT_SIGBUS;
> +		break;
> +	}

So without having to pin (plus a few other similar changes), we only need
to report

-ENOMEM, -ENOSPC, -EIO (from shmemfs) and -EFAULT (get_pages).

Given that I'd rather have the reasoning behind each explicit.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 15:10   ` Chris Wilson
@ 2016-01-26 16:23     ` Tvrtko Ursulin
  2016-01-26 16:59       ` Chris Wilson
  0 siblings, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 16:23 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx


On 26/01/16 15:10, Chris Wilson wrote:
> On Tue, Jan 26, 2016 at 02:53:31PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem.c | 96 ++++++++++++++++++++++++++++++++++++++---
>>   include/uapi/drm/i915_drm.h     |  3 ++
>>   2 files changed, 93 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index dacf6a0013c5..039d55a49fc6 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1954,6 +1954,60 @@ out:
>>   	return i915_gem_ret_to_vm_ret(dev_priv, ret);
>>   }
>>
>> +static int
>> +i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>> +{
>> +	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
>> +	struct drm_device *dev = obj->base.dev;
>> +	struct drm_i915_private *dev_priv = dev->dev_private;
>> +	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
>> +	pgoff_t page_offset;
>> +	struct page *page;
>> +	int ret;
>> +
>> +	/* We don't use vmf->pgoff since that has the fake offset */
>> +	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
>> +			PAGE_SHIFT;
>> +
>> +	trace_i915_gem_object_fault(obj, page_offset, true, write);
>> +
>> +	intel_runtime_pm_get(dev_priv);
>> +
>> +	ret = i915_mutex_lock_interruptible(dev);
>> +	if (ret)
>> +		goto out;
>> +
>> +	ret = i915_gem_object_set_to_cpu_domain(obj, write);
>> +	if (ret)
>> +		goto out_unlock;
>
> That was a mistake in the GTT gem_fault(). If you do this, we also want
> the nonblocking wait for obvious reasons.

You suggest leaving it for userspace?

And how would a non-blocking wait work?

>
>> +	ret = i915_gem_object_get_pages(obj);
>> +	if (ret)
>> +		goto out_unlock;
>> +
>> +	page = i915_gem_object_get_page(obj, page_offset);
>> +	if (!page) {
>> +		ret = -ERANGE;
>> +		goto out_unlock;
>> +	}
>> +
>> +	mutex_unlock(&dev->struct_mutex);
>> +
>> +	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
>> +			    page_to_pfn(page));
>
> We don't have a page ref at this point, so this obj+page could be
> freed (via the shrinker at least) before we insert it.

Oh yeah, need to pin the pages..

> I would also be more interested in having a version that faulted the
> entire object at once - though maybe we will see more random access in
> future.

Yeah I did not want to concern myself with more code since this was a 
proof of concept anyway.

>> +	intel_runtime_pm_put(dev_priv);
>> +
>> +	return i915_gem_ret_to_vm_ret(dev_priv, ret);
>> +
>> +out_unlock:
>> +	mutex_unlock(&dev->struct_mutex);
>> +out:
>> +	intel_runtime_pm_put(dev_priv);
>> +
>> +	return i915_gem_ret_to_vm_ret(dev_priv, ret);
>> +}
>> +
>>   /**
>>    * i915_gem_release_mmap - remove physical page mappings
>>    * @obj: obj in question
>> @@ -2078,11 +2132,18 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
>>   	drm_gem_free_mmap_offset(&obj->base);
>>   }
>>
>> -int
>> -i915_gem_mmap_gtt(struct drm_file *file,
>> -		  struct drm_device *dev,
>> -		  uint32_t handle,
>> -		  uint64_t *offset)
>> +static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
>> +	.fault = i915_gem_cpu_fault,
>> +	.open = drm_gem_vm_open,
>> +	.close = drm_gem_vm_close,
>> +};
>> +
>> +static int
>> +i915_gem_mmap(struct drm_file *file,
>> +	      struct drm_device *dev,
>> +	      uint32_t handle,
>> +	      uint32_t flags,
>> +	      uint64_t *offset)
>>   {
>>   	struct drm_i915_gem_object *obj;
>>   	int ret;
>> @@ -2103,10 +2164,23 @@ i915_gem_mmap_gtt(struct drm_file *file,
>>   		goto out;
>>   	}
>>
>> +	if (!obj->base.filp && (flags & I915_MMAP2_CPU)) {
>> +		DRM_DEBUG("Attempting to mmap non-shm based object via CPU!\n");
>> +		ret = -EINVAL;
>> +		goto out;
>> +	}
>> +
>>   	ret = i915_gem_object_create_mmap_offset(obj);
>>   	if (ret)
>>   		goto out;
>>
>> +	if (flags & I915_MMAP2_CPU) {
>> +		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
>> +					      &i915_gem_cpu_vm_ops);
>> +		if (ret)
>> +			goto out;
>> +	}
>
> We would also need a WC equivalent.
>
> It looks fairly sane. I wanted this just a short while ago, but figured
> out a way of using regular mmap() to give me the inheritance instead.

So would it be useful to cleanup and finish this work or not?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 2/5] drm/i915: Extract code mapping errno to vm fault code
  2016-01-26 15:18   ` Chris Wilson
@ 2016-01-26 16:24     ` Tvrtko Ursulin
  2016-01-26 16:42       ` Chris Wilson
  0 siblings, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-26 16:24 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx



On 26/01/16 15:18, Chris Wilson wrote:
> On Tue, Jan 26, 2016 at 02:53:30PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> Will be used from multiple callers in a following patch.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_gem.c | 91 ++++++++++++++++++++++-------------------
>>   1 file changed, 49 insertions(+), 42 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index af15c290c71d..dacf6a0013c5 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1772,6 +1772,53 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
>>   	return 0;
>>   }
>>
>> +static int
>> +i915_gem_ret_to_vm_ret(struct drm_i915_private *dev_priv, int ret)
>> +{
>> +	switch (ret) {
>> +	case -EIO:
>> +		/*
>> +		 * We eat errors when the gpu is terminally wedged to avoid
>> +		 * userspace unduly crashing (gl has no provisions for mmaps to
>> +		 * fail). But any other -EIO isn't ours (e.g. swap in failure)
>> +		 * and so needs to be reported.
>> +		 */
>> +		if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
>> +			ret = VM_FAULT_SIGBUS;
>> +			break;
>> +		}
>> +	case -EAGAIN:
>> +		/*
>> +		 * EAGAIN means the gpu is hung and we'll wait for the error
>> +		 * handler to reset everything when re-faulting in
>> +		 * i915_mutex_lock_interruptible.
>> +		 */
>> +	case 0:
>> +	case -ERESTARTSYS:
>> +	case -EINTR:
>> +	case -EBUSY:
>> +		/*
>> +		 * EBUSY is ok: this just means that another thread
>> +		 * already did the job.
>> +		 */
>> +		ret = VM_FAULT_NOPAGE;
>> +		break;
>> +	case -ENOMEM:
>> +		ret = VM_FAULT_OOM;
>> +		break;
>> +	case -ENOSPC:
>> +	case -EFAULT:
>> +		ret = VM_FAULT_SIGBUS;
>> +		break;
>> +	default:
>> +		WARN_ONCE(ret, "unhandled error in page fault\n");
>> +		ret = VM_FAULT_SIGBUS;
>> +		break;
>> +	}
>
> So without having to pin (plus a few other similar changes), we only need
> to report
>
> -ENOMEM, -ENOSPC, -EIO (from shmemfs) and -EFAULT (get_pages).
>
> Given that I'd rather have the reasoning behind each explicit.

This is just existing code factored out so I am not sure what you are 
suggesting?

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 2/5] drm/i915: Extract code mapping errno to vm fault code
  2016-01-26 16:24     ` Tvrtko Ursulin
@ 2016-01-26 16:42       ` Chris Wilson
  0 siblings, 0 replies; 24+ messages in thread
From: Chris Wilson @ 2016-01-26 16:42 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Tue, Jan 26, 2016 at 04:24:46PM +0000, Tvrtko Ursulin wrote:
> 
> 
> On 26/01/16 15:18, Chris Wilson wrote:
> >On Tue, Jan 26, 2016 at 02:53:30PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>Will be used from multiple callers in a following patch.
> >>
> >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_gem.c | 91 ++++++++++++++++++++++-------------------
> >>  1 file changed, 49 insertions(+), 42 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>index af15c290c71d..dacf6a0013c5 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -1772,6 +1772,53 @@ i915_gem_mmap_ioctl(struct drm_device *dev, void *data,
> >>  	return 0;
> >>  }
> >>
> >>+static int
> >>+i915_gem_ret_to_vm_ret(struct drm_i915_private *dev_priv, int ret)
> >>+{
> >>+	switch (ret) {
> >>+	case -EIO:
> >>+		/*
> >>+		 * We eat errors when the gpu is terminally wedged to avoid
> >>+		 * userspace unduly crashing (gl has no provisions for mmaps to
> >>+		 * fail). But any other -EIO isn't ours (e.g. swap in failure)
> >>+		 * and so needs to be reported.
> >>+		 */
> >>+		if (!i915_terminally_wedged(&dev_priv->gpu_error)) {
> >>+			ret = VM_FAULT_SIGBUS;
> >>+			break;
> >>+		}
> >>+	case -EAGAIN:
> >>+		/*
> >>+		 * EAGAIN means the gpu is hung and we'll wait for the error
> >>+		 * handler to reset everything when re-faulting in
> >>+		 * i915_mutex_lock_interruptible.
> >>+		 */
> >>+	case 0:
> >>+	case -ERESTARTSYS:
> >>+	case -EINTR:
> >>+	case -EBUSY:
> >>+		/*
> >>+		 * EBUSY is ok: this just means that another thread
> >>+		 * already did the job.
> >>+		 */
> >>+		ret = VM_FAULT_NOPAGE;
> >>+		break;
> >>+	case -ENOMEM:
> >>+		ret = VM_FAULT_OOM;
> >>+		break;
> >>+	case -ENOSPC:
> >>+	case -EFAULT:
> >>+		ret = VM_FAULT_SIGBUS;
> >>+		break;
> >>+	default:
> >>+		WARN_ONCE(ret, "unhandled error in page fault\n");
> >>+		ret = VM_FAULT_SIGBUS;
> >>+		break;
> >>+	}
> >
> >So without having to pin (plus a few other similar changes), we only need
> >to report
> >
> >-ENOMEM, -ENOSPC, -EIO (from shmemfs) and -EFAULT (get_pages).
> >
> >Given that I'd rather have the reasoning behind each explicit.
> 
> This is just existing code factored out so I am not sure what you
> are suggesting?

I'm looking at the next user and wondering how many of these comments
should (and do) apply.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 16:23     ` Tvrtko Ursulin
@ 2016-01-26 16:59       ` Chris Wilson
  2016-01-27 15:24         ` Tvrtko Ursulin
  0 siblings, 1 reply; 24+ messages in thread
From: Chris Wilson @ 2016-01-26 16:59 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Tue, Jan 26, 2016 at 04:23:28PM +0000, Tvrtko Ursulin wrote:
> 
> On 26/01/16 15:10, Chris Wilson wrote:
> >On Tue, Jan 26, 2016 at 02:53:31PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>---
> >>  drivers/gpu/drm/i915/i915_gem.c | 96 ++++++++++++++++++++++++++++++++++++++---
> >>  include/uapi/drm/i915_drm.h     |  3 ++
> >>  2 files changed, 93 insertions(+), 6 deletions(-)
> >>
> >>diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >>index dacf6a0013c5..039d55a49fc6 100644
> >>--- a/drivers/gpu/drm/i915/i915_gem.c
> >>+++ b/drivers/gpu/drm/i915/i915_gem.c
> >>@@ -1954,6 +1954,60 @@ out:
> >>  	return i915_gem_ret_to_vm_ret(dev_priv, ret);
> >>  }
> >>
> >>+static int
> >>+i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> >>+{
> >>+	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
> >>+	struct drm_device *dev = obj->base.dev;
> >>+	struct drm_i915_private *dev_priv = dev->dev_private;
> >>+	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
> >>+	pgoff_t page_offset;
> >>+	struct page *page;
> >>+	int ret;
> >>+
> >>+	/* We don't use vmf->pgoff since that has the fake offset */
> >>+	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
> >>+			PAGE_SHIFT;
> >>+
> >>+	trace_i915_gem_object_fault(obj, page_offset, true, write);
> >>+
> >>+	intel_runtime_pm_get(dev_priv);
> >>+
> >>+	ret = i915_mutex_lock_interruptible(dev);
> >>+	if (ret)
> >>+		goto out;
> >>+
> >>+	ret = i915_gem_object_set_to_cpu_domain(obj, write);
> >>+	if (ret)
> >>+		goto out_unlock;
> >
> >That was a mistake in the GTT gem_fault(). If you do this, we also want
> >the nonblocking wait for obvious reasons.
> 
> You suggest leaving it for userspace?

It is userspace's responsibility. Page faults are random and do not
occur around every pointer acceess - userspace has to mark the domain
changes on its boundaries (and coordinate amongst its peers).
 
> And how would a non-blocking wait work?

Before set-to-cpu-domain, we do a wait_rendering_nonblocking which drops
and then reacquires the mutex. (That allows for multiple waiters which
tends to be the lowest hanging fruit with struct_mutex contention.) Then
set-to-cpu domain does a blocking wait to ensure nothing snuck in.

But I don't think we want this. And we can then reduce the
i915_mutex_lock_interruptible() to a plain mutex_lock_interruptible() as
we are not touching the GPU.

> >>+	ret = i915_gem_object_get_pages(obj);
> >>+	if (ret)
> >>+		goto out_unlock;
> >>+
> >>+	page = i915_gem_object_get_page(obj, page_offset);
> >>+	if (!page) {
> >>+		ret = -ERANGE;

ret = -EFAULT;

though it would definitely be a stack bug.

> >>+		goto out_unlock;
> >>+	}
> >>+
> >>+	mutex_unlock(&dev->struct_mutex);
> >>+
> >>+	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
> >>+			    page_to_pfn(page));
> >
> >We don't have a page ref at this point, so this obj+page could be
> >freed (via the shrinker at least) before we insert it.
> 
> Oh yeah, need to pin the pages..

But only whilst inserting. Once inserted they need to be evicted, and I
was wondering whether we should do the zap on put_pages(). If we don't,
it means that the shrinker is neutered.

> >I would also be more interested in having a version that faulted the
> >entire object at once - though maybe we will see more random access in
> >future.
> 
> Yeah I did not want to concern myself with more code since this was
> a proof of concept anyway.

No worries, the transformation is simple with a certain remap function.

> >It looks fairly sane. I wanted this just a short while ago, but figured
> >out a way of using regular mmap() to give me the inheritance instead.
> 
> So would it be useful to cleanup and finish this work or not?

I agree that it closes a big hole in the API - the ability to CPU mmap
non-shmemfs object (i.e. userptr, dmabuf). With a bit of polish we
should be able to offer something to take advantage of the existing GEM
infrastructure better than a regular CPU mmapping - though off the top
of my head, I don't have anything that is ratelimited by CPU pagefaults.

Another thing I realised was that this severely limits the mmap space on
32-bit systems, as the vma manager is unsigned long. The CPU mmaping was
a way around some of the restrictions. That would seem fairly easy to
lift (and I hope without consequence).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 14:53 ` [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
  2016-01-26 15:10   ` Chris Wilson
@ 2016-01-27 15:21   ` Tvrtko Ursulin
  2016-01-27 15:51     ` Daniel Vetter
  1 sibling, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-27 15:21 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This enables mapping via CPU using the proper DRM mmap API for
better debug (Valgrind) and implementation symmetry in the
driver.

v2:
 * Use normal mutex, skip domain management and pin pages. (Chris Wilson)
 * No need to drop struct mutex over vma_insert_pfn.
 * Turn off write-combine set up by the DRM core.
 * Commit message.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 98 ++++++++++++++++++++++++++++++++++++++---
 include/uapi/drm/i915_drm.h     |  3 ++
 2 files changed, 95 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a1e2832047e2..a690cee31f20 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1954,6 +1954,62 @@ out:
 	return i915_gem_ret_to_vm_ret(dev_priv, ret);
 }
 
+static int
+i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
+	struct drm_device *dev = obj->base.dev;
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
+	pgoff_t page_offset;
+	struct page *page;
+	unsigned long pgprot;
+	int ret;
+
+	/* We don't use vmf->pgoff since that has the fake offset */
+	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
+			PAGE_SHIFT;
+
+	trace_i915_gem_object_fault(obj, page_offset, true, write);
+
+	intel_runtime_pm_get(dev_priv);
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		goto out;
+
+	ret = i915_gem_object_get_pages(obj);
+	if (ret)
+		goto out_unlock;
+
+	i915_gem_object_pin_pages(obj);
+
+	page = i915_gem_object_get_page(obj, page_offset);
+	if (!page) {
+		ret = -EFAULT;
+		goto out_unpin;
+	}
+
+	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
+			    page_to_pfn(page));
+	if (ret == 0) {
+		/* DRM core sets up WC by default and we want WB. */
+		pgprot = pgprot_val(vm_get_page_prot(vma->vm_flags));
+		pgprot &= ~_PAGE_CACHE_MASK;
+		pgprot |= cachemode2protval(_PAGE_CACHE_MODE_WB);
+		vma->vm_page_prot = __pgprot(pgprot);
+	}
+
+out_unpin:
+	i915_gem_object_unpin_pages(obj);
+out_unlock:
+	mutex_unlock(&dev->struct_mutex);
+out:
+	intel_runtime_pm_put(dev_priv);
+
+	return i915_gem_ret_to_vm_ret(dev_priv, ret);
+}
+
 /**
  * i915_gem_release_mmap - remove physical page mappings
  * @obj: obj in question
@@ -2078,11 +2134,18 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
 	drm_gem_free_mmap_offset(&obj->base);
 }
 
-int
-i915_gem_mmap_gtt(struct drm_file *file,
-		  struct drm_device *dev,
-		  uint32_t handle,
-		  uint64_t *offset)
+static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
+	.fault = i915_gem_cpu_fault,
+	.open = drm_gem_vm_open,
+	.close = drm_gem_vm_close,
+};
+
+static int
+i915_gem_mmap(struct drm_file *file,
+	      struct drm_device *dev,
+	      uint32_t handle,
+	      uint32_t flags,
+	      uint64_t *offset)
 {
 	struct drm_i915_gem_object *obj;
 	int ret;
@@ -2103,10 +2166,23 @@ i915_gem_mmap_gtt(struct drm_file *file,
 		goto out;
 	}
 
+	if (!obj->base.filp && (flags & I915_MMAP2_CPU)) {
+		DRM_DEBUG("Attempting to mmap non-shm based object via CPU!\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
 	ret = i915_gem_object_create_mmap_offset(obj);
 	if (ret)
 		goto out;
 
+	if (flags & I915_MMAP2_CPU) {
+		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
+					      &i915_gem_cpu_vm_ops);
+		if (ret)
+			goto out;
+	}
+
 	*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
 
 out:
@@ -2116,6 +2192,15 @@ unlock:
 	return ret;
 }
 
+int
+i915_gem_mmap_gtt(struct drm_file *file,
+		  struct drm_device *dev,
+		  uint32_t handle,
+		  uint64_t *offset)
+{
+	return i915_gem_mmap(file, dev, handle, 0, offset);
+}
+
 /**
  * i915_gem_mmap_gtt_ioctl - prepare an object for GTT mmap'ing
  * @dev: DRM device
@@ -2137,7 +2222,8 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_i915_gem_mmap_gtt *args = data;
 
-	return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset);
+	return i915_gem_mmap(file, dev, args->handle, args->flags,
+			     &args->offset);
 }
 
 /* Immediately discard the backing storage */
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 6a19371391fa..359a36d604bb 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -528,6 +528,9 @@ struct drm_i915_gem_mmap_gtt {
 	 * This is a fixed-size type for 32/64 compatibility.
 	 */
 	__u64 offset;
+
+#define I915_MMAP2_CPU   0x1
+	__u64 flags;
 };
 
 struct drm_i915_gem_set_domain {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 4/5] drm/i915: Add support for write-combined CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 14:53 ` [RFC 4/5] drm/i915: Add support for write-combined " Tvrtko Ursulin
  2016-01-26 15:11   ` Chris Wilson
@ 2016-01-27 15:22   ` Tvrtko Ursulin
  1 sibling, 0 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-27 15:22 UTC (permalink / raw)
  To: Intel-gfx

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Old DRM_IOCTL_I915_GEM_MMAP supported write-combine mapping so
add support for it here as well.

v2: Commit msg.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 34 +++++++++++++++++++++++++++-------
 include/uapi/drm/i915_drm.h     |  3 ++-
 2 files changed, 29 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a690cee31f20..642c75a5530e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1954,8 +1954,8 @@ out:
 	return i915_gem_ret_to_vm_ret(dev_priv, ret);
 }
 
-static int
-i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+static inline int
+__i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf, bool wc)
 {
 	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
 	struct drm_device *dev = obj->base.dev;
@@ -1992,7 +1992,7 @@ i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 
 	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
 			    page_to_pfn(page));
-	if (ret == 0) {
+	if (ret == 0 && !wc) {
 		/* DRM core sets up WC by default and we want WB. */
 		pgprot = pgprot_val(vm_get_page_prot(vma->vm_flags));
 		pgprot &= ~_PAGE_CACHE_MASK;
@@ -2010,6 +2010,18 @@ out:
 	return i915_gem_ret_to_vm_ret(dev_priv, ret);
 }
 
+static int
+i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	return __i915_gem_cpu_fault(vma, vmf, false);
+}
+
+static int
+i915_gem_cpu_wc_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
+{
+	return __i915_gem_cpu_fault(vma, vmf, true);
+}
+
 /**
  * i915_gem_release_mmap - remove physical page mappings
  * @obj: obj in question
@@ -2140,6 +2152,12 @@ static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
 	.close = drm_gem_vm_close,
 };
 
+static const struct vm_operations_struct i915_gem_cpu_wc_vm_ops = {
+	.fault = i915_gem_cpu_wc_fault,
+	.open = drm_gem_vm_open,
+	.close = drm_gem_vm_close,
+};
+
 static int
 i915_gem_mmap(struct drm_file *file,
 	      struct drm_device *dev,
@@ -2176,12 +2194,14 @@ i915_gem_mmap(struct drm_file *file,
 	if (ret)
 		goto out;
 
-	if (flags & I915_MMAP2_CPU) {
+	if (flags & I915_MMAP2_CPU_WC)
+		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
+					      &i915_gem_cpu_wc_vm_ops);
+	else if (flags & I915_MMAP2_CPU)
 		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
 					      &i915_gem_cpu_vm_ops);
-		if (ret)
-			goto out;
-	}
+	if (ret)
+		goto out;
 
 	*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 359a36d604bb..05e74c7aef4f 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -529,7 +529,8 @@ struct drm_i915_gem_mmap_gtt {
 	 */
 	__u64 offset;
 
-#define I915_MMAP2_CPU   0x1
+#define I915_MMAP2_CPU    0x1
+#define I915_MMAP2_CPU_WC 0x3
 	__u64 flags;
 };
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-26 16:59       ` Chris Wilson
@ 2016-01-27 15:24         ` Tvrtko Ursulin
  2016-01-27 16:36           ` Chris Wilson
  2016-01-27 16:40           ` Chris Wilson
  0 siblings, 2 replies; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-27 15:24 UTC (permalink / raw)
  To: Chris Wilson, Intel-gfx


On 26/01/16 16:59, Chris Wilson wrote:
> On Tue, Jan 26, 2016 at 04:23:28PM +0000, Tvrtko Ursulin wrote:
>>
>> On 26/01/16 15:10, Chris Wilson wrote:
>>> On Tue, Jan 26, 2016 at 02:53:31PM +0000, Tvrtko Ursulin wrote:
>>>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> ---
>>>>   drivers/gpu/drm/i915/i915_gem.c | 96 ++++++++++++++++++++++++++++++++++++++---
>>>>   include/uapi/drm/i915_drm.h     |  3 ++
>>>>   2 files changed, 93 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>>>> index dacf6a0013c5..039d55a49fc6 100644
>>>> --- a/drivers/gpu/drm/i915/i915_gem.c
>>>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>>>> @@ -1954,6 +1954,60 @@ out:
>>>>   	return i915_gem_ret_to_vm_ret(dev_priv, ret);
>>>>   }
>>>>
>>>> +static int
>>>> +i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>>>> +{
>>>> +	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
>>>> +	struct drm_device *dev = obj->base.dev;
>>>> +	struct drm_i915_private *dev_priv = dev->dev_private;
>>>> +	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
>>>> +	pgoff_t page_offset;
>>>> +	struct page *page;
>>>> +	int ret;
>>>> +
>>>> +	/* We don't use vmf->pgoff since that has the fake offset */
>>>> +	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
>>>> +			PAGE_SHIFT;
>>>> +
>>>> +	trace_i915_gem_object_fault(obj, page_offset, true, write);
>>>> +
>>>> +	intel_runtime_pm_get(dev_priv);
>>>> +
>>>> +	ret = i915_mutex_lock_interruptible(dev);
>>>> +	if (ret)
>>>> +		goto out;
>>>> +
>>>> +	ret = i915_gem_object_set_to_cpu_domain(obj, write);
>>>> +	if (ret)
>>>> +		goto out_unlock;
>>>
>>> That was a mistake in the GTT gem_fault(). If you do this, we also want
>>> the nonblocking wait for obvious reasons.
>>
>> You suggest leaving it for userspace?
>
> It is userspace's responsibility. Page faults are random and do not
> occur around every pointer acceess - userspace has to mark the domain
> changes on its boundaries (and coordinate amongst its peers).
>
>> And how would a non-blocking wait work?
>
> Before set-to-cpu-domain, we do a wait_rendering_nonblocking which drops
> and then reacquires the mutex. (That allows for multiple waiters which
> tends to be the lowest hanging fruit with struct_mutex contention.) Then
> set-to-cpu domain does a blocking wait to ensure nothing snuck in.
>
> But I don't think we want this. And we can then reduce the
> i915_mutex_lock_interruptible() to a plain mutex_lock_interruptible() as
> we are not touching the GPU.
>
>>>> +	ret = i915_gem_object_get_pages(obj);
>>>> +	if (ret)
>>>> +		goto out_unlock;
>>>> +
>>>> +	page = i915_gem_object_get_page(obj, page_offset);
>>>> +	if (!page) {
>>>> +		ret = -ERANGE;
>
> ret = -EFAULT;
>
> though it would definitely be a stack bug.
>
>>>> +		goto out_unlock;
>>>> +	}
>>>> +
>>>> +	mutex_unlock(&dev->struct_mutex);
>>>> +
>>>> +	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
>>>> +			    page_to_pfn(page));
>>>
>>> We don't have a page ref at this point, so this obj+page could be
>>> freed (via the shrinker at least) before we insert it.
>>
>> Oh yeah, need to pin the pages..
>
> But only whilst inserting. Once inserted they need to be evicted, and I
> was wondering whether we should do the zap on put_pages(). If we don't,
> it means that the shrinker is neutered.

Ah I forgot this in v2. So we would need something like 
obj->fault_mappable so that i915_gem_release_mmap runs, yes?

>>> I would also be more interested in having a version that faulted the
>>> entire object at once - though maybe we will see more random access in
>>> future.
>>
>> Yeah I did not want to concern myself with more code since this was
>> a proof of concept anyway.
>
> No worries, the transformation is simple with a certain remap function.
>
>>> It looks fairly sane. I wanted this just a short while ago, but figured
>>> out a way of using regular mmap() to give me the inheritance instead.
>>
>> So would it be useful to cleanup and finish this work or not?
>
> I agree that it closes a big hole in the API - the ability to CPU mmap
> non-shmemfs object (i.e. userptr, dmabuf). With a bit of polish we
> should be able to offer something to take advantage of the existing GEM
> infrastructure better than a regular CPU mmapping - though off the top
> of my head, I don't have anything that is ratelimited by CPU pagefaults.
>
> Another thing I realised was that this severely limits the mmap space on
> 32-bit systems, as the vma manager is unsigned long. The CPU mmaping was
> a way around some of the restrictions. That would seem fairly easy to
> lift (and I hope without consequence).

I did not manage to figure out what here limits the space on 32-bit systems?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-27 15:21   ` [PATCH v2 " Tvrtko Ursulin
@ 2016-01-27 15:51     ` Daniel Vetter
  2016-01-27 16:01       ` Tvrtko Ursulin
  2016-01-27 16:32       ` Chris Wilson
  0 siblings, 2 replies; 24+ messages in thread
From: Daniel Vetter @ 2016-01-27 15:51 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Wed, Jan 27, 2016 at 03:21:48PM +0000, Tvrtko Ursulin wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> 
> This enables mapping via CPU using the proper DRM mmap API for
> better debug (Valgrind) and implementation symmetry in the
> driver.
> 
> v2:
>  * Use normal mutex, skip domain management and pin pages. (Chris Wilson)
>  * No need to drop struct mutex over vma_insert_pfn.

I think we still neeed that on first fault:

- userspac calls set_domain(GTT)
- kernel does nothing since there's no binding
- userspace starts accessing gtt mmap
- kernel faults, but doesn't update domain/flush cpu caches
-> BOOM

Needs more testcases I'd say ;-)
-Daniel

>  * Turn off write-combine set up by the DRM core.
>  * Commit message.
> 
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 98 ++++++++++++++++++++++++++++++++++++++---
>  include/uapi/drm/i915_drm.h     |  3 ++
>  2 files changed, 95 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index a1e2832047e2..a690cee31f20 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1954,6 +1954,62 @@ out:
>  	return i915_gem_ret_to_vm_ret(dev_priv, ret);
>  }
>  
> +static int
> +i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
> +{
> +	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
> +	struct drm_device *dev = obj->base.dev;
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
> +	pgoff_t page_offset;
> +	struct page *page;
> +	unsigned long pgprot;
> +	int ret;
> +
> +	/* We don't use vmf->pgoff since that has the fake offset */
> +	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
> +			PAGE_SHIFT;
> +
> +	trace_i915_gem_object_fault(obj, page_offset, true, write);
> +
> +	intel_runtime_pm_get(dev_priv);
> +
> +	ret = mutex_lock_interruptible(&dev->struct_mutex);
> +	if (ret)
> +		goto out;
> +
> +	ret = i915_gem_object_get_pages(obj);
> +	if (ret)
> +		goto out_unlock;
> +
> +	i915_gem_object_pin_pages(obj);
> +
> +	page = i915_gem_object_get_page(obj, page_offset);
> +	if (!page) {
> +		ret = -EFAULT;
> +		goto out_unpin;
> +	}
> +
> +	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
> +			    page_to_pfn(page));
> +	if (ret == 0) {
> +		/* DRM core sets up WC by default and we want WB. */
> +		pgprot = pgprot_val(vm_get_page_prot(vma->vm_flags));
> +		pgprot &= ~_PAGE_CACHE_MASK;
> +		pgprot |= cachemode2protval(_PAGE_CACHE_MODE_WB);
> +		vma->vm_page_prot = __pgprot(pgprot);
> +	}
> +
> +out_unpin:
> +	i915_gem_object_unpin_pages(obj);
> +out_unlock:
> +	mutex_unlock(&dev->struct_mutex);
> +out:
> +	intel_runtime_pm_put(dev_priv);
> +
> +	return i915_gem_ret_to_vm_ret(dev_priv, ret);
> +}
> +
>  /**
>   * i915_gem_release_mmap - remove physical page mappings
>   * @obj: obj in question
> @@ -2078,11 +2134,18 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
>  	drm_gem_free_mmap_offset(&obj->base);
>  }
>  
> -int
> -i915_gem_mmap_gtt(struct drm_file *file,
> -		  struct drm_device *dev,
> -		  uint32_t handle,
> -		  uint64_t *offset)
> +static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
> +	.fault = i915_gem_cpu_fault,
> +	.open = drm_gem_vm_open,
> +	.close = drm_gem_vm_close,
> +};
> +
> +static int
> +i915_gem_mmap(struct drm_file *file,
> +	      struct drm_device *dev,
> +	      uint32_t handle,
> +	      uint32_t flags,
> +	      uint64_t *offset)
>  {
>  	struct drm_i915_gem_object *obj;
>  	int ret;
> @@ -2103,10 +2166,23 @@ i915_gem_mmap_gtt(struct drm_file *file,
>  		goto out;
>  	}
>  
> +	if (!obj->base.filp && (flags & I915_MMAP2_CPU)) {
> +		DRM_DEBUG("Attempting to mmap non-shm based object via CPU!\n");
> +		ret = -EINVAL;
> +		goto out;
> +	}
> +
>  	ret = i915_gem_object_create_mmap_offset(obj);
>  	if (ret)
>  		goto out;
>  
> +	if (flags & I915_MMAP2_CPU) {
> +		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
> +					      &i915_gem_cpu_vm_ops);
> +		if (ret)
> +			goto out;
> +	}
> +
>  	*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
>  
>  out:
> @@ -2116,6 +2192,15 @@ unlock:
>  	return ret;
>  }
>  
> +int
> +i915_gem_mmap_gtt(struct drm_file *file,
> +		  struct drm_device *dev,
> +		  uint32_t handle,
> +		  uint64_t *offset)
> +{
> +	return i915_gem_mmap(file, dev, handle, 0, offset);
> +}
> +
>  /**
>   * i915_gem_mmap_gtt_ioctl - prepare an object for GTT mmap'ing
>   * @dev: DRM device
> @@ -2137,7 +2222,8 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
>  {
>  	struct drm_i915_gem_mmap_gtt *args = data;
>  
> -	return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset);
> +	return i915_gem_mmap(file, dev, args->handle, args->flags,
> +			     &args->offset);
>  }
>  
>  /* Immediately discard the backing storage */
> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index 6a19371391fa..359a36d604bb 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -528,6 +528,9 @@ struct drm_i915_gem_mmap_gtt {
>  	 * This is a fixed-size type for 32/64 compatibility.
>  	 */
>  	__u64 offset;
> +
> +#define I915_MMAP2_CPU   0x1
> +	__u64 flags;
>  };
>  
>  struct drm_i915_gem_set_domain {
> -- 
> 1.9.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-27 15:51     ` Daniel Vetter
@ 2016-01-27 16:01       ` Tvrtko Ursulin
  2016-01-27 16:10         ` Daniel Vetter
  2016-01-27 16:32       ` Chris Wilson
  1 sibling, 1 reply; 24+ messages in thread
From: Tvrtko Ursulin @ 2016-01-27 16:01 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx


On 27/01/16 15:51, Daniel Vetter wrote:
> On Wed, Jan 27, 2016 at 03:21:48PM +0000, Tvrtko Ursulin wrote:
>> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>
>> This enables mapping via CPU using the proper DRM mmap API for
>> better debug (Valgrind) and implementation symmetry in the
>> driver.
>>
>> v2:
>>   * Use normal mutex, skip domain management and pin pages. (Chris Wilson)
>>   * No need to drop struct mutex over vma_insert_pfn.
>
> I think we still neeed that on first fault:
>
> - userspac calls set_domain(GTT)
> - kernel does nothing since there's no binding
> - userspace starts accessing gtt mmap
> - kernel faults, but doesn't update domain/flush cpu caches
> -> BOOM

Boom what? :) (seriously I don't follow)

And as an aside, you would merge this in general or see value in it?

> Needs more testcases I'd say ;-)

Oh it has no dedicated ones, just gem_mmap_gtt hacked up to always pass 
the new flag in (so request CPU mmap via the mmap_gtt ioctl).

Regards,

Tvrtko

> -Daniel
>
>>   * Turn off write-combine set up by the DRM core.
>>   * Commit message.
>>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Chris Wilson <chris@chris-wilson.co.uk>
>> ---
>>   drivers/gpu/drm/i915/i915_gem.c | 98 ++++++++++++++++++++++++++++++++++++++---
>>   include/uapi/drm/i915_drm.h     |  3 ++
>>   2 files changed, 95 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index a1e2832047e2..a690cee31f20 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -1954,6 +1954,62 @@ out:
>>   	return i915_gem_ret_to_vm_ret(dev_priv, ret);
>>   }
>>
>> +static int
>> +i915_gem_cpu_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
>> +{
>> +	struct drm_i915_gem_object *obj = to_intel_bo(vma->vm_private_data);
>> +	struct drm_device *dev = obj->base.dev;
>> +	struct drm_i915_private *dev_priv = dev->dev_private;
>> +	bool write = !!(vmf->flags & FAULT_FLAG_WRITE);
>> +	pgoff_t page_offset;
>> +	struct page *page;
>> +	unsigned long pgprot;
>> +	int ret;
>> +
>> +	/* We don't use vmf->pgoff since that has the fake offset */
>> +	page_offset = ((unsigned long)vmf->virtual_address - vma->vm_start) >>
>> +			PAGE_SHIFT;
>> +
>> +	trace_i915_gem_object_fault(obj, page_offset, true, write);
>> +
>> +	intel_runtime_pm_get(dev_priv);
>> +
>> +	ret = mutex_lock_interruptible(&dev->struct_mutex);
>> +	if (ret)
>> +		goto out;
>> +
>> +	ret = i915_gem_object_get_pages(obj);
>> +	if (ret)
>> +		goto out_unlock;
>> +
>> +	i915_gem_object_pin_pages(obj);
>> +
>> +	page = i915_gem_object_get_page(obj, page_offset);
>> +	if (!page) {
>> +		ret = -EFAULT;
>> +		goto out_unpin;
>> +	}
>> +
>> +	ret = vm_insert_pfn(vma, (unsigned long)vmf->virtual_address,
>> +			    page_to_pfn(page));
>> +	if (ret == 0) {
>> +		/* DRM core sets up WC by default and we want WB. */
>> +		pgprot = pgprot_val(vm_get_page_prot(vma->vm_flags));
>> +		pgprot &= ~_PAGE_CACHE_MASK;
>> +		pgprot |= cachemode2protval(_PAGE_CACHE_MODE_WB);
>> +		vma->vm_page_prot = __pgprot(pgprot);
>> +	}
>> +
>> +out_unpin:
>> +	i915_gem_object_unpin_pages(obj);
>> +out_unlock:
>> +	mutex_unlock(&dev->struct_mutex);
>> +out:
>> +	intel_runtime_pm_put(dev_priv);
>> +
>> +	return i915_gem_ret_to_vm_ret(dev_priv, ret);
>> +}
>> +
>>   /**
>>    * i915_gem_release_mmap - remove physical page mappings
>>    * @obj: obj in question
>> @@ -2078,11 +2134,18 @@ static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
>>   	drm_gem_free_mmap_offset(&obj->base);
>>   }
>>
>> -int
>> -i915_gem_mmap_gtt(struct drm_file *file,
>> -		  struct drm_device *dev,
>> -		  uint32_t handle,
>> -		  uint64_t *offset)
>> +static const struct vm_operations_struct i915_gem_cpu_vm_ops = {
>> +	.fault = i915_gem_cpu_fault,
>> +	.open = drm_gem_vm_open,
>> +	.close = drm_gem_vm_close,
>> +};
>> +
>> +static int
>> +i915_gem_mmap(struct drm_file *file,
>> +	      struct drm_device *dev,
>> +	      uint32_t handle,
>> +	      uint32_t flags,
>> +	      uint64_t *offset)
>>   {
>>   	struct drm_i915_gem_object *obj;
>>   	int ret;
>> @@ -2103,10 +2166,23 @@ i915_gem_mmap_gtt(struct drm_file *file,
>>   		goto out;
>>   	}
>>
>> +	if (!obj->base.filp && (flags & I915_MMAP2_CPU)) {
>> +		DRM_DEBUG("Attempting to mmap non-shm based object via CPU!\n");
>> +		ret = -EINVAL;
>> +		goto out;
>> +	}
>> +
>>   	ret = i915_gem_object_create_mmap_offset(obj);
>>   	if (ret)
>>   		goto out;
>>
>> +	if (flags & I915_MMAP2_CPU) {
>> +		ret = drm_vma_node_set_vm_ops(&obj->base.vma_node,
>> +					      &i915_gem_cpu_vm_ops);
>> +		if (ret)
>> +			goto out;
>> +	}
>> +
>>   	*offset = drm_vma_node_offset_addr(&obj->base.vma_node);
>>
>>   out:
>> @@ -2116,6 +2192,15 @@ unlock:
>>   	return ret;
>>   }
>>
>> +int
>> +i915_gem_mmap_gtt(struct drm_file *file,
>> +		  struct drm_device *dev,
>> +		  uint32_t handle,
>> +		  uint64_t *offset)
>> +{
>> +	return i915_gem_mmap(file, dev, handle, 0, offset);
>> +}
>> +
>>   /**
>>    * i915_gem_mmap_gtt_ioctl - prepare an object for GTT mmap'ing
>>    * @dev: DRM device
>> @@ -2137,7 +2222,8 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
>>   {
>>   	struct drm_i915_gem_mmap_gtt *args = data;
>>
>> -	return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset);
>> +	return i915_gem_mmap(file, dev, args->handle, args->flags,
>> +			     &args->offset);
>>   }
>>
>>   /* Immediately discard the backing storage */
>> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
>> index 6a19371391fa..359a36d604bb 100644
>> --- a/include/uapi/drm/i915_drm.h
>> +++ b/include/uapi/drm/i915_drm.h
>> @@ -528,6 +528,9 @@ struct drm_i915_gem_mmap_gtt {
>>   	 * This is a fixed-size type for 32/64 compatibility.
>>   	 */
>>   	__u64 offset;
>> +
>> +#define I915_MMAP2_CPU   0x1
>> +	__u64 flags;
>>   };
>>
>>   struct drm_i915_gem_set_domain {
>> --
>> 1.9.1
>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-27 16:01       ` Tvrtko Ursulin
@ 2016-01-27 16:10         ` Daniel Vetter
  0 siblings, 0 replies; 24+ messages in thread
From: Daniel Vetter @ 2016-01-27 16:10 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Wed, Jan 27, 2016 at 04:01:26PM +0000, Tvrtko Ursulin wrote:
> 
> On 27/01/16 15:51, Daniel Vetter wrote:
> >On Wed, Jan 27, 2016 at 03:21:48PM +0000, Tvrtko Ursulin wrote:
> >>From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> >>
> >>This enables mapping via CPU using the proper DRM mmap API for
> >>better debug (Valgrind) and implementation symmetry in the
> >>driver.
> >>
> >>v2:
> >>  * Use normal mutex, skip domain management and pin pages. (Chris Wilson)
> >>  * No need to drop struct mutex over vma_insert_pfn.
> >
> >I think we still neeed that on first fault:
> >
> >- userspac calls set_domain(GTT)
> >- kernel does nothing since there's no binding
> >- userspace starts accessing gtt mmap
> >- kernel faults, but doesn't update domain/flush cpu caches
> >-> BOOM
> 
> Boom what? :) (seriously I don't follow)

I screwed up the example, this one explains why we need the set_domain for
gtt mmaps. For cpu mmaps I think we can get away with it, since we never
optimize away a set_domain(CPU). The problem is just the asymmetry in how
we treat set_domain(GTT) when there's no gtt mmaping.

> And as an aside, you would merge this in general or see value in it?

I think the idea makes sense, seems to still lack justification in form of
some open-source userspace wanting this ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-27 15:51     ` Daniel Vetter
  2016-01-27 16:01       ` Tvrtko Ursulin
@ 2016-01-27 16:32       ` Chris Wilson
  1 sibling, 0 replies; 24+ messages in thread
From: Chris Wilson @ 2016-01-27 16:32 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Intel-gfx

On Wed, Jan 27, 2016 at 04:51:02PM +0100, Daniel Vetter wrote:
> On Wed, Jan 27, 2016 at 03:21:48PM +0000, Tvrtko Ursulin wrote:
> > From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > 
> > This enables mapping via CPU using the proper DRM mmap API for
> > better debug (Valgrind) and implementation symmetry in the
> > driver.
> > 
> > v2:
> >  * Use normal mutex, skip domain management and pin pages. (Chris Wilson)
> >  * No need to drop struct mutex over vma_insert_pfn.
> 
> I think we still neeed that on first fault:
> 
> - userspac calls set_domain(GTT)
> - kernel does nothing since there's no binding

For GPU mmaps it would be set_domain(CPU). For WC/GTT, we already closed
that hole for precisely this reason.

However, I can accept the if the kernel shrinks between pointer accesses,
even though I don't like it. pagefault-of-doom again.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-27 15:24         ` Tvrtko Ursulin
@ 2016-01-27 16:36           ` Chris Wilson
  2016-01-27 16:40           ` Chris Wilson
  1 sibling, 0 replies; 24+ messages in thread
From: Chris Wilson @ 2016-01-27 16:36 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Wed, Jan 27, 2016 at 03:24:43PM +0000, Tvrtko Ursulin wrote:
> 
> On 26/01/16 16:59, Chris Wilson wrote:
> >Another thing I realised was that this severely limits the mmap space on
> >32-bit systems, as the vma manager is unsigned long. The CPU mmaping was
> >a way around some of the restrictions. That would seem fairly easy to
> >lift (and I hope without consequence).
> 
> I did not manage to figure out what here limits the space on 32-bit systems?

A hang-over of mine. We once used to exhaust the 32bit mmap space quite
easily. I was thinking drm_vma_manager had the same limit due to its use
of unsigned long, but it is counting in pages not bytes, so we have
44bits available of space, which will hopefully be enuogh for the last
remaining 32bit system to grow old gracefully.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
  2016-01-27 15:24         ` Tvrtko Ursulin
  2016-01-27 16:36           ` Chris Wilson
@ 2016-01-27 16:40           ` Chris Wilson
  1 sibling, 0 replies; 24+ messages in thread
From: Chris Wilson @ 2016-01-27 16:40 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Intel-gfx

On Wed, Jan 27, 2016 at 03:24:43PM +0000, Tvrtko Ursulin wrote:
> 
> On 26/01/16 16:59, Chris Wilson wrote:
> >On Tue, Jan 26, 2016 at 04:23:28PM +0000, Tvrtko Ursulin wrote:
> >>Oh yeah, need to pin the pages..
> >
> >But only whilst inserting. Once inserted they need to be evicted, and I
> >was wondering whether we should do the zap on put_pages(). If we don't,
> >it means that the shrinker is neutered.
> 
> Ah I forgot this in v2. So we would need something like
> obj->fault_mappable so that i915_gem_release_mmap runs, yes?

Something like that. I don't have a good idea, but I was wondering if
plonking a callback into the offset-node would be useful, and walking a
list of nodes on the object.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* ✓ Fi.CI.BAT: success for Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT (rev3)
  2016-01-26 14:53 [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
                   ` (4 preceding siblings ...)
  2016-01-26 14:53 ` [RFC 5/5] drm/i915: Announce the new DRM_IOCTL_I915_GEM_MMAP_GTT capabilities Tvrtko Ursulin
@ 2016-01-28  9:18 ` Patchwork
  2016-01-28 16:10 ` Patchwork
  6 siblings, 0 replies; 24+ messages in thread
From: Patchwork @ 2016-01-28  9:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Summary ==

Built on 430706bace599ea1a908b9a7c6b7ea17535fe17f drm-intel-nightly: 2016y-01m-27d-16h-33m-06s UTC integration manifest

Test kms_pipe_crc_basic:
        Subgroup read-crc-pipe-b:
                dmesg-warn -> PASS       (ilk-hp8440p)

bdw-ultra        total:144  pass:138  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:144  pass:120  dwarn:0   dfail:0   fail:0   skip:24 
byt-nuc          total:144  pass:129  dwarn:0   dfail:0   fail:0   skip:15 
hsw-brixbox      total:144  pass:137  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:144  pass:140  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:144  pass:105  dwarn:0   dfail:0   fail:1   skip:38 
ivb-t430s        total:144  pass:138  dwarn:0   dfail:0   fail:0   skip:6  
skl-i5k-2        total:144  pass:135  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:144  pass:130  dwarn:0   dfail:0   fail:0   skip:14 
snb-x220t        total:144  pass:130  dwarn:0   dfail:0   fail:1   skip:13 

Results at /archive/results/CI_IGT_test/Patchwork_1276/

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

* ✓ Fi.CI.BAT: success for Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT (rev3)
  2016-01-26 14:53 [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
                   ` (5 preceding siblings ...)
  2016-01-28  9:18 ` ✓ Fi.CI.BAT: success for Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT (rev3) Patchwork
@ 2016-01-28 16:10 ` Patchwork
  6 siblings, 0 replies; 24+ messages in thread
From: Patchwork @ 2016-01-28 16:10 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

== Summary ==

Series 2843v3 Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT
http://patchwork.freedesktop.org/api/1.0/series/2843/revisions/3/mbox/


bdw-nuci7        total:156  pass:147  dwarn:0   dfail:0   fail:0   skip:9  
bdw-ultra        total:159  pass:153  dwarn:0   dfail:0   fail:0   skip:6  
bsw-nuc-2        total:159  pass:135  dwarn:0   dfail:0   fail:0   skip:24 
byt-nuc          total:159  pass:142  dwarn:0   dfail:0   fail:0   skip:17 
hsw-brixbox      total:159  pass:152  dwarn:0   dfail:0   fail:0   skip:7  
hsw-gt2          total:159  pass:155  dwarn:0   dfail:0   fail:0   skip:4  
ilk-hp8440p      total:159  pass:114  dwarn:0   dfail:0   fail:1   skip:44 
ivb-t430s        total:159  pass:151  dwarn:0   dfail:0   fail:0   skip:8  
skl-i5k-2        total:159  pass:150  dwarn:1   dfail:0   fail:0   skip:8  
snb-dellxps      total:159  pass:141  dwarn:0   dfail:0   fail:0   skip:18 
snb-x220t        total:159  pass:141  dwarn:0   dfail:0   fail:1   skip:17 

Results at /archive/results/CI_IGT_test/Patchwork_1304/

b3f8ad64bc71f6236f05c2e9f4ad49a61745869a drm-intel-nightly: 2016y-01m-28d-10h-26m-23s UTC integration manifest
bd9f3f17bcb78179490f71c763d8555f97ceddef drm/i915: Announce the new DRM_IOCTL_I915_GEM_MMAP_GTT capabilities
3f784ee758028c10779be6c227b0674a604ab18a drm/i915: Add support for write-combined CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
4704d553d1a0ff8d69982358975fdbb3666f48f3 drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT
2757c99ee9c686ca0f0f1dbbed0eabf7015adbae drm/i915: Extract code mapping errno to vm fault code
fe46382e262524f25c06500f9725eb483cc5a2f7 drm: Allow drivers setting vm_ops per vma offset node

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2016-01-28 16:10 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-26 14:53 [RFC 0/5] Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
2016-01-26 14:53 ` [RFC 1/5] drm: Allow drivers setting vm_ops per vma offset node Tvrtko Ursulin
2016-01-26 14:53 ` [RFC 2/5] drm/i915: Extract code mapping errno to vm fault code Tvrtko Ursulin
2016-01-26 15:18   ` Chris Wilson
2016-01-26 16:24     ` Tvrtko Ursulin
2016-01-26 16:42       ` Chris Wilson
2016-01-26 14:53 ` [RFC 3/5] drm/i915: Add support for CPU mapping to DRM_IOCTL_I915_GEM_MMAP_GTT Tvrtko Ursulin
2016-01-26 15:10   ` Chris Wilson
2016-01-26 16:23     ` Tvrtko Ursulin
2016-01-26 16:59       ` Chris Wilson
2016-01-27 15:24         ` Tvrtko Ursulin
2016-01-27 16:36           ` Chris Wilson
2016-01-27 16:40           ` Chris Wilson
2016-01-27 15:21   ` [PATCH v2 " Tvrtko Ursulin
2016-01-27 15:51     ` Daniel Vetter
2016-01-27 16:01       ` Tvrtko Ursulin
2016-01-27 16:10         ` Daniel Vetter
2016-01-27 16:32       ` Chris Wilson
2016-01-26 14:53 ` [RFC 4/5] drm/i915: Add support for write-combined " Tvrtko Ursulin
2016-01-26 15:11   ` Chris Wilson
2016-01-27 15:22   ` [PATCH v2 " Tvrtko Ursulin
2016-01-26 14:53 ` [RFC 5/5] drm/i915: Announce the new DRM_IOCTL_I915_GEM_MMAP_GTT capabilities Tvrtko Ursulin
2016-01-28  9:18 ` ✓ Fi.CI.BAT: success for Adding CPU mmap support to DRM_IOCTL_I915_GEM_MMAP_GTT (rev3) Patchwork
2016-01-28 16:10 ` Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.