All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v11 0/9] Support for creating/using Stolen memory backed objects
@ 2015-12-14  5:46 ankitprasad.r.sharma
  2015-12-14  5:46 ` [PATCH 1/9] drm/i915: Allow use of get_dma_address for stolen " ankitprasad.r.sharma
                   ` (8 more replies)
  0 siblings, 9 replies; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>

This patch series adds support for creating/using Stolen memory backed
objects.

Despite being a unified memory architecture (UMA) some bits of memory
are more equal than others. In particular we have the thorny issue of
stolen memory, memory stolen from the system by the BIOS and reserved
for igfx use. Stolen memory is required for some functions of the GPU
and display engine, but in general it goes wasted. Whilst we cannot
return it back to the system, we need to find some other method for
utilising it. As we do not support direct access to the physical address
in the stolen region, it behaves like a different class of memory,
closer in kin to local GPU memory. This strongly suggests that we need a
placement model like TTM if we are to fully utilize these discrete
chunks of differing memory.

To add support for creating Stolen memory backed objects, we extend the
drm_i915_gem_create structure, by adding a new flag through which user
can specify the preference to allocate the object from stolen memory,
which if set, an attempt will be made to allocate the object from stolen
memory subject to the availability of free space in the stolen region.

This patch series adds support for clearing buffer objects via CPU/GTT.
This is particularly useful for clearing out the memory from stolen
region, but can also be used for other shmem allocated objects. Currently
being used for buffers allocated in the stolen region. Also adding support
for stealing purgable stolen pages, if we run out of stolen memory when
trying to allocate an object.

v2: Added support for read/write from/to objects not backed by
shmem using the pread/pwrite interface.
Also extended the current get_aperture ioctl to retrieve the
total and available size of the stolen region.

v3: Removed the extended get_aperture ioctl patch 5 (to be submitted as
part of other patch series), addressed comments by Chris about pread/pwrite
for non shmem backed objects.

v4: Rebased to the latest drm-intel-nightly.

v5: Addressed comments, replaced patch 1/4 "Clearing buffers via blitter
engine" by "Clearing buffers via CPU/GTT".

v6: Rebased to the latest drm-intel-nightly, Addressed comments, updated
stolen memory purging logic by maintaining a list for purgable stolen
memory objects, enabled pread/pwrite for all non-shmem backed objects
without tiling restrictions.

v7: Addressed comments, compiler optimization, new patch added for correct
error code propagation to the userspace.

v8: Added a new patch to the series to Migrate stolen objects before
hibernation, as stolen memory is not preserved across hibernation. Added
correct error propagation for shmem as well non-shmem backed object allocation.

v9: Addressed comments, use of insert_page helper function to map object page
by page which can be helpful in low aperture space availability.

v10: Addressed comments, use insert_page for clearing out the stolen memory

v11: Addressed comments, 3 new patches added to support allocation from Stolen
memory
1. Allow use of i915_gem_object_get_dma_address for stolen backed objects
2. Use insert_page for pwrite_fast
3. Fail the execbuff using stolen objects as batchbuffers

This can be verified using IGT tests: igt/gem_stolen, igt/gem_create

Ankitprasad Sharma (7):
  drm/i915: Allow use of i915_gem_object_get_dma_address for stolen
    backed objects
  drm/i915: Use insert_page for pwrite_fast
  drm/i915: Clearing buffer objects via CPU/GTT
  drm/i915: Support for creating Stolen memory backed objects
  drm/i915: Propagating correct error codes to the userspace
  drm/i915: Support for pread/pwrite from/to non shmem backed objects
  drm/i915: Fail the execbuff using stolen objects as batchbuffers

Chris Wilson (2):
  drm/i915: Add support for stealing purgable stolen pages
  drm/i915: Migrate stolen objects before hibernation

 drivers/gpu/drm/i915/i915_debugfs.c          |   6 +-
 drivers/gpu/drm/i915/i915_dma.c              |   3 +
 drivers/gpu/drm/i915/i915_drv.c              |  17 +-
 drivers/gpu/drm/i915/i915_drv.h              |  27 +-
 drivers/gpu/drm/i915/i915_gem.c              | 580 ++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_batch_pool.c   |   4 +-
 drivers/gpu/drm/i915/i915_gem_context.c      |   4 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c   |   4 +-
 drivers/gpu/drm/i915/i915_gem_render_state.c |   7 +-
 drivers/gpu/drm/i915/i915_gem_stolen.c       | 215 ++++++++--
 drivers/gpu/drm/i915/i915_guc_submission.c   |  52 ++-
 drivers/gpu/drm/i915/intel_display.c         |   5 +-
 drivers/gpu/drm/i915/intel_fbdev.c           |  12 +-
 drivers/gpu/drm/i915/intel_lrc.c             |  10 +-
 drivers/gpu/drm/i915/intel_overlay.c         |   4 +-
 drivers/gpu/drm/i915/intel_pm.c              |  13 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c      |  27 +-
 include/uapi/drm/i915_drm.h                  |  16 +
 18 files changed, 845 insertions(+), 161 deletions(-)

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/9] drm/i915: Allow use of get_dma_address for stolen backed objects
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-17 10:20   ` Tvrtko Ursulin
  2015-12-14  5:46 ` [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast ankitprasad.r.sharma
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>

i915_gem_object_get_dma_address function is used to retrieve the dma address
of a particular page so as to map it in a given GTT entry for CPU access.
This function would be used for stolen backed objects also for tasks like
pwrite,  clearing of the pages etc. So the obj->get_page.sg needs to be
initialized for the stolen objects also.

Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_stolen.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 598ed2f..5384767 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -569,6 +569,9 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 	if (obj->pages == NULL)
 		goto cleanup;
 
+	obj->get_page.sg = obj->pages->sgl;
+	obj->get_page.last = 0;
+
 	i915_gem_object_pin_pages(obj);
 	obj->stolen = stolen;
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
  2015-12-14  5:46 ` [PATCH 1/9] drm/i915: Allow use of get_dma_address for stolen " ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-14  9:54   ` Chris Wilson
  2015-12-17 10:45   ` Tvrtko Ursulin
  2015-12-14  5:46 ` [PATCH 3/9] drm/i915: Clearing buffer objects via CPU/GTT ankitprasad.r.sharma
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>

In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
we try a nonblocking pin for the whole object (since that is fastest if
reused), then failing that we try to grab one page in the mappable
aperture. It also allows us to handle objects larger than the mappable
aperture (e.g. if we need to pwrite with vGPU restricting the aperture
to a measely 8MiB or something like that).

v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)

v3: Combined loops based on local patch by Chris (Chris)

v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris)

Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 86 ++++++++++++++++++++++++++++++-----------
 1 file changed, 64 insertions(+), 22 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bf7f203..46c1e75 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -61,6 +61,21 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 	return obj->pin_display;
 }
 
+static int
+i915_gem_insert_node_in_range(struct drm_i915_private *i915,
+			      struct drm_mm_node *node, u64 size,
+			      unsigned alignment, u64 start, u64 end)
+{
+	int ret;
+
+	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
+						  size, alignment, 0, start,
+						  end, DRM_MM_SEARCH_DEFAULT,
+						  DRM_MM_SEARCH_DEFAULT);
+
+	return ret;
+}
+
 /* some bookkeeping */
 static void i915_gem_info_add_obj(struct drm_i915_private *dev_priv,
 				  size_t size)
@@ -760,20 +775,29 @@ fast_user_write(struct io_mapping *mapping,
  * user into the GTT, uncached.
  */
 static int
-i915_gem_gtt_pwrite_fast(struct drm_device *dev,
+i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
 			 struct drm_i915_gem_object *obj,
 			 struct drm_i915_gem_pwrite *args,
 			 struct drm_file *file)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	ssize_t remain;
-	loff_t offset, page_base;
+	struct drm_mm_node node;
+	uint64_t remain, offset;
 	char __user *user_data;
-	int page_offset, page_length, ret;
+	int ret;
 
 	ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE | PIN_NONBLOCK);
-	if (ret)
-		goto out;
+	if (ret) {
+		memset(&node, 0, sizeof(node));
+		ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0,
+						    0, i915->gtt.mappable_end);
+		if (ret)
+			goto out;
+
+		i915_gem_object_pin_pages(obj);
+	} else {
+		node.start = i915_gem_obj_ggtt_offset(obj);
+		node.allocated = false;
+	}
 
 	ret = i915_gem_object_set_to_gtt_domain(obj, true);
 	if (ret)
@@ -783,31 +807,39 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	if (ret)
 		goto out_unpin;
 
-	user_data = to_user_ptr(args->data_ptr);
-	remain = args->size;
-
-	offset = i915_gem_obj_ggtt_offset(obj) + args->offset;
-
 	intel_fb_obj_invalidate(obj, ORIGIN_GTT);
+	obj->dirty = true;
 
-	while (remain > 0) {
+	user_data = to_user_ptr(args->data_ptr);
+	offset = args->offset;
+	remain = args->size;
+	while (remain) {
 		/* Operation in this page
 		 *
 		 * page_base = page offset within aperture
 		 * page_offset = offset within page
 		 * page_length = bytes to copy for this page
 		 */
-		page_base = offset & PAGE_MASK;
-		page_offset = offset_in_page(offset);
-		page_length = remain;
-		if ((page_offset + remain) > PAGE_SIZE)
-			page_length = PAGE_SIZE - page_offset;
-
+		u32 page_base = node.start;
+		unsigned page_offset = offset_in_page(offset);
+		unsigned page_length = PAGE_SIZE - page_offset;
+		page_length = remain < page_length ? remain : page_length;
+		if (node.allocated) {
+			wmb();
+			i915->gtt.base.insert_page(&i915->gtt.base,
+						   i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
+						   node.start,
+						   I915_CACHE_NONE,
+						   0);
+			wmb();
+		} else {
+			page_base += offset & PAGE_MASK;
+		}
 		/* If we get a fault while copying data, then (presumably) our
 		 * source page isn't available.  Return the error and we'll
 		 * retry in the slow path.
 		 */
-		if (fast_user_write(dev_priv->gtt.mappable, page_base,
+		if (fast_user_write(i915->gtt.mappable, page_base,
 				    page_offset, user_data, page_length)) {
 			ret = -EFAULT;
 			goto out_flush;
@@ -821,7 +853,17 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 out_flush:
 	intel_fb_obj_flush(obj, false, ORIGIN_GTT);
 out_unpin:
-	i915_gem_object_ggtt_unpin(obj);
+	if (node.allocated) {
+		wmb();
+		i915->gtt.base.clear_range(&i915->gtt.base,
+				node.start, node.size,
+				true);
+		drm_mm_remove_node(&node);
+		i915_gem_object_unpin_pages(obj);
+	}
+	else {
+		i915_gem_object_ggtt_unpin(obj);
+	}
 out:
 	return ret;
 }
@@ -1086,7 +1128,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 	if (obj->tiling_mode == I915_TILING_NONE &&
 	    obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
 	    cpu_write_needs_clflush(obj)) {
-		ret = i915_gem_gtt_pwrite_fast(dev, obj, args, file);
+		ret = i915_gem_gtt_pwrite_fast(dev_priv, obj, args, file);
 		/* Note that the gtt paths might fail with non-page-backed user
 		 * pointers (e.g. gtt mappings when moving data between
 		 * textures). Fallback to the shmem path in that case. */
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/9] drm/i915: Clearing buffer objects via CPU/GTT
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
  2015-12-14  5:46 ` [PATCH 1/9] drm/i915: Allow use of get_dma_address for stolen " ankitprasad.r.sharma
  2015-12-14  5:46 ` [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-14  9:48   ` Chris Wilson
  2015-12-17 10:27   ` Tvrtko Ursulin
  2015-12-14  5:46 ` [PATCH 4/9] drm/i915: Support for creating Stolen memory backed objects ankitprasad.r.sharma
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>

This patch adds support for clearing buffer objects via CPU/GTT. This
is particularly useful for clearing out the non shmem backed objects.
Currently intend to use this only for buffers allocated from stolen
region.

v2: Added kernel doc for i915_gem_clear_object(), corrected/removed
variable assignments (Tvrtko)

v3: Map object page by page to the gtt if the pinning of the whole object
to the ggtt fails, Corrected function name (Chris)

v4: Clear the buffer page by page, and not map the whole object in the gtt
aperture. Use i915 wrapper function in place of drm_mm_insert_node_in_range.

Testcase: igt/gem_stolen

Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h |  1 +
 drivers/gpu/drm/i915/i915_gem.c | 44 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 45 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a10b866..e195fee 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2897,6 +2897,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
 				    int *needs_clflush);
 
 int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
+int i915_gem_object_clear(struct drm_i915_gem_object *obj);
 
 static inline int __sg_page_count(struct scatterlist *sg)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 46c1e75..e50a91b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5293,3 +5293,47 @@ fail:
 	drm_gem_object_unreference(&obj->base);
 	return ERR_PTR(ret);
 }
+
+/**
+ * i915_gem_object_clear() - Clear buffer object via CPU/GTT
+ * @obj: Buffer object to be cleared
+ *
+ * Return: 0 - success, non-zero - failure
+ */
+int i915_gem_object_clear(struct drm_i915_gem_object *obj)
+{
+	int ret, i;
+	char __iomem *base;
+	size_t size = obj->base.size;
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct drm_mm_node node;
+
+	lockdep_assert_held(&obj->base.dev->struct_mutex);
+	memset(&node, 0, sizeof(node));
+	ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0,
+					    0, i915->gtt.mappable_end);
+	if (ret)
+		goto out;
+
+	i915_gem_object_pin_pages(obj);
+	base = io_mapping_map_wc(i915->gtt.mappable, node.start);
+	for (i = 0; i < size/PAGE_SIZE; i++) {
+		wmb();
+		i915->gtt.base.insert_page(&i915->gtt.base,
+					   i915_gem_object_get_dma_address(obj, i),
+					   node.start,
+					   I915_CACHE_NONE, 0);
+		wmb();
+		memset_io(base, 0, 4096);
+	}
+
+	wmb();
+	io_mapping_unmap(base);
+	i915->gtt.base.clear_range(&i915->gtt.base,
+			node.start, node.size,
+			true);
+	drm_mm_remove_node(&node);
+	i915_gem_object_unpin_pages(obj);
+out:
+	return ret;
+}
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/9] drm/i915: Support for creating Stolen memory backed objects
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
                   ` (2 preceding siblings ...)
  2015-12-14  5:46 ` [PATCH 3/9] drm/i915: Clearing buffer objects via CPU/GTT ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-14 10:05   ` Chris Wilson
  2015-12-14  5:46 ` [PATCH 5/9] drm/i915: Propagating correct error codes to the userspace ankitprasad.r.sharma
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>

Extend the drm_i915_gem_create structure to add support for
creating Stolen memory backed objects. Added a new flag through
which user can specify the preference to allocate the object from
stolen memory, which if set, an attempt will be made to allocate
the object from stolen memory subject to the availability of
free space in the stolen region.

v2: Rebased to the latest drm-intel-nightly (Ankit)

v3: Changed versioning of GEM_CREATE param, added new comments (Tvrtko)

v4: Changed size from 32b to 64b to prevent userspace overflow (Tvrtko)
Corrected function arguments ordering (Chris)

v5: Corrected function name (Chris)

Testcase: igt/gem_stolen

Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_dma.c        |  3 +++
 drivers/gpu/drm/i915/i915_drv.h        |  2 +-
 drivers/gpu/drm/i915/i915_gem.c        | 30 +++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/i915_gem_stolen.c |  4 ++--
 include/uapi/drm/i915_drm.h            | 16 ++++++++++++++++
 5 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 52b8289..5d2189c 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -172,6 +172,9 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_SOFTPIN:
 		value = 1;
 		break;
+	case I915_PARAM_CREATE_VERSION:
+		value = 2;
+		break;
 	default:
 		DRM_DEBUG("Unknown parameter %d\n", param->param);
 		return -EINVAL;
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e195fee..dcdfb97 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3255,7 +3255,7 @@ void i915_gem_stolen_remove_node(struct drm_i915_private *dev_priv,
 int i915_gem_init_stolen(struct drm_device *dev);
 void i915_gem_cleanup_stolen(struct drm_device *dev);
 struct drm_i915_gem_object *
-i915_gem_object_create_stolen(struct drm_device *dev, u32 size);
+i915_gem_object_create_stolen(struct drm_device *dev, u64 size);
 struct drm_i915_gem_object *
 i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 					       u32 stolen_offset,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e50a91b..0a859b0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -390,6 +390,7 @@ static int
 i915_gem_create(struct drm_file *file,
 		struct drm_device *dev,
 		uint64_t size,
+		uint32_t flags,
 		uint32_t *handle_p)
 {
 	struct drm_i915_gem_object *obj;
@@ -400,8 +401,31 @@ i915_gem_create(struct drm_file *file,
 	if (size == 0)
 		return -EINVAL;
 
+	if (flags & __I915_CREATE_UNKNOWN_FLAGS)
+		return -EINVAL;
+
 	/* Allocate the new object */
-	obj = i915_gem_alloc_object(dev, size);
+	if (flags & I915_CREATE_PLACEMENT_STOLEN) {
+		mutex_lock(&dev->struct_mutex);
+		obj = i915_gem_object_create_stolen(dev, size);
+		if (!obj) {
+			mutex_unlock(&dev->struct_mutex);
+			return -ENOMEM;
+		}
+
+		/* Always clear fresh buffers before handing to userspace */
+		ret = i915_gem_object_clear(obj);
+		if (ret) {
+			drm_gem_object_unreference(&obj->base);
+			mutex_unlock(&dev->struct_mutex);
+			return ret;
+		}
+
+		mutex_unlock(&dev->struct_mutex);
+	} else {
+		obj = i915_gem_alloc_object(dev, size);
+	}
+
 	if (obj == NULL)
 		return -ENOMEM;
 
@@ -424,7 +448,7 @@ i915_gem_dumb_create(struct drm_file *file,
 	args->pitch = ALIGN(args->width * DIV_ROUND_UP(args->bpp, 8), 64);
 	args->size = args->pitch * args->height;
 	return i915_gem_create(file, dev,
-			       args->size, &args->handle);
+			       args->size, 0, &args->handle);
 }
 
 /**
@@ -437,7 +461,7 @@ i915_gem_create_ioctl(struct drm_device *dev, void *data,
 	struct drm_i915_gem_create *args = data;
 
 	return i915_gem_create(file, dev,
-			       args->size, &args->handle);
+			       args->size, args->flags, &args->handle);
 }
 
 static inline int
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 5384767..17d679e 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -586,7 +586,7 @@ cleanup:
 }
 
 struct drm_i915_gem_object *
-i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
+i915_gem_object_create_stolen(struct drm_device *dev, u64 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
@@ -596,7 +596,7 @@ i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
 	if (!drm_mm_initialized(&dev_priv->mm.stolen))
 		return NULL;
 
-	DRM_DEBUG_KMS("creating stolen object: size=%x\n", size);
+	DRM_DEBUG_KMS("creating stolen object: size=%llx\n", size);
 	if (size == 0)
 		return NULL;
 
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index d727b49..ebce8c9 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -357,6 +357,7 @@ typedef struct drm_i915_irq_wait {
 #define I915_PARAM_HAS_GPU_RESET	 35
 #define I915_PARAM_HAS_RESOURCE_STREAMER 36
 #define I915_PARAM_HAS_EXEC_SOFTPIN	 37
+#define I915_PARAM_CREATE_VERSION	 38
 
 typedef struct drm_i915_getparam {
 	__s32 param;
@@ -456,6 +457,21 @@ struct drm_i915_gem_create {
 	 */
 	__u32 handle;
 	__u32 pad;
+	/**
+	 * Requested flags (currently used for placement
+	 * (which memory domain))
+	 *
+	 * You can request that the object be created from special memory
+	 * rather than regular system pages using this parameter. Such
+	 * irregular objects may have certain restrictions (such as CPU
+	 * access to a stolen object is verboten).
+	 *
+	 * This can be used in the future for other purposes too
+	 * e.g. specifying tiling/caching/madvise
+	 */
+	__u32 flags;
+#define I915_CREATE_PLACEMENT_STOLEN 	(1<<0) /* Cannot use CPU mmaps */
+#define __I915_CREATE_UNKNOWN_FLAGS	-(I915_CREATE_PLACEMENT_STOLEN << 1)
 };
 
 struct drm_i915_gem_pread {
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 5/9] drm/i915: Propagating correct error codes to the userspace
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
                   ` (3 preceding siblings ...)
  2015-12-14  5:46 ` [PATCH 4/9] drm/i915: Support for creating Stolen memory backed objects ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-14 10:10   ` Chris Wilson
  2015-12-14  5:46 ` [PATCH 6/9] drm/i915: Add support for stealing purgable stolen pages ankitprasad.r.sharma
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>

Propagating correct error codes to userspace by using ERR_PTR and
PTR_ERR macros for stolen memory based object allocation. We generally
return -ENOMEM to the user whenever there is a failure in object
allocation. This patch helps user to identify the correct reason for the
failure and not just -ENOMEM each time.

v2: Moved the patch up in the series, added error propagation for
i915_gem_alloc_object too (Chris)

v3: Removed storing of error pointer inside structs, Corrected error
propagation in caller functions (Chris)

v4: Remove assignments inside the predicate (Chris)

v5: Removed unnecessary initializations, updated kerneldoc for
i915_guc_client, corrected missed error pointer handling (Tvrtko)

Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c              | 16 +++++----
 drivers/gpu/drm/i915/i915_gem_batch_pool.c   |  4 +--
 drivers/gpu/drm/i915/i915_gem_context.c      |  4 +--
 drivers/gpu/drm/i915/i915_gem_render_state.c |  7 ++--
 drivers/gpu/drm/i915/i915_gem_stolen.c       | 46 ++++++++++++------------
 drivers/gpu/drm/i915/i915_guc_submission.c   | 52 ++++++++++++++++++----------
 drivers/gpu/drm/i915/intel_display.c         |  2 +-
 drivers/gpu/drm/i915/intel_fbdev.c           |  6 ++--
 drivers/gpu/drm/i915/intel_lrc.c             | 10 +++---
 drivers/gpu/drm/i915/intel_overlay.c         |  4 +--
 drivers/gpu/drm/i915/intel_pm.c              |  7 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c      | 21 +++++------
 12 files changed, 101 insertions(+), 78 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0a859b0..05505de 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -408,9 +408,9 @@ i915_gem_create(struct drm_file *file,
 	if (flags & I915_CREATE_PLACEMENT_STOLEN) {
 		mutex_lock(&dev->struct_mutex);
 		obj = i915_gem_object_create_stolen(dev, size);
-		if (!obj) {
+		if (IS_ERR(obj)) {
 			mutex_unlock(&dev->struct_mutex);
-			return -ENOMEM;
+			return PTR_ERR(obj);
 		}
 
 		/* Always clear fresh buffers before handing to userspace */
@@ -426,8 +426,8 @@ i915_gem_create(struct drm_file *file,
 		obj = i915_gem_alloc_object(dev, size);
 	}
 
-	if (obj == NULL)
-		return -ENOMEM;
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
 
 	ret = drm_gem_handle_create(file, &obj->base, &handle);
 	/* drop reference from allocate - handle holds it now */
@@ -4451,14 +4451,16 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 	struct drm_i915_gem_object *obj;
 	struct address_space *mapping;
 	gfp_t mask;
+	int ret;
 
 	obj = i915_gem_object_alloc(dev);
 	if (obj == NULL)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
-	if (drm_gem_object_init(dev, &obj->base, size) != 0) {
+	ret = drm_gem_object_init(dev, &obj->base, size);
+	if (ret) {
 		i915_gem_object_free(obj);
-		return NULL;
+		return ERR_PTR(ret);
 	}
 
 	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
diff --git a/drivers/gpu/drm/i915/i915_gem_batch_pool.c b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
index 7bf2f3f..d79caa2 100644
--- a/drivers/gpu/drm/i915/i915_gem_batch_pool.c
+++ b/drivers/gpu/drm/i915/i915_gem_batch_pool.c
@@ -135,8 +135,8 @@ i915_gem_batch_pool_get(struct i915_gem_batch_pool *pool,
 		int ret;
 
 		obj = i915_gem_alloc_object(pool->dev, size);
-		if (obj == NULL)
-			return ERR_PTR(-ENOMEM);
+		if (IS_ERR(obj))
+			return obj;
 
 		ret = i915_gem_object_get_pages(obj);
 		if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 43761c5..9754894 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -179,8 +179,8 @@ i915_gem_alloc_context_obj(struct drm_device *dev, size_t size)
 	int ret;
 
 	obj = i915_gem_alloc_object(dev, size);
-	if (obj == NULL)
-		return ERR_PTR(-ENOMEM);
+	if (IS_ERR(obj))
+		return obj;
 
 	/*
 	 * Try to make the context utilize L3 as well as LLC.
diff --git a/drivers/gpu/drm/i915/i915_gem_render_state.c b/drivers/gpu/drm/i915/i915_gem_render_state.c
index 5026a62..2bfdd49 100644
--- a/drivers/gpu/drm/i915/i915_gem_render_state.c
+++ b/drivers/gpu/drm/i915/i915_gem_render_state.c
@@ -58,8 +58,11 @@ static int render_state_init(struct render_state *so, struct drm_device *dev)
 		return -EINVAL;
 
 	so->obj = i915_gem_alloc_object(dev, 4096);
-	if (so->obj == NULL)
-		return -ENOMEM;
+	if (IS_ERR(so->obj)) {
+		ret = PTR_ERR(so->obj);
+		so->obj = NULL;
+		return ret;
+	}
 
 	ret = i915_gem_obj_ggtt_pin(so->obj, 4096, 0);
 	if (ret)
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 17d679e..366080b9 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -492,6 +492,7 @@ i915_pages_create_for_stolen(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct sg_table *st;
 	struct scatterlist *sg;
+	int ret;
 
 	DRM_DEBUG_DRIVER("offset=0x%x, size=%d\n", offset, size);
 	BUG_ON(offset > dev_priv->gtt.stolen_size - size);
@@ -503,11 +504,12 @@ i915_pages_create_for_stolen(struct drm_device *dev,
 
 	st = kmalloc(sizeof(*st), GFP_KERNEL);
 	if (st == NULL)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
-	if (sg_alloc_table(st, 1, GFP_KERNEL)) {
+	ret = sg_alloc_table(st, 1, GFP_KERNEL);
+	if (ret) {
 		kfree(st);
-		return NULL;
+		return ERR_PTR(ret);
 	}
 
 	sg = st->sgl;
@@ -559,15 +561,17 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 
 	obj = i915_gem_object_alloc(dev);
 	if (obj == NULL)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	drm_gem_private_object_init(dev, &obj->base, stolen->size);
 	i915_gem_object_init(obj, &i915_gem_object_stolen_ops);
 
 	obj->pages = i915_pages_create_for_stolen(dev,
 						  stolen->start, stolen->size);
-	if (obj->pages == NULL)
-		goto cleanup;
+	if (IS_ERR(obj->pages)) {
+		i915_gem_object_free(obj);
+		return (void*) obj->pages;
+	}
 
 	obj->get_page.sg = obj->pages->sgl;
 	obj->get_page.last = 0;
@@ -579,10 +583,6 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 	obj->cache_level = HAS_LLC(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
 
 	return obj;
-
-cleanup:
-	i915_gem_object_free(obj);
-	return NULL;
 }
 
 struct drm_i915_gem_object *
@@ -594,29 +594,29 @@ i915_gem_object_create_stolen(struct drm_device *dev, u64 size)
 	int ret;
 
 	if (!drm_mm_initialized(&dev_priv->mm.stolen))
-		return NULL;
+		return ERR_PTR(-ENODEV);
 
 	DRM_DEBUG_KMS("creating stolen object: size=%llx\n", size);
 	if (size == 0)
-		return NULL;
+		return ERR_PTR(-EINVAL);
 
 	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
 	if (!stolen)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	ret = i915_gem_stolen_insert_node(dev_priv, stolen, size, 4096);
 	if (ret) {
 		kfree(stolen);
-		return NULL;
+		return ERR_PTR(ret);
 	}
 
 	obj = _i915_gem_object_create_stolen(dev, stolen);
-	if (obj)
+	if (!IS_ERR(obj))
 		return obj;
 
 	i915_gem_stolen_remove_node(dev_priv, stolen);
 	kfree(stolen);
-	return NULL;
+	return obj;
 }
 
 struct drm_i915_gem_object *
@@ -633,7 +633,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	int ret;
 
 	if (!drm_mm_initialized(&dev_priv->mm.stolen))
-		return NULL;
+		return ERR_PTR(-ENODEV);
 
 	DRM_DEBUG_KMS("creating preallocated stolen object: stolen_offset=%x, gtt_offset=%x, size=%x\n",
 			stolen_offset, gtt_offset, size);
@@ -641,11 +641,11 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	/* KISS and expect everything to be page-aligned */
 	if (WARN_ON(size == 0) || WARN_ON(size & 4095) ||
 	    WARN_ON(stolen_offset & 4095))
-		return NULL;
+		return ERR_PTR(-EINVAL);
 
 	stolen = kzalloc(sizeof(*stolen), GFP_KERNEL);
 	if (!stolen)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	stolen->start = stolen_offset;
 	stolen->size = size;
@@ -655,15 +655,15 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (ret) {
 		DRM_DEBUG_KMS("failed to allocate stolen space\n");
 		kfree(stolen);
-		return NULL;
+		return ERR_PTR(ret);
 	}
 
 	obj = _i915_gem_object_create_stolen(dev, stolen);
-	if (obj == NULL) {
+	if (IS_ERR(obj)) {
 		DRM_DEBUG_KMS("failed to allocate stolen object\n");
 		i915_gem_stolen_remove_node(dev_priv, stolen);
 		kfree(stolen);
-		return NULL;
+		return obj;
 	}
 
 	/* Some objects just need physical mem from stolen space */
@@ -701,5 +701,5 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 
 err:
 	drm_gem_object_unreference(&obj->base);
-	return NULL;
+	return ERR_PTR(ret);
 }
diff --git a/drivers/gpu/drm/i915/i915_guc_submission.c b/drivers/gpu/drm/i915/i915_guc_submission.c
index 0d23785b..ac042c6 100644
--- a/drivers/gpu/drm/i915/i915_guc_submission.c
+++ b/drivers/gpu/drm/i915/i915_guc_submission.c
@@ -629,27 +629,30 @@ int i915_guc_submit(struct i915_guc_client *client,
  * object needs to be pinned lifetime. Also we must pin it to gtt space other
  * than [0, GUC_WOPCM_TOP) because this range is reserved inside GuC.
  *
- * Return:	A drm_i915_gem_object if successful, otherwise NULL.
+ * Return:	A drm_i915_gem_object if successful, otherwise error pointer.
  */
 static struct drm_i915_gem_object *gem_allocate_guc_obj(struct drm_device *dev,
 							u32 size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
+	int ret;
 
 	obj = i915_gem_alloc_object(dev, size);
-	if (!obj)
-		return NULL;
+	if (IS_ERR(obj))
+		return obj;
 
-	if (i915_gem_object_get_pages(obj)) {
+	ret = i915_gem_object_get_pages(obj);
+	if (ret) {
 		drm_gem_object_unreference(&obj->base);
-		return NULL;
+		return ERR_PTR(ret);
 	}
 
-	if (i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
-			PIN_OFFSET_BIAS | GUC_WOPCM_TOP)) {
+	ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
+				    PIN_OFFSET_BIAS | GUC_WOPCM_TOP);
+	if (ret) {
 		drm_gem_object_unreference(&obj->base);
-		return NULL;
+		return ERR_PTR(ret);
 	}
 
 	/* Invalidate GuC TLB to let GuC take the latest updates to GTT. */
@@ -717,7 +720,7 @@ static void guc_client_free(struct drm_device *dev,
  * @ctx:	the context that owns the client (we use the default render
  * 		context)
  *
- * Return:	An i915_guc_client object if success.
+ * Return:	An i915_guc_client object if success, error pointer on failure.
  */
 static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 						uint32_t priority,
@@ -727,10 +730,11 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_guc *guc = &dev_priv->guc;
 	struct drm_i915_gem_object *obj;
+	int ret;
 
 	client = kzalloc(sizeof(*client), GFP_KERNEL);
 	if (!client)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	client->doorbell_id = GUC_INVALID_DOORBELL_ID;
 	client->priority = priority;
@@ -741,13 +745,16 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 			GUC_MAX_GPU_CONTEXTS, GFP_KERNEL);
 	if (client->ctx_index >= GUC_MAX_GPU_CONTEXTS) {
 		client->ctx_index = GUC_INVALID_CTX_ID;
+		ret = -EINVAL;
 		goto err;
 	}
 
 	/* The first page is doorbell/proc_desc. Two followed pages are wq. */
 	obj = gem_allocate_guc_obj(dev, GUC_DB_SIZE + GUC_WQ_SIZE);
-	if (!obj)
+	if (IS_ERR(obj)) {
+		ret = PTR_ERR(obj);
 		goto err;
+	}
 
 	client->client_obj = obj;
 	client->wq_offset = GUC_DB_SIZE;
@@ -766,9 +773,11 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 		client->proc_desc_offset = (GUC_DB_SIZE / 2);
 
 	client->doorbell_id = assign_doorbell(guc, client->priority);
-	if (client->doorbell_id == GUC_INVALID_DOORBELL_ID)
+	if (client->doorbell_id == GUC_INVALID_DOORBELL_ID) {
 		/* XXX: evict a doorbell instead */
+		ret = -EINVAL;
 		goto err;
+	}
 
 	guc_init_proc_desc(guc, client);
 	guc_init_ctx_desc(guc, client);
@@ -776,7 +785,8 @@ static struct i915_guc_client *guc_client_alloc(struct drm_device *dev,
 
 	/* XXX: Any cache flushes needed? General domain mgmt calls? */
 
-	if (host2guc_allocate_doorbell(guc, client))
+	ret = host2guc_allocate_doorbell(guc, client);
+	if (ret)
 		goto err;
 
 	DRM_DEBUG_DRIVER("new priority %u client %p: ctx_index %u db_id %u\n",
@@ -788,7 +798,7 @@ err:
 	DRM_ERROR("FAILED to create priority %u GuC client!\n", priority);
 
 	guc_client_free(dev, client);
-	return NULL;
+	return ERR_PTR(ret);
 }
 
 static void guc_create_log(struct intel_guc *guc)
@@ -813,7 +823,7 @@ static void guc_create_log(struct intel_guc *guc)
 	obj = guc->log_obj;
 	if (!obj) {
 		obj = gem_allocate_guc_obj(dev_priv->dev, size);
-		if (!obj) {
+		if (IS_ERR(obj)) {
 			/* logging will be off */
 			i915.guc_log_level = -1;
 			return;
@@ -843,6 +853,7 @@ int i915_guc_submission_init(struct drm_device *dev)
 	const size_t poolsize = GUC_MAX_GPU_CONTEXTS * ctxsize;
 	const size_t gemsize = round_up(poolsize, PAGE_SIZE);
 	struct intel_guc *guc = &dev_priv->guc;
+	int ret;
 
 	if (!i915.enable_guc_submission)
 		return 0; /* not enabled  */
@@ -851,8 +862,11 @@ int i915_guc_submission_init(struct drm_device *dev)
 		return 0; /* already allocated */
 
 	guc->ctx_pool_obj = gem_allocate_guc_obj(dev_priv->dev, gemsize);
-	if (!guc->ctx_pool_obj)
-		return -ENOMEM;
+	if (IS_ERR(guc->ctx_pool_obj)) {
+		ret = PTR_ERR(guc->ctx_pool_obj);
+		guc->ctx_pool_obj = NULL;
+		return ret;
+	}
 
 	ida_init(&guc->ctx_ids);
 
@@ -870,9 +884,9 @@ int i915_guc_submission_enable(struct drm_device *dev)
 
 	/* client for execbuf submission */
 	client = guc_client_alloc(dev, GUC_CTX_PRIORITY_KMD_NORMAL, ctx);
-	if (!client) {
+	if (IS_ERR(client)) {
 		DRM_ERROR("Failed to create execbuf guc_client\n");
-		return -ENOMEM;
+		return PTR_ERR(client);
 	}
 
 	guc->execbuf_client = client;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index dd0e966..006d43a 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2543,7 +2543,7 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
 							     base_aligned,
 							     base_aligned,
 							     size_aligned);
-	if (!obj)
+	if (IS_ERR(obj))
 		return false;
 
 	obj->tiling_mode = plane_config->tiling;
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 7ccde58..b2f134a 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -148,11 +148,11 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 	 * features. */
 	if (size * 2 < dev_priv->gtt.stolen_usable_size)
 		obj = i915_gem_object_create_stolen(dev, size);
-	if (obj == NULL)
+	if (IS_ERR_OR_NULL(obj))
 		obj = i915_gem_alloc_object(dev, size);
-	if (!obj) {
+	if (IS_ERR(obj)) {
 		DRM_ERROR("failed to allocate framebuffer\n");
-		ret = -ENOMEM;
+		ret = PTR_ERR(obj);
 		goto out;
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4ebafab..5ca4c06 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1371,9 +1371,11 @@ static int lrc_setup_wa_ctx_obj(struct intel_engine_cs *ring, u32 size)
 	int ret;
 
 	ring->wa_ctx.obj = i915_gem_alloc_object(ring->dev, PAGE_ALIGN(size));
-	if (!ring->wa_ctx.obj) {
+	if (IS_ERR(ring->wa_ctx.obj)) {
 		DRM_DEBUG_DRIVER("alloc LRC WA ctx backing obj failed.\n");
-		return -ENOMEM;
+		ret = PTR_ERR(ring->wa_ctx.obj);
+		ring->wa_ctx.obj = NULL;
+		return ret;
 	}
 
 	ret = i915_gem_obj_ggtt_pin(ring->wa_ctx.obj, PAGE_SIZE, 0);
@@ -2456,9 +2458,9 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
 	context_size += PAGE_SIZE * LRC_PPHWSP_PN;
 
 	ctx_obj = i915_gem_alloc_object(dev, context_size);
-	if (!ctx_obj) {
+	if (IS_ERR(ctx_obj)) {
 		DRM_DEBUG_DRIVER("Alloc LRC backing obj failed.\n");
-		return -ENOMEM;
+		return PTR_ERR(ctx_obj);
 	}
 
 	ringbuf = intel_engine_create_ringbuffer(ring, 4 * PAGE_SIZE);
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 76f1980..3a65858 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -1392,9 +1392,9 @@ void intel_setup_overlay(struct drm_device *dev)
 	reg_bo = NULL;
 	if (!OVERLAY_NEEDS_PHYSICAL(dev))
 		reg_bo = i915_gem_object_create_stolen(dev, PAGE_SIZE);
-	if (reg_bo == NULL)
+	if (IS_ERR_OR_NULL(reg_bo))
 		reg_bo = i915_gem_alloc_object(dev, PAGE_SIZE);
-	if (reg_bo == NULL)
+	if (IS_ERR(reg_bo))
 		goto out_free;
 	overlay->reg_bo = reg_bo;
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 9968c66..0afb819 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5174,10 +5174,11 @@ static void valleyview_setup_pctx(struct drm_device *dev)
 								      pcbr_offset,
 								      I915_GTT_OFFSET_NONE,
 								      pctx_size);
-		goto out;
+		if (!IS_ERR(pctx))
+			goto out;
 	}
 
-	DRM_DEBUG_DRIVER("BIOS didn't set up PCBR, fixing up\n");
+	DRM_DEBUG_DRIVER("BIOS didn't set up PCBR or prealloc failed, fixing up\n");
 
 	/*
 	 * From the Gunit register HAS:
@@ -5188,7 +5189,7 @@ static void valleyview_setup_pctx(struct drm_device *dev)
 	 * memory, or any other relevant ranges.
 	 */
 	pctx = i915_gem_object_create_stolen(dev, pctx_size);
-	if (!pctx) {
+	if (IS_ERR(pctx)) {
 		DRM_DEBUG("not enough stolen space for PCTX, disabling\n");
 		return;
 	}
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e5359eb..56d8375 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -679,9 +679,10 @@ intel_init_pipe_control(struct intel_engine_cs *ring)
 	WARN_ON(ring->scratch.obj);
 
 	ring->scratch.obj = i915_gem_alloc_object(ring->dev, 4096);
-	if (ring->scratch.obj == NULL) {
+	if (IS_ERR(ring->scratch.obj)) {
 		DRM_ERROR("Failed to allocate seqno page\n");
-		ret = -ENOMEM;
+		ret = PTR_ERR(ring->scratch.obj);
+		ring->scratch.obj = NULL;
 		goto err;
 	}
 
@@ -1939,9 +1940,9 @@ static int init_status_page(struct intel_engine_cs *ring)
 		int ret;
 
 		obj = i915_gem_alloc_object(ring->dev, 4096);
-		if (obj == NULL) {
+		if (IS_ERR(obj)) {
 			DRM_ERROR("Failed to allocate status page\n");
-			return -ENOMEM;
+			return PTR_ERR(obj);
 		}
 
 		ret = i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
@@ -2088,10 +2089,10 @@ static int intel_alloc_ringbuffer_obj(struct drm_device *dev,
 	obj = NULL;
 	if (!HAS_LLC(dev))
 		obj = i915_gem_object_create_stolen(dev, ringbuf->size);
-	if (obj == NULL)
+	if (IS_ERR_OR_NULL(obj))
 		obj = i915_gem_alloc_object(dev, ringbuf->size);
-	if (obj == NULL)
-		return -ENOMEM;
+	if (IS_ERR(obj))
+		return PTR_ERR(obj);
 
 	/* mark ring buffers as read-only from GPU side by default */
 	obj->gt_ro = 1;
@@ -2682,7 +2683,7 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	if (INTEL_INFO(dev)->gen >= 8) {
 		if (i915_semaphore_is_enabled(dev)) {
 			obj = i915_gem_alloc_object(dev, 4096);
-			if (obj == NULL) {
+			if (IS_ERR(obj)) {
 				DRM_ERROR("Failed to allocate semaphore bo. Disabling semaphores\n");
 				i915.semaphores = 0;
 			} else {
@@ -2789,9 +2790,9 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	/* Workaround batchbuffer to combat CS tlb bug. */
 	if (HAS_BROKEN_CS_TLB(dev)) {
 		obj = i915_gem_alloc_object(dev, I830_WA_SIZE);
-		if (obj == NULL) {
+		if (IS_ERR(obj)) {
 			DRM_ERROR("Failed to allocate batch bo\n");
-			return -ENOMEM;
+			return PTR_ERR(obj);
 		}
 
 		ret = i915_gem_obj_ggtt_pin(obj, 0, 0);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 6/9] drm/i915: Add support for stealing purgable stolen pages
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
                   ` (4 preceding siblings ...)
  2015-12-14  5:46 ` [PATCH 5/9] drm/i915: Propagating correct error codes to the userspace ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-14 10:13   ` Chris Wilson
  2015-12-17 10:51   ` Tvrtko Ursulin
  2015-12-14  5:46 ` [PATCH 7/9] drm/i915: Support for pread/pwrite from/to non shmem backed objects ankitprasad.r.sharma
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Chris Wilson <chris at chris-wilson.co.uk>

If we run out of stolen memory when trying to allocate an object, see if
we can reap enough purgeable objects to free up enough contiguous free
space for the allocation. This is in principle very much like evicting
objects to free up enough contiguous space in the vma when binding
a new object - and you will be forgiven for thinking that the code looks
very similar.

At the moment, we do not allow userspace to allocate objects in stolen,
so there is neither the memory pressure to trigger stolen eviction nor
any purgeable objects inside the stolen arena. However, this will change
in the near future, and so better management and defragmentation of
stolen memory will become a real issue.

v2: Remember to remove the drm_mm_node.

v3: Rebased to the latest drm-intel-nightly (Ankit)

v4: corrected if-else braces format (Tvrtko/kerneldoc)

v5: Rebased to the latest drm-intel-nightly (Ankit)
Added a seperate list to maintain purgable objects from stolen memory
region (Chris/Daniel)

v6: Compiler optimization (merging 2 single loops into one for() loop),
corrected code for object eviction, retire_requests before starting
object eviction (Chris)

v7: Added kernel doc for i915_gem_object_create_stolen()

v8: Check for struct_mutex lock before creating object from stolen
region (Tvrtko)

v9: Renamed variables to make usage clear, added comment, removed onetime
used macro (Tvrtko)

v10: Avoid masking of error when stolen_alloc fails (Tvrtko)

Testcase: igt/gem_stolen

Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c    |   6 +-
 drivers/gpu/drm/i915/i915_drv.h        |  17 +++-
 drivers/gpu/drm/i915/i915_gem.c        |  16 ++++
 drivers/gpu/drm/i915/i915_gem_stolen.c | 170 +++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_pm.c        |   4 +-
 5 files changed, 188 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a8721fc..f0aa3d4 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -174,7 +174,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 			seq_puts(m, ")");
 	}
 	if (obj->stolen)
-		seq_printf(m, " (stolen: %08llx)", obj->stolen->start);
+		seq_printf(m, " (stolen: %08llx)", obj->stolen->base.start);
 	if (obj->pin_display || obj->fault_mappable) {
 		char s[3], *t = s;
 		if (obj->pin_display)
@@ -253,9 +253,9 @@ static int obj_rank_by_stolen(void *priv,
 	struct drm_i915_gem_object *b =
 		container_of(B, struct drm_i915_gem_object, obj_exec_link);
 
-	if (a->stolen->start < b->stolen->start)
+	if (a->stolen->base.start < b->stolen->base.start)
 		return -1;
-	if (a->stolen->start > b->stolen->start)
+	if (a->stolen->base.start > b->stolen->base.start)
 		return 1;
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index dcdfb97..479703b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -841,6 +841,12 @@ struct i915_ctx_hang_stats {
 	bool banned;
 };
 
+struct i915_stolen_node {
+	struct drm_mm_node base;
+	struct list_head mm_link;
+	struct drm_i915_gem_object *obj;
+};
+
 /* This must match up with the value previously used for execbuf2.rsvd1. */
 #define DEFAULT_CONTEXT_HANDLE 0
 
@@ -1251,6 +1257,13 @@ struct i915_gem_mm {
 	 */
 	struct list_head unbound_list;
 
+	/**
+	 * List of stolen objects that have been marked as purgeable and
+	 * thus available for reaping if we need more space for a new
+	 * allocation. Ordered by time of marking purgeable.
+	 */
+	struct list_head stolen_list;
+
 	/** Usable portion of the GTT for GEM */
 	unsigned long stolen_base; /* limited to low memory (32-bit) */
 
@@ -2031,7 +2044,7 @@ struct drm_i915_gem_object {
 	struct list_head vma_list;
 
 	/** Stolen memory for this object, instead of being backed by shmem. */
-	struct drm_mm_node *stolen;
+	struct i915_stolen_node *stolen;
 	struct list_head global_list;
 
 	struct list_head ring_list[I915_NUM_RINGS];
@@ -2039,6 +2052,8 @@ struct drm_i915_gem_object {
 	struct list_head obj_exec_link;
 
 	struct list_head batch_pool_link;
+	/** Used during stolen memory allocations to temporarily hold a ref */
+	struct list_head stolen_link;
 
 	/**
 	 * This is set if the object is on the active lists (has pending
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 05505de..8a508cd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4411,6 +4411,20 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	if (obj->madv == I915_MADV_DONTNEED && obj->pages == NULL)
 		i915_gem_object_truncate(obj);
 
+	if (obj->stolen) {
+		switch (obj->madv) {
+		case I915_MADV_WILLNEED:
+			list_del_init(&obj->stolen->mm_link);
+			break;
+		case I915_MADV_DONTNEED:
+			list_move(&obj->stolen->mm_link,
+				  &dev_priv->mm.stolen_list);
+			break;
+		default:
+			break;
+		}
+	}
+
 	args->retained = obj->madv != __I915_MADV_PURGED;
 
 out:
@@ -4431,6 +4445,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&obj->obj_exec_link);
 	INIT_LIST_HEAD(&obj->vma_list);
 	INIT_LIST_HEAD(&obj->batch_pool_link);
+	INIT_LIST_HEAD(&obj->stolen_link);
 
 	obj->ops = ops;
 
@@ -5046,6 +5061,7 @@ i915_gem_load(struct drm_device *dev)
 	INIT_LIST_HEAD(&dev_priv->context_list);
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
+	INIT_LIST_HEAD(&dev_priv->mm.stolen_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
 	for (i = 0; i < I915_NUM_RINGS; i++)
 		init_ring_lists(&dev_priv->ring[i]);
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 366080b9..014d478 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -542,7 +542,8 @@ i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 
 	if (obj->stolen) {
-		i915_gem_stolen_remove_node(dev_priv, obj->stolen);
+		list_del(&obj->stolen->mm_link);
+		i915_gem_stolen_remove_node(dev_priv, &obj->stolen->base);
 		kfree(obj->stolen);
 		obj->stolen = NULL;
 	}
@@ -555,7 +556,7 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
 
 static struct drm_i915_gem_object *
 _i915_gem_object_create_stolen(struct drm_device *dev,
-			       struct drm_mm_node *stolen)
+			       struct i915_stolen_node *stolen)
 {
 	struct drm_i915_gem_object *obj;
 
@@ -563,11 +564,12 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 	if (obj == NULL)
 		return ERR_PTR(-ENOMEM);
 
-	drm_gem_private_object_init(dev, &obj->base, stolen->size);
+	drm_gem_private_object_init(dev, &obj->base, stolen->base.size);
 	i915_gem_object_init(obj, &i915_gem_object_stolen_ops);
 
 	obj->pages = i915_pages_create_for_stolen(dev,
-						  stolen->start, stolen->size);
+						  stolen->base.start,
+						  stolen->base.size);
 	if (IS_ERR(obj->pages)) {
 		i915_gem_object_free(obj);
 		return (void*) obj->pages;
@@ -579,24 +581,111 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 	i915_gem_object_pin_pages(obj);
 	obj->stolen = stolen;
 
+	stolen->obj = obj;
+	INIT_LIST_HEAD(&stolen->mm_link);
+
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
 	obj->cache_level = HAS_LLC(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
 
 	return obj;
 }
 
-struct drm_i915_gem_object *
-i915_gem_object_create_stolen(struct drm_device *dev, u64 size)
+static bool
+mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
+{
+	BUG_ON(obj->stolen == NULL);
+
+	if (obj->madv != I915_MADV_DONTNEED)
+		return false;
+
+	if (obj->pin_display)
+		return false;
+
+	list_add(&obj->stolen_link, unwind);
+	return drm_mm_scan_add_block(&obj->stolen->base);
+}
+
+static int
+stolen_evict(struct drm_i915_private *dev_priv, u64 size)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
-	struct drm_mm_node *stolen;
-	int ret;
+	struct list_head unwind, evict;
+	struct i915_stolen_node *iter;
+	int ret, active;
 
-	if (!drm_mm_initialized(&dev_priv->mm.stolen))
-		return ERR_PTR(-ENODEV);
+	drm_mm_init_scan(&dev_priv->mm.stolen, size, 0, 0);
+	INIT_LIST_HEAD(&unwind);
+
+	/* Retire all requests before creating the evict list */
+	i915_gem_retire_requests(dev_priv->dev);
+
+	for (active = 0; active <= 1; active++) {
+		list_for_each_entry(iter, &dev_priv->mm.stolen_list, mm_link) {
+			if (iter->obj->active != active)
+				continue;
+
+			if (mark_free(iter->obj, &unwind))
+				goto found;
+		}
+	}
+
+found:
+	INIT_LIST_HEAD(&evict);
+	while (!list_empty(&unwind)) {
+		obj = list_first_entry(&unwind,
+				       struct drm_i915_gem_object,
+				       stolen_link);
+		list_del(&obj->stolen_link);
+
+		if (drm_mm_scan_remove_block(&obj->stolen->base)) {
+			list_add(&obj->stolen_link, &evict);
+			drm_gem_object_reference(&obj->base);
+		}
+	}
+
+	ret = 0;
+	while (!list_empty(&evict)) {
+		obj = list_first_entry(&evict,
+				       struct drm_i915_gem_object,
+				       stolen_link);
+		list_del(&obj->stolen_link);
+
+		if (ret == 0) {
+			struct i915_vma *vma, *vma_next;
+
+			list_for_each_entry_safe(vma, vma_next,
+						 &obj->vma_list,
+						 vma_link)
+				if (i915_vma_unbind(vma))
+					break;
+
+			/* Stolen pins its pages to prevent the
+			 * normal shrinker from processing stolen
+			 * objects.
+			 */
+			i915_gem_object_unpin_pages(obj);
+
+			ret = i915_gem_object_put_pages(obj);
+			if (ret == 0) {
+				i915_gem_object_release_stolen(obj);
+				obj->madv = __I915_MADV_PURGED;
+			} else {
+				i915_gem_object_pin_pages(obj);
+			}
+		}
+
+		drm_gem_object_unreference(&obj->base);
+	}
+
+	return ret;
+}
+
+static struct i915_stolen_node *
+stolen_alloc(struct drm_i915_private *dev_priv, u64 size)
+{
+	struct i915_stolen_node *stolen;
+	int ret;
 
-	DRM_DEBUG_KMS("creating stolen object: size=%llx\n", size);
 	if (size == 0)
 		return ERR_PTR(-EINVAL);
 
@@ -604,17 +693,60 @@ i915_gem_object_create_stolen(struct drm_device *dev, u64 size)
 	if (!stolen)
 		return ERR_PTR(-ENOMEM);
 
-	ret = i915_gem_stolen_insert_node(dev_priv, stolen, size, 4096);
+	ret = i915_gem_stolen_insert_node(dev_priv, &stolen->base, size, 4096);
+	if (ret == 0)
+		goto out;
+
+	/* No more stolen memory available, or too fragmented.
+	 * Try evicting purgeable objects and search again.
+	 */
+	ret = stolen_evict(dev_priv, size);
+	if (ret == 0)
+		ret = i915_gem_stolen_insert_node(dev_priv, &stolen->base,
+						  size, 4096);
+out:
 	if (ret) {
 		kfree(stolen);
 		return ERR_PTR(ret);
 	}
 
+	return stolen;
+}
+
+/**
+ * i915_gem_object_create_stolen() - creates object using the stolen memory
+ * @dev:	drm device
+ * @size:	size of the object requested
+ *
+ * i915_gem_object_create_stolen() tries to allocate memory for the object
+ * from the stolen memory region. If not enough memory is found, it tries
+ * evicting purgeable objects and searching again.
+ *
+ * Returns: Object pointer - success and error pointer - failure
+ */
+struct drm_i915_gem_object *
+i915_gem_object_create_stolen(struct drm_device *dev, u64 size)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj;
+	struct i915_stolen_node *stolen;
+
+	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
+
+	if (!drm_mm_initialized(&dev_priv->mm.stolen))
+		return ERR_PTR(-ENODEV);
+
+	DRM_DEBUG_KMS("creating stolen object: size=%llx\n", size);
+
+	stolen = stolen_alloc(dev_priv, size);
+	if (IS_ERR(stolen))
+		return (void*) stolen;
+
 	obj = _i915_gem_object_create_stolen(dev, stolen);
 	if (!IS_ERR(obj))
 		return obj;
 
-	i915_gem_stolen_remove_node(dev_priv, stolen);
+	i915_gem_stolen_remove_node(dev_priv, &stolen->base);
 	kfree(stolen);
 	return obj;
 }
@@ -628,7 +760,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct i915_address_space *ggtt = &dev_priv->gtt.base;
 	struct drm_i915_gem_object *obj;
-	struct drm_mm_node *stolen;
+	struct i915_stolen_node *stolen;
 	struct i915_vma *vma;
 	int ret;
 
@@ -647,10 +779,10 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	if (!stolen)
 		return ERR_PTR(-ENOMEM);
 
-	stolen->start = stolen_offset;
-	stolen->size = size;
+	stolen->base.start = stolen_offset;
+	stolen->base.size = size;
 	mutex_lock(&dev_priv->mm.stolen_lock);
-	ret = drm_mm_reserve_node(&dev_priv->mm.stolen, stolen);
+	ret = drm_mm_reserve_node(&dev_priv->mm.stolen, &stolen->base);
 	mutex_unlock(&dev_priv->mm.stolen_lock);
 	if (ret) {
 		DRM_DEBUG_KMS("failed to allocate stolen space\n");
@@ -661,7 +793,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
 	obj = _i915_gem_object_create_stolen(dev, stolen);
 	if (IS_ERR(obj)) {
 		DRM_DEBUG_KMS("failed to allocate stolen object\n");
-		i915_gem_stolen_remove_node(dev_priv, stolen);
+		i915_gem_stolen_remove_node(dev_priv, &stolen->base);
 		kfree(stolen);
 		return obj;
 	}
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 0afb819..c94b39b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5119,7 +5119,7 @@ static void valleyview_check_pctx(struct drm_i915_private *dev_priv)
 	unsigned long pctx_addr = I915_READ(VLV_PCBR) & ~4095;
 
 	WARN_ON(pctx_addr != dev_priv->mm.stolen_base +
-			     dev_priv->vlv_pctx->stolen->start);
+			     dev_priv->vlv_pctx->stolen->base.start);
 }
 
 
@@ -5194,7 +5194,7 @@ static void valleyview_setup_pctx(struct drm_device *dev)
 		return;
 	}
 
-	pctx_paddr = dev_priv->mm.stolen_base + pctx->stolen->start;
+	pctx_paddr = dev_priv->mm.stolen_base + pctx->stolen->base.start;
 	I915_WRITE(VLV_PCBR, pctx_paddr);
 
 out:
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 7/9] drm/i915: Support for pread/pwrite from/to non shmem backed objects
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
                   ` (5 preceding siblings ...)
  2015-12-14  5:46 ` [PATCH 6/9] drm/i915: Add support for stealing purgable stolen pages ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-14  9:43   ` Chris Wilson
  2015-12-14  5:46 ` [PATCH 8/9] drm/i915: Migrate stolen objects before hibernation ankitprasad.r.sharma
  2015-12-14  5:46 ` [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers ankitprasad.r.sharma
  8 siblings, 1 reply; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>

This patch adds support for extending the pread/pwrite functionality
for objects not backed by shmem. The access will be made through
gtt interface. This will cover objects backed by stolen memory as well
as other non-shmem backed objects.

v2: Drop locks around slow_user_access, prefault the pages before
access (Chris)

v3: Rebased to the latest drm-intel-nightly (Ankit)

v4: Moved page base & offset calculations outside the copy loop,
corrected data types for size and offset variables, corrected if-else
braces format (Tvrtko/kerneldocs)

v5: Enabled pread/pwrite for all non-shmem backed objects including
without tiling restrictions (Ankit)

v6: Using pwrite_fast for non-shmem backed objects as well (Chris)

v7: Updated commit message, Renamed i915_gem_gtt_read to i915_gem_gtt_copy,
added pwrite slow path for non-shmem backed objects (Chris/Tvrtko)

v8: Updated v7 commit message, mutex unlock around pwrite slow path for
non-shmem backed objects (Tvrtko)

v9: Corrected check during pread_ioctl, to avoid shmem_pread being
called for non-shmem backed objects (Tvrtko)

Testcase: igt/gem_stolen

Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c | 151 +++++++++++++++++++++++++++++++++-------
 1 file changed, 127 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8a508cd..ad61783 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -629,6 +629,99 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
 	return ret ? - EFAULT : 0;
 }
 
+static inline uint64_t
+slow_user_access(struct io_mapping *mapping,
+		 uint64_t page_base, int page_offset,
+		 char __user *user_data,
+		 int length, bool pwrite)
+{
+	void __iomem *vaddr_inatomic;
+	void *vaddr;
+	uint64_t unwritten;
+
+	vaddr_inatomic = io_mapping_map_wc(mapping, page_base);
+	/* We can use the cpu mem copy function because this is X86. */
+	vaddr = (void __force *)vaddr_inatomic + page_offset;
+	if (pwrite)
+		unwritten = __copy_from_user(vaddr, user_data, length);
+	else
+		unwritten = __copy_to_user(user_data, vaddr, length);
+
+	io_mapping_unmap(vaddr_inatomic);
+	return unwritten;
+}
+
+static int
+i915_gem_gtt_copy(struct drm_device *dev,
+		   struct drm_i915_gem_object *obj, uint64_t size,
+		   uint64_t data_offset, uint64_t data_ptr)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	char __user *user_data;
+	uint64_t remain;
+	uint64_t offset, page_base;
+	int page_offset, page_length, ret = 0;
+
+	ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE);
+	if (ret)
+		goto out;
+
+	ret = i915_gem_object_set_to_gtt_domain(obj, false);
+	if (ret)
+		goto out_unpin;
+
+	ret = i915_gem_object_put_fence(obj);
+	if (ret)
+		goto out_unpin;
+
+	user_data = to_user_ptr(data_ptr);
+	remain = size;
+	offset = i915_gem_obj_ggtt_offset(obj) + data_offset;
+
+	mutex_unlock(&dev->struct_mutex);
+	if (likely(!i915.prefault_disable))
+		ret = fault_in_multipages_writeable(user_data, remain);
+
+	/*
+	 * page_offset = offset within page
+	 * page_base = page offset within aperture
+	 */
+	page_offset = offset_in_page(offset);
+	page_base = offset & PAGE_MASK;
+
+	while (remain > 0) {
+		/* page_length = bytes to copy for this page */
+		page_length = remain;
+		if ((page_offset + remain) > PAGE_SIZE)
+			page_length = PAGE_SIZE - page_offset;
+
+		/* This is a slow read/write as it tries to read from
+		 * and write to user memory which may result into page
+		 * faults
+		 */
+		ret = slow_user_access(dev_priv->gtt.mappable, page_base,
+				       page_offset, user_data,
+				       page_length, false);
+
+		if (ret) {
+			ret = -EFAULT;
+			break;
+		}
+
+		remain -= page_length;
+		user_data += page_length;
+		page_base += page_length;
+		page_offset = 0;
+	}
+
+	mutex_lock(&dev->struct_mutex);
+
+out_unpin:
+	i915_gem_object_ggtt_unpin(obj);
+out:
+	return ret;
+}
+
 static int
 i915_gem_shmem_pread(struct drm_device *dev,
 		     struct drm_i915_gem_object *obj,
@@ -752,17 +845,14 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 		goto out;
 	}
 
-	/* prime objects have no backing filp to GEM pread/pwrite
-	 * pages from.
-	 */
-	if (!obj->base.filp) {
-		ret = -EINVAL;
-		goto out;
-	}
-
 	trace_i915_gem_object_pread(obj, args->offset, args->size);
 
-	ret = i915_gem_shmem_pread(dev, obj, args, file);
+	/* pread for non shmem backed objects */
+	if (!obj->base.filp && obj->tiling_mode == I915_TILING_NONE)
+		ret = i915_gem_gtt_copy(dev, obj, args->size,
+					args->offset, args->data_ptr);
+	else if (obj->base.filp)
+		ret = i915_gem_shmem_pread(dev, obj, args, file);
 
 out:
 	drm_gem_object_unreference(&obj->base);
@@ -804,10 +894,12 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
 			 struct drm_i915_gem_pwrite *args,
 			 struct drm_file *file)
 {
+	struct drm_device *dev = obj->base.dev;
 	struct drm_mm_node node;
 	uint64_t remain, offset;
 	char __user *user_data;
 	int ret;
+	bool faulted = false;
 
 	ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE | PIN_NONBLOCK);
 	if (ret) {
@@ -862,11 +954,29 @@ i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
 		/* If we get a fault while copying data, then (presumably) our
 		 * source page isn't available.  Return the error and we'll
 		 * retry in the slow path.
+		 * If the object is non-shmem backed, we retry again with the
+		 * path that handles page fault.
 		 */
-		if (fast_user_write(i915->gtt.mappable, page_base,
-				    page_offset, user_data, page_length)) {
-			ret = -EFAULT;
-			goto out_flush;
+		if (faulted || fast_user_write(i915->gtt.mappable,
+						page_base, page_offset,
+						user_data, page_length)) {
+			if (!obj->base.filp) {
+				faulted = true;
+				mutex_unlock(&dev->struct_mutex);
+				if (slow_user_access(i915->gtt.mappable,
+						     page_base,
+						     page_offset, user_data,
+						     page_length, true)) {
+					ret = -EFAULT;
+					mutex_lock(&dev->struct_mutex);
+					goto out_flush;
+				}
+
+				mutex_lock(&dev->struct_mutex);
+			} else {
+				ret = -EFAULT;
+				goto out_flush;
+			}
 		}
 
 		remain -= page_length;
@@ -1132,14 +1242,6 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 		goto out;
 	}
 
-	/* prime objects have no backing filp to GEM pread/pwrite
-	 * pages from.
-	 */
-	if (!obj->base.filp) {
-		ret = -EINVAL;
-		goto out;
-	}
-
 	trace_i915_gem_object_pwrite(obj, args->offset, args->size);
 
 	ret = -EFAULT;
@@ -1150,8 +1252,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 	 * perspective, requiring manual detiling by the client.
 	 */
 	if (obj->tiling_mode == I915_TILING_NONE &&
-	    obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
-	    cpu_write_needs_clflush(obj)) {
+	    (!obj->base.filp ||
+	    (obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
+	    cpu_write_needs_clflush(obj)))) {
 		ret = i915_gem_gtt_pwrite_fast(dev_priv, obj, args, file);
 		/* Note that the gtt paths might fail with non-page-backed user
 		 * pointers (e.g. gtt mappings when moving data between
@@ -1161,7 +1264,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 	if (ret == -EFAULT || ret == -ENOSPC) {
 		if (obj->phys_handle)
 			ret = i915_gem_phys_pwrite(obj, args, file);
-		else
+		else if (obj->base.filp)
 			ret = i915_gem_shmem_pwrite(dev, obj, args, file);
 	}
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 8/9] drm/i915: Migrate stolen objects before hibernation
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
                   ` (6 preceding siblings ...)
  2015-12-14  5:46 ` [PATCH 7/9] drm/i915: Support for pread/pwrite from/to non shmem backed objects ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-14 10:31   ` Chris Wilson
  2015-12-14  5:46 ` [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers ankitprasad.r.sharma
  8 siblings, 1 reply; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Chris Wilson <chris@chris-wilson.co.uk>

Ville reminded us that stolen memory is not preserved across
hibernation, and a result of this was that context objects now being
allocated from stolen were being corrupted on S4 and promptly hanging
the GPU on resume.

We want to utilise stolen for as much as possible (nothing else will use
that wasted memory otherwise), so we need a strategy for handling
general objects allocated from stolen and hibernation. A simple solution
is to do a CPU copy through the GTT of the stolen object into a fresh
shmemfs backing store and thenceforth treat it as a normal objects. This
can be refined in future to either use a GPU copy to avoid the slow
uncached reads (though it's hibernation!) and recreate stolen objects
upon resume/first-use. For now, a simple approach should suffice for
testing the object migration.

v2:
Swap PTE for pinned bindings over to the shmemfs. This adds a
complicated dance, but is required as many stolen objects are likely to
be pinned for use by the hardware. Swapping the PTEs should not result
in externally visible behaviour, as each PTE update should be atomic and
the two pages identical. (danvet)

safe-by-default, or the principle of least surprise. We need a new flag
to mark objects that we can wilfully discard and recreate across
hibernation. (danvet)

Just use the global_list rather than invent a new stolen_list. This is
the slowpath hibernate and so adding a new list and the associated
complexity isn't worth it.

v3: Rebased on drm-intel-nightly (Ankit)

v4: Use insert_page to map stolen memory backed pages for migration to
shmem (Chris)

v5: Acquire mutex lock while copying stolen buffer objects to shmem (Chris)

v6: Handled file leak, Splitted object migration function, added kerneldoc
for migrate_stolen_to_shmemfs() function (Tvrtko)
Use i915 wrapper function for drm_mm_insert_node_in_range()

Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c         |  17 ++-
 drivers/gpu/drm/i915/i915_drv.h         |   7 +
 drivers/gpu/drm/i915/i915_gem.c         | 243 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/intel_display.c    |   3 +
 drivers/gpu/drm/i915/intel_fbdev.c      |   6 +
 drivers/gpu/drm/i915/intel_pm.c         |   2 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |   6 +
 7 files changed, 272 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index e6935f1..8f675ae7 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -979,6 +979,21 @@ static int i915_pm_suspend(struct device *dev)
 	return i915_drm_suspend(drm_dev);
 }
 
+static int i915_pm_freeze(struct device *dev)
+{
+	int ret;
+
+	ret = i915_gem_freeze(pci_get_drvdata(to_pci_dev(dev)));
+	if (ret)
+		return ret;
+
+	ret = i915_pm_suspend(dev);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
 static int i915_pm_suspend_late(struct device *dev)
 {
 	struct drm_device *drm_dev = dev_to_i915(dev)->dev;
@@ -1607,7 +1622,7 @@ static const struct dev_pm_ops i915_pm_ops = {
 	 * @restore, @restore_early : called after rebooting and restoring the
 	 *                            hibernation image [PMSG_RESTORE]
 	 */
-	.freeze = i915_pm_suspend,
+	.freeze = i915_pm_freeze,
 	.freeze_late = i915_pm_suspend_late,
 	.thaw_early = i915_pm_resume_early,
 	.thaw = i915_pm_resume,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 479703b..b874292 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2079,6 +2079,12 @@ struct drm_i915_gem_object {
 	 * Advice: are the backing pages purgeable?
 	 */
 	unsigned int madv:2;
+	/**
+	 * Whereas madv is for userspace, there are certain situations
+	 * where we want I915_MADV_DONTNEED behaviour on internal objects
+	 * without conflating the userspace setting.
+	 */
+	unsigned int internal_volatile:1;
 
 	/**
 	 * Current tiling mode for the object.
@@ -3047,6 +3053,7 @@ int i915_gem_l3_remap(struct drm_i915_gem_request *req, int slice);
 void i915_gem_init_swizzling(struct drm_device *dev);
 void i915_gem_cleanup_ringbuffer(struct drm_device *dev);
 int __must_check i915_gpu_idle(struct drm_device *dev);
+int __must_check i915_gem_freeze(struct drm_device *dev);
 int __must_check i915_gem_suspend(struct drm_device *dev);
 void __i915_add_request(struct drm_i915_gem_request *req,
 			struct drm_i915_gem_object *batch_obj,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ad61783..ae3729f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4563,12 +4563,27 @@ static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
 	.put_pages = i915_gem_object_put_pages_gtt,
 };
 
+static struct address_space *
+i915_gem_set_inode_gfp(struct drm_device *dev, struct file *file)
+{
+	struct address_space *mapping = file_inode(file)->i_mapping;
+	gfp_t mask;
+
+	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
+	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
+		/* 965gm cannot relocate objects above 4GiB. */
+		mask &= ~__GFP_HIGHMEM;
+		mask |= __GFP_DMA32;
+	}
+	mapping_set_gfp_mask(mapping, mask);
+
+	return mapping;
+}
+
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size)
 {
 	struct drm_i915_gem_object *obj;
-	struct address_space *mapping;
-	gfp_t mask;
 	int ret;
 
 	obj = i915_gem_object_alloc(dev);
@@ -4581,15 +4596,7 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 		return ERR_PTR(ret);
 	}
 
-	mask = GFP_HIGHUSER | __GFP_RECLAIMABLE;
-	if (IS_CRESTLINE(dev) || IS_BROADWATER(dev)) {
-		/* 965gm cannot relocate objects above 4GiB. */
-		mask &= ~__GFP_HIGHMEM;
-		mask |= __GFP_DMA32;
-	}
-
-	mapping = file_inode(obj->base.filp)->i_mapping;
-	mapping_set_gfp_mask(mapping, mask);
+	i915_gem_set_inode_gfp(dev, obj->base.filp);
 
 	i915_gem_object_init(obj, &i915_gem_object_ops);
 
@@ -4764,6 +4771,220 @@ i915_gem_stop_ringbuffers(struct drm_device *dev)
 		dev_priv->gt.stop_ring(ring);
 }
 
+static int
+copy_content(struct drm_i915_gem_object *obj,
+		struct drm_i915_private *i915,
+		struct address_space *mapping)
+{
+	struct drm_mm_node node;
+	int ret, i;
+
+	/* stolen objects are already pinned to prevent shrinkage */
+	memset(&node, 0, sizeof(node));
+	ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0,
+					    0, i915->gtt.mappable_end);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
+		struct page *page;
+		void *__iomem src;
+		void *dst;
+
+		wmb();
+		i915->gtt.base.insert_page(&i915->gtt.base,
+					   i915_gem_object_get_dma_address(obj, i),
+					   node.start,
+					   I915_CACHE_NONE,
+					   0);
+		wmb();
+
+		page = shmem_read_mapping_page(mapping, i);
+		if (IS_ERR(page)) {
+			ret = PTR_ERR(page);
+			break;
+		}
+
+		src = io_mapping_map_atomic_wc(i915->gtt.mappable, node.start);
+		dst = kmap_atomic(page);
+		memcpy_fromio(dst, src, PAGE_SIZE);
+		kunmap_atomic(dst);
+		io_mapping_unmap_atomic(src);
+
+		page_cache_release(page);
+	}
+
+	wmb();
+	i915->gtt.base.clear_range(&i915->gtt.base,
+				   node.start, node.size,
+				   true);
+	drm_mm_remove_node(&node);
+	return ret;
+}
+
+/**
+ * i915_gem_object_migrate_stolen_to_shmemfs() - migrates a stolen backed
+ * object to shmemfs
+ * @obj: stolen backed object to be migrated
+ *
+ * Returns: 0 on successful migration, errno on failure
+ */
+
+static int
+i915_gem_object_migrate_stolen_to_shmemfs(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
+	struct i915_vma *vma, *vn;
+	struct file *file;
+	struct address_space *mapping;
+	struct sg_table *stolen_pages, *shmemfs_pages;
+	int ret;
+
+	if (WARN_ON_ONCE(i915_gem_object_needs_bit17_swizzle(obj)))
+		return -EINVAL;
+
+	ret = i915_gem_object_set_to_gtt_domain(obj, false);
+	if (ret)
+		return ret;
+
+	file = shmem_file_setup("drm mm object", obj->base.size, VM_NORESERVE);
+	if (IS_ERR(file))
+		return PTR_ERR(file);
+	mapping = i915_gem_set_inode_gfp(obj->base.dev, file);
+
+	list_for_each_entry_safe(vma, vn, &obj->vma_list, vma_link)
+		if (i915_vma_unbind(vma))
+			continue;
+
+	if (obj->madv != I915_MADV_WILLNEED && list_empty(&obj->vma_list)) {
+		/* Discard the stolen reservation, and replace with
+		 * an unpopulated shmemfs object.
+		 */
+		obj->madv = __I915_MADV_PURGED;
+		goto swap_pages;
+	}
+
+	ret = copy_content(obj, i915, mapping);
+	if (ret)
+		goto err_file;
+
+swap_pages:
+	stolen_pages = obj->pages;
+	obj->pages = NULL;
+
+	obj->base.filp = file;
+	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
+	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+
+	/* Recreate any pinned binding with pointers to the new storage */
+	if (!list_empty(&obj->vma_list)) {
+		ret = i915_gem_object_get_pages_gtt(obj);
+		if (ret) {
+			obj->pages = stolen_pages;
+			goto err_file;
+		}
+
+		ret = i915_gem_object_set_to_gtt_domain(obj, true);
+		if (ret) {
+			i915_gem_object_put_pages_gtt(obj);
+			obj->pages = stolen_pages;
+			goto err_file;
+		}
+
+		obj->get_page.sg = obj->pages->sgl;
+		obj->get_page.last = 0;
+
+		list_for_each_entry(vma, &obj->vma_list, vma_link) {
+			if (!drm_mm_node_allocated(&vma->node))
+				continue;
+
+			WARN_ON(i915_vma_bind(vma,
+					      obj->cache_level,
+					      PIN_UPDATE));
+		}
+	} else
+		list_del(&obj->global_list);
+
+	/* drop the stolen pin and backing */
+	shmemfs_pages = obj->pages;
+	obj->pages = stolen_pages;
+
+	i915_gem_object_unpin_pages(obj);
+	obj->ops->put_pages(obj);
+	if (obj->ops->release)
+		obj->ops->release(obj);
+
+	obj->ops = &i915_gem_object_ops;
+	obj->pages = shmemfs_pages;
+
+	return 0;
+
+err_file:
+	fput(file);
+	obj->base.filp = NULL;
+	return ret;
+}
+
+int
+i915_gem_freeze(struct drm_device *dev)
+{
+	/* Called before i915_gem_suspend() when hibernating */
+	struct drm_i915_private *i915 = to_i915(dev);
+	struct drm_i915_gem_object *obj, *tmp;
+	struct list_head *phase[] = {
+		&i915->mm.unbound_list, &i915->mm.bound_list, NULL
+	}, **p;
+	int ret;
+
+	ret = i915_mutex_lock_interruptible(dev);
+	if (ret)
+		return ret;
+	/* Across hibernation, the stolen area is not preserved.
+	 * Anything inside stolen must copied back to normal
+	 * memory if we wish to preserve it.
+	 */
+	for (p = phase; *p; p++) {
+		struct list_head migrate;
+		int ret;
+
+		INIT_LIST_HEAD(&migrate);
+		list_for_each_entry_safe(obj, tmp, *p, global_list) {
+			if (obj->stolen == NULL)
+				continue;
+
+			if (obj->internal_volatile)
+				continue;
+
+			/* In the general case, this object may only be alive
+			 * due to an active reference, and that may disappear
+			 * when we unbind any of the objects (and so wait upon
+			 * the GPU and retire requests). To prevent one of the
+			 * objects from disappearing beneath us, we need to
+			 * take a reference to each as we build the migration
+			 * list.
+			 *
+			 * This is similar to the strategy required whilst
+			 * shrinking or evicting objects (for the same reason).
+			 */
+			drm_gem_object_reference(&obj->base);
+			list_move(&obj->global_list, &migrate);
+		}
+
+		ret = 0;
+		list_for_each_entry_safe(obj, tmp, &migrate, global_list) {
+			if (ret == 0)
+				ret = i915_gem_object_migrate_stolen_to_shmemfs(obj);
+			drm_gem_object_unreference(&obj->base);
+		}
+		list_splice(&migrate, *p);
+		if (ret)
+			break;
+	}
+
+	mutex_unlock(&dev->struct_mutex);
+	return ret;
+}
+
 int
 i915_gem_suspend(struct drm_device *dev)
 {
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 006d43a..ca2cd44 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2546,6 +2546,9 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
 	if (IS_ERR(obj))
 		return false;
 
+	/* Not to be preserved across hibernation */
+	obj->internal_volatile = true;
+
 	obj->tiling_mode = plane_config->tiling;
 	if (obj->tiling_mode == I915_TILING_X)
 		obj->stride = fb->pitches[0];
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index b2f134a..e162249 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -156,6 +156,12 @@ static int intelfb_alloc(struct drm_fb_helper *helper,
 		goto out;
 	}
 
+	/* Discard the contents of the BIOS fb across hibernation.
+	 * We really want to completely throwaway the earlier fbdev
+	 * and reconfigure it anyway.
+	 */
+	obj->internal_volatile = true;
+
 	fb = __intel_framebuffer_create(dev, &mode_cmd, obj);
 	if (IS_ERR(fb)) {
 		drm_gem_object_unreference(&obj->base);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c94b39b..3ffd181 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -5198,6 +5198,8 @@ static void valleyview_setup_pctx(struct drm_device *dev)
 	I915_WRITE(VLV_PCBR, pctx_paddr);
 
 out:
+	/* The power context need not be preserved across hibernation */
+	pctx->internal_volatile = true;
 	DRM_DEBUG_DRIVER("PCBR: 0x%08x\n", I915_READ(VLV_PCBR));
 	dev_priv->vlv_pctx = pctx;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 56d8375..412212e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2094,6 +2094,12 @@ static int intel_alloc_ringbuffer_obj(struct drm_device *dev,
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
+	/* Ringbuffer objects are by definition volatile - only the commands
+	 * between HEAD and TAIL need to be preserved and whilst there are
+	 * any commands there, the ringbuffer is pinned by activity.
+	 */
+	obj->internal_volatile = true;
+
 	/* mark ring buffers as read-only from GPU side by default */
 	obj->gt_ro = 1;
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers
  2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
                   ` (7 preceding siblings ...)
  2015-12-14  5:46 ` [PATCH 8/9] drm/i915: Migrate stolen objects before hibernation ankitprasad.r.sharma
@ 2015-12-14  5:46 ` ankitprasad.r.sharma
  2015-12-14  9:44   ` Chris Wilson
  2015-12-15 14:41   ` Dave Gordon
  8 siblings, 2 replies; 30+ messages in thread
From: ankitprasad.r.sharma @ 2015-12-14  5:46 UTC (permalink / raw)
  To: intel-gfx; +Cc: Ankitprasad Sharma, akash.goel, shashidhar.hiremath

From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>

Using stolen backed objects as a batchbuffer may result into a kernel
panic during relocation. Added a check to prevent the panic and fail
the execbuffer call. It is not recommended to use stolen object as
a batchbuffer.

Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 48ec484..d342f10 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -462,7 +462,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 	if (obj->active && pagefault_disabled())
 		return -EFAULT;
 
-	if (use_cpu_reloc(obj))
+	if (obj->stolen)
+		ret = -EINVAL;
+	else if (use_cpu_reloc(obj))
 		ret = relocate_entry_cpu(obj, reloc, target_offset);
 	else if (obj->map_and_fenceable)
 		ret = relocate_entry_gtt(obj, reloc, target_offset);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 7/9] drm/i915: Support for pread/pwrite from/to non shmem backed objects
  2015-12-14  5:46 ` [PATCH 7/9] drm/i915: Support for pread/pwrite from/to non shmem backed objects ankitprasad.r.sharma
@ 2015-12-14  9:43   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-14  9:43 UTC (permalink / raw)
  To: ankitprasad.r.sharma; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

On Mon, Dec 14, 2015 at 11:16:09AM +0530, ankitprasad.r.sharma@intel.com wrote:
> @@ -1150,8 +1252,9 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
>  	 * perspective, requiring manual detiling by the client.
>  	 */
>  	if (obj->tiling_mode == I915_TILING_NONE &&
> -	    obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
> -	    cpu_write_needs_clflush(obj)) {
> +	    (!obj->base.filp ||
> +	    (obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
> +	    cpu_write_needs_clflush(obj)))) {

This is too confusing. Move the write_domain check to needs_clflush
ala:

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 8edc56a34caa..fd3c73c8ab77 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -49,6 +49,9 @@ static bool cpu_cache_is_coherent(struct drm_device *dev,
 
 static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 {
+       if (obj->base.write_domain == I915_GEM_DOMAIN_CPU)
+               return false;
+
        if (!cpu_cache_is_coherent(obj->base.dev, obj->cache_level))
                return true;
 
@@ -1073,7 +1076,6 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
         * perspective, requiring manual detiling by the client.
         */
        if (obj->tiling_mode == I915_TILING_NONE &&
-           obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
            cpu_write_needs_clflush(obj)) {
                ret = i915_gem_gtt_pwrite_fast(dev, obj, args, file);
                /* Note that the gtt paths might fail with non-page-backed user
@@ -3159,9 +3161,7 @@ out:
         * object is now coherent at its new cache level (with respect
         * to the access domain).
         */
-       if (obj->cache_dirty &&
-           obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
-           cpu_write_needs_clflush(obj)) {
+       if (obj->cache_dirty && cpu_write_needs_clflush(obj)) {
                if (i915_gem_clflush_object(obj, true))
                        i915_gem_chipset_flush(obj->base.dev);
        }


and the negative (tiling mode) test to i915_gem_gtt_pwrite_fast.

Then it reads as

if (obj->base.filp == NULL || cpu_write_needs_clflush(obj))
	ret = i915_gem_gtt_pwrite_fast(dev_priv, obj, args, file);

>  		/* Note that the gtt paths might fail with non-page-backed user
>  		 * pointers (e.g. gtt mappings when moving data between
> @@ -1161,7 +1264,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
>  	if (ret == -EFAULT || ret == -ENOSPC) {
>  		if (obj->phys_handle)
>  			ret = i915_gem_phys_pwrite(obj, args, file);
> -		else
> +		else if (obj->base.filp)
>  			ret = i915_gem_shmem_pwrite(dev, obj, args, file);
Forgot
		else
			ret = -ENODEV;

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers
  2015-12-14  5:46 ` [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers ankitprasad.r.sharma
@ 2015-12-14  9:44   ` Chris Wilson
  2015-12-15 14:41   ` Dave Gordon
  1 sibling, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-14  9:44 UTC (permalink / raw)
  To: ankitprasad.r.sharma; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

On Mon, Dec 14, 2015 at 11:16:11AM +0530, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> 
> Using stolen backed objects as a batchbuffer may result into a kernel
> panic during relocation. Added a check to prevent the panic and fail
> the execbuffer call. It is not recommended to use stolen object as
> a batchbuffer.

Nope. Let's fix it properly.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/9] drm/i915: Clearing buffer objects via CPU/GTT
  2015-12-14  5:46 ` [PATCH 3/9] drm/i915: Clearing buffer objects via CPU/GTT ankitprasad.r.sharma
@ 2015-12-14  9:48   ` Chris Wilson
  2015-12-17 10:27   ` Tvrtko Ursulin
  1 sibling, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-14  9:48 UTC (permalink / raw)
  To: ankitprasad.r.sharma; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

On Mon, Dec 14, 2015 at 11:16:05AM +0530, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> 
> This patch adds support for clearing buffer objects via CPU/GTT. This
> is particularly useful for clearing out the non shmem backed objects.
> Currently intend to use this only for buffers allocated from stolen
> region.
> 
> v2: Added kernel doc for i915_gem_clear_object(), corrected/removed
> variable assignments (Tvrtko)
> 
> v3: Map object page by page to the gtt if the pinning of the whole object
> to the ggtt fails, Corrected function name (Chris)
> 
> v4: Clear the buffer page by page, and not map the whole object in the gtt
> aperture. Use i915 wrapper function in place of drm_mm_insert_node_in_range.
> 
> Testcase: igt/gem_stolen
> 
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h |  1 +
>  drivers/gpu/drm/i915/i915_gem.c | 44 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 45 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a10b866..e195fee 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2897,6 +2897,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
>  				    int *needs_clflush);
>  
>  int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
> +int i915_gem_object_clear(struct drm_i915_gem_object *obj);
>  
>  static inline int __sg_page_count(struct scatterlist *sg)
>  {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 46c1e75..e50a91b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5293,3 +5293,47 @@ fail:
>  	drm_gem_object_unreference(&obj->base);
>  	return ERR_PTR(ret);
>  }
> +
> +/**
> + * i915_gem_object_clear() - Clear buffer object via CPU/GTT
> + * @obj: Buffer object to be cleared
> + *
> + * Return: 0 - success, non-zero - failure
> + */
> +int i915_gem_object_clear(struct drm_i915_gem_object *obj)
> +{
> +	int ret, i;
> +	char __iomem *base;
> +	size_t size = obj->base.size;
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	struct drm_mm_node node;
> +
> +	lockdep_assert_held(&obj->base.dev->struct_mutex);
> +	memset(&node, 0, sizeof(node));
> +	ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0,
> +					    0, i915->gtt.mappable_end);

I do not like this wrapper because you have no idea what it is meant to
do.

> +	if (ret)
> +		goto out;
> +
> +	i915_gem_object_pin_pages(obj);
> +	base = io_mapping_map_wc(i915->gtt.mappable, node.start);
> +	for (i = 0; i < size/PAGE_SIZE; i++) {
> +		wmb();
> +		i915->gtt.base.insert_page(&i915->gtt.base,
> +					   i915_gem_object_get_dma_address(obj, i),
> +					   node.start,
> +					   I915_CACHE_NONE, 0);
> +		wmb();
> +		memset_io(base, 0, 4096);

The barriers can just be written as:

for (;;) {
	insert_page();
	wmb();
	memset_io()
	wmb();
}
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast
  2015-12-14  5:46 ` [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast ankitprasad.r.sharma
@ 2015-12-14  9:54   ` Chris Wilson
  2015-12-14 10:48     ` Chris Wilson
  2015-12-17 10:45   ` Tvrtko Ursulin
  1 sibling, 1 reply; 30+ messages in thread
From: Chris Wilson @ 2015-12-14  9:54 UTC (permalink / raw)
  To: ankitprasad.r.sharma; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

On Mon, Dec 14, 2015 at 11:16:04AM +0530, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> 
> In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
> we try a nonblocking pin for the whole object (since that is fastest if
> reused), then failing that we try to grab one page in the mappable
> aperture. It also allows us to handle objects larger than the mappable
> aperture (e.g. if we need to pwrite with vGPU restricting the aperture
> to a measely 8MiB or something like that).
> 
> v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)
> 
> v3: Combined loops based on local patch by Chris (Chris)
> 
> v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris)
> 
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem.c | 86 ++++++++++++++++++++++++++++++-----------
>  1 file changed, 64 insertions(+), 22 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bf7f203..46c1e75 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -61,6 +61,21 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
>  	return obj->pin_display;
>  }
>  
> +static int
> +i915_gem_insert_node_in_range(struct drm_i915_private *i915,
> +			      struct drm_mm_node *node, u64 size,
> +			      unsigned alignment, u64 start, u64 end)
> +{
> +	int ret;
> +
> +	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
> +						  size, alignment, 0, start,
> +						  end, DRM_MM_SEARCH_DEFAULT,
> +						  DRM_MM_SEARCH_DEFAULT);
> +
> +	return ret;
> +}

No. It encodes a very bad assumption (i915->gtt) that is not made clear
in anyway.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/9] drm/i915: Support for creating Stolen memory backed objects
  2015-12-14  5:46 ` [PATCH 4/9] drm/i915: Support for creating Stolen memory backed objects ankitprasad.r.sharma
@ 2015-12-14 10:05   ` Chris Wilson
  2015-12-15  6:10     ` Ankitprasad Sharma
  0 siblings, 1 reply; 30+ messages in thread
From: Chris Wilson @ 2015-12-14 10:05 UTC (permalink / raw)
  To: ankitprasad.r.sharma; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

> diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> index d727b49..ebce8c9 100644
> --- a/include/uapi/drm/i915_drm.h
> +++ b/include/uapi/drm/i915_drm.h
> @@ -357,6 +357,7 @@ typedef struct drm_i915_irq_wait {
>  #define I915_PARAM_HAS_GPU_RESET	 35
>  #define I915_PARAM_HAS_RESOURCE_STREAMER 36
>  #define I915_PARAM_HAS_EXEC_SOFTPIN	 37
> +#define I915_PARAM_CREATE_VERSION	 38
>  
>  typedef struct drm_i915_getparam {
>  	__s32 param;
> @@ -456,6 +457,21 @@ struct drm_i915_gem_create {
>  	 */
>  	__u32 handle;
>  	__u32 pad;
> +	/**
> +	 * Requested flags (currently used for placement
> +	 * (which memory domain))
> +	 *
> +	 * You can request that the object be created from special memory
> +	 * rather than regular system pages using this parameter. Such
> +	 * irregular objects may have certain restrictions (such as CPU
> +	 * access to a stolen object is verboten).
> +	 *
> +	 * This can be used in the future for other purposes too
> +	 * e.g. specifying tiling/caching/madvise
> +	 */
> +	__u32 flags;
> +#define I915_CREATE_PLACEMENT_STOLEN 	(1<<0) /* Cannot use CPU mmaps */
> +#define __I915_CREATE_UNKNOWN_FLAGS	-(I915_CREATE_PLACEMENT_STOLEN << 1)

Alignment. sizeof(drm_i915_gem_create) must be aligned to u64 since we
contain u64 (to keep ABI compat for 32bit).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 5/9] drm/i915: Propagating correct error codes to the userspace
  2015-12-14  5:46 ` [PATCH 5/9] drm/i915: Propagating correct error codes to the userspace ankitprasad.r.sharma
@ 2015-12-14 10:10   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-14 10:10 UTC (permalink / raw)
  To: ankitprasad.r.sharma; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

On Mon, Dec 14, 2015 at 11:16:07AM +0530, ankitprasad.r.sharma@intel.com wrote:
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 17d679e..366080b9 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -492,6 +492,7 @@ i915_pages_create_for_stolen(struct drm_device *dev,
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct sg_table *st;
>  	struct scatterlist *sg;
> +	int ret;
>  
>  	DRM_DEBUG_DRIVER("offset=0x%x, size=%d\n", offset, size);
>  	BUG_ON(offset > dev_priv->gtt.stolen_size - size);
> @@ -503,11 +504,12 @@ i915_pages_create_for_stolen(struct drm_device *dev,
>  
>  	st = kmalloc(sizeof(*st), GFP_KERNEL);
>  	if (st == NULL)
> -		return NULL;
> +		return ERR_PTR(-ENOMEM);
>  
> -	if (sg_alloc_table(st, 1, GFP_KERNEL)) {
> +	ret = sg_alloc_table(st, 1, GFP_KERNEL);
> +	if (ret) {
>  		kfree(st);
> -		return NULL;
> +		return ERR_PTR(ret);
>  	}
>  
>  	sg = st->sgl;
> @@ -559,15 +561,17 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
>  
>  	obj = i915_gem_object_alloc(dev);
>  	if (obj == NULL)
> -		return NULL;
> +		return ERR_PTR(-ENOMEM);
>  
>  	drm_gem_private_object_init(dev, &obj->base, stolen->size);
>  	i915_gem_object_init(obj, &i915_gem_object_stolen_ops);
>  
>  	obj->pages = i915_pages_create_for_stolen(dev,
>  						  stolen->start, stolen->size);
> -	if (obj->pages == NULL)
> -		goto cleanup;
> +	if (IS_ERR(obj->pages)) {
> +		i915_gem_object_free(obj);
> +		return (void*) obj->pages;

This is a bad idiom to use. Looks ok here (as only one caller sees the
invalid obj->pages) but it was an immediate red-flag for me as a reader
of the code (since you are storing an invalid pointer in a very common
field).

Anyway the correct use is return ERR_CAST(obj->pages);

However, I would much prefer a temporary variable:

pages = i915_pages_crate_for_stolen();
if (IS_ERR(pages)) {
	object_free(obj);
	return ERR_CAST(pages);
}
obj->pages = pages;

Just so that I don't have to think about who might chase that invalid
pointer, today or in the future.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/9] drm/i915: Add support for stealing purgable stolen pages
  2015-12-14  5:46 ` [PATCH 6/9] drm/i915: Add support for stealing purgable stolen pages ankitprasad.r.sharma
@ 2015-12-14 10:13   ` Chris Wilson
  2015-12-17 10:51   ` Tvrtko Ursulin
  1 sibling, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-14 10:13 UTC (permalink / raw)
  To: ankitprasad.r.sharma; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

On Mon, Dec 14, 2015 at 11:16:08AM +0530, ankitprasad.r.sharma@intel.com wrote:
> @@ -2039,6 +2052,8 @@ struct drm_i915_gem_object {
>  	struct list_head obj_exec_link;
>  
>  	struct list_head batch_pool_link;
> +	/** Used during stolen memory allocations to temporarily hold a ref */
> +	struct list_head stolen_link;

It would be very useful to me if you could rename this tmp_link.

/** Used by eviction/debugfs only under the struct_mutex when the caller
 * is certain they are the only users of this list.
 */

> +	INIT_LIST_HEAD(&obj->stolen_link);

Doesn't require init.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 8/9] drm/i915: Migrate stolen objects before hibernation
  2015-12-14  5:46 ` [PATCH 8/9] drm/i915: Migrate stolen objects before hibernation ankitprasad.r.sharma
@ 2015-12-14 10:31   ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-14 10:31 UTC (permalink / raw)
  To: ankitprasad.r.sharma; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

On Mon, Dec 14, 2015 at 11:16:10AM +0530, ankitprasad.r.sharma@intel.com wrote:
> +static int
> +copy_content(struct drm_i915_gem_object *obj,
> +		struct drm_i915_private *i915,
> +		struct address_space *mapping)
> +{
> +	struct drm_mm_node node;
> +	int ret, i;
> +
> +	/* stolen objects are already pinned to prevent shrinkage */
> +	memset(&node, 0, sizeof(node));
> +	ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0,
> +					    0, i915->gtt.mappable_end);
> +	if (ret)
> +		return ret;
> +
> +	for (i = 0; i < obj->base.size / PAGE_SIZE; i++) {
> +		struct page *page;
> +		void *__iomem src;
> +		void *dst;
> +
> +		wmb();
> +		i915->gtt.base.insert_page(&i915->gtt.base,
> +					   i915_gem_object_get_dma_address(obj, i),
> +					   node.start,
> +					   I915_CACHE_NONE,
> +					   0);
> +		wmb();
> +
> +		page = shmem_read_mapping_page(mapping, i);
> +		if (IS_ERR(page)) {
> +			ret = PTR_ERR(page);
> +			break;
> +		}
> +
> +		src = io_mapping_map_atomic_wc(i915->gtt.mappable, node.start);
> +		dst = kmap_atomic(page);

The wmb() barriers are here...
> +		memcpy_fromio(dst, src, PAGE_SIZE);
...and here.

> +		kunmap_atomic(dst);
> +		io_mapping_unmap_atomic(src);
> +
> +		page_cache_release(page);
> +	}
> +
> +	wmb();
> +	i915->gtt.base.clear_range(&i915->gtt.base,
> +				   node.start, node.size,
> +				   true);
> +	drm_mm_remove_node(&node);
> +	return ret;
> +}
> +
> +/**
> + * i915_gem_object_migrate_stolen_to_shmemfs() - migrates a stolen backed
> + * object to shmemfs
> + * @obj: stolen backed object to be migrated
> + *
> + * Returns: 0 on successful migration, errno on failure
> + */
> +
> +static int
> +i915_gem_object_migrate_stolen_to_shmemfs(struct drm_i915_gem_object *obj)
> +{
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	struct i915_vma *vma, *vn;
> +	struct file *file;
> +	struct address_space *mapping;
> +	struct sg_table *stolen_pages, *shmemfs_pages;
> +	int ret;
> +
> +	if (WARN_ON_ONCE(i915_gem_object_needs_bit17_swizzle(obj)))
> +		return -EINVAL;
> +
> +	ret = i915_gem_object_set_to_gtt_domain(obj, false);
> +	if (ret)
> +		return ret;

This should be in copy_content.

> +	file = shmem_file_setup("drm mm object", obj->base.size, VM_NORESERVE);
> +	if (IS_ERR(file))
> +		return PTR_ERR(file);
> +	mapping = i915_gem_set_inode_gfp(obj->base.dev, file);
> +
> +	list_for_each_entry_safe(vma, vn, &obj->vma_list, vma_link)
> +		if (i915_vma_unbind(vma))
> +			continue;
> +
> +	if (obj->madv != I915_MADV_WILLNEED && list_empty(&obj->vma_list)) {
> +		/* Discard the stolen reservation, and replace with
> +		 * an unpopulated shmemfs object.
> +		 */
> +		obj->madv = __I915_MADV_PURGED;
> +		goto swap_pages;

A goto over one line? else?

> +	}
> +
> +	ret = copy_content(obj, i915, mapping);
> +	if (ret)
> +		goto err_file;
> +
> +swap_pages:
> +	stolen_pages = obj->pages;
> +	obj->pages = NULL;
> +
> +	obj->base.filp = file;
> +	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
> +	obj->base.write_domain = I915_GEM_DOMAIN_CPU;

Again, these domains are a result of copy_content.

> +
> +	/* Recreate any pinned binding with pointers to the new storage */
> +	if (!list_empty(&obj->vma_list)) {
> +		ret = i915_gem_object_get_pages_gtt(obj);
> +		if (ret) {
> +			obj->pages = stolen_pages;
> +			goto err_file;
> +		}
> +
> +		ret = i915_gem_object_set_to_gtt_domain(obj, true);

Why? The pages are allocated, the domain is irrelevant (just so long as
it is accurate, see above).

> +		obj->get_page.sg = obj->pages->sgl;
> +		obj->get_page.last = 0;
> +
> +		list_for_each_entry(vma, &obj->vma_list, vma_link) {
> +			if (!drm_mm_node_allocated(&vma->node))
> +				continue;
> +
> +			WARN_ON(i915_vma_bind(vma,
> +					      obj->cache_level,
> +					      PIN_UPDATE));
> +		}
> +	} else
> +		list_del(&obj->global_list);

This is very confusing (and wrong). This should only be a result of
setting PURGED above.

> +	/* drop the stolen pin and backing */
> +	shmemfs_pages = obj->pages;
> +	obj->pages = stolen_pages;
> +
> +	i915_gem_object_unpin_pages(obj);
> +	obj->ops->put_pages(obj);
> +	if (obj->ops->release)
> +		obj->ops->release(obj);
> +
> +	obj->ops = &i915_gem_object_ops;
> +	obj->pages = shmemfs_pages;
> +
> +	return 0;
> +
> +err_file:
> +	fput(file);
> +	obj->base.filp = NULL;
> +	return ret;
> +}
> +
> +int
> +i915_gem_freeze(struct drm_device *dev)
> +{
> +	/* Called before i915_gem_suspend() when hibernating */
> +	struct drm_i915_private *i915 = to_i915(dev);
> +	struct drm_i915_gem_object *obj, *tmp;
> +	struct list_head *phase[] = {
> +		&i915->mm.unbound_list, &i915->mm.bound_list, NULL
> +	}, **p;
> +	int ret;
> +
> +	ret = i915_mutex_lock_interruptible(dev);
> +	if (ret)
> +		return ret;

Whitespace.
-Chris

> +	/* Across hibernation, the stolen area is not preserved.
> +	 * Anything inside stolen must copied back to normal
> +	 * memory if we wish to preserve it.
> +	 */

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast
  2015-12-14  9:54   ` Chris Wilson
@ 2015-12-14 10:48     ` Chris Wilson
  2015-12-14 11:22       ` Chris Wilson
  0 siblings, 1 reply; 30+ messages in thread
From: Chris Wilson @ 2015-12-14 10:48 UTC (permalink / raw)
  To: ankitprasad.r.sharma, intel-gfx, akash.goel, shashidhar.hiremath,
	tvrtko.ursulin

On Mon, Dec 14, 2015 at 09:54:06AM +0000, Chris Wilson wrote:
> On Mon, Dec 14, 2015 at 11:16:04AM +0530, ankitprasad.r.sharma@intel.com wrote:
> > From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> > 
> > In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
> > we try a nonblocking pin for the whole object (since that is fastest if
> > reused), then failing that we try to grab one page in the mappable
> > aperture. It also allows us to handle objects larger than the mappable
> > aperture (e.g. if we need to pwrite with vGPU restricting the aperture
> > to a measely 8MiB or something like that).
> > 
> > v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)
> > 
> > v3: Combined loops based on local patch by Chris (Chris)
> > 
> > v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris)
> > 
> > Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 86 ++++++++++++++++++++++++++++++-----------
> >  1 file changed, 64 insertions(+), 22 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > index bf7f203..46c1e75 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -61,6 +61,21 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
> >  	return obj->pin_display;
> >  }
> >  
> > +static int
> > +i915_gem_insert_node_in_range(struct drm_i915_private *i915,
> > +			      struct drm_mm_node *node, u64 size,
> > +			      unsigned alignment, u64 start, u64 end)
> > +{
> > +	int ret;
> > +
> > +	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
> > +						  size, alignment, 0, start,
> > +						  end, DRM_MM_SEARCH_DEFAULT,
> > +						  DRM_MM_SEARCH_DEFAULT);
> > +
> > +	return ret;
> > +}
> 
> No. It encodes a very bad assumption (i915->gtt) that is not made clear
> in anyway.

static int
insert_mappable_node(struct drm_i915_private *i915,
		     struct drm_mm_node *node)
{
	return drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
						   4096, 0, 0,
						   0, i915->gtt.mappable_end,
						   DRM_MM_SEARCH_DEFAULT,
						   DRM_MM_SEARCH_DEFAULT);
}

Should do the trick
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast
  2015-12-14 10:48     ` Chris Wilson
@ 2015-12-14 11:22       ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-14 11:22 UTC (permalink / raw)
  To: ankitprasad.r.sharma, intel-gfx, akash.goel, shashidhar.hiremath,
	tvrtko.ursulin

On Mon, Dec 14, 2015 at 10:48:51AM +0000, Chris Wilson wrote:
> On Mon, Dec 14, 2015 at 09:54:06AM +0000, Chris Wilson wrote:
> > On Mon, Dec 14, 2015 at 11:16:04AM +0530, ankitprasad.r.sharma@intel.com wrote:
> > > From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> > > 
> > > In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
> > > we try a nonblocking pin for the whole object (since that is fastest if
> > > reused), then failing that we try to grab one page in the mappable
> > > aperture. It also allows us to handle objects larger than the mappable
> > > aperture (e.g. if we need to pwrite with vGPU restricting the aperture
> > > to a measely 8MiB or something like that).
> > > 
> > > v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)
> > > 
> > > v3: Combined loops based on local patch by Chris (Chris)
> > > 
> > > v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris)
> > > 
> > > Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > ---
> > >  drivers/gpu/drm/i915/i915_gem.c | 86 ++++++++++++++++++++++++++++++-----------
> > >  1 file changed, 64 insertions(+), 22 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> > > index bf7f203..46c1e75 100644
> > > --- a/drivers/gpu/drm/i915/i915_gem.c
> > > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > > @@ -61,6 +61,21 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
> > >  	return obj->pin_display;
> > >  }
> > >  
> > > +static int
> > > +i915_gem_insert_node_in_range(struct drm_i915_private *i915,
> > > +			      struct drm_mm_node *node, u64 size,
> > > +			      unsigned alignment, u64 start, u64 end)
> > > +{
> > > +	int ret;
> > > +
> > > +	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
> > > +						  size, alignment, 0, start,
> > > +						  end, DRM_MM_SEARCH_DEFAULT,
> > > +						  DRM_MM_SEARCH_DEFAULT);
> > > +
> > > +	return ret;
> > > +}
> > 
> > No. It encodes a very bad assumption (i915->gtt) that is not made clear
> > in anyway.
> 
> static int
> insert_mappable_node(struct drm_i915_private *i915,
> 		     struct drm_mm_node *node)
> {
> 	return drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
> 						   4096, 0, 0,
> 						   0, i915->gtt.mappable_end,
> 						   DRM_MM_SEARCH_DEFAULT,
> 						   DRM_MM_SEARCH_DEFAULT);

DRM_MM_SEARCH_DEFAULT, DRM_MM_CREATE_DEFAULT
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/9] drm/i915: Support for creating Stolen memory backed objects
  2015-12-14 10:05   ` Chris Wilson
@ 2015-12-15  6:10     ` Ankitprasad Sharma
  0 siblings, 0 replies; 30+ messages in thread
From: Ankitprasad Sharma @ 2015-12-15  6:10 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, akash.goel, shashidhar.hiremath

On Mon, 2015-12-14 at 10:05 +0000, Chris Wilson wrote:
> > diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
> > index d727b49..ebce8c9 100644
> > --- a/include/uapi/drm/i915_drm.h
> > +++ b/include/uapi/drm/i915_drm.h
> > @@ -357,6 +357,7 @@ typedef struct drm_i915_irq_wait {
> >  #define I915_PARAM_HAS_GPU_RESET	 35
> >  #define I915_PARAM_HAS_RESOURCE_STREAMER 36
> >  #define I915_PARAM_HAS_EXEC_SOFTPIN	 37
> > +#define I915_PARAM_CREATE_VERSION	 38
> >  
> >  typedef struct drm_i915_getparam {
> >  	__s32 param;
> > @@ -456,6 +457,21 @@ struct drm_i915_gem_create {
> >  	 */
> >  	__u32 handle;
> >  	__u32 pad;
> > +	/**
> > +	 * Requested flags (currently used for placement
> > +	 * (which memory domain))
> > +	 *
> > +	 * You can request that the object be created from special memory
> > +	 * rather than regular system pages using this parameter. Such
> > +	 * irregular objects may have certain restrictions (such as CPU
> > +	 * access to a stolen object is verboten).
> > +	 *
> > +	 * This can be used in the future for other purposes too
> > +	 * e.g. specifying tiling/caching/madvise
> > +	 */
> > +	__u32 flags;
> > +#define I915_CREATE_PLACEMENT_STOLEN 	(1<<0) /* Cannot use CPU mmaps */
> > +#define __I915_CREATE_UNKNOWN_FLAGS	-(I915_CREATE_PLACEMENT_STOLEN << 1)
> 
> Alignment. sizeof(drm_i915_gem_create) must be aligned to u64 since we
> contain u64 (to keep ABI compat for 32bit).
> -Chris
Sure, will update __u32 flags to __64 flags

Thanks,
Ankit
> 


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers
  2015-12-14  5:46 ` [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers ankitprasad.r.sharma
  2015-12-14  9:44   ` Chris Wilson
@ 2015-12-15 14:41   ` Dave Gordon
  2015-12-15 14:54     ` Chris Wilson
  1 sibling, 1 reply; 30+ messages in thread
From: Dave Gordon @ 2015-12-15 14:41 UTC (permalink / raw)
  To: ankitprasad.r.sharma, intel-gfx; +Cc: akash.goel, shashidhar.hiremath

On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>
> Using stolen backed objects as a batchbuffer may result into a kernel
> panic during relocation. Added a check to prevent the panic and fail
> the execbuffer call. It is not recommended to use stolen object as
> a batchbuffer.
>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 48ec484..d342f10 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -462,7 +462,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>   	if (obj->active && pagefault_disabled())
>   		return -EFAULT;
>
> -	if (use_cpu_reloc(obj))
> +	if (obj->stolen)
> +		ret = -EINVAL;

I'd rather reject ALL "weird" gem objects at the first opportunity,
so that none of the execbuffer code has to worry about stolen, phys,
dmabuf, etc ...

	if (obj->ops != &i915_gem_object_ops))
		ret = -EINVAL;		/* No exotica please */

.Dave.

> +	else if (use_cpu_reloc(obj))
>   		ret = relocate_entry_cpu(obj, reloc, target_offset);
>   	else if (obj->map_and_fenceable)
>   		ret = relocate_entry_gtt(obj, reloc, target_offset);
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers
  2015-12-15 14:41   ` Dave Gordon
@ 2015-12-15 14:54     ` Chris Wilson
  2015-12-15 17:50       ` Dave Gordon
  0 siblings, 1 reply; 30+ messages in thread
From: Chris Wilson @ 2015-12-15 14:54 UTC (permalink / raw)
  To: Dave Gordon
  Cc: ankitprasad.r.sharma, intel-gfx, akash.goel, shashidhar.hiremath

On Tue, Dec 15, 2015 at 02:41:47PM +0000, Dave Gordon wrote:
> On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
> >From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> >
> >Using stolen backed objects as a batchbuffer may result into a kernel
> >panic during relocation. Added a check to prevent the panic and fail
> >the execbuffer call. It is not recommended to use stolen object as
> >a batchbuffer.
> >
> >Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> >---
> >  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >index 48ec484..d342f10 100644
> >--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >@@ -462,7 +462,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >  	if (obj->active && pagefault_disabled())
> >  		return -EFAULT;
> >
> >-	if (use_cpu_reloc(obj))
> >+	if (obj->stolen)
> >+		ret = -EINVAL;
> 
> I'd rather reject ALL "weird" gem objects at the first opportunity,
> so that none of the execbuffer code has to worry about stolen, phys,
> dmabuf, etc ...
> 
> 	if (obj->ops != &i915_gem_object_ops))
> 		ret = -EINVAL;		/* No exotica please */

No. All GEM objects are supposed to be first-class so that they are
interchangeable through all aspects of the API (that becomes even more
important with dma-buf interoperation). We have had to relax that for a
couple of special categories (basically CPU mmapping) for certain clases
that are not struct file backed. Though in principle, a gemfs would work
just fine.

The only restrictions we should ideally impose are those determined by
hardware.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers
  2015-12-15 14:54     ` Chris Wilson
@ 2015-12-15 17:50       ` Dave Gordon
  2015-12-16 12:35         ` Chris Wilson
  0 siblings, 1 reply; 30+ messages in thread
From: Dave Gordon @ 2015-12-15 17:50 UTC (permalink / raw)
  To: Chris Wilson, ankitprasad.r.sharma, intel-gfx, akash.goel,
	shashidhar.hiremath

On 15/12/15 14:54, Chris Wilson wrote:
> On Tue, Dec 15, 2015 at 02:41:47PM +0000, Dave Gordon wrote:
>> On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
>>> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>>>
>>> Using stolen backed objects as a batchbuffer may result into a kernel
>>> panic during relocation. Added a check to prevent the panic and fail
>>> the execbuffer call. It is not recommended to use stolen object as
>>> a batchbuffer.
>>>
>>> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> index 48ec484..d342f10 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>>> @@ -462,7 +462,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
>>>   	if (obj->active && pagefault_disabled())
>>>   		return -EFAULT;
>>>
>>> -	if (use_cpu_reloc(obj))
>>> +	if (obj->stolen)
>>> +		ret = -EINVAL;
>>
>> I'd rather reject ALL "weird" gem objects at the first opportunity,
>> so that none of the execbuffer code has to worry about stolen, phys,
>> dmabuf, etc ...
>>
>> 	if (obj->ops != &i915_gem_object_ops))
>> 		ret = -EINVAL;		/* No exotica please */
>
> No. All GEM objects are supposed to be first-class so that they are
> interchangeable through all aspects of the API (that becomes even more
> important with dma-buf interoperation). We have had to relax that for a
> couple of special categories (basically CPU mmapping) for certain clases
> that are not struct file backed. Though in principle, a gemfs would work
> just fine.
>
> The only restrictions we should ideally impose are those determined by
> hardware.
> -Chris

I don't think it's reasonable to place objects that the kernel driver 
cares about -- i.e. understands and decodes -- in memory areas that it 
does not manage, and which may be subject to arbitrary uncontrolled 
access by external hardware and/or processes.

And I thought we couldn't kmap stolen anyway?

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers
  2015-12-15 17:50       ` Dave Gordon
@ 2015-12-16 12:35         ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-16 12:35 UTC (permalink / raw)
  To: Dave Gordon
  Cc: ankitprasad.r.sharma, intel-gfx, akash.goel, shashidhar.hiremath

On Tue, Dec 15, 2015 at 05:50:36PM +0000, Dave Gordon wrote:
> On 15/12/15 14:54, Chris Wilson wrote:
> >On Tue, Dec 15, 2015 at 02:41:47PM +0000, Dave Gordon wrote:
> >>On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
> >>>From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> >>>
> >>>Using stolen backed objects as a batchbuffer may result into a kernel
> >>>panic during relocation. Added a check to prevent the panic and fail
> >>>the execbuffer call. It is not recommended to use stolen object as
> >>>a batchbuffer.
> >>>
> >>>Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> >>>---
> >>>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +++-
> >>>  1 file changed, 3 insertions(+), 1 deletion(-)
> >>>
> >>>diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >>>index 48ec484..d342f10 100644
> >>>--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >>>+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> >>>@@ -462,7 +462,9 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
> >>>  	if (obj->active && pagefault_disabled())
> >>>  		return -EFAULT;
> >>>
> >>>-	if (use_cpu_reloc(obj))
> >>>+	if (obj->stolen)
> >>>+		ret = -EINVAL;
> >>
> >>I'd rather reject ALL "weird" gem objects at the first opportunity,
> >>so that none of the execbuffer code has to worry about stolen, phys,
> >>dmabuf, etc ...
> >>
> >>	if (obj->ops != &i915_gem_object_ops))
> >>		ret = -EINVAL;		/* No exotica please */
> >
> >No. All GEM objects are supposed to be first-class so that they are
> >interchangeable through all aspects of the API (that becomes even more
> >important with dma-buf interoperation). We have had to relax that for a
> >couple of special categories (basically CPU mmapping) for certain clases
> >that are not struct file backed. Though in principle, a gemfs would work
> >just fine.
> >
> >The only restrictions we should ideally impose are those determined by
> >hardware.
> >-Chris
> 
> I don't think it's reasonable to place objects that the kernel
> driver cares about -- i.e. understands and decodes -- in memory
> areas that it does not manage, and which may be subject to arbitrary
> uncontrolled access by external hardware and/or processes.

We don't though. As for these objects, they are exposed no matter what
since the user can access them concurrently and remap them to other
devices without our intervention if they should so chose. The reloc
patching depends on a userspace handshake, we have no idea if they are
lying - let alone changing the contents on the fly.

> And I thought we couldn't kmap stolen anyway?

We can on gen2-5, but that isn't the point. The point is that the API is
consistent and not piecemeal.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/9] drm/i915: Allow use of get_dma_address for stolen backed objects
  2015-12-14  5:46 ` [PATCH 1/9] drm/i915: Allow use of get_dma_address for stolen " ankitprasad.r.sharma
@ 2015-12-17 10:20   ` Tvrtko Ursulin
  0 siblings, 0 replies; 30+ messages in thread
From: Tvrtko Ursulin @ 2015-12-17 10:20 UTC (permalink / raw)
  To: ankitprasad.r.sharma, intel-gfx; +Cc: akash.goel, shashidhar.hiremath


On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>
> i915_gem_object_get_dma_address function is used to retrieve the dma address
> of a particular page so as to map it in a given GTT entry for CPU access.
> This function would be used for stolen backed objects also for tasks like
> pwrite,  clearing of the pages etc. So the obj->get_page.sg needs to be
> initialized for the stolen objects also.
>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_stolen.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 598ed2f..5384767 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -569,6 +569,9 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
>   	if (obj->pages == NULL)
>   		goto cleanup;
>
> +	obj->get_page.sg = obj->pages->sgl;
> +	obj->get_page.last = 0;
> +
>   	i915_gem_object_pin_pages(obj);
>   	obj->stolen = stolen;
>
>

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/9] drm/i915: Clearing buffer objects via CPU/GTT
  2015-12-14  5:46 ` [PATCH 3/9] drm/i915: Clearing buffer objects via CPU/GTT ankitprasad.r.sharma
  2015-12-14  9:48   ` Chris Wilson
@ 2015-12-17 10:27   ` Tvrtko Ursulin
  1 sibling, 0 replies; 30+ messages in thread
From: Tvrtko Ursulin @ 2015-12-17 10:27 UTC (permalink / raw)
  To: ankitprasad.r.sharma, intel-gfx; +Cc: akash.goel, shashidhar.hiremath


Hi,

On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>
> This patch adds support for clearing buffer objects via CPU/GTT. This
> is particularly useful for clearing out the non shmem backed objects.
> Currently intend to use this only for buffers allocated from stolen
> region.
>
> v2: Added kernel doc for i915_gem_clear_object(), corrected/removed
> variable assignments (Tvrtko)
>
> v3: Map object page by page to the gtt if the pinning of the whole object
> to the ggtt fails, Corrected function name (Chris)
>
> v4: Clear the buffer page by page, and not map the whole object in the gtt
> aperture. Use i915 wrapper function in place of drm_mm_insert_node_in_range.
>
> Testcase: igt/gem_stolen
>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h |  1 +
>   drivers/gpu/drm/i915/i915_gem.c | 44 +++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 45 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index a10b866..e195fee 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2897,6 +2897,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
>   				    int *needs_clflush);
>
>   int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
> +int i915_gem_object_clear(struct drm_i915_gem_object *obj);
>
>   static inline int __sg_page_count(struct scatterlist *sg)
>   {
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 46c1e75..e50a91b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5293,3 +5293,47 @@ fail:
>   	drm_gem_object_unreference(&obj->base);
>   	return ERR_PTR(ret);
>   }
> +
> +/**
> + * i915_gem_object_clear() - Clear buffer object via CPU/GTT
> + * @obj: Buffer object to be cleared
> + *
> + * Return: 0 - success, non-zero - failure
> + */
> +int i915_gem_object_clear(struct drm_i915_gem_object *obj)
> +{
> +	int ret, i;
> +	char __iomem *base;
> +	size_t size = obj->base.size;
> +	struct drm_i915_private *i915 = to_i915(obj->base.dev);
> +	struct drm_mm_node node;
> +
> +	lockdep_assert_held(&obj->base.dev->struct_mutex);
> +	memset(&node, 0, sizeof(node));
> +	ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0,

Use PAGE_SIZE instead of 4096 since it is used it the for loop below?

> +					    0, i915->gtt.mappable_end);
> +	if (ret)
> +		goto out;
> +
> +	i915_gem_object_pin_pages(obj);

Does it need a call to i915_gem_object_get_pages to work with all 
objects in all scenarios?

> +	base = io_mapping_map_wc(i915->gtt.mappable, node.start);
> +	for (i = 0; i < size/PAGE_SIZE; i++) {
> +		wmb();
> +		i915->gtt.base.insert_page(&i915->gtt.base,
> +					   i915_gem_object_get_dma_address(obj, i),
> +					   node.start,
> +					   I915_CACHE_NONE, 0);
> +		wmb();
> +		memset_io(base, 0, 4096);

Again, maybe also use PAGE_SIZE so it is consistent with the for loop?

> +	}
> +
> +	wmb();
> +	io_mapping_unmap(base);
> +	i915->gtt.base.clear_range(&i915->gtt.base,
> +			node.start, node.size,
> +			true);
> +	drm_mm_remove_node(&node);
> +	i915_gem_object_unpin_pages(obj);
> +out:
> +	return ret;
> +}
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast
  2015-12-14  5:46 ` [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast ankitprasad.r.sharma
  2015-12-14  9:54   ` Chris Wilson
@ 2015-12-17 10:45   ` Tvrtko Ursulin
  2015-12-17 11:19     ` Chris Wilson
  1 sibling, 1 reply; 30+ messages in thread
From: Tvrtko Ursulin @ 2015-12-17 10:45 UTC (permalink / raw)
  To: ankitprasad.r.sharma, intel-gfx; +Cc: akash.goel, shashidhar.hiremath


Hi,

On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
> From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
>
> In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
> we try a nonblocking pin for the whole object (since that is fastest if
> reused), then failing that we try to grab one page in the mappable
> aperture. It also allows us to handle objects larger than the mappable
> aperture (e.g. if we need to pwrite with vGPU restricting the aperture
> to a measely 8MiB or something like that).
>
> v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)
>
> v3: Combined loops based on local patch by Chris (Chris)
>
> v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris)
>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 86 ++++++++++++++++++++++++++++++-----------
>   1 file changed, 64 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bf7f203..46c1e75 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -61,6 +61,21 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
>   	return obj->pin_display;
>   }
>
> +static int
> +i915_gem_insert_node_in_range(struct drm_i915_private *i915,
> +			      struct drm_mm_node *node, u64 size,
> +			      unsigned alignment, u64 start, u64 end)
> +{
> +	int ret;
> +
> +	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
> +						  size, alignment, 0, start,
> +						  end, DRM_MM_SEARCH_DEFAULT,
> +						  DRM_MM_SEARCH_DEFAULT);
> +
> +	return ret;
> +}
> +
>   /* some bookkeeping */
>   static void i915_gem_info_add_obj(struct drm_i915_private *dev_priv,
>   				  size_t size)
> @@ -760,20 +775,29 @@ fast_user_write(struct io_mapping *mapping,
>    * user into the GTT, uncached.
>    */
>   static int
> -i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> +i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
>   			 struct drm_i915_gem_object *obj,
>   			 struct drm_i915_gem_pwrite *args,
>   			 struct drm_file *file)
>   {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	ssize_t remain;
> -	loff_t offset, page_base;
> +	struct drm_mm_node node;
> +	uint64_t remain, offset;
>   	char __user *user_data;
> -	int page_offset, page_length, ret;
> +	int ret;
>
>   	ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE | PIN_NONBLOCK);
> -	if (ret)
> -		goto out;
> +	if (ret) {
> +		memset(&node, 0, sizeof(node));
> +		ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0,
> +						    0, i915->gtt.mappable_end);

Suggest PAGE_SIZE instead of 4096 to match the main loop below.

> +		if (ret)
> +			goto out;
> +
> +		i915_gem_object_pin_pages(obj);

i915_gem_object_get_pages is missing again before pin pages I think.

If true it means we need an IGT to exercise this path. Should be easy 
with a huge object and just pwrite a small chunk?

> +	} else {
> +		node.start = i915_gem_obj_ggtt_offset(obj);
> +		node.allocated = false;
> +	}
>
>   	ret = i915_gem_object_set_to_gtt_domain(obj, true);
>   	if (ret)
> @@ -783,31 +807,39 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>   	if (ret)
>   		goto out_unpin;
>
> -	user_data = to_user_ptr(args->data_ptr);
> -	remain = args->size;
> -
> -	offset = i915_gem_obj_ggtt_offset(obj) + args->offset;
> -
>   	intel_fb_obj_invalidate(obj, ORIGIN_GTT);
> +	obj->dirty = true;
>
> -	while (remain > 0) {
> +	user_data = to_user_ptr(args->data_ptr);
> +	offset = args->offset;
> +	remain = args->size;
> +	while (remain) {
>   		/* Operation in this page
>   		 *
>   		 * page_base = page offset within aperture
>   		 * page_offset = offset within page
>   		 * page_length = bytes to copy for this page
>   		 */
> -		page_base = offset & PAGE_MASK;
> -		page_offset = offset_in_page(offset);
> -		page_length = remain;
> -		if ((page_offset + remain) > PAGE_SIZE)
> -			page_length = PAGE_SIZE - page_offset;
> -
> +		u32 page_base = node.start;

Compiler does not complain about possible truncation?

> +		unsigned page_offset = offset_in_page(offset);
> +		unsigned page_length = PAGE_SIZE - page_offset;
> +		page_length = remain < page_length ? remain : page_length;
> +		if (node.allocated) {
> +			wmb();
> +			i915->gtt.base.insert_page(&i915->gtt.base,
> +						   i915_gem_object_get_dma_address(obj, offset >> PAGE_SHIFT),
> +						   node.start,
> +						   I915_CACHE_NONE,
> +						   0);
> +			wmb();
> +		} else {
> +			page_base += offset & PAGE_MASK;
> +		}
>   		/* If we get a fault while copying data, then (presumably) our
>   		 * source page isn't available.  Return the error and we'll
>   		 * retry in the slow path.
>   		 */
> -		if (fast_user_write(dev_priv->gtt.mappable, page_base,
> +		if (fast_user_write(i915->gtt.mappable, page_base,
>   				    page_offset, user_data, page_length)) {
>   			ret = -EFAULT;
>   			goto out_flush;
> @@ -821,7 +853,17 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
>   out_flush:
>   	intel_fb_obj_flush(obj, false, ORIGIN_GTT);
>   out_unpin:
> -	i915_gem_object_ggtt_unpin(obj);
> +	if (node.allocated) {
> +		wmb();
> +		i915->gtt.base.clear_range(&i915->gtt.base,
> +				node.start, node.size,
> +				true);
> +		drm_mm_remove_node(&node);
> +		i915_gem_object_unpin_pages(obj);
> +	}
> +	else {
> +		i915_gem_object_ggtt_unpin(obj);
> +	}
>   out:
>   	return ret;
>   }
> @@ -1086,7 +1128,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
>   	if (obj->tiling_mode == I915_TILING_NONE &&
>   	    obj->base.write_domain != I915_GEM_DOMAIN_CPU &&
>   	    cpu_write_needs_clflush(obj)) {
> -		ret = i915_gem_gtt_pwrite_fast(dev, obj, args, file);
> +		ret = i915_gem_gtt_pwrite_fast(dev_priv, obj, args, file);
>   		/* Note that the gtt paths might fail with non-page-backed user
>   		 * pointers (e.g. gtt mappings when moving data between
>   		 * textures). Fallback to the shmem path in that case. */
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 6/9] drm/i915: Add support for stealing purgable stolen pages
  2015-12-14  5:46 ` [PATCH 6/9] drm/i915: Add support for stealing purgable stolen pages ankitprasad.r.sharma
  2015-12-14 10:13   ` Chris Wilson
@ 2015-12-17 10:51   ` Tvrtko Ursulin
  1 sibling, 0 replies; 30+ messages in thread
From: Tvrtko Ursulin @ 2015-12-17 10:51 UTC (permalink / raw)
  To: ankitprasad.r.sharma, intel-gfx; +Cc: akash.goel, shashidhar.hiremath



On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
> From: Chris Wilson <chris at chris-wilson.co.uk>
>
> If we run out of stolen memory when trying to allocate an object, see if
> we can reap enough purgeable objects to free up enough contiguous free
> space for the allocation. This is in principle very much like evicting
> objects to free up enough contiguous space in the vma when binding
> a new object - and you will be forgiven for thinking that the code looks
> very similar.
>
> At the moment, we do not allow userspace to allocate objects in stolen,
> so there is neither the memory pressure to trigger stolen eviction nor
> any purgeable objects inside the stolen arena. However, this will change
> in the near future, and so better management and defragmentation of
> stolen memory will become a real issue.
>
> v2: Remember to remove the drm_mm_node.
>
> v3: Rebased to the latest drm-intel-nightly (Ankit)
>
> v4: corrected if-else braces format (Tvrtko/kerneldoc)
>
> v5: Rebased to the latest drm-intel-nightly (Ankit)
> Added a seperate list to maintain purgable objects from stolen memory
> region (Chris/Daniel)
>
> v6: Compiler optimization (merging 2 single loops into one for() loop),
> corrected code for object eviction, retire_requests before starting
> object eviction (Chris)
>
> v7: Added kernel doc for i915_gem_object_create_stolen()
>
> v8: Check for struct_mutex lock before creating object from stolen
> region (Tvrtko)
>
> v9: Renamed variables to make usage clear, added comment, removed onetime
> used macro (Tvrtko)
>
> v10: Avoid masking of error when stolen_alloc fails (Tvrtko)
>
> Testcase: igt/gem_stolen
>
> Signed-off-by: Chris Wilson <chris at chris-wilson.co.uk>
> Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_debugfs.c    |   6 +-
>   drivers/gpu/drm/i915/i915_drv.h        |  17 +++-
>   drivers/gpu/drm/i915/i915_gem.c        |  16 ++++
>   drivers/gpu/drm/i915/i915_gem_stolen.c | 170 +++++++++++++++++++++++++++++----
>   drivers/gpu/drm/i915/intel_pm.c        |   4 +-
>   5 files changed, 188 insertions(+), 25 deletions(-)


Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index a8721fc..f0aa3d4 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -174,7 +174,7 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
>   			seq_puts(m, ")");
>   	}
>   	if (obj->stolen)
> -		seq_printf(m, " (stolen: %08llx)", obj->stolen->start);
> +		seq_printf(m, " (stolen: %08llx)", obj->stolen->base.start);
>   	if (obj->pin_display || obj->fault_mappable) {
>   		char s[3], *t = s;
>   		if (obj->pin_display)
> @@ -253,9 +253,9 @@ static int obj_rank_by_stolen(void *priv,
>   	struct drm_i915_gem_object *b =
>   		container_of(B, struct drm_i915_gem_object, obj_exec_link);
>
> -	if (a->stolen->start < b->stolen->start)
> +	if (a->stolen->base.start < b->stolen->base.start)
>   		return -1;
> -	if (a->stolen->start > b->stolen->start)
> +	if (a->stolen->base.start > b->stolen->base.start)
>   		return 1;
>   	return 0;
>   }
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index dcdfb97..479703b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -841,6 +841,12 @@ struct i915_ctx_hang_stats {
>   	bool banned;
>   };
>
> +struct i915_stolen_node {
> +	struct drm_mm_node base;
> +	struct list_head mm_link;
> +	struct drm_i915_gem_object *obj;
> +};
> +
>   /* This must match up with the value previously used for execbuf2.rsvd1. */
>   #define DEFAULT_CONTEXT_HANDLE 0
>
> @@ -1251,6 +1257,13 @@ struct i915_gem_mm {
>   	 */
>   	struct list_head unbound_list;
>
> +	/**
> +	 * List of stolen objects that have been marked as purgeable and
> +	 * thus available for reaping if we need more space for a new
> +	 * allocation. Ordered by time of marking purgeable.
> +	 */
> +	struct list_head stolen_list;
> +
>   	/** Usable portion of the GTT for GEM */
>   	unsigned long stolen_base; /* limited to low memory (32-bit) */
>
> @@ -2031,7 +2044,7 @@ struct drm_i915_gem_object {
>   	struct list_head vma_list;
>
>   	/** Stolen memory for this object, instead of being backed by shmem. */
> -	struct drm_mm_node *stolen;
> +	struct i915_stolen_node *stolen;
>   	struct list_head global_list;
>
>   	struct list_head ring_list[I915_NUM_RINGS];
> @@ -2039,6 +2052,8 @@ struct drm_i915_gem_object {
>   	struct list_head obj_exec_link;
>
>   	struct list_head batch_pool_link;
> +	/** Used during stolen memory allocations to temporarily hold a ref */
> +	struct list_head stolen_link;
>
>   	/**
>   	 * This is set if the object is on the active lists (has pending
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 05505de..8a508cd 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -4411,6 +4411,20 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
>   	if (obj->madv == I915_MADV_DONTNEED && obj->pages == NULL)
>   		i915_gem_object_truncate(obj);
>
> +	if (obj->stolen) {
> +		switch (obj->madv) {
> +		case I915_MADV_WILLNEED:
> +			list_del_init(&obj->stolen->mm_link);
> +			break;
> +		case I915_MADV_DONTNEED:
> +			list_move(&obj->stolen->mm_link,
> +				  &dev_priv->mm.stolen_list);
> +			break;
> +		default:
> +			break;
> +		}
> +	}
> +
>   	args->retained = obj->madv != __I915_MADV_PURGED;
>
>   out:
> @@ -4431,6 +4445,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
>   	INIT_LIST_HEAD(&obj->obj_exec_link);
>   	INIT_LIST_HEAD(&obj->vma_list);
>   	INIT_LIST_HEAD(&obj->batch_pool_link);
> +	INIT_LIST_HEAD(&obj->stolen_link);
>
>   	obj->ops = ops;
>
> @@ -5046,6 +5061,7 @@ i915_gem_load(struct drm_device *dev)
>   	INIT_LIST_HEAD(&dev_priv->context_list);
>   	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
>   	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
> +	INIT_LIST_HEAD(&dev_priv->mm.stolen_list);
>   	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
>   	for (i = 0; i < I915_NUM_RINGS; i++)
>   		init_ring_lists(&dev_priv->ring[i]);
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index 366080b9..014d478 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -542,7 +542,8 @@ i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
>   	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>
>   	if (obj->stolen) {
> -		i915_gem_stolen_remove_node(dev_priv, obj->stolen);
> +		list_del(&obj->stolen->mm_link);
> +		i915_gem_stolen_remove_node(dev_priv, &obj->stolen->base);
>   		kfree(obj->stolen);
>   		obj->stolen = NULL;
>   	}
> @@ -555,7 +556,7 @@ static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
>
>   static struct drm_i915_gem_object *
>   _i915_gem_object_create_stolen(struct drm_device *dev,
> -			       struct drm_mm_node *stolen)
> +			       struct i915_stolen_node *stolen)
>   {
>   	struct drm_i915_gem_object *obj;
>
> @@ -563,11 +564,12 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
>   	if (obj == NULL)
>   		return ERR_PTR(-ENOMEM);
>
> -	drm_gem_private_object_init(dev, &obj->base, stolen->size);
> +	drm_gem_private_object_init(dev, &obj->base, stolen->base.size);
>   	i915_gem_object_init(obj, &i915_gem_object_stolen_ops);
>
>   	obj->pages = i915_pages_create_for_stolen(dev,
> -						  stolen->start, stolen->size);
> +						  stolen->base.start,
> +						  stolen->base.size);
>   	if (IS_ERR(obj->pages)) {
>   		i915_gem_object_free(obj);
>   		return (void*) obj->pages;
> @@ -579,24 +581,111 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
>   	i915_gem_object_pin_pages(obj);
>   	obj->stolen = stolen;
>
> +	stolen->obj = obj;
> +	INIT_LIST_HEAD(&stolen->mm_link);
> +
>   	obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
>   	obj->cache_level = HAS_LLC(dev) ? I915_CACHE_LLC : I915_CACHE_NONE;
>
>   	return obj;
>   }
>
> -struct drm_i915_gem_object *
> -i915_gem_object_create_stolen(struct drm_device *dev, u64 size)
> +static bool
> +mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
> +{
> +	BUG_ON(obj->stolen == NULL);
> +
> +	if (obj->madv != I915_MADV_DONTNEED)
> +		return false;
> +
> +	if (obj->pin_display)
> +		return false;
> +
> +	list_add(&obj->stolen_link, unwind);
> +	return drm_mm_scan_add_block(&obj->stolen->base);
> +}
> +
> +static int
> +stolen_evict(struct drm_i915_private *dev_priv, u64 size)
>   {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
>   	struct drm_i915_gem_object *obj;
> -	struct drm_mm_node *stolen;
> -	int ret;
> +	struct list_head unwind, evict;
> +	struct i915_stolen_node *iter;
> +	int ret, active;
>
> -	if (!drm_mm_initialized(&dev_priv->mm.stolen))
> -		return ERR_PTR(-ENODEV);
> +	drm_mm_init_scan(&dev_priv->mm.stolen, size, 0, 0);
> +	INIT_LIST_HEAD(&unwind);
> +
> +	/* Retire all requests before creating the evict list */
> +	i915_gem_retire_requests(dev_priv->dev);
> +
> +	for (active = 0; active <= 1; active++) {
> +		list_for_each_entry(iter, &dev_priv->mm.stolen_list, mm_link) {
> +			if (iter->obj->active != active)
> +				continue;
> +
> +			if (mark_free(iter->obj, &unwind))
> +				goto found;
> +		}
> +	}
> +
> +found:
> +	INIT_LIST_HEAD(&evict);
> +	while (!list_empty(&unwind)) {
> +		obj = list_first_entry(&unwind,
> +				       struct drm_i915_gem_object,
> +				       stolen_link);
> +		list_del(&obj->stolen_link);
> +
> +		if (drm_mm_scan_remove_block(&obj->stolen->base)) {
> +			list_add(&obj->stolen_link, &evict);
> +			drm_gem_object_reference(&obj->base);
> +		}
> +	}
> +
> +	ret = 0;
> +	while (!list_empty(&evict)) {
> +		obj = list_first_entry(&evict,
> +				       struct drm_i915_gem_object,
> +				       stolen_link);
> +		list_del(&obj->stolen_link);
> +
> +		if (ret == 0) {
> +			struct i915_vma *vma, *vma_next;
> +
> +			list_for_each_entry_safe(vma, vma_next,
> +						 &obj->vma_list,
> +						 vma_link)
> +				if (i915_vma_unbind(vma))
> +					break;
> +
> +			/* Stolen pins its pages to prevent the
> +			 * normal shrinker from processing stolen
> +			 * objects.
> +			 */
> +			i915_gem_object_unpin_pages(obj);
> +
> +			ret = i915_gem_object_put_pages(obj);
> +			if (ret == 0) {
> +				i915_gem_object_release_stolen(obj);
> +				obj->madv = __I915_MADV_PURGED;
> +			} else {
> +				i915_gem_object_pin_pages(obj);
> +			}
> +		}
> +
> +		drm_gem_object_unreference(&obj->base);
> +	}
> +
> +	return ret;
> +}
> +
> +static struct i915_stolen_node *
> +stolen_alloc(struct drm_i915_private *dev_priv, u64 size)
> +{
> +	struct i915_stolen_node *stolen;
> +	int ret;
>
> -	DRM_DEBUG_KMS("creating stolen object: size=%llx\n", size);
>   	if (size == 0)
>   		return ERR_PTR(-EINVAL);
>
> @@ -604,17 +693,60 @@ i915_gem_object_create_stolen(struct drm_device *dev, u64 size)
>   	if (!stolen)
>   		return ERR_PTR(-ENOMEM);
>
> -	ret = i915_gem_stolen_insert_node(dev_priv, stolen, size, 4096);
> +	ret = i915_gem_stolen_insert_node(dev_priv, &stolen->base, size, 4096);
> +	if (ret == 0)
> +		goto out;
> +
> +	/* No more stolen memory available, or too fragmented.
> +	 * Try evicting purgeable objects and search again.
> +	 */
> +	ret = stolen_evict(dev_priv, size);
> +	if (ret == 0)
> +		ret = i915_gem_stolen_insert_node(dev_priv, &stolen->base,
> +						  size, 4096);
> +out:
>   	if (ret) {
>   		kfree(stolen);
>   		return ERR_PTR(ret);
>   	}
>
> +	return stolen;
> +}
> +
> +/**
> + * i915_gem_object_create_stolen() - creates object using the stolen memory
> + * @dev:	drm device
> + * @size:	size of the object requested
> + *
> + * i915_gem_object_create_stolen() tries to allocate memory for the object
> + * from the stolen memory region. If not enough memory is found, it tries
> + * evicting purgeable objects and searching again.
> + *
> + * Returns: Object pointer - success and error pointer - failure
> + */
> +struct drm_i915_gem_object *
> +i915_gem_object_create_stolen(struct drm_device *dev, u64 size)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_i915_gem_object *obj;
> +	struct i915_stolen_node *stolen;
> +
> +	WARN_ON(!mutex_is_locked(&dev->struct_mutex));
> +
> +	if (!drm_mm_initialized(&dev_priv->mm.stolen))
> +		return ERR_PTR(-ENODEV);
> +
> +	DRM_DEBUG_KMS("creating stolen object: size=%llx\n", size);
> +
> +	stolen = stolen_alloc(dev_priv, size);
> +	if (IS_ERR(stolen))
> +		return (void*) stolen;
> +
>   	obj = _i915_gem_object_create_stolen(dev, stolen);
>   	if (!IS_ERR(obj))
>   		return obj;
>
> -	i915_gem_stolen_remove_node(dev_priv, stolen);
> +	i915_gem_stolen_remove_node(dev_priv, &stolen->base);
>   	kfree(stolen);
>   	return obj;
>   }
> @@ -628,7 +760,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>   	struct drm_i915_private *dev_priv = dev->dev_private;
>   	struct i915_address_space *ggtt = &dev_priv->gtt.base;
>   	struct drm_i915_gem_object *obj;
> -	struct drm_mm_node *stolen;
> +	struct i915_stolen_node *stolen;
>   	struct i915_vma *vma;
>   	int ret;
>
> @@ -647,10 +779,10 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>   	if (!stolen)
>   		return ERR_PTR(-ENOMEM);
>
> -	stolen->start = stolen_offset;
> -	stolen->size = size;
> +	stolen->base.start = stolen_offset;
> +	stolen->base.size = size;
>   	mutex_lock(&dev_priv->mm.stolen_lock);
> -	ret = drm_mm_reserve_node(&dev_priv->mm.stolen, stolen);
> +	ret = drm_mm_reserve_node(&dev_priv->mm.stolen, &stolen->base);
>   	mutex_unlock(&dev_priv->mm.stolen_lock);
>   	if (ret) {
>   		DRM_DEBUG_KMS("failed to allocate stolen space\n");
> @@ -661,7 +793,7 @@ i915_gem_object_create_stolen_for_preallocated(struct drm_device *dev,
>   	obj = _i915_gem_object_create_stolen(dev, stolen);
>   	if (IS_ERR(obj)) {
>   		DRM_DEBUG_KMS("failed to allocate stolen object\n");
> -		i915_gem_stolen_remove_node(dev_priv, stolen);
> +		i915_gem_stolen_remove_node(dev_priv, &stolen->base);
>   		kfree(stolen);
>   		return obj;
>   	}
> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 0afb819..c94b39b 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -5119,7 +5119,7 @@ static void valleyview_check_pctx(struct drm_i915_private *dev_priv)
>   	unsigned long pctx_addr = I915_READ(VLV_PCBR) & ~4095;
>
>   	WARN_ON(pctx_addr != dev_priv->mm.stolen_base +
> -			     dev_priv->vlv_pctx->stolen->start);
> +			     dev_priv->vlv_pctx->stolen->base.start);
>   }
>
>
> @@ -5194,7 +5194,7 @@ static void valleyview_setup_pctx(struct drm_device *dev)
>   		return;
>   	}
>
> -	pctx_paddr = dev_priv->mm.stolen_base + pctx->stolen->start;
> +	pctx_paddr = dev_priv->mm.stolen_base + pctx->stolen->base.start;
>   	I915_WRITE(VLV_PCBR, pctx_paddr);
>
>   out:
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast
  2015-12-17 10:45   ` Tvrtko Ursulin
@ 2015-12-17 11:19     ` Chris Wilson
  0 siblings, 0 replies; 30+ messages in thread
From: Chris Wilson @ 2015-12-17 11:19 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: ankitprasad.r.sharma, intel-gfx, akash.goel, shashidhar.hiremath

On Thu, Dec 17, 2015 at 10:45:15AM +0000, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> On 14/12/15 05:46, ankitprasad.r.sharma@intel.com wrote:
> >From: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> >
> >In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
> >we try a nonblocking pin for the whole object (since that is fastest if
> >reused), then failing that we try to grab one page in the mappable
> >aperture. It also allows us to handle objects larger than the mappable
> >aperture (e.g. if we need to pwrite with vGPU restricting the aperture
> >to a measely 8MiB or something like that).
> >
> >v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)
> >
> >v3: Combined loops based on local patch by Chris (Chris)
> >
> >v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris)
> >
> >Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> >  drivers/gpu/drm/i915/i915_gem.c | 86 ++++++++++++++++++++++++++++++-----------
> >  1 file changed, 64 insertions(+), 22 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index bf7f203..46c1e75 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -61,6 +61,21 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
> >  	return obj->pin_display;
> >  }
> >
> >+static int
> >+i915_gem_insert_node_in_range(struct drm_i915_private *i915,
> >+			      struct drm_mm_node *node, u64 size,
> >+			      unsigned alignment, u64 start, u64 end)
> >+{
> >+	int ret;
> >+
> >+	ret = drm_mm_insert_node_in_range_generic(&i915->gtt.base.mm, node,
> >+						  size, alignment, 0, start,
> >+						  end, DRM_MM_SEARCH_DEFAULT,
> >+						  DRM_MM_SEARCH_DEFAULT);
> >+
> >+	return ret;
> >+}
> >+
> >  /* some bookkeeping */
> >  static void i915_gem_info_add_obj(struct drm_i915_private *dev_priv,
> >  				  size_t size)
> >@@ -760,20 +775,29 @@ fast_user_write(struct io_mapping *mapping,
> >   * user into the GTT, uncached.
> >   */
> >  static int
> >-i915_gem_gtt_pwrite_fast(struct drm_device *dev,
> >+i915_gem_gtt_pwrite_fast(struct drm_i915_private *i915,
> >  			 struct drm_i915_gem_object *obj,
> >  			 struct drm_i915_gem_pwrite *args,
> >  			 struct drm_file *file)
> >  {
> >-	struct drm_i915_private *dev_priv = dev->dev_private;
> >-	ssize_t remain;
> >-	loff_t offset, page_base;
> >+	struct drm_mm_node node;
> >+	uint64_t remain, offset;
> >  	char __user *user_data;
> >-	int page_offset, page_length, ret;
> >+	int ret;
> >
> >  	ret = i915_gem_obj_ggtt_pin(obj, 0, PIN_MAPPABLE | PIN_NONBLOCK);
> >-	if (ret)
> >-		goto out;
> >+	if (ret) {
> >+		memset(&node, 0, sizeof(node));
> >+		ret = i915_gem_insert_node_in_range(i915, &node, 4096, 0,
> >+						    0, i915->gtt.mappable_end);
> 
> Suggest PAGE_SIZE instead of 4096 to match the main loop below.
> 
> >+		if (ret)
> >+			goto out;
> >+
> >+		i915_gem_object_pin_pages(obj);
> 
> i915_gem_object_get_pages is missing again before pin pages I think.

That's due to rebasing my patch where I merge the get_pages call into
pin_pages, sorry.

> If true it means we need an IGT to exercise this path. Should be
> easy with a huge object and just pwrite a small chunk?

Hmm, it should be hit by gem_pwrite/big-gtt + huge-gtt. If not, then we
do indeed more testing.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2015-12-17 11:20 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-14  5:46 [PATCH v11 0/9] Support for creating/using Stolen memory backed objects ankitprasad.r.sharma
2015-12-14  5:46 ` [PATCH 1/9] drm/i915: Allow use of get_dma_address for stolen " ankitprasad.r.sharma
2015-12-17 10:20   ` Tvrtko Ursulin
2015-12-14  5:46 ` [PATCH 2/9] drm/i915: Use insert_page for pwrite_fast ankitprasad.r.sharma
2015-12-14  9:54   ` Chris Wilson
2015-12-14 10:48     ` Chris Wilson
2015-12-14 11:22       ` Chris Wilson
2015-12-17 10:45   ` Tvrtko Ursulin
2015-12-17 11:19     ` Chris Wilson
2015-12-14  5:46 ` [PATCH 3/9] drm/i915: Clearing buffer objects via CPU/GTT ankitprasad.r.sharma
2015-12-14  9:48   ` Chris Wilson
2015-12-17 10:27   ` Tvrtko Ursulin
2015-12-14  5:46 ` [PATCH 4/9] drm/i915: Support for creating Stolen memory backed objects ankitprasad.r.sharma
2015-12-14 10:05   ` Chris Wilson
2015-12-15  6:10     ` Ankitprasad Sharma
2015-12-14  5:46 ` [PATCH 5/9] drm/i915: Propagating correct error codes to the userspace ankitprasad.r.sharma
2015-12-14 10:10   ` Chris Wilson
2015-12-14  5:46 ` [PATCH 6/9] drm/i915: Add support for stealing purgable stolen pages ankitprasad.r.sharma
2015-12-14 10:13   ` Chris Wilson
2015-12-17 10:51   ` Tvrtko Ursulin
2015-12-14  5:46 ` [PATCH 7/9] drm/i915: Support for pread/pwrite from/to non shmem backed objects ankitprasad.r.sharma
2015-12-14  9:43   ` Chris Wilson
2015-12-14  5:46 ` [PATCH 8/9] drm/i915: Migrate stolen objects before hibernation ankitprasad.r.sharma
2015-12-14 10:31   ` Chris Wilson
2015-12-14  5:46 ` [PATCH 9/9] drm/i915: Fail the execbuff using stolen objects as batchbuffers ankitprasad.r.sharma
2015-12-14  9:44   ` Chris Wilson
2015-12-15 14:41   ` Dave Gordon
2015-12-15 14:54     ` Chris Wilson
2015-12-15 17:50       ` Dave Gordon
2015-12-16 12:35         ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.