intel-gfx.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* Stolen pages, with a little surprise
@ 2012-08-11 14:40 Chris Wilson
  2012-08-11 14:41 ` [PATCH 01/29] drm/i915: Track unbound pages Chris Wilson
                   ` (28 more replies)
  0 siblings, 29 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:40 UTC (permalink / raw)
  To: intel-gfx

I've reworked and incorporated dmabuf into the grand scheme of things,
unifying how we track pages for the various different types of object
(shmemfs, dmabuf, stolen) so the series is quite a bit more complicated
than it was before.

Please review, flame and generally suggest cleaner methods.
-Chris

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH 01/29] drm/i915: Track unbound pages
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-20  9:00   ` [PATCH 1/2] drm/i915: move functions around Daniel Vetter
  2012-08-11 14:41 ` [PATCH 02/29] drm/i915: Show (count, size) of purgeable objects in i915_gem_objects Chris Wilson
                   ` (27 subsequent siblings)
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

When dealing with a working set larger than the GATT, or even the
mappable aperture when touching through the GTT, we end up with evicting
objects only to rebind them at a new offset again later. Moving an
object into and out of the GTT requires clflushing the pages, thus
causing a double-clflush penalty for rebinding.

To avoid having to clflush on rebinding, we can track the pages as they
are evicted from the GTT and only relinquish those pages on memory
pressure.

As usual, if it were not for the handling of out-of-memory condition and
having to manually shrink our own bo caches, it would be a net reduction
of code. Alas.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   14 +-
 drivers/gpu/drm/i915/i915_drv.h            |   13 +-
 drivers/gpu/drm/i915/i915_gem.c            |  397 ++++++++++++++--------------
 drivers/gpu/drm/i915/i915_gem_dmabuf.c     |   20 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |   13 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    9 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c        |    2 +-
 drivers/gpu/drm/i915/i915_irq.c            |    4 +-
 drivers/gpu/drm/i915/i915_trace.h          |   10 +-
 9 files changed, 240 insertions(+), 242 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0e8f14d..a7eb093 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -211,7 +211,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 		   dev_priv->mm.object_memory);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&dev_priv->mm.gtt_list, gtt_list);
+	count_objects(&dev_priv->mm.bound_list, gtt_list);
 	seq_printf(m, "%u [%u] objects, %zu [%zu] bytes in gtt\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -225,8 +225,13 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
+	size = count = 0;
+	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list)
+		size += obj->base.size, ++count;
+	seq_printf(m, "%u unbound objects, %zu bytes\n", count, size);
+
 	size = count = mappable_size = mappable_count = 0;
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list) {
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) {
 		if (obj->fault_mappable) {
 			size += obj->gtt_space->size;
 			++count;
@@ -264,7 +269,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void* data)
 		return ret;
 
 	total_obj_size = total_gtt_size = count = 0;
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list) {
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) {
 		if (list == PINNED_LIST && obj->pin_count == 0)
 			continue;
 
@@ -526,7 +531,8 @@ static int i915_gem_fence_regs_info(struct seq_file *m, void *data)
 	for (i = 0; i < dev_priv->num_fence_regs; i++) {
 		struct drm_i915_gem_object *obj = dev_priv->fence_regs[i].obj;
 
-		seq_printf(m, "Fenced object[%2d] = ", i);
+		seq_printf(m, "Fence %d, pin count = %d, object = ",
+			   i, dev_priv->fence_regs[i].pin_count);
 		if (obj == NULL)
 			seq_printf(m, "unused");
 		else
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 261fe21..e252947 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -692,7 +692,13 @@ typedef struct drm_i915_private {
 		struct drm_mm gtt_space;
 		/** List of all objects in gtt_space. Used to restore gtt
 		 * mappings on resume */
-		struct list_head gtt_list;
+		struct list_head bound_list;
+		/**
+		 * List of objects which are not bound to the GTT (thus
+		 * are idle and not used by the GPU) but still have
+		 * (presumably uncached) pages still attached.
+		 */
+		struct list_head unbound_list;
 
 		/** Usable portion of the GTT for GEM */
 		unsigned long gtt_start;
@@ -1307,8 +1313,7 @@ int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
 void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
 
-int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
-				  gfp_t gfpmask);
+int __must_check i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj);
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
@@ -1450,7 +1455,7 @@ int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
 					  unsigned alignment,
 					  unsigned cache_level,
 					  bool mappable);
-int i915_gem_evict_everything(struct drm_device *dev, bool purgeable_only);
+int i915_gem_evict_everything(struct drm_device *dev);
 
 /* i915_gem_stolen.c */
 int i915_gem_init_stolen(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0514593..3a7ac38 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -55,6 +55,8 @@ static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj,
 
 static int i915_gem_inactive_shrink(struct shrinker *shrinker,
 				    struct shrink_control *sc);
+static long i915_gem_purge(struct drm_i915_private *dev_priv, long target);
+static void i915_gem_shrink_all(struct drm_i915_private *dev_priv);
 static void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
 
 static inline void i915_gem_object_fence_lost(struct drm_i915_gem_object *obj)
@@ -140,7 +142,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return !obj->active;
+	return obj->gtt_space && !obj->active;
 }
 
 int
@@ -179,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 
 	pinned = 0;
 	mutex_lock(&dev->struct_mutex);
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list)
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
 		if (obj->pin_count)
 			pinned += obj->gtt_space->size;
 	mutex_unlock(&dev->struct_mutex);
@@ -423,9 +425,11 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		 * anyway again before the next pread happens. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush = 1;
-		ret = i915_gem_object_set_to_gtt_domain(obj, false);
-		if (ret)
-			return ret;
+		if (obj->gtt_space) {
+			ret = i915_gem_object_set_to_gtt_domain(obj, false);
+			if (ret)
+				return ret;
+		}
 	}
 
 	offset = args->offset;
@@ -751,9 +755,11 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		 * right away and we therefore have to clflush anyway. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush_after = 1;
-		ret = i915_gem_object_set_to_gtt_domain(obj, true);
-		if (ret)
-			return ret;
+		if (obj->gtt_space) {
+			ret = i915_gem_object_set_to_gtt_domain(obj, true);
+			if (ret)
+				return ret;
+		}
 	}
 	/* Same trick applies for invalidate partially written cachelines before
 	 * writing.  */
@@ -1340,64 +1346,54 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
 	return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset);
 }
 
-int
-i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
-			      gfp_t gfpmask)
+/* Immediately discard the backing storage */
+static void
+i915_gem_object_truncate(struct drm_i915_gem_object *obj)
 {
-	int page_count, i;
-	struct address_space *mapping;
 	struct inode *inode;
-	struct page *page;
-
-	if (obj->pages || obj->sg_table)
-		return 0;
 
-	/* Get the list of pages out of our struct file.  They'll be pinned
-	 * at this point until we release them.
+	/* Our goal here is to return as much of the memory as
+	 * is possible back to the system as we are called from OOM.
+	 * To do this we must instruct the shmfs to drop all of its
+	 * backing pages, *now*.
 	 */
-	page_count = obj->base.size / PAGE_SIZE;
-	BUG_ON(obj->pages != NULL);
-	obj->pages = drm_malloc_ab(page_count, sizeof(struct page *));
-	if (obj->pages == NULL)
-		return -ENOMEM;
-
 	inode = obj->base.filp->f_path.dentry->d_inode;
-	mapping = inode->i_mapping;
-	gfpmask |= mapping_gfp_mask(mapping);
-
-	for (i = 0; i < page_count; i++) {
-		page = shmem_read_mapping_page_gfp(mapping, i, gfpmask);
-		if (IS_ERR(page))
-			goto err_pages;
-
-		obj->pages[i] = page;
-	}
-
-	if (i915_gem_object_needs_bit17_swizzle(obj))
-		i915_gem_object_do_bit_17_swizzle(obj);
+	shmem_truncate_range(inode, 0, (loff_t)-1);
 
-	return 0;
+	if (obj->base.map_list.map)
+		drm_gem_free_mmap_offset(&obj->base);
 
-err_pages:
-	while (i--)
-		page_cache_release(obj->pages[i]);
+	obj->madv = __I915_MADV_PURGED;
+}
 
-	drm_free_large(obj->pages);
-	obj->pages = NULL;
-	return PTR_ERR(page);
+static inline int
+i915_gem_object_is_purgeable(struct drm_i915_gem_object *obj)
+{
+	return obj->madv == I915_MADV_DONTNEED;
 }
 
-static void
+static int
 i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 {
 	int page_count = obj->base.size / PAGE_SIZE;
-	int i;
+	int ret, i;
 
-	if (!obj->pages)
-		return;
+	if (obj->pages == NULL)
+		return 0;
 
+	BUG_ON(obj->gtt_space);
 	BUG_ON(obj->madv == __I915_MADV_PURGED);
 
+	ret = i915_gem_object_set_to_cpu_domain(obj, true);
+	if (ret) {
+		/* In the event of a disaster, abandon all caches and
+		 * hope for the best.
+		 */
+		WARN_ON(ret != -EIO);
+		i915_gem_clflush_object(obj);
+		obj->base.read_domains = obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+	}
+
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_save_bit_17_swizzle(obj);
 
@@ -1417,6 +1413,129 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 
 	drm_free_large(obj->pages);
 	obj->pages = NULL;
+
+	list_del(&obj->gtt_list);
+
+	if (i915_gem_object_is_purgeable(obj))
+		i915_gem_object_truncate(obj);
+
+	return 0;
+}
+
+static long
+i915_gem_purge(struct drm_i915_private *dev_priv, long target)
+{
+	struct drm_i915_gem_object *obj, *next;
+	long count = 0;
+
+	list_for_each_entry_safe(obj, next,
+				 &dev_priv->mm.unbound_list,
+				 gtt_list) {
+		if (i915_gem_object_is_purgeable(obj) &&
+		    i915_gem_object_put_pages_gtt(obj) == 0) {
+			count += obj->base.size >> PAGE_SHIFT;
+			if (count >= target)
+				return count;
+		}
+	}
+
+	list_for_each_entry_safe(obj, next,
+				 &dev_priv->mm.inactive_list,
+				 mm_list) {
+		if (i915_gem_object_is_purgeable(obj) &&
+		    i915_gem_object_unbind(obj) == 0 &&
+		    i915_gem_object_put_pages_gtt(obj) == 0) {
+			count += obj->base.size >> PAGE_SHIFT;
+			if (count >= target)
+				return count;
+		}
+	}
+
+	return count;
+}
+
+static void
+i915_gem_shrink_all(struct drm_i915_private *dev_priv)
+{
+	struct drm_i915_gem_object *obj, *next;
+
+	i915_gem_evict_everything(dev_priv->dev);
+
+	list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list)
+		i915_gem_object_put_pages_gtt(obj);
+}
+
+int
+i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	int page_count, i;
+	struct address_space *mapping;
+	struct page *page;
+	gfp_t gfp;
+
+	if (obj->pages || obj->sg_table)
+		return 0;
+
+	/* Assert that the object is not currently in any GPU domain. As it
+	 * wasn't in the GTT, there shouldn't be any way it could have been in
+	 * a GPU cache
+	 */
+	BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
+	BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
+
+	/* Get the list of pages out of our struct file.  They'll be pinned
+	 * at this point until we release them.
+	 */
+	page_count = obj->base.size / PAGE_SIZE;
+	obj->pages = kmalloc(page_count*sizeof(struct page *), GFP_KERNEL);
+	if (obj->pages == NULL)
+		return -ENOMEM;
+
+	/* Fail silently without starting the shrinker */
+	mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
+	gfp = mapping_gfp_mask(mapping);
+	gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD;
+	gfp &= ~(__GFP_IO | __GFP_WAIT);
+	for (i = 0; i < page_count; i++) {
+		page = shmem_read_mapping_page_gfp(mapping, i, gfp);
+		if (IS_ERR(page)) {
+			i915_gem_purge(dev_priv, page_count);
+			page = shmem_read_mapping_page_gfp(mapping, i, gfp);
+		}
+		if (IS_ERR(page)) {
+			/* We've tried hard to allocate the memory by reaping
+			 * our own buffer, now let the real VM do its job and
+			 * go down in flames if truly OOM.
+			 */
+			gfp &= ~(__GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD);
+			gfp |= __GFP_IO | __GFP_WAIT;
+
+			i915_gem_shrink_all(dev_priv);
+			page = shmem_read_mapping_page_gfp(mapping, i, gfp);
+			if (IS_ERR(page))
+				goto err_pages;
+
+			gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD;
+			gfp &= ~(__GFP_IO | __GFP_WAIT);
+		}
+
+		obj->pages[i] = page;
+	}
+
+	if (i915_gem_object_needs_bit17_swizzle(obj))
+		i915_gem_object_do_bit_17_swizzle(obj);
+
+	list_add_tail(&obj->gtt_list, &dev_priv->mm.unbound_list);
+	return 0;
+
+err_pages:
+	while (i--)
+		page_cache_release(obj->pages[i]);
+
+	drm_free_large(obj->pages);
+	obj->pages = NULL;
+	return PTR_ERR(page);
 }
 
 void
@@ -1486,32 +1605,6 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 	WARN_ON(i915_verify_lists(dev));
 }
 
-/* Immediately discard the backing storage */
-static void
-i915_gem_object_truncate(struct drm_i915_gem_object *obj)
-{
-	struct inode *inode;
-
-	/* Our goal here is to return as much of the memory as
-	 * is possible back to the system as we are called from OOM.
-	 * To do this we must instruct the shmfs to drop all of its
-	 * backing pages, *now*.
-	 */
-	inode = obj->base.filp->f_path.dentry->d_inode;
-	shmem_truncate_range(inode, 0, (loff_t)-1);
-
-	if (obj->base.map_list.map)
-		drm_gem_free_mmap_offset(&obj->base);
-
-	obj->madv = __I915_MADV_PURGED;
-}
-
-static inline int
-i915_gem_object_is_purgeable(struct drm_i915_gem_object *obj)
-{
-	return obj->madv == I915_MADV_DONTNEED;
-}
-
 static u32
 i915_gem_get_seqno(struct drm_device *dev)
 {
@@ -1681,7 +1774,7 @@ static void i915_gem_reset_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj;
+	struct drm_i915_gem_object *obj, *next;
 	struct intel_ring_buffer *ring;
 	int i;
 
@@ -1698,6 +1791,10 @@ void i915_gem_reset(struct drm_device *dev)
 		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 	}
 
+
+	list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list)
+		i915_gem_object_put_pages_gtt(obj);
+
 	/* The fence registers are invalidated so clear them out */
 	i915_gem_reset_fences(dev);
 }
@@ -2209,22 +2306,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 
 	i915_gem_object_finish_gtt(obj);
 
-	/* Move the object to the CPU domain to ensure that
-	 * any possible CPU writes while it's not in the GTT
-	 * are flushed when we go to remap it.
-	 */
-	if (ret == 0)
-		ret = i915_gem_object_set_to_cpu_domain(obj, 1);
-	if (ret == -ERESTARTSYS)
-		return ret;
-	if (ret) {
-		/* In the event of a disaster, abandon all caches and
-		 * hope for the best.
-		 */
-		i915_gem_clflush_object(obj);
-		obj->base.read_domains = obj->base.write_domain = I915_GEM_DOMAIN_CPU;
-	}
-
 	/* release the fence reg _after_ flushing */
 	ret = i915_gem_object_put_fence(obj);
 	if (ret)
@@ -2240,10 +2321,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	}
 	i915_gem_gtt_finish_object(obj);
 
-	i915_gem_object_put_pages_gtt(obj);
-
-	list_del_init(&obj->gtt_list);
-	list_del_init(&obj->mm_list);
+	list_del(&obj->mm_list);
+	list_move_tail(&obj->gtt_list, &dev_priv->mm.unbound_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
@@ -2251,10 +2330,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	obj->gtt_space = NULL;
 	obj->gtt_offset = 0;
 
-	if (i915_gem_object_is_purgeable(obj))
-		i915_gem_object_truncate(obj);
-
-	return ret;
+	return 0;
 }
 
 static int i915_ring_idle(struct intel_ring_buffer *ring)
@@ -2667,7 +2743,6 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_mm_node *free_space;
-	gfp_t gfpmask = __GFP_NORETRY | __GFP_NOWARN;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
 	int ret;
@@ -2707,6 +2782,10 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		return -E2BIG;
 	}
 
+	ret = i915_gem_object_get_pages_gtt(obj);
+	if (ret)
+		return ret;
+
  search_free:
 	if (map_and_fenceable)
 		free_space =
@@ -2733,9 +2812,6 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 							 false);
 	}
 	if (obj->gtt_space == NULL) {
-		/* If the gtt is empty and we're still having trouble
-		 * fitting our object in, we're out of memory.
-		 */
 		ret = i915_gem_evict_something(dev, size, alignment,
 					       obj->cache_level,
 					       map_and_fenceable);
@@ -2752,55 +2828,20 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		return -EINVAL;
 	}
 
-	ret = i915_gem_object_get_pages_gtt(obj, gfpmask);
-	if (ret) {
-		drm_mm_put_block(obj->gtt_space);
-		obj->gtt_space = NULL;
-
-		if (ret == -ENOMEM) {
-			/* first try to reclaim some memory by clearing the GTT */
-			ret = i915_gem_evict_everything(dev, false);
-			if (ret) {
-				/* now try to shrink everyone else */
-				if (gfpmask) {
-					gfpmask = 0;
-					goto search_free;
-				}
-
-				return -ENOMEM;
-			}
-
-			goto search_free;
-		}
-
-		return ret;
-	}
 
 	ret = i915_gem_gtt_prepare_object(obj);
 	if (ret) {
-		i915_gem_object_put_pages_gtt(obj);
 		drm_mm_put_block(obj->gtt_space);
 		obj->gtt_space = NULL;
-
-		if (i915_gem_evict_everything(dev, false))
-			return ret;
-
-		goto search_free;
+		return ret;
 	}
 
 	if (!dev_priv->mm.aliasing_ppgtt)
 		i915_gem_gtt_bind_object(obj, obj->cache_level);
 
-	list_add_tail(&obj->gtt_list, &dev_priv->mm.gtt_list);
+	list_move_tail(&obj->gtt_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
 
-	/* Assert that the object is not currently in any GPU domain. As it
-	 * wasn't in the GTT, there shouldn't be any way it could have been in
-	 * a GPU cache
-	 */
-	BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
-	BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
-
 	obj->gtt_offset = obj->gtt_space->start;
 
 	fenceable =
@@ -3464,9 +3505,8 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	if (obj->madv != __I915_MADV_PURGED)
 		obj->madv = args->madv;
 
-	/* if the object is no longer bound, discard its backing storage */
-	if (i915_gem_object_is_purgeable(obj) &&
-	    obj->gtt_space == NULL)
+	/* if the object is no longer attached, discard its backing storage */
+	if (i915_gem_object_is_purgeable(obj) && obj->pages == NULL)
 		i915_gem_object_truncate(obj);
 
 	args->retained = obj->madv != __I915_MADV_PURGED;
@@ -3573,6 +3613,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		dev_priv->mm.interruptible = was_interruptible;
 	}
 
+	i915_gem_object_put_pages_gtt(obj);
 	if (obj->base.map_list.map)
 		drm_gem_free_mmap_offset(&obj->base);
 
@@ -3605,7 +3646,7 @@ i915_gem_idle(struct drm_device *dev)
 
 	/* Under UMS, be paranoid and evict. */
 	if (!drm_core_check_feature(dev, DRIVER_MODESET))
-		i915_gem_evict_everything(dev, false);
+		i915_gem_evict_everything(dev);
 
 	i915_gem_reset_fences(dev);
 
@@ -3963,8 +4004,9 @@ i915_gem_load(struct drm_device *dev)
 
 	INIT_LIST_HEAD(&dev_priv->mm.active_list);
 	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
+	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
+	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
-	INIT_LIST_HEAD(&dev_priv->mm.gtt_list);
 	for (i = 0; i < I915_NUM_RINGS; i++)
 		init_ring_lists(&dev_priv->ring[i]);
 	for (i = 0; i < I915_MAX_NUM_FENCES; i++)
@@ -4209,13 +4251,6 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 }
 
 static int
-i915_gpu_is_active(struct drm_device *dev)
-{
-	drm_i915_private_t *dev_priv = dev->dev_private;
-	return !list_empty(&dev_priv->mm.active_list);
-}
-
-static int
 i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 {
 	struct drm_i915_private *dev_priv =
@@ -4223,60 +4258,26 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 			     struct drm_i915_private,
 			     mm.inactive_shrinker);
 	struct drm_device *dev = dev_priv->dev;
-	struct drm_i915_gem_object *obj, *next;
+	struct drm_i915_gem_object *obj;
 	int nr_to_scan = sc->nr_to_scan;
 	int cnt;
 
 	if (!mutex_trylock(&dev->struct_mutex))
 		return 0;
 
-	/* "fast-path" to count number of available objects */
-	if (nr_to_scan == 0) {
-		cnt = 0;
-		list_for_each_entry(obj,
-				    &dev_priv->mm.inactive_list,
-				    mm_list)
-			cnt++;
-		mutex_unlock(&dev->struct_mutex);
-		return cnt / 100 * sysctl_vfs_cache_pressure;
-	}
-
-rescan:
-	/* first scan for clean buffers */
-	i915_gem_retire_requests(dev);
-
-	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list,
-				 mm_list) {
-		if (i915_gem_object_is_purgeable(obj)) {
-			if (i915_gem_object_unbind(obj) == 0 &&
-			    --nr_to_scan == 0)
-				break;
-		}
+	if (nr_to_scan) {
+		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
+		if (nr_to_scan > 0)
+			i915_gem_shrink_all(dev_priv);
 	}
 
-	/* second pass, evict/count anything still on the inactive list */
 	cnt = 0;
-	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list,
-				 mm_list) {
-		if (nr_to_scan &&
-		    i915_gem_object_unbind(obj) == 0)
-			nr_to_scan--;
-		else
-			cnt++;
-	}
+	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list)
+		cnt += obj->base.size >> PAGE_SHIFT;
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
+		if (obj->pin_count == 0)
+			cnt += obj->base.size >> PAGE_SHIFT;
 
-	if (nr_to_scan && i915_gpu_is_active(dev)) {
-		/*
-		 * We are desperate for pages, so as a last resort, wait
-		 * for the GPU to finish and discard whatever we can.
-		 * This has a dramatic impact to reduce the number of
-		 * OOM-killer events whilst running the GPU aggressively.
-		 */
-		if (i915_gpu_idle(dev) == 0)
-			goto rescan;
-	}
 	mutex_unlock(&dev->struct_mutex);
-	return cnt / 100 * sysctl_vfs_cache_pressure;
+	return cnt;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index aa308e1..e5f0375 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -33,7 +33,7 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
 	struct drm_i915_gem_object *obj = attachment->dmabuf->priv;
 	struct drm_device *dev = obj->base.dev;
 	int npages = obj->base.size / PAGE_SIZE;
-	struct sg_table *sg = NULL;
+	struct sg_table *sg;
 	int ret;
 	int nents;
 
@@ -41,10 +41,10 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
 	if (ret)
 		return ERR_PTR(ret);
 
-	if (!obj->pages) {
-		ret = i915_gem_object_get_pages_gtt(obj, __GFP_NORETRY | __GFP_NOWARN);
-		if (ret)
-			goto out;
+	ret = i915_gem_object_get_pages_gtt(obj);
+	if (ret) {
+		sg = ERR_PTR(ret);
+		goto out;
 	}
 
 	/* link the pages into an SG then map the sg */
@@ -89,12 +89,10 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 		goto out_unlock;
 	}
 
-	if (!obj->pages) {
-		ret = i915_gem_object_get_pages_gtt(obj, __GFP_NORETRY | __GFP_NOWARN);
-		if (ret) {
-			mutex_unlock(&dev->struct_mutex);
-			return ERR_PTR(ret);
-		}
+	ret = i915_gem_object_get_pages_gtt(obj);
+	if (ret) {
+		mutex_unlock(&dev->struct_mutex);
+		return ERR_PTR(ret);
 	}
 
 	obj->dma_buf_vmapping = vmap(obj->pages, obj->base.size / PAGE_SIZE, 0, PAGE_KERNEL);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 7279c31..74635da 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -148,7 +148,7 @@ found:
 }
 
 int
-i915_gem_evict_everything(struct drm_device *dev, bool purgeable_only)
+i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj, *next;
@@ -160,7 +160,7 @@ i915_gem_evict_everything(struct drm_device *dev, bool purgeable_only)
 	if (lists_empty)
 		return -ENOSPC;
 
-	trace_i915_gem_evict_everything(dev, purgeable_only);
+	trace_i915_gem_evict_everything(dev);
 
 	/* The gpu_idle will flush everything in the write domain to the
 	 * active list. Then we must move everything off the active list
@@ -174,12 +174,9 @@ i915_gem_evict_everything(struct drm_device *dev, bool purgeable_only)
 
 	/* Having flushed everything, unbind() should never raise an error */
 	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list, mm_list) {
-		if (!purgeable_only || obj->madv != I915_MADV_WILLNEED) {
-			if (obj->pin_count == 0)
-				WARN_ON(i915_gem_object_unbind(obj));
-		}
-	}
+				 &dev_priv->mm.inactive_list, mm_list)
+		if (obj->pin_count == 0)
+			WARN_ON(i915_gem_object_unbind(obj));
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 25b2c54..50b07a1 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -502,17 +502,12 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			}
 		}
 
-		if (ret != -ENOSPC || retry > 1)
+		if (ret != -ENOSPC || retry++)
 			return ret;
 
-		/* First attempt, just clear anything that is purgeable.
-		 * Second attempt, clear the entire GTT.
-		 */
-		ret = i915_gem_evict_everything(ring->dev, retry == 0);
+		ret = i915_gem_evict_everything(ring->dev);
 		if (ret)
 			return ret;
-
-		retry++;
 	} while (1);
 
 err:
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 4584f7f..3c36d3b 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -348,7 +348,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 	intel_gtt_clear_range(dev_priv->mm.gtt_start / PAGE_SIZE,
 			      (dev_priv->mm.gtt_end - dev_priv->mm.gtt_start) / PAGE_SIZE);
 
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list) {
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) {
 		i915_gem_clflush_object(obj);
 		i915_gem_gtt_bind_object(obj, obj->cache_level);
 	}
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 0c37101..4153c75 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1222,7 +1222,7 @@ static void i915_capture_error_state(struct drm_device *dev)
 	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
 		i++;
 	error->active_bo_count = i;
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list)
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
 		if (obj->pin_count)
 			i++;
 	error->pinned_bo_count = i - error->active_bo_count;
@@ -1247,7 +1247,7 @@ static void i915_capture_error_state(struct drm_device *dev)
 		error->pinned_bo_count =
 			capture_pinned_bo(error->pinned_bo,
 					  error->pinned_bo_count,
-					  &dev_priv->mm.gtt_list);
+					  &dev_priv->mm.bound_list);
 
 	do_gettimeofday(&error->time);
 
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index fe90b3a..3c4093d 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -214,22 +214,18 @@ TRACE_EVENT(i915_gem_evict,
 );
 
 TRACE_EVENT(i915_gem_evict_everything,
-	    TP_PROTO(struct drm_device *dev, bool purgeable),
-	    TP_ARGS(dev, purgeable),
+	    TP_PROTO(struct drm_device *dev),
+	    TP_ARGS(dev),
 
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
-			     __field(bool, purgeable)
 			    ),
 
 	    TP_fast_assign(
 			   __entry->dev = dev->primary->index;
-			   __entry->purgeable = purgeable;
 			  ),
 
-	    TP_printk("dev=%d%s",
-		      __entry->dev,
-		      __entry->purgeable ? ", purgeable only" : "")
+	    TP_printk("dev=%d", __entry->dev)
 );
 
 TRACE_EVENT(i915_gem_ring_dispatch,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 02/29] drm/i915: Show (count, size) of purgeable objects in i915_gem_objects
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
  2012-08-11 14:41 ` [PATCH 01/29] drm/i915: Track unbound pages Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-20  9:04   ` Daniel Vetter
  2012-08-11 14:41 ` [PATCH 03/29] drm/i915: Show pin count in debugfs Chris Wilson
                   ` (26 subsequent siblings)
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c |   19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a7eb093..16e8701 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -197,8 +197,8 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 	struct drm_info_node *node = (struct drm_info_node *) m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	u32 count, mappable_count;
-	size_t size, mappable_size;
+	u32 count, mappable_count, purgeable_count;
+	size_t size, mappable_size, purgeable_size;
 	struct drm_i915_gem_object *obj;
 	int ret;
 
@@ -225,9 +225,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
-	size = count = 0;
-	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list)
+	size = count = purgeable_size = purgeable_count = 0;
+	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list) {
 		size += obj->base.size, ++count;
+		if (obj->madv == I915_MADV_DONTNEED)
+			purgeable_size += obj->base.size, ++purgeable_count;
+	}
 	seq_printf(m, "%u unbound objects, %zu bytes\n", count, size);
 
 	size = count = mappable_size = mappable_count = 0;
@@ -237,10 +240,16 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 			++count;
 		}
 		if (obj->pin_mappable) {
-			mappable_size += obj->gtt_space->size;
+			mappable_size += obj->gtt_space->size,
 			++mappable_count;
 		}
+		if (obj->madv == I915_MADV_DONTNEED) {
+			purgeable_size += obj->base.size;
+			++purgeable_count;
+		}
 	}
+	seq_printf(m, "%u purgeable objects, %zu bytes\n",
+		   purgeable_count, purgeable_size);
 	seq_printf(m, "%u pinned mappable objects, %zu bytes\n",
 		   mappable_count, mappable_size);
 	seq_printf(m, "%u fault mappable objects, %zu bytes\n",
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 03/29] drm/i915: Show pin count in debugfs
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
  2012-08-11 14:41 ` [PATCH 01/29] drm/i915: Track unbound pages Chris Wilson
  2012-08-11 14:41 ` [PATCH 02/29] drm/i915: Show (count, size) of purgeable objects in i915_gem_objects Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 04/29] drm/i915: Try harder to allocate an mmap_offset Chris Wilson
                   ` (25 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 16e8701..229cf27 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -118,6 +118,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 		   obj->madv == I915_MADV_DONTNEED ? " purgeable" : "");
 	if (obj->base.name)
 		seq_printf(m, " (name: %d)", obj->base.name);
+	if (obj->pin_count)
+		seq_printf(m, " (pinned x %d)", obj->pin_count);
 	if (obj->fence_reg != I915_FENCE_REG_NONE)
 		seq_printf(m, " (fence: %d)", obj->fence_reg);
 	if (obj->gtt_space != NULL)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 04/29] drm/i915: Try harder to allocate an mmap_offset
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (2 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 03/29] drm/i915: Show pin count in debugfs Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-20  9:37   ` Daniel Vetter
  2012-08-11 14:41 ` [PATCH 05/29] drm/i915: Only pwrite through the GTT if there is space in the aperture Chris Wilson
                   ` (24 subsequent siblings)
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Given the persistence of an offset for the lifetime of an object, itis
easy to contemplate how the mmap space becomes badly fragmented to the
point that further allocations fail with ENOSPC. Our only recourse at
this point is to try to purge the objects to release some space and
reattempt the allocation.

References: https://bugs.freedesktop.org/show_bug.cgi?id=39552
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |   50 ++++++++++++++++++++++++++++++++-------
 1 file changed, 41 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 3a7ac38..0e0fc1e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1276,6 +1276,42 @@ i915_gem_get_unfenced_gtt_alignment(struct drm_device *dev,
 	return i915_gem_get_gtt_size(dev, size, tiling_mode);
 }
 
+static int i915_gem_object_create_mmap_offset(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	int ret;
+
+	if (obj->base.map_list.map)
+		return 0;
+
+	ret = drm_gem_create_mmap_offset(&obj->base);
+	if (ret != -ENOSPC)
+		return ret;
+
+	/* Badly fragmented mmap space? The only way we can recover
+	 * space is by destroying unwanted objects. We can't randomly release
+	 * mmap_offsets as userspace expects them to be persistent for the
+	 * lifetime of the objects. The closest we can is to release the
+	 * offsets on purgeable objects by truncating it and marking it purged,
+	 * which prevents userspace from ever using that object again.
+	 */
+	i915_gem_purge(dev_priv, obj->base.size >> PAGE_SHIFT);
+	ret = drm_gem_create_mmap_offset(&obj->base);
+	if (ret != -ENOSPC)
+		return ret;
+
+	i915_gem_shrink_all(dev_priv);
+	return drm_gem_create_mmap_offset(&obj->base);
+}
+
+static void i915_gem_object_free_mmap_offset(struct drm_i915_gem_object *obj)
+{
+	if (!obj->base.map_list.map)
+		return;
+
+	drm_gem_free_mmap_offset(&obj->base);
+}
+
 int
 i915_gem_mmap_gtt(struct drm_file *file,
 		  struct drm_device *dev,
@@ -1307,11 +1343,9 @@ i915_gem_mmap_gtt(struct drm_file *file,
 		goto out;
 	}
 
-	if (!obj->base.map_list.map) {
-		ret = drm_gem_create_mmap_offset(&obj->base);
-		if (ret)
-			goto out;
-	}
+	ret = i915_gem_object_create_mmap_offset(obj);
+	if (ret)
+		goto out;
 
 	*offset = (u64)obj->base.map_list.hash.key << PAGE_SHIFT;
 
@@ -1360,8 +1394,7 @@ i915_gem_object_truncate(struct drm_i915_gem_object *obj)
 	inode = obj->base.filp->f_path.dentry->d_inode;
 	shmem_truncate_range(inode, 0, (loff_t)-1);
 
-	if (obj->base.map_list.map)
-		drm_gem_free_mmap_offset(&obj->base);
+	i915_gem_object_free_mmap_offset(obj);
 
 	obj->madv = __I915_MADV_PURGED;
 }
@@ -3614,8 +3647,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	}
 
 	i915_gem_object_put_pages_gtt(obj);
-	if (obj->base.map_list.map)
-		drm_gem_free_mmap_offset(&obj->base);
+	i915_gem_object_free_mmap_offset(obj);
 
 	drm_gem_object_release(&obj->base);
 	i915_gem_info_remove_obj(dev_priv, obj->base.size);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 05/29] drm/i915: Only pwrite through the GTT if there is space in the aperture
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (3 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 04/29] drm/i915: Try harder to allocate an mmap_offset Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 06/29] drm/i915: Protect private gem objects from truncate (such as imported dmabuf) Chris Wilson
                   ` (23 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Avoid stalling and waiting for the GPU by checking to see if there is
sufficient inactive space in the aperture for us to bind the buffer
prior to writing through the GTT. If there is inadequate space we will
have to stall waiting for the GPU, and incur overheads moving objects
about. Instead, only incur the clflush overhead on the target object by
writing through shmem.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h            |    6 ++++--
 drivers/gpu/drm/i915/i915_gem.c            |   29 +++++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem_context.c    |    4 ++--
 drivers/gpu/drm/i915/i915_gem_evict.c      |    6 +++++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    2 +-
 drivers/gpu/drm/i915/intel_overlay.c       |    2 +-
 drivers/gpu/drm/i915/intel_pm.c            |    2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |    6 +++---
 8 files changed, 33 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e252947..78eed86 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1307,7 +1307,8 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 void i915_gem_free_object(struct drm_gem_object *obj);
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
-				     bool map_and_fenceable);
+				     bool map_and_fenceable,
+				     bool nonblocking);
 void i915_gem_object_unpin(struct drm_i915_gem_object *obj);
 int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
 void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
@@ -1454,7 +1455,8 @@ void i915_gem_init_global_gtt(struct drm_device *dev,
 int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
 					  unsigned alignment,
 					  unsigned cache_level,
-					  bool mappable);
+					  bool mappable,
+					  bool nonblock);
 int i915_gem_evict_everything(struct drm_device *dev);
 
 /* i915_gem_stolen.c */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0e0fc1e..b16b5ff 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -41,7 +41,8 @@ static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *o
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
 static __must_check int i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 						    unsigned alignment,
-						    bool map_and_fenceable);
+						    bool map_and_fenceable,
+						    bool nonblocking);
 static int i915_gem_phys_pwrite(struct drm_device *dev,
 				struct drm_i915_gem_object *obj,
 				struct drm_i915_gem_pwrite *args,
@@ -609,7 +610,7 @@ i915_gem_gtt_pwrite_fast(struct drm_device *dev,
 	char __user *user_data;
 	int page_offset, page_length, ret;
 
-	ret = i915_gem_object_pin(obj, 0, true);
+	ret = i915_gem_object_pin(obj, 0, true, true);
 	if (ret)
 		goto out;
 
@@ -925,10 +926,8 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 		goto out;
 	}
 
-	if (obj->gtt_space &&
-	    obj->cache_level == I915_CACHE_NONE &&
+	if (obj->cache_level == I915_CACHE_NONE &&
 	    obj->tiling_mode == I915_TILING_NONE &&
-	    obj->map_and_fenceable &&
 	    obj->base.write_domain != I915_GEM_DOMAIN_CPU) {
 		ret = i915_gem_gtt_pwrite_fast(dev, obj, args, file);
 		/* Note that the gtt paths might fail with non-page-backed user
@@ -936,7 +935,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 		 * textures). Fallback to the shmem path in that case. */
 	}
 
-	if (ret == -EFAULT)
+	if (ret == -EFAULT || ret == -ENOSPC)
 		ret = i915_gem_shmem_pwrite(dev, obj, args, file);
 
 out:
@@ -1115,7 +1114,7 @@ int i915_gem_fault(struct vm_area_struct *vma, struct vm_fault *vmf)
 			goto unlock;
 	}
 	if (!obj->gtt_space) {
-		ret = i915_gem_object_bind_to_gtt(obj, 0, true);
+		ret = i915_gem_object_bind_to_gtt(obj, 0, true, false);
 		if (ret)
 			goto unlock;
 
@@ -2771,7 +2770,8 @@ static void i915_gem_verify_gtt(struct drm_device *dev)
 static int
 i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 			    unsigned alignment,
-			    bool map_and_fenceable)
+			    bool map_and_fenceable,
+			    bool nonblocking)
 {
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
@@ -2847,7 +2847,8 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	if (obj->gtt_space == NULL) {
 		ret = i915_gem_evict_something(dev, size, alignment,
 					       obj->cache_level,
-					       map_and_fenceable);
+					       map_and_fenceable,
+					       nonblocking);
 		if (ret)
 			return ret;
 
@@ -3187,7 +3188,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 	 * (e.g. libkms for the bootup splash), we have to ensure that we
 	 * always use map_and_fenceable for all scanout buffers.
 	 */
-	ret = i915_gem_object_pin(obj, alignment, true);
+	ret = i915_gem_object_pin(obj, alignment, true, false);
 	if (ret)
 		return ret;
 
@@ -3324,7 +3325,8 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 int
 i915_gem_object_pin(struct drm_i915_gem_object *obj,
 		    uint32_t alignment,
-		    bool map_and_fenceable)
+		    bool map_and_fenceable,
+		    bool nonblocking)
 {
 	int ret;
 
@@ -3348,7 +3350,8 @@ i915_gem_object_pin(struct drm_i915_gem_object *obj,
 
 	if (obj->gtt_space == NULL) {
 		ret = i915_gem_object_bind_to_gtt(obj, alignment,
-						  map_and_fenceable);
+						  map_and_fenceable,
+						  nonblocking);
 		if (ret)
 			return ret;
 	}
@@ -3406,7 +3409,7 @@ i915_gem_pin_ioctl(struct drm_device *dev, void *data,
 	obj->user_pin_count++;
 	obj->pin_filp = file;
 	if (obj->user_pin_count == 1) {
-		ret = i915_gem_object_pin(obj, args->alignment, true);
+		ret = i915_gem_object_pin(obj, args->alignment, true, false);
 		if (ret)
 			goto out;
 	}
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index f5c721b..184753e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -228,7 +228,7 @@ static int create_default_context(struct drm_i915_private *dev_priv)
 	 * default context.
 	 */
 	dev_priv->ring[RCS].default_context = ctx;
-	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false);
+	ret = i915_gem_object_pin(ctx->obj, CONTEXT_ALIGN, false, false);
 	if (ret)
 		goto err_destroy;
 
@@ -379,7 +379,7 @@ static int do_switch(struct i915_hw_context *to)
 	if (from_obj == to->obj)
 		return 0;
 
-	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false);
+	ret = i915_gem_object_pin(to->obj, CONTEXT_ALIGN, false, false);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 74635da..a2d8acd 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -45,7 +45,7 @@ mark_free(struct drm_i915_gem_object *obj, struct list_head *unwind)
 int
 i915_gem_evict_something(struct drm_device *dev, int min_size,
 			 unsigned alignment, unsigned cache_level,
-			 bool mappable)
+			 bool mappable, bool nonblocking)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct list_head eviction_list, unwind_list;
@@ -92,12 +92,16 @@ i915_gem_evict_something(struct drm_device *dev, int min_size,
 			goto found;
 	}
 
+	if (nonblocking)
+		goto none;
+
 	/* Now merge in the soon-to-be-expired objects... */
 	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list) {
 		if (mark_free(obj, &unwind_list))
 			goto found;
 	}
 
+none:
 	/* Nothing found, clean up and bail out! */
 	while (!list_empty(&unwind_list)) {
 		obj = list_first_entry(&unwind_list,
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 50b07a1..dbb003d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -354,7 +354,7 @@ pin_and_fence_object(struct drm_i915_gem_object *obj,
 		obj->tiling_mode != I915_TILING_NONE;
 	need_mappable = need_fence || need_reloc_mappable(obj);
 
-	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable);
+	ret = i915_gem_object_pin(obj, entry->alignment, need_mappable, false);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 830d0dd..7a98459 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -1439,7 +1439,7 @@ void intel_setup_overlay(struct drm_device *dev)
 		}
 		overlay->flip_addr = reg_bo->phys_obj->handle->busaddr;
 	} else {
-		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true);
+		ret = i915_gem_object_pin(reg_bo, PAGE_SIZE, true, false);
 		if (ret) {
 			DRM_ERROR("failed to pin overlay register bo\n");
 			goto out_free_bo;
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index d64dffb..3021c18 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -2138,7 +2138,7 @@ intel_alloc_context_page(struct drm_device *dev)
 		return NULL;
 	}
 
-	ret = i915_gem_object_pin(ctx, 4096, true);
+	ret = i915_gem_object_pin(ctx, 4096, true, false);
 	if (ret) {
 		DRM_ERROR("failed to pin power context: %d\n", ret);
 		goto err_unref;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 95144b1..80d8791 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -379,7 +379,7 @@ init_pipe_control(struct intel_ring_buffer *ring)
 
 	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true);
+	ret = i915_gem_object_pin(obj, 4096, true, false);
 	if (ret)
 		goto err_unref;
 
@@ -967,7 +967,7 @@ static int init_status_page(struct intel_ring_buffer *ring)
 
 	i915_gem_object_set_cache_level(obj, I915_CACHE_LLC);
 
-	ret = i915_gem_object_pin(obj, 4096, true);
+	ret = i915_gem_object_pin(obj, 4096, true, false);
 	if (ret != 0) {
 		goto err_unref;
 	}
@@ -1024,7 +1024,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 
 	ring->obj = obj;
 
-	ret = i915_gem_object_pin(obj, PAGE_SIZE, true);
+	ret = i915_gem_object_pin(obj, PAGE_SIZE, true, false);
 	if (ret)
 		goto err_unref;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 06/29] drm/i915: Protect private gem objects from truncate (such as imported dmabuf)
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (4 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 05/29] drm/i915: Only pwrite through the GTT if there is space in the aperture Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 07/29] drm/i915: Extract general object init routine Chris Wilson
                   ` (22 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

If the object has no backing shmemfs filp, then we obviously cannot
perform a truncation operation upon it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b16b5ff..2f4a113 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1385,6 +1385,11 @@ i915_gem_object_truncate(struct drm_i915_gem_object *obj)
 {
 	struct inode *inode;
 
+	i915_gem_object_free_mmap_offset(obj);
+
+	if (obj->base.filp == NULL)
+		return;
+
 	/* Our goal here is to return as much of the memory as
 	 * is possible back to the system as we are called from OOM.
 	 * To do this we must instruct the shmfs to drop all of its
@@ -1393,8 +1398,6 @@ i915_gem_object_truncate(struct drm_i915_gem_object *obj)
 	inode = obj->base.filp->f_path.dentry->d_inode;
 	shmem_truncate_range(inode, 0, (loff_t)-1);
 
-	i915_gem_object_free_mmap_offset(obj);
-
 	obj->madv = __I915_MADV_PURGED;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 07/29] drm/i915: Extract general object init routine
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (5 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 06/29] drm/i915: Protect private gem objects from truncate (such as imported dmabuf) Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-24  0:05   ` Daniel Vetter
  2012-08-11 14:41 ` [PATCH 08/29] drm/i915: Introduce drm_i915_gem_object_ops Chris Wilson
                   ` (21 subsequent siblings)
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

As we wish to create specialised object constructions in the near
future that share the same basic GEM object struct, export the default
initializer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h |    1 +
 drivers/gpu/drm/i915/i915_gem.c |   30 ++++++++++++++++++------------
 2 files changed, 19 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 78eed86..bbc51ef 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1302,6 +1302,7 @@ int i915_gem_wait_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file_priv);
 void i915_gem_load(struct drm_device *dev);
 int i915_gem_init_object(struct drm_gem_object *obj);
+void i915_gem_object_init(struct drm_i915_gem_object *obj);
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2f4a113..9c8787e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3557,10 +3557,26 @@ unlock:
 	return ret;
 }
 
+void i915_gem_object_init(struct drm_i915_gem_object *obj)
+{
+	obj->base.driver_private = NULL;
+
+	INIT_LIST_HEAD(&obj->mm_list);
+	INIT_LIST_HEAD(&obj->gtt_list);
+	INIT_LIST_HEAD(&obj->ring_list);
+	INIT_LIST_HEAD(&obj->exec_list);
+
+	obj->fence_reg = I915_FENCE_REG_NONE;
+	obj->madv = I915_MADV_WILLNEED;
+	/* Avoid an unnecessary call to unbind on the first bind. */
+	obj->map_and_fenceable = true;
+
+	i915_gem_info_add_obj(obj->base.dev->dev_private, obj->base.size);
+}
+
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj;
 	struct address_space *mapping;
 	u32 mask;
@@ -3584,7 +3600,7 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 	mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
 	mapping_set_gfp_mask(mapping, mask);
 
-	i915_gem_info_add_obj(dev_priv, size);
+	i915_gem_object_init(obj);
 
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
@@ -3606,16 +3622,6 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 	} else
 		obj->cache_level = I915_CACHE_NONE;
 
-	obj->base.driver_private = NULL;
-	obj->fence_reg = I915_FENCE_REG_NONE;
-	INIT_LIST_HEAD(&obj->mm_list);
-	INIT_LIST_HEAD(&obj->gtt_list);
-	INIT_LIST_HEAD(&obj->ring_list);
-	INIT_LIST_HEAD(&obj->exec_list);
-	obj->madv = I915_MADV_WILLNEED;
-	/* Avoid an unnecessary call to unbind on the first bind. */
-	obj->map_and_fenceable = true;
-
 	return obj;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 08/29] drm/i915: Introduce drm_i915_gem_object_ops
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (6 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 07/29] drm/i915: Extract general object init routine Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-20 19:35   ` Daniel Vetter
  2012-08-11 14:41 ` [PATCH 09/29] drm/i915: Pin backing pages whilst exporting through a dmabuf vmap Chris Wilson
                   ` (20 subsequent siblings)
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

In order to specialise functions depending upon the type of object, we
can attach vfuncs to each object via their obj->driver_private pointer,
bringing it back to life!

For instance, this will be used in future patches to only bind pages from
a dma-buf for the duration that the object is used by the GPU - and so
prevent them from pinning those pages for the entire of the object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |   10 ++++-
 drivers/gpu/drm/i915/i915_gem.c        |   65 ++++++++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |    4 +-
 3 files changed, 56 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bbc51ef..c42190b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -895,6 +895,11 @@ enum i915_cache_level {
 	I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */
 };
 
+struct drm_i915_gem_object_ops {
+	int (*get_pages)(struct drm_i915_gem_object *);
+	void (*put_pages)(struct drm_i915_gem_object *);
+};
+
 struct drm_i915_gem_object {
 	struct drm_gem_object base;
 
@@ -1302,7 +1307,8 @@ int i915_gem_wait_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file_priv);
 void i915_gem_load(struct drm_device *dev);
 int i915_gem_init_object(struct drm_gem_object *obj);
-void i915_gem_object_init(struct drm_i915_gem_object *obj);
+void i915_gem_object_init(struct drm_i915_gem_object *obj,
+			 const struct drm_i915_gem_object_ops *ops);
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
@@ -1315,7 +1321,7 @@ int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
 void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
 
-int __must_check i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj);
+int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 9c8787e..ed6a1ec 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1407,15 +1407,12 @@ i915_gem_object_is_purgeable(struct drm_i915_gem_object *obj)
 	return obj->madv == I915_MADV_DONTNEED;
 }
 
-static int
+static void
 i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 {
 	int page_count = obj->base.size / PAGE_SIZE;
 	int ret, i;
 
-	if (obj->pages == NULL)
-		return 0;
-
 	BUG_ON(obj->gtt_space);
 	BUG_ON(obj->madv == __I915_MADV_PURGED);
 
@@ -1448,9 +1445,19 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 
 	drm_free_large(obj->pages);
 	obj->pages = NULL;
+}
 
-	list_del(&obj->gtt_list);
+static int
+i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
+{
+	const struct drm_i915_gem_object_ops *ops = obj->base.driver_private;
 
+	if (obj->sg_table || obj->pages == NULL)
+		return 0;
+
+	ops->put_pages(obj);
+
+	list_del(&obj->gtt_list);
 	if (i915_gem_object_is_purgeable(obj))
 		i915_gem_object_truncate(obj);
 
@@ -1467,7 +1474,7 @@ i915_gem_purge(struct drm_i915_private *dev_priv, long target)
 				 &dev_priv->mm.unbound_list,
 				 gtt_list) {
 		if (i915_gem_object_is_purgeable(obj) &&
-		    i915_gem_object_put_pages_gtt(obj) == 0) {
+		    i915_gem_object_put_pages(obj) == 0) {
 			count += obj->base.size >> PAGE_SHIFT;
 			if (count >= target)
 				return count;
@@ -1479,7 +1486,7 @@ i915_gem_purge(struct drm_i915_private *dev_priv, long target)
 				 mm_list) {
 		if (i915_gem_object_is_purgeable(obj) &&
 		    i915_gem_object_unbind(obj) == 0 &&
-		    i915_gem_object_put_pages_gtt(obj) == 0) {
+		    i915_gem_object_put_pages(obj) == 0) {
 			count += obj->base.size >> PAGE_SHIFT;
 			if (count >= target)
 				return count;
@@ -1497,10 +1504,10 @@ i915_gem_shrink_all(struct drm_i915_private *dev_priv)
 	i915_gem_evict_everything(dev_priv->dev);
 
 	list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list)
-		i915_gem_object_put_pages_gtt(obj);
+		i915_gem_object_put_pages(obj);
 }
 
-int
+static int
 i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 {
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
@@ -1509,9 +1516,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	struct page *page;
 	gfp_t gfp;
 
-	if (obj->pages || obj->sg_table)
-		return 0;
-
 	/* Assert that the object is not currently in any GPU domain. As it
 	 * wasn't in the GTT, there shouldn't be any way it could have been in
 	 * a GPU cache
@@ -1561,7 +1565,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_do_bit_17_swizzle(obj);
 
-	list_add_tail(&obj->gtt_list, &dev_priv->mm.unbound_list);
 	return 0;
 
 err_pages:
@@ -1573,6 +1576,24 @@ err_pages:
 	return PTR_ERR(page);
 }
 
+int
+i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	const struct drm_i915_gem_object_ops *ops = obj->base.driver_private;
+	int ret;
+
+	if (obj->sg_table || obj->pages)
+		return 0;
+
+	ret = ops->get_pages(obj);
+	if (ret)
+		return ret;
+
+	list_add_tail(&obj->gtt_list, &dev_priv->mm.unbound_list);
+	return 0;
+}
+
 void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 			       struct intel_ring_buffer *ring,
@@ -1828,7 +1849,7 @@ void i915_gem_reset(struct drm_device *dev)
 
 
 	list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list)
-		i915_gem_object_put_pages_gtt(obj);
+		i915_gem_object_put_pages(obj);
 
 	/* The fence registers are invalidated so clear them out */
 	i915_gem_reset_fences(dev);
@@ -2818,7 +2839,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		return -E2BIG;
 	}
 
-	ret = i915_gem_object_get_pages_gtt(obj);
+	ret = i915_gem_object_get_pages(obj);
 	if (ret)
 		return ret;
 
@@ -3557,9 +3578,10 @@ unlock:
 	return ret;
 }
 
-void i915_gem_object_init(struct drm_i915_gem_object *obj)
+void i915_gem_object_init(struct drm_i915_gem_object *obj,
+			  const struct drm_i915_gem_object_ops *ops)
 {
-	obj->base.driver_private = NULL;
+	obj->base.driver_private = (void *)ops;
 
 	INIT_LIST_HEAD(&obj->mm_list);
 	INIT_LIST_HEAD(&obj->gtt_list);
@@ -3574,6 +3596,11 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj)
 	i915_gem_info_add_obj(obj->base.dev->dev_private, obj->base.size);
 }
 
+static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
+	.get_pages = i915_gem_object_get_pages_gtt,
+	.put_pages = i915_gem_object_put_pages_gtt,
+};
+
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size)
 {
@@ -3600,7 +3627,7 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 	mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
 	mapping_set_gfp_mask(mapping, mask);
 
-	i915_gem_object_init(obj);
+	i915_gem_object_init(obj, &i915_gem_object_ops);
 
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
@@ -3658,7 +3685,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		dev_priv->mm.interruptible = was_interruptible;
 	}
 
-	i915_gem_object_put_pages_gtt(obj);
+	i915_gem_object_put_pages(obj);
 	i915_gem_object_free_mmap_offset(obj);
 
 	drm_gem_object_release(&obj->base);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index e5f0375..1203460 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -41,7 +41,7 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
 	if (ret)
 		return ERR_PTR(ret);
 
-	ret = i915_gem_object_get_pages_gtt(obj);
+	ret = i915_gem_object_get_pages(obj);
 	if (ret) {
 		sg = ERR_PTR(ret);
 		goto out;
@@ -89,7 +89,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 		goto out_unlock;
 	}
 
-	ret = i915_gem_object_get_pages_gtt(obj);
+	ret = i915_gem_object_get_pages(obj);
 	if (ret) {
 		mutex_unlock(&dev->struct_mutex);
 		return ERR_PTR(ret);
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 09/29] drm/i915: Pin backing pages whilst exporting through a dmabuf vmap
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (7 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 08/29] drm/i915: Introduce drm_i915_gem_object_ops Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 10/29] drm/i915: Pin backing pages for pwrite Chris Wilson
                   ` (19 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

We need to refcount our pages in order to prevent reaping them at
inopportune times, such as when they currently vmapped or exported to
another driver. However, we also wish to keep the lazy deallocation of
our pages so we need to take a pin/unpinned approach rather than a
simple refcount.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |   12 ++++++++++++
 drivers/gpu/drm/i915/i915_gem.c        |   11 +++++++++--
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |    9 +++++++--
 3 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c42190b..0805040f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -989,6 +989,7 @@ struct drm_i915_gem_object {
 	unsigned int has_global_gtt_mapping:1;
 
 	struct page **pages;
+	int pages_pin_count;
 
 	/**
 	 * DMAR support
@@ -1322,6 +1323,17 @@ void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
 
 int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
+static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
+{
+	BUG_ON(obj->pages == NULL);
+	obj->pages_pin_count++;
+}
+static inline void i915_gem_object_unpin_pages(struct drm_i915_gem_object *obj)
+{
+	BUG_ON(obj->pages_pin_count == 0);
+	obj->pages_pin_count--;
+}
+
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ed6a1ec..c90e265 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1455,6 +1455,9 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 	if (obj->sg_table || obj->pages == NULL)
 		return 0;
 
+	if (obj->pages_pin_count)
+		return -EBUSY;
+
 	ops->put_pages(obj);
 
 	list_del(&obj->gtt_list);
@@ -1586,6 +1589,8 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 	if (obj->sg_table || obj->pages)
 		return 0;
 
+	BUG_ON(obj->pages_pin_count);
+
 	ret = ops->get_pages(obj);
 	if (ret)
 		return ret;
@@ -3685,6 +3690,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		dev_priv->mm.interruptible = was_interruptible;
 	}
 
+	obj->pages_pin_count = 0;
 	i915_gem_object_put_pages(obj);
 	i915_gem_object_free_mmap_offset(obj);
 
@@ -4344,9 +4350,10 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 
 	cnt = 0;
 	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list)
-		cnt += obj->base.size >> PAGE_SHIFT;
+		if (obj->pages_pin_count == 0)
+			cnt += obj->base.size >> PAGE_SHIFT;
 	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
-		if (obj->pin_count == 0)
+		if (obj->pin_count == 0 && obj->pages_pin_count == 0)
 			cnt += obj->base.size >> PAGE_SHIFT;
 
 	mutex_unlock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 1203460..4a6982e 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -50,6 +50,8 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
 	/* link the pages into an SG then map the sg */
 	sg = drm_prime_pages_to_sg(obj->pages, npages);
 	nents = dma_map_sg(attachment->dev, sg->sgl, sg->nents, dir);
+	i915_gem_object_pin_pages(obj);
+
 out:
 	mutex_unlock(&dev->struct_mutex);
 	return sg;
@@ -72,6 +74,7 @@ static void i915_gem_dmabuf_release(struct dma_buf *dma_buf)
 		obj->base.export_dma_buf = NULL;
 		drm_gem_object_unreference_unlocked(&obj->base);
 	}
+	i915_gem_object_unpin_pages(obj);
 }
 
 static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
@@ -102,6 +105,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 	}
 
 	obj->vmapping_count = 1;
+	i915_gem_object_pin_pages(obj);
 out_unlock:
 	mutex_unlock(&dev->struct_mutex);
 	return obj->dma_buf_vmapping;
@@ -117,10 +121,11 @@ static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
 	if (ret)
 		return;
 
-	--obj->vmapping_count;
-	if (obj->vmapping_count == 0) {
+	if (--obj->vmapping_count == 0) {
 		vunmap(obj->dma_buf_vmapping);
 		obj->dma_buf_vmapping = NULL;
+
+		i915_gem_object_unpin_pages(obj);
 	}
 	mutex_unlock(&dev->struct_mutex);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 10/29] drm/i915: Pin backing pages for pwrite
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (8 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 09/29] drm/i915: Pin backing pages whilst exporting through a dmabuf vmap Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 11/29] drm/i915: Pin backing pages for pread Chris Wilson
                   ` (18 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

By using the recently introduced pinning of pages, we can safely drop
the mutex in the knowledge that the pages are not going to disappear
beneath us, and so we can simplify the code for iterating over the pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |   37 +++++++++++++------------------------
 1 file changed, 13 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c90e265..2e5cecc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -690,7 +690,7 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
 				       page_length);
 	kunmap_atomic(vaddr);
 
-	return ret;
+	return ret ? -EFAULT : 0;
 }
 
 /* Only difference to the fast-path function is that this can handle bit17
@@ -724,7 +724,7 @@ shmem_pwrite_slow(struct page *page, int shmem_page_offset, int page_length,
 					     page_do_bit17_swizzling);
 	kunmap(page);
 
-	return ret;
+	return ret ? -EFAULT : 0;
 }
 
 static int
@@ -733,7 +733,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		      struct drm_i915_gem_pwrite *args,
 		      struct drm_file *file)
 {
-	struct address_space *mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
 	ssize_t remain;
 	loff_t offset;
 	char __user *user_data;
@@ -742,7 +741,6 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 	int hit_slowpath = 0;
 	int needs_clflush_after = 0;
 	int needs_clflush_before = 0;
-	int release_page;
 
 	user_data = (char __user *) (uintptr_t) args->data_ptr;
 	remain = args->size;
@@ -768,6 +766,12 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 	    && obj->cache_level == I915_CACHE_NONE)
 		needs_clflush_before = 1;
 
+	ret = i915_gem_object_get_pages(obj);
+	if (ret)
+		return ret;
+
+	i915_gem_object_pin_pages(obj);
+
 	offset = args->offset;
 	obj->dirty = 1;
 
@@ -793,18 +797,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 			((shmem_page_offset | page_length)
 				& (boot_cpu_data.x86_clflush_size - 1));
 
-		if (obj->pages) {
-			page = obj->pages[offset >> PAGE_SHIFT];
-			release_page = 0;
-		} else {
-			page = shmem_read_mapping_page(mapping, offset >> PAGE_SHIFT);
-			if (IS_ERR(page)) {
-				ret = PTR_ERR(page);
-				goto out;
-			}
-			release_page = 1;
-		}
-
+		page = obj->pages[offset >> PAGE_SHIFT];
 		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
 			(page_to_phys(page) & (1 << 17)) != 0;
 
@@ -816,26 +809,20 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 			goto next_page;
 
 		hit_slowpath = 1;
-		page_cache_get(page);
 		mutex_unlock(&dev->struct_mutex);
-
 		ret = shmem_pwrite_slow(page, shmem_page_offset, page_length,
 					user_data, page_do_bit17_swizzling,
 					partial_cacheline_write,
 					needs_clflush_after);
 
 		mutex_lock(&dev->struct_mutex);
-		page_cache_release(page);
+
 next_page:
 		set_page_dirty(page);
 		mark_page_accessed(page);
-		if (release_page)
-			page_cache_release(page);
 
-		if (ret) {
-			ret = -EFAULT;
+		if (ret)
 			goto out;
-		}
 
 		remain -= page_length;
 		user_data += page_length;
@@ -843,6 +830,8 @@ next_page:
 	}
 
 out:
+	i915_gem_object_unpin_pages(obj);
+
 	if (hit_slowpath) {
 		/* Fixup: Kill any reinstated backing storage pages */
 		if (obj->madv == __I915_MADV_PURGED)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 11/29] drm/i915: Pin backing pages for pread
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (9 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 10/29] drm/i915: Pin backing pages for pwrite Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 12/29] drm/i915: Replace the array of pages with a scatterlist Chris Wilson
                   ` (17 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

By using the recently introduced pinning of pages, we can safely drop
the mutex in the knowledge that the pages are not going to disappear
beneath us, and so we can simplify the code for iterating over the pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |   36 +++++++++++++-----------------------
 1 file changed, 13 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2e5cecc..cd17434 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -343,7 +343,7 @@ shmem_pread_fast(struct page *page, int shmem_page_offset, int page_length,
 				      page_length);
 	kunmap_atomic(vaddr);
 
-	return ret;
+	return ret ? -EFAULT : 0;
 }
 
 static void
@@ -394,7 +394,7 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
 				     page_length);
 	kunmap(page);
 
-	return ret;
+	return ret ? - EFAULT : 0;
 }
 
 static int
@@ -403,7 +403,6 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		     struct drm_i915_gem_pread *args,
 		     struct drm_file *file)
 {
-	struct address_space *mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
 	char __user *user_data;
 	ssize_t remain;
 	loff_t offset;
@@ -412,7 +411,6 @@ i915_gem_shmem_pread(struct drm_device *dev,
 	int hit_slowpath = 0;
 	int prefaulted = 0;
 	int needs_clflush = 0;
-	int release_page;
 
 	user_data = (char __user *) (uintptr_t) args->data_ptr;
 	remain = args->size;
@@ -433,6 +431,12 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		}
 	}
 
+	ret = i915_gem_object_get_pages(obj);
+	if (ret)
+		return ret;
+
+	i915_gem_object_pin_pages(obj);
+
 	offset = args->offset;
 
 	while (remain > 0) {
@@ -448,18 +452,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		if ((shmem_page_offset + page_length) > PAGE_SIZE)
 			page_length = PAGE_SIZE - shmem_page_offset;
 
-		if (obj->pages) {
-			page = obj->pages[offset >> PAGE_SHIFT];
-			release_page = 0;
-		} else {
-			page = shmem_read_mapping_page(mapping, offset >> PAGE_SHIFT);
-			if (IS_ERR(page)) {
-				ret = PTR_ERR(page);
-				goto out;
-			}
-			release_page = 1;
-		}
-
+		page = obj->pages[offset >> PAGE_SHIFT];
 		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
 			(page_to_phys(page) & (1 << 17)) != 0;
 
@@ -470,7 +463,6 @@ i915_gem_shmem_pread(struct drm_device *dev,
 			goto next_page;
 
 		hit_slowpath = 1;
-		page_cache_get(page);
 		mutex_unlock(&dev->struct_mutex);
 
 		if (!prefaulted) {
@@ -488,16 +480,12 @@ i915_gem_shmem_pread(struct drm_device *dev,
 				       needs_clflush);
 
 		mutex_lock(&dev->struct_mutex);
-		page_cache_release(page);
+
 next_page:
 		mark_page_accessed(page);
-		if (release_page)
-			page_cache_release(page);
 
-		if (ret) {
-			ret = -EFAULT;
+		if (ret)
 			goto out;
-		}
 
 		remain -= page_length;
 		user_data += page_length;
@@ -505,6 +493,8 @@ next_page:
 	}
 
 out:
+	i915_gem_object_unpin_pages(obj);
+
 	if (hit_slowpath) {
 		/* Fixup: Kill any reinstated backing storage pages */
 		if (obj->madv == __I915_MADV_PURGED)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 12/29] drm/i915: Replace the array of pages with a scatterlist
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (10 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 11/29] drm/i915: Pin backing pages for pread Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 13/29] drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops Chris Wilson
                   ` (16 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Rather than have multiple data structures for describing our page layout
in conjunction with the array of pages, we can migrate all users over to
a scatterlist.

One major advantage, other than unifying the page tracking structures,
this offers is that we replace the vmalloc'ed array (which can be up to
a megabyte in size) with a chain of individual pages which helps reduce
memory pressure.

The disadvantage is that we then do not have a simple array to iterate,
or to access randomly. The common case for this is in the relocation
processing, which will typically fit within a single scatterlist page
and so be almost the same cost as the simple array. For iterating over
the array, the extra function call could be optimised away, but in
reality is an insignificant cost of either binding the pages, or
performing the pwrite/pread.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/char/agp/intel-gtt.c               |   51 +++++-------
 drivers/gpu/drm/drm_cache.c                |   23 ++++++
 drivers/gpu/drm/i915/i915_drv.h            |   18 +++--
 drivers/gpu/drm/i915/i915_gem.c            |   79 ++++++++++++------
 drivers/gpu/drm/i915/i915_gem_dmabuf.c     |  113 ++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    3 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c        |  120 ++++++----------------------
 drivers/gpu/drm/i915/i915_gem_tiling.c     |   16 ++--
 drivers/gpu/drm/i915/i915_irq.c            |   25 +++---
 drivers/gpu/drm/i915/intel_ringbuffer.c    |    9 ++-
 include/drm/drmP.h                         |    1 +
 include/drm/intel-gtt.h                    |   10 +--
 12 files changed, 250 insertions(+), 218 deletions(-)

diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index 9ed92ef..1d39864 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -84,40 +84,33 @@ static struct _intel_private {
 #define IS_IRONLAKE	intel_private.driver->is_ironlake
 #define HAS_PGTBL_EN	intel_private.driver->has_pgtbl_enable
 
-int intel_gtt_map_memory(struct page **pages, unsigned int num_entries,
-			 struct scatterlist **sg_list, int *num_sg)
+static int intel_gtt_map_memory(struct page **pages,
+				unsigned int num_entries,
+				struct sg_table *st)
 {
-	struct sg_table st;
 	struct scatterlist *sg;
 	int i;
 
-	if (*sg_list)
-		return 0; /* already mapped (for e.g. resume */
-
 	DBG("try mapping %lu pages\n", (unsigned long)num_entries);
 
-	if (sg_alloc_table(&st, num_entries, GFP_KERNEL))
+	if (sg_alloc_table(st, num_entries, GFP_KERNEL))
 		goto err;
 
-	*sg_list = sg = st.sgl;
-
-	for (i = 0 ; i < num_entries; i++, sg = sg_next(sg))
+	for_each_sg(st->sgl, sg, num_entries,i)
 		sg_set_page(sg, pages[i], PAGE_SIZE, 0);
 
-	*num_sg = pci_map_sg(intel_private.pcidev, *sg_list,
-				 num_entries, PCI_DMA_BIDIRECTIONAL);
-	if (unlikely(!*num_sg))
+	if (!pci_map_sg(intel_private.pcidev,
+			st->sgl, st->nents, PCI_DMA_BIDIRECTIONAL))
 		goto err;
 
 	return 0;
 
 err:
-	sg_free_table(&st);
+	sg_free_table(st);
 	return -ENOMEM;
 }
-EXPORT_SYMBOL(intel_gtt_map_memory);
 
-void intel_gtt_unmap_memory(struct scatterlist *sg_list, int num_sg)
+static void intel_gtt_unmap_memory(struct scatterlist *sg_list, int num_sg)
 {
 	struct sg_table st;
 	DBG("try unmapping %lu pages\n", (unsigned long)mem->page_count);
@@ -130,7 +123,6 @@ void intel_gtt_unmap_memory(struct scatterlist *sg_list, int num_sg)
 
 	sg_free_table(&st);
 }
-EXPORT_SYMBOL(intel_gtt_unmap_memory);
 
 static void intel_fake_agp_enable(struct agp_bridge_data *bridge, u32 mode)
 {
@@ -879,8 +871,7 @@ static bool i830_check_flags(unsigned int flags)
 	return false;
 }
 
-void intel_gtt_insert_sg_entries(struct scatterlist *sg_list,
-				 unsigned int sg_len,
+void intel_gtt_insert_sg_entries(struct sg_table *st,
 				 unsigned int pg_start,
 				 unsigned int flags)
 {
@@ -892,12 +883,11 @@ void intel_gtt_insert_sg_entries(struct scatterlist *sg_list,
 
 	/* sg may merge pages, but we have to separate
 	 * per-page addr for GTT */
-	for_each_sg(sg_list, sg, sg_len, i) {
+	for_each_sg(st->sgl, sg, st->nents, i) {
 		len = sg_dma_len(sg) >> PAGE_SHIFT;
 		for (m = 0; m < len; m++) {
 			dma_addr_t addr = sg_dma_address(sg) + (m << PAGE_SHIFT);
-			intel_private.driver->write_entry(addr,
-							  j, flags);
+			intel_private.driver->write_entry(addr, j, flags);
 			j++;
 		}
 	}
@@ -905,8 +895,10 @@ void intel_gtt_insert_sg_entries(struct scatterlist *sg_list,
 }
 EXPORT_SYMBOL(intel_gtt_insert_sg_entries);
 
-void intel_gtt_insert_pages(unsigned int first_entry, unsigned int num_entries,
-			    struct page **pages, unsigned int flags)
+static void intel_gtt_insert_pages(unsigned int first_entry,
+				   unsigned int num_entries,
+				   struct page **pages,
+				   unsigned int flags)
 {
 	int i, j;
 
@@ -917,7 +909,6 @@ void intel_gtt_insert_pages(unsigned int first_entry, unsigned int num_entries,
 	}
 	readl(intel_private.gtt+j-1);
 }
-EXPORT_SYMBOL(intel_gtt_insert_pages);
 
 static int intel_fake_agp_insert_entries(struct agp_memory *mem,
 					 off_t pg_start, int type)
@@ -953,13 +944,15 @@ static int intel_fake_agp_insert_entries(struct agp_memory *mem,
 		global_cache_flush();
 
 	if (intel_private.base.needs_dmar) {
-		ret = intel_gtt_map_memory(mem->pages, mem->page_count,
-					   &mem->sg_list, &mem->num_sg);
+		struct sg_table st;
+
+		ret = intel_gtt_map_memory(mem->pages, mem->page_count, &st);
 		if (ret != 0)
 			return ret;
 
-		intel_gtt_insert_sg_entries(mem->sg_list, mem->num_sg,
-					    pg_start, type);
+		intel_gtt_insert_sg_entries(&st, pg_start, type);
+		mem->sg_list = st.sgl;
+		mem->num_sg = st.nents;
 	} else
 		intel_gtt_insert_pages(pg_start, mem->page_count, mem->pages,
 				       type);
diff --git a/drivers/gpu/drm/drm_cache.c b/drivers/gpu/drm/drm_cache.c
index 08758e0..628a2e0 100644
--- a/drivers/gpu/drm/drm_cache.c
+++ b/drivers/gpu/drm/drm_cache.c
@@ -100,6 +100,29 @@ drm_clflush_pages(struct page *pages[], unsigned long num_pages)
 EXPORT_SYMBOL(drm_clflush_pages);
 
 void
+drm_clflush_sg(struct sg_table *st)
+{
+#if defined(CONFIG_X86)
+	if (cpu_has_clflush) {
+		struct scatterlist *sg;
+		int i;
+
+		mb();
+		for_each_sg(st->sgl, sg, st->nents, i)
+			drm_clflush_page(sg_page(sg));
+		mb();
+	}
+
+	if (on_each_cpu(drm_clflush_ipi_handler, NULL, 1) != 0)
+		printk(KERN_ERR "Timed out waiting for cache flush.\n");
+#else
+	printk(KERN_ERR "Architecture has no drm_cache.c support\n");
+	WARN_ON_ONCE(1);
+#endif
+}
+EXPORT_SYMBOL(drm_clflush_sg);
+
+void
 drm_clflush_virt_range(char *addr, unsigned long length)
 {
 #if defined(CONFIG_X86)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0805040f..d4d3b2a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -987,16 +987,11 @@ struct drm_i915_gem_object {
 
 	unsigned int has_aliasing_ppgtt_mapping:1;
 	unsigned int has_global_gtt_mapping:1;
+	unsigned int has_dma_mapping:1;
 
-	struct page **pages;
+	struct sg_table *pages;
 	int pages_pin_count;
 
-	/**
-	 * DMAR support
-	 */
-	struct scatterlist *sg_list;
-	int num_sg;
-
 	/* prime dma-buf support */
 	struct sg_table *sg_table;
 	void *dma_buf_vmapping;
@@ -1323,6 +1318,15 @@ void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
 
 int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
+static inline struct page *i915_gem_object_get_page(struct drm_i915_gem_object *obj, int n)
+{
+	struct scatterlist *sg = obj->pages->sgl;
+	while (n >= SG_MAX_SINGLE_ALLOC) {
+		sg = sg_chain_ptr(sg + SG_MAX_SINGLE_ALLOC - 1);
+		n -= SG_MAX_SINGLE_ALLOC - 1;
+	}
+	return sg_page(sg+n);
+}
 static inline void i915_gem_object_pin_pages(struct drm_i915_gem_object *obj)
 {
 	BUG_ON(obj->pages == NULL);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index cd17434..20e05c2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -411,6 +411,8 @@ i915_gem_shmem_pread(struct drm_device *dev,
 	int hit_slowpath = 0;
 	int prefaulted = 0;
 	int needs_clflush = 0;
+	struct scatterlist *sg;
+	int i;
 
 	user_data = (char __user *) (uintptr_t) args->data_ptr;
 	remain = args->size;
@@ -439,9 +441,15 @@ i915_gem_shmem_pread(struct drm_device *dev,
 
 	offset = args->offset;
 
-	while (remain > 0) {
+	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
 		struct page *page;
 
+		if (i < offset >> PAGE_SHIFT)
+			continue;
+
+		if (remain <= 0)
+			break;
+
 		/* Operation in this page
 		 *
 		 * shmem_page_offset = offset within page in shmem file
@@ -452,7 +460,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		if ((shmem_page_offset + page_length) > PAGE_SIZE)
 			page_length = PAGE_SIZE - shmem_page_offset;
 
-		page = obj->pages[offset >> PAGE_SHIFT];
+		page = sg_page(sg);
 		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
 			(page_to_phys(page) & (1 << 17)) != 0;
 
@@ -731,6 +739,8 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 	int hit_slowpath = 0;
 	int needs_clflush_after = 0;
 	int needs_clflush_before = 0;
+	int i;
+	struct scatterlist *sg;
 
 	user_data = (char __user *) (uintptr_t) args->data_ptr;
 	remain = args->size;
@@ -765,10 +775,16 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 	offset = args->offset;
 	obj->dirty = 1;
 
-	while (remain > 0) {
+	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
 		struct page *page;
 		int partial_cacheline_write;
 
+		if (i < offset >> PAGE_SHIFT)
+			continue;
+
+		if (remain <= 0)
+			break;
+
 		/* Operation in this page
 		 *
 		 * shmem_page_offset = offset within page in shmem file
@@ -787,7 +803,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 			((shmem_page_offset | page_length)
 				& (boot_cpu_data.x86_clflush_size - 1));
 
-		page = obj->pages[offset >> PAGE_SHIFT];
+		page = sg_page(sg);
 		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
 			(page_to_phys(page) & (1 << 17)) != 0;
 
@@ -1390,6 +1406,7 @@ static void
 i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 {
 	int page_count = obj->base.size / PAGE_SIZE;
+	struct scatterlist *sg;
 	int ret, i;
 
 	BUG_ON(obj->gtt_space);
@@ -1411,19 +1428,21 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 	if (obj->madv == I915_MADV_DONTNEED)
 		obj->dirty = 0;
 
-	for (i = 0; i < page_count; i++) {
+	for_each_sg(obj->pages->sgl, sg, page_count, i) {
+		struct page *page = sg_page(sg);
+
 		if (obj->dirty)
-			set_page_dirty(obj->pages[i]);
+			set_page_dirty(page);
 
 		if (obj->madv == I915_MADV_WILLNEED)
-			mark_page_accessed(obj->pages[i]);
+			mark_page_accessed(page);
 
-		page_cache_release(obj->pages[i]);
+		page_cache_release(page);
 	}
 	obj->dirty = 0;
 
-	drm_free_large(obj->pages);
-	obj->pages = NULL;
+	sg_free_table(obj->pages);
+	kfree(obj->pages);
 }
 
 static int
@@ -1438,6 +1457,7 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 		return -EBUSY;
 
 	ops->put_pages(obj);
+	obj->pages = NULL;
 
 	list_del(&obj->gtt_list);
 	if (i915_gem_object_is_purgeable(obj))
@@ -1495,6 +1515,8 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	int page_count, i;
 	struct address_space *mapping;
+	struct sg_table *st;
+	struct scatterlist *sg;
 	struct page *page;
 	gfp_t gfp;
 
@@ -1505,20 +1527,27 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 	BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
 	BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
 
-	/* Get the list of pages out of our struct file.  They'll be pinned
-	 * at this point until we release them.
-	 */
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (st == NULL)
+		return -ENOMEM;
+
 	page_count = obj->base.size / PAGE_SIZE;
-	obj->pages = kmalloc(page_count*sizeof(struct page *), GFP_KERNEL);
-	if (obj->pages == NULL)
+	if (sg_alloc_table(st, page_count, GFP_KERNEL)) {
+		sg_free_table(st);
+		kfree(st);
 		return -ENOMEM;
+	}
 
-	/* Fail silently without starting the shrinker */
+	/* Get the list of pages out of our struct file.  They'll be pinned
+	 * at this point until we release them.
+	 *
+	 * Fail silently without starting the shrinker
+	 */
 	mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
 	gfp = mapping_gfp_mask(mapping);
 	gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD;
 	gfp &= ~(__GFP_IO | __GFP_WAIT);
-	for (i = 0; i < page_count; i++) {
+	for_each_sg(st->sgl, sg, page_count, i) {
 		page = shmem_read_mapping_page_gfp(mapping, i, gfp);
 		if (IS_ERR(page)) {
 			i915_gem_purge(dev_priv, page_count);
@@ -1541,20 +1570,20 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 			gfp &= ~(__GFP_IO | __GFP_WAIT);
 		}
 
-		obj->pages[i] = page;
+		sg_set_page(sg, page, PAGE_SIZE, 0);
 	}
 
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_do_bit_17_swizzle(obj);
 
+	obj->pages = st;
 	return 0;
 
 err_pages:
-	while (i--)
-		page_cache_release(obj->pages[i]);
-
-	drm_free_large(obj->pages);
-	obj->pages = NULL;
+	for_each_sg(st->sgl, sg, i, page_count)
+		page_cache_release(sg_page(sg));
+	sg_free_table(st);
+	kfree(st);
 	return PTR_ERR(page);
 }
 
@@ -2923,7 +2952,7 @@ i915_gem_clflush_object(struct drm_i915_gem_object *obj)
 
 	trace_i915_gem_object_clflush(obj);
 
-	drm_clflush_pages(obj->pages, obj->base.size / PAGE_SIZE);
+	drm_clflush_sg(obj->pages);
 }
 
 /** Flushes the GTT write domain for the object if it's dirty. */
@@ -3673,6 +3702,8 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	i915_gem_object_put_pages(obj);
 	i915_gem_object_free_mmap_offset(obj);
 
+	BUG_ON(obj->pages);
+
 	drm_gem_object_release(&obj->base);
 	i915_gem_info_remove_obj(dev_priv, obj->base.size);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 4a6982e..b48c2a4 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -28,33 +28,57 @@
 #include <linux/dma-buf.h>
 
 static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachment,
-				      enum dma_data_direction dir)
+					     enum dma_data_direction dir)
 {
 	struct drm_i915_gem_object *obj = attachment->dmabuf->priv;
-	struct drm_device *dev = obj->base.dev;
-	int npages = obj->base.size / PAGE_SIZE;
-	struct sg_table *sg;
-	int ret;
-	int nents;
+	struct sg_table *st;
+	struct scatterlist *src, *dst;
+	int ret, i;
 
-	ret = i915_mutex_lock_interruptible(dev);
+	ret = i915_mutex_lock_interruptible(obj->base.dev);
 	if (ret)
 		return ERR_PTR(ret);
 
 	ret = i915_gem_object_get_pages(obj);
 	if (ret) {
-		sg = ERR_PTR(ret);
+		st = ERR_PTR(ret);
+		goto out;
+	}
+
+	/* Copy sg so that we make an independent mapping */
+	st = kmalloc(sizeof(struct sg_table), GFP_KERNEL);
+	if (st == NULL) {
+		st = ERR_PTR(-ENOMEM);
+		goto out;
+	}
+
+	ret = sg_alloc_table(st, obj->pages->nents, GFP_KERNEL);
+	if (ret) {
+		kfree(st);
+		st = ERR_PTR(ret);
+		goto out;
+	}
+
+	src = obj->pages->sgl;
+	dst = st->sgl;
+	for (i = 0; i < obj->pages->nents; i++) {
+		sg_set_page(dst, sg_page(src), PAGE_SIZE, 0);
+		dst = sg_next(dst);
+		src = sg_next(src);
+	}
+
+	if (!dma_map_sg(attachment->dev, st->sgl, st->nents, dir)) {
+		sg_free_table(st);
+		kfree(st);
+		st = ERR_PTR(-ENOMEM);
 		goto out;
 	}
 
-	/* link the pages into an SG then map the sg */
-	sg = drm_prime_pages_to_sg(obj->pages, npages);
-	nents = dma_map_sg(attachment->dev, sg->sgl, sg->nents, dir);
 	i915_gem_object_pin_pages(obj);
 
 out:
-	mutex_unlock(&dev->struct_mutex);
-	return sg;
+	mutex_unlock(&obj->base.dev->struct_mutex);
+	return st;
 }
 
 static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
@@ -81,7 +105,9 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 {
 	struct drm_i915_gem_object *obj = dma_buf->priv;
 	struct drm_device *dev = obj->base.dev;
-	int ret;
+	struct scatterlist *sg;
+	struct page **pages;
+	int ret, i;
 
 	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
@@ -93,22 +119,33 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 	}
 
 	ret = i915_gem_object_get_pages(obj);
-	if (ret) {
-		mutex_unlock(&dev->struct_mutex);
-		return ERR_PTR(ret);
-	}
+	if (ret)
+		goto error;
 
-	obj->dma_buf_vmapping = vmap(obj->pages, obj->base.size / PAGE_SIZE, 0, PAGE_KERNEL);
-	if (!obj->dma_buf_vmapping) {
-		DRM_ERROR("failed to vmap object\n");
-		goto out_unlock;
-	}
+	ret = -ENOMEM;
+
+	pages = drm_malloc_ab(obj->pages->nents, sizeof(struct page *));
+	if (pages == NULL)
+		goto error;
+
+	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i)
+		pages[i] = sg_page(sg);
+
+	obj->dma_buf_vmapping = vmap(pages, obj->pages->nents, 0, PAGE_KERNEL);
+	drm_free_large(pages);
+
+	if (!obj->dma_buf_vmapping)
+		goto error;
 
 	obj->vmapping_count = 1;
 	i915_gem_object_pin_pages(obj);
 out_unlock:
 	mutex_unlock(&dev->struct_mutex);
 	return obj->dma_buf_vmapping;
+
+error:
+	mutex_unlock(&dev->struct_mutex);
+	return ERR_PTR(ret);
 }
 
 static void i915_gem_dmabuf_vunmap(struct dma_buf *dma_buf, void *vaddr)
@@ -168,22 +205,19 @@ static const struct dma_buf_ops i915_dmabuf_ops =  {
 };
 
 struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
-				struct drm_gem_object *gem_obj, int flags)
+				      struct drm_gem_object *gem_obj, int flags)
 {
 	struct drm_i915_gem_object *obj = to_intel_bo(gem_obj);
 
-	return dma_buf_export(obj, &i915_dmabuf_ops,
-						  obj->base.size, 0600);
+	return dma_buf_export(obj, &i915_dmabuf_ops, obj->base.size, 0600);
 }
 
 struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
-				struct dma_buf *dma_buf)
+					     struct dma_buf *dma_buf)
 {
 	struct dma_buf_attachment *attach;
 	struct sg_table *sg;
 	struct drm_i915_gem_object *obj;
-	int npages;
-	int size;
 	int ret;
 
 	/* is this one of own objects? */
@@ -207,21 +241,19 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 		goto fail_detach;
 	}
 
-	size = dma_buf->size;
-	npages = size / PAGE_SIZE;
-
 	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
 	if (obj == NULL) {
 		ret = -ENOMEM;
 		goto fail_unmap;
 	}
 
-	ret = drm_gem_private_object_init(dev, &obj->base, size);
+	ret = drm_gem_private_object_init(dev, &obj->base, dma_buf->size);
 	if (ret) {
 		kfree(obj);
 		goto fail_unmap;
 	}
 
+	obj->has_dma_mapping = true;
 	obj->sg_table = sg;
 	obj->base.import_attach = attach;
 
@@ -233,3 +265,18 @@ fail_detach:
 	dma_buf_detach(dma_buf, attach);
 	return ERR_PTR(ret);
 }
+
+void i915_gem_object_release_dmabuf(struct drm_i915_gem_object *obj)
+{
+	struct dma_buf_attachment *attach;
+
+	if (obj->base.import_attach == NULL)
+		return;
+
+	attach = obj->base.import_attach;
+
+	dma_buf_unmap_attachment(attach, obj->pages, DMA_BIDIRECTIONAL);
+
+	dma_buf_detach(attach->dmabuf, attach);
+	dma_buf_put(attach->dmabuf);
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index dbb003d..50bcc7e 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -199,7 +199,8 @@ i915_gem_execbuffer_relocate_entry(struct drm_i915_gem_object *obj,
 		if (ret)
 			return ret;
 
-		vaddr = kmap_atomic(obj->pages[reloc->offset >> PAGE_SHIFT]);
+		vaddr = kmap_atomic(i915_gem_object_get_page(obj,
+							     reloc->offset >> PAGE_SHIFT));
 		*(uint32_t *)(vaddr + page_offset) = reloc->delta;
 		kunmap_atomic(vaddr);
 	} else {
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3c36d3b..410f883 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -167,8 +167,7 @@ void i915_gem_cleanup_aliasing_ppgtt(struct drm_device *dev)
 }
 
 static void i915_ppgtt_insert_sg_entries(struct i915_hw_ppgtt *ppgtt,
-					 struct scatterlist *sg_list,
-					 unsigned sg_len,
+					 const struct sg_table *pages,
 					 unsigned first_entry,
 					 uint32_t pte_flags)
 {
@@ -180,12 +179,12 @@ static void i915_ppgtt_insert_sg_entries(struct i915_hw_ppgtt *ppgtt,
 	struct scatterlist *sg;
 
 	/* init sg walking */
-	sg = sg_list;
+	sg = pages->sgl;
 	i = 0;
 	segment_len = sg_dma_len(sg) >> PAGE_SHIFT;
 	m = 0;
 
-	while (i < sg_len) {
+	while (i < pages->nents) {
 		pt_vaddr = kmap_atomic(ppgtt->pt_pages[act_pd]);
 
 		for (j = first_pte; j < I915_PPGTT_PT_ENTRIES; j++) {
@@ -194,13 +193,11 @@ static void i915_ppgtt_insert_sg_entries(struct i915_hw_ppgtt *ppgtt,
 			pt_vaddr[j] = pte | pte_flags;
 
 			/* grab the next page */
-			m++;
-			if (m == segment_len) {
-				sg = sg_next(sg);
-				i++;
-				if (i == sg_len)
+			if (++m == segment_len) {
+				if (++i == pages->nents)
 					break;
 
+				sg = sg_next(sg);
 				segment_len = sg_dma_len(sg) >> PAGE_SHIFT;
 				m = 0;
 			}
@@ -213,44 +210,10 @@ static void i915_ppgtt_insert_sg_entries(struct i915_hw_ppgtt *ppgtt,
 	}
 }
 
-static void i915_ppgtt_insert_pages(struct i915_hw_ppgtt *ppgtt,
-				    unsigned first_entry, unsigned num_entries,
-				    struct page **pages, uint32_t pte_flags)
-{
-	uint32_t *pt_vaddr, pte;
-	unsigned act_pd = first_entry / I915_PPGTT_PT_ENTRIES;
-	unsigned first_pte = first_entry % I915_PPGTT_PT_ENTRIES;
-	unsigned last_pte, i;
-	dma_addr_t page_addr;
-
-	while (num_entries) {
-		last_pte = first_pte + num_entries;
-		last_pte = min_t(unsigned, last_pte, I915_PPGTT_PT_ENTRIES);
-
-		pt_vaddr = kmap_atomic(ppgtt->pt_pages[act_pd]);
-
-		for (i = first_pte; i < last_pte; i++) {
-			page_addr = page_to_phys(*pages);
-			pte = GEN6_PTE_ADDR_ENCODE(page_addr);
-			pt_vaddr[i] = pte | pte_flags;
-
-			pages++;
-		}
-
-		kunmap_atomic(pt_vaddr);
-
-		num_entries -= last_pte - first_pte;
-		first_pte = 0;
-		act_pd++;
-	}
-}
-
 void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 			    struct drm_i915_gem_object *obj,
 			    enum i915_cache_level cache_level)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	uint32_t pte_flags = GEN6_PTE_VALID;
 
 	switch (cache_level) {
@@ -267,26 +230,10 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 		BUG();
 	}
 
-	if (obj->sg_table) {
-		i915_ppgtt_insert_sg_entries(ppgtt,
-					     obj->sg_table->sgl,
-					     obj->sg_table->nents,
-					     obj->gtt_space->start >> PAGE_SHIFT,
-					     pte_flags);
-	} else if (dev_priv->mm.gtt->needs_dmar) {
-		BUG_ON(!obj->sg_list);
-
-		i915_ppgtt_insert_sg_entries(ppgtt,
-					     obj->sg_list,
-					     obj->num_sg,
-					     obj->gtt_space->start >> PAGE_SHIFT,
-					     pte_flags);
-	} else
-		i915_ppgtt_insert_pages(ppgtt,
-					obj->gtt_space->start >> PAGE_SHIFT,
-					obj->base.size >> PAGE_SHIFT,
-					obj->pages,
-					pte_flags);
+	i915_ppgtt_insert_sg_entries(ppgtt,
+				     obj->sg_table ?: obj->pages,
+				     obj->gtt_space->start >> PAGE_SHIFT,
+				     pte_flags);
 }
 
 void i915_ppgtt_unbind_object(struct i915_hw_ppgtt *ppgtt,
@@ -358,43 +305,26 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 
 int i915_gem_gtt_prepare_object(struct drm_i915_gem_object *obj)
 {
-	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-
-	if (dev_priv->mm.gtt->needs_dmar)
-		return intel_gtt_map_memory(obj->pages,
-					    obj->base.size >> PAGE_SHIFT,
-					    &obj->sg_list,
-					    &obj->num_sg);
-	else
+	if (obj->has_dma_mapping)
 		return 0;
+
+	if (!dma_map_sg(&obj->base.dev->pdev->dev,
+			obj->pages->sgl, obj->pages->nents,
+			PCI_DMA_BIDIRECTIONAL))
+		return -ENOSPC;
+
+	return 0;
 }
 
 void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 			      enum i915_cache_level cache_level)
 {
 	struct drm_device *dev = obj->base.dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned int agp_type = cache_level_to_agp_type(dev, cache_level);
 
-	if (obj->sg_table) {
-		intel_gtt_insert_sg_entries(obj->sg_table->sgl,
-					    obj->sg_table->nents,
-					    obj->gtt_space->start >> PAGE_SHIFT,
-					    agp_type);
-	} else if (dev_priv->mm.gtt->needs_dmar) {
-		BUG_ON(!obj->sg_list);
-
-		intel_gtt_insert_sg_entries(obj->sg_list,
-					    obj->num_sg,
-					    obj->gtt_space->start >> PAGE_SHIFT,
-					    agp_type);
-	} else
-		intel_gtt_insert_pages(obj->gtt_space->start >> PAGE_SHIFT,
-				       obj->base.size >> PAGE_SHIFT,
-				       obj->pages,
-				       agp_type);
-
+	intel_gtt_insert_sg_entries(obj->sg_table ?: obj->pages,
+				    obj->gtt_space->start >> PAGE_SHIFT,
+				    agp_type);
 	obj->has_global_gtt_mapping = 1;
 }
 
@@ -414,10 +344,10 @@ void i915_gem_gtt_finish_object(struct drm_i915_gem_object *obj)
 
 	interruptible = do_idling(dev_priv);
 
-	if (obj->sg_list) {
-		intel_gtt_unmap_memory(obj->sg_list, obj->num_sg);
-		obj->sg_list = NULL;
-	}
+	if (!obj->has_dma_mapping)
+		dma_unmap_sg(&dev->pdev->dev,
+			     obj->pages->sgl, obj->pages->nents,
+			     PCI_DMA_BIDIRECTIONAL);
 
 	undo_idling(dev_priv, interruptible);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_tiling.c b/drivers/gpu/drm/i915/i915_gem_tiling.c
index b964df5..8093ecd 100644
--- a/drivers/gpu/drm/i915/i915_gem_tiling.c
+++ b/drivers/gpu/drm/i915/i915_gem_tiling.c
@@ -470,18 +470,20 @@ i915_gem_swizzle_page(struct page *page)
 void
 i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj)
 {
+	struct scatterlist *sg;
 	int page_count = obj->base.size >> PAGE_SHIFT;
 	int i;
 
 	if (obj->bit_17 == NULL)
 		return;
 
-	for (i = 0; i < page_count; i++) {
-		char new_bit_17 = page_to_phys(obj->pages[i]) >> 17;
+	for_each_sg(obj->pages->sgl, sg, page_count, i) {
+		struct page *page = sg_page(sg);
+		char new_bit_17 = page_to_phys(page) >> 17;
 		if ((new_bit_17 & 0x1) !=
 		    (test_bit(i, obj->bit_17) != 0)) {
-			i915_gem_swizzle_page(obj->pages[i]);
-			set_page_dirty(obj->pages[i]);
+			i915_gem_swizzle_page(page);
+			set_page_dirty(page);
 		}
 	}
 }
@@ -489,6 +491,7 @@ i915_gem_object_do_bit_17_swizzle(struct drm_i915_gem_object *obj)
 void
 i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj)
 {
+	struct scatterlist *sg;
 	int page_count = obj->base.size >> PAGE_SHIFT;
 	int i;
 
@@ -502,8 +505,9 @@ i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj)
 		}
 	}
 
-	for (i = 0; i < page_count; i++) {
-		if (page_to_phys(obj->pages[i]) & (1 << 17))
+	for_each_sg(obj->pages->sgl, sg, page_count, i) {
+		struct page *page = sg_page(sg);
+		if (page_to_phys(page) & (1 << 17))
 			__set_bit(i, obj->bit_17);
 		else
 			__clear_bit(i, obj->bit_17);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4153c75..2ec8ad7 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -860,20 +860,20 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 			 struct drm_i915_gem_object *src)
 {
 	struct drm_i915_error_object *dst;
-	int page, page_count;
+	int i, count;
 	u32 reloc_offset;
 
 	if (src == NULL || src->pages == NULL)
 		return NULL;
 
-	page_count = src->base.size / PAGE_SIZE;
+	count = src->base.size / PAGE_SIZE;
 
-	dst = kmalloc(sizeof(*dst) + page_count * sizeof(u32 *), GFP_ATOMIC);
+	dst = kmalloc(sizeof(*dst) + count * sizeof(u32 *), GFP_ATOMIC);
 	if (dst == NULL)
 		return NULL;
 
 	reloc_offset = src->gtt_offset;
-	for (page = 0; page < page_count; page++) {
+	for (i = 0; i < count; i++) {
 		unsigned long flags;
 		void *d;
 
@@ -896,30 +896,33 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 			memcpy_fromio(d, s, PAGE_SIZE);
 			io_mapping_unmap_atomic(s);
 		} else {
+			struct page *page;
 			void *s;
 
-			drm_clflush_pages(&src->pages[page], 1);
+			page = i915_gem_object_get_page(src, i);
+
+			drm_clflush_pages(&page, 1);
 
-			s = kmap_atomic(src->pages[page]);
+			s = kmap_atomic(page);
 			memcpy(d, s, PAGE_SIZE);
 			kunmap_atomic(s);
 
-			drm_clflush_pages(&src->pages[page], 1);
+			drm_clflush_pages(&page, 1);
 		}
 		local_irq_restore(flags);
 
-		dst->pages[page] = d;
+		dst->pages[i] = d;
 
 		reloc_offset += PAGE_SIZE;
 	}
-	dst->page_count = page_count;
+	dst->page_count = count;
 	dst->gtt_offset = src->gtt_offset;
 
 	return dst;
 
 unwind:
-	while (page--)
-		kfree(dst->pages[page]);
+	while (i--)
+		kfree(dst->pages[i]);
 	kfree(dst);
 	return NULL;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 80d8791..5fdd297 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -384,7 +384,7 @@ init_pipe_control(struct intel_ring_buffer *ring)
 		goto err_unref;
 
 	pc->gtt_offset = obj->gtt_offset;
-	pc->cpu_page =  kmap(obj->pages[0]);
+	pc->cpu_page =  kmap(sg_page(obj->pages->sgl));
 	if (pc->cpu_page == NULL)
 		goto err_unpin;
 
@@ -411,7 +411,8 @@ cleanup_pipe_control(struct intel_ring_buffer *ring)
 		return;
 
 	obj = pc->obj;
-	kunmap(obj->pages[0]);
+
+	kunmap(sg_page(obj->pages->sgl));
 	i915_gem_object_unpin(obj);
 	drm_gem_object_unreference(&obj->base);
 
@@ -946,7 +947,7 @@ static void cleanup_status_page(struct intel_ring_buffer *ring)
 	if (obj == NULL)
 		return;
 
-	kunmap(obj->pages[0]);
+	kunmap(sg_page(obj->pages->sgl));
 	i915_gem_object_unpin(obj);
 	drm_gem_object_unreference(&obj->base);
 	ring->status_page.obj = NULL;
@@ -973,7 +974,7 @@ static int init_status_page(struct intel_ring_buffer *ring)
 	}
 
 	ring->status_page.gfx_addr = obj->gtt_offset;
-	ring->status_page.page_addr = kmap(obj->pages[0]);
+	ring->status_page.page_addr = kmap(sg_page(obj->pages->sgl));
 	if (ring->status_page.page_addr == NULL) {
 		ret = -ENOMEM;
 		goto err_unpin;
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index d6b67bb..d5f0c16 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -1367,6 +1367,7 @@ extern int drm_remove_magic(struct drm_master *master, drm_magic_t magic);
 
 /* Cache management (drm_cache.c) */
 void drm_clflush_pages(struct page *pages[], unsigned long num_pages);
+void drm_clflush_sg(struct sg_table *st);
 void drm_clflush_virt_range(char *addr, unsigned long length);
 
 				/* Locking IOCTL support (drm_lock.h) */
diff --git a/include/drm/intel-gtt.h b/include/drm/intel-gtt.h
index 8e29d55..2e37e9f 100644
--- a/include/drm/intel-gtt.h
+++ b/include/drm/intel-gtt.h
@@ -30,16 +30,10 @@ void intel_gmch_remove(void);
 bool intel_enable_gtt(void);
 
 void intel_gtt_chipset_flush(void);
-void intel_gtt_unmap_memory(struct scatterlist *sg_list, int num_sg);
-void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries);
-int intel_gtt_map_memory(struct page **pages, unsigned int num_entries,
-			 struct scatterlist **sg_list, int *num_sg);
-void intel_gtt_insert_sg_entries(struct scatterlist *sg_list,
-				 unsigned int sg_len,
+void intel_gtt_insert_sg_entries(struct sg_table *st,
 				 unsigned int pg_start,
 				 unsigned int flags);
-void intel_gtt_insert_pages(unsigned int first_entry, unsigned int num_entries,
-			    struct page **pages, unsigned int flags);
+void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries);
 
 /* Special gtt memory types */
 #define AGP_DCACHE_MEMORY	1
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 13/29] drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (11 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 12/29] drm/i915: Replace the array of pages with a scatterlist Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 14/29] drm: Introduce drm_mm_create_block() Chris Wilson
                   ` (15 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

By providing a callback for when we need to bind the pages, and then
release them again later, we can shorten the amount of time we hold the
foreign pages mapped and pinned, and importantly the dmabuf objects then
behave as any other normal object with respect to the shrinker and
memory management.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |    1 -
 drivers/gpu/drm/i915/i915_gem.c        |   10 +++----
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |   45 +++++++++++++++++++++-----------
 drivers/gpu/drm/i915/i915_gem_gtt.c    |    4 +--
 4 files changed, 37 insertions(+), 23 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d4d3b2a..74a3c6c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -993,7 +993,6 @@ struct drm_i915_gem_object {
 	int pages_pin_count;
 
 	/* prime dma-buf support */
-	struct sg_table *sg_table;
 	void *dma_buf_vmapping;
 	int vmapping_count;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 20e05c2..d0fcb61 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1450,7 +1450,7 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 {
 	const struct drm_i915_gem_object_ops *ops = obj->base.driver_private;
 
-	if (obj->sg_table || obj->pages == NULL)
+	if (obj->pages == NULL)
 		return 0;
 
 	if (obj->pages_pin_count)
@@ -1594,7 +1594,7 @@ i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
 	const struct drm_i915_gem_object_ops *ops = obj->base.driver_private;
 	int ret;
 
-	if (obj->sg_table || obj->pages)
+	if (obj->pages)
 		return 0;
 
 	BUG_ON(obj->pages_pin_count);
@@ -3680,9 +3680,6 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 
 	trace_i915_gem_object_destroy(obj);
 
-	if (gem_obj->import_attach)
-		drm_prime_gem_destroy(gem_obj, obj->sg_table);
-
 	if (obj->phys_obj)
 		i915_gem_detach_phys_object(dev, obj);
 
@@ -3704,6 +3701,9 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 
 	BUG_ON(obj->pages);
 
+	if (obj->base.import_attach)
+		drm_prime_gem_destroy(&obj->base, NULL);
+
 	drm_gem_object_release(&obj->base);
 	i915_gem_info_remove_obj(dev_priv, obj->base.size);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index b48c2a4..5e72e95 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -82,7 +82,8 @@ out:
 }
 
 static void i915_gem_unmap_dma_buf(struct dma_buf_attachment *attachment,
-			    struct sg_table *sg, enum dma_data_direction dir)
+				   struct sg_table *sg,
+				   enum dma_data_direction dir)
 {
 	dma_unmap_sg(attachment->dev, sg->sgl, sg->nents, dir);
 	sg_free_table(sg);
@@ -212,11 +213,35 @@ struct dma_buf *i915_gem_prime_export(struct drm_device *dev,
 	return dma_buf_export(obj, &i915_dmabuf_ops, obj->base.size, 0600);
 }
 
+static int i915_gem_object_get_pages_dmabuf(struct drm_i915_gem_object *obj)
+{
+	struct sg_table *sg;
+
+	sg = dma_buf_map_attachment(obj->base.import_attach, DMA_BIDIRECTIONAL);
+	if (IS_ERR(sg))
+		return PTR_ERR(sg);
+
+	obj->pages = sg;
+	obj->has_dma_mapping = true;
+	return 0;
+}
+
+static void i915_gem_object_put_pages_dmabuf(struct drm_i915_gem_object *obj)
+{
+	dma_buf_unmap_attachment(obj->base.import_attach,
+				 obj->pages, DMA_BIDIRECTIONAL);
+	obj->has_dma_mapping = false;
+}
+
+static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = {
+	.get_pages = i915_gem_object_get_pages_dmabuf,
+	.put_pages = i915_gem_object_put_pages_dmabuf,
+};
+
 struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 					     struct dma_buf *dma_buf)
 {
 	struct dma_buf_attachment *attach;
-	struct sg_table *sg;
 	struct drm_i915_gem_object *obj;
 	int ret;
 
@@ -235,32 +260,24 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 	if (IS_ERR(attach))
 		return ERR_CAST(attach);
 
-	sg = dma_buf_map_attachment(attach, DMA_BIDIRECTIONAL);
-	if (IS_ERR(sg)) {
-		ret = PTR_ERR(sg);
-		goto fail_detach;
-	}
 
 	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
 	if (obj == NULL) {
 		ret = -ENOMEM;
-		goto fail_unmap;
+		goto fail_detach;
 	}
 
 	ret = drm_gem_private_object_init(dev, &obj->base, dma_buf->size);
 	if (ret) {
 		kfree(obj);
-		goto fail_unmap;
+		goto fail_detach;
 	}
 
-	obj->has_dma_mapping = true;
-	obj->sg_table = sg;
+	i915_gem_object_init(obj, &i915_gem_object_dmabuf_ops);
 	obj->base.import_attach = attach;
 
 	return &obj->base;
 
-fail_unmap:
-	dma_buf_unmap_attachment(attach, sg, DMA_BIDIRECTIONAL);
 fail_detach:
 	dma_buf_detach(dma_buf, attach);
 	return ERR_PTR(ret);
@@ -275,8 +292,6 @@ void i915_gem_object_release_dmabuf(struct drm_i915_gem_object *obj)
 
 	attach = obj->base.import_attach;
 
-	dma_buf_unmap_attachment(attach, obj->pages, DMA_BIDIRECTIONAL);
-
 	dma_buf_detach(attach->dmabuf, attach);
 	dma_buf_put(attach->dmabuf);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 410f883..ea4fc20 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -231,7 +231,7 @@ void i915_ppgtt_bind_object(struct i915_hw_ppgtt *ppgtt,
 	}
 
 	i915_ppgtt_insert_sg_entries(ppgtt,
-				     obj->sg_table ?: obj->pages,
+				     obj->pages,
 				     obj->gtt_space->start >> PAGE_SHIFT,
 				     pte_flags);
 }
@@ -322,7 +322,7 @@ void i915_gem_gtt_bind_object(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	unsigned int agp_type = cache_level_to_agp_type(dev, cache_level);
 
-	intel_gtt_insert_sg_entries(obj->sg_table ?: obj->pages,
+	intel_gtt_insert_sg_entries(obj->pages,
 				    obj->gtt_space->start >> PAGE_SHIFT,
 				    agp_type);
 	obj->has_global_gtt_mapping = 1;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 14/29] drm: Introduce drm_mm_create_block()
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (12 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 13/29] drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 15/29] drm/i915: Fix detection of stolen base for gen2 Chris Wilson
                   ` (14 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: Dave Airlie

To be used later by i915 to preallocate exact blocks of space from the
range manager.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Dave Airlie <airlied@redhat.com>
---
 drivers/gpu/drm/drm_mm.c |   49 ++++++++++++++++++++++++++++++++++++++++++++++
 include/drm/drm_mm.h     |    4 ++++
 2 files changed, 53 insertions(+)

diff --git a/drivers/gpu/drm/drm_mm.c b/drivers/gpu/drm/drm_mm.c
index 9bb82f7..5db8c20 100644
--- a/drivers/gpu/drm/drm_mm.c
+++ b/drivers/gpu/drm/drm_mm.c
@@ -161,6 +161,55 @@ static void drm_mm_insert_helper(struct drm_mm_node *hole_node,
 	}
 }
 
+struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
+					unsigned long start,
+					unsigned long size,
+					bool atomic)
+{
+	struct drm_mm_node *hole, *node;
+	unsigned long end = start + size;
+
+	list_for_each_entry(hole, &mm->hole_stack, hole_stack) {
+		unsigned long hole_start;
+		unsigned long hole_end;
+
+		BUG_ON(!hole->hole_follows);
+		hole_start = drm_mm_hole_node_start(hole);
+		hole_end = drm_mm_hole_node_end(hole);
+
+		if (hole_start > start || hole_end < end)
+			continue;
+
+		node = drm_mm_kmalloc(mm, atomic);
+		if (unlikely(node == NULL))
+			return NULL;
+
+		node->start = start;
+		node->size = size;
+		node->mm = mm;
+		node->allocated = 1;
+
+		INIT_LIST_HEAD(&node->hole_stack);
+		list_add(&node->node_list, &hole->node_list);
+
+		if (start == hole_start) {
+			hole->hole_follows = 0;
+			list_del_init(&hole->hole_stack);
+		}
+
+		node->hole_follows = 0;
+		if (end != hole_end) {
+			list_add(&node->hole_stack, &mm->hole_stack);
+			node->hole_follows = 1;
+		}
+
+		return node;
+	}
+
+	return NULL;
+}
+EXPORT_SYMBOL(drm_mm_create_block);
+
 struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *hole_node,
 					     unsigned long size,
 					     unsigned alignment,
diff --git a/include/drm/drm_mm.h b/include/drm/drm_mm.h
index 06d7f79..4020f96 100644
--- a/include/drm/drm_mm.h
+++ b/include/drm/drm_mm.h
@@ -102,6 +102,10 @@ static inline bool drm_mm_initialized(struct drm_mm *mm)
 /*
  * Basic range manager support (drm_mm.c)
  */
+extern struct drm_mm_node *drm_mm_create_block(struct drm_mm *mm,
+					       unsigned long start,
+					       unsigned long size,
+					       bool atomic);
 extern struct drm_mm_node *drm_mm_get_block_generic(struct drm_mm_node *node,
 						    unsigned long size,
 						    unsigned alignment,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 15/29] drm/i915: Fix detection of stolen base for gen2
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (13 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 14/29] drm: Introduce drm_mm_create_block() Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 16/29] drm/i915: Fix location of stolen memory register for SandyBridge+ Chris Wilson
                   ` (13 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

It was not until the G33 refresh, that a PCI config register was
introduced that explicitly said where the stolen memory was. Prior to
865G there was not even a register that said where the end of usable
low memory was and where the stolen memory began (or ended depending
upon chipset). Before then, one has to look at the BIOS memory maps to
find the Top of Memory. Alas that is not exported by arch/x86 and so we
have to resort to disabling stolen memory on gen2 for the time being.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |    1 +
 drivers/gpu/drm/i915/i915_gem_stolen.c |   69 ++++++++++++++------------------
 2 files changed, 31 insertions(+), 39 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 74a3c6c..a25edcc 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -704,6 +704,7 @@ typedef struct drm_i915_private {
 		unsigned long gtt_start;
 		unsigned long gtt_mappable_end;
 		unsigned long gtt_end;
+		unsigned long stolen_base; /* limited to low memory (32-bit) */
 
 		struct io_mapping *gtt_mapping;
 		phys_addr_t gtt_base_addr;
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index ada2e90..a01ff74 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -43,56 +43,43 @@
  * for is a boon.
  */
 
-#define PTE_ADDRESS_MASK		0xfffff000
-#define PTE_ADDRESS_MASK_HIGH		0x000000f0 /* i915+ */
-#define PTE_MAPPING_TYPE_UNCACHED	(0 << 1)
-#define PTE_MAPPING_TYPE_DCACHE		(1 << 1) /* i830 only */
-#define PTE_MAPPING_TYPE_CACHED		(3 << 1)
-#define PTE_MAPPING_TYPE_MASK		(3 << 1)
-#define PTE_VALID			(1 << 0)
-
-/**
- * i915_stolen_to_phys - take an offset into stolen memory and turn it into
- *                       a physical one
- * @dev: drm device
- * @offset: address to translate
- *
- * Some chip functions require allocations from stolen space and need the
- * physical address of the memory in question.
- */
-static unsigned long i915_stolen_to_phys(struct drm_device *dev, u32 offset)
+static unsigned long i915_stolen_to_physical(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct pci_dev *pdev = dev_priv->bridge_dev;
 	u32 base;
 
-#if 0
 	/* On the machines I have tested the Graphics Base of Stolen Memory
-	 * is unreliable, so compute the base by subtracting the stolen memory
-	 * from the Top of Low Usable DRAM which is where the BIOS places
-	 * the graphics stolen memory.
+	 * is unreliable, so on those compute the base by subtracting the
+	 * stolen memory from the Top of Low Usable DRAM which is where the
+	 * BIOS places the graphics stolen memory.
+	 *
+	 * On gen2, the layout is slightly different with the Graphics Segment
+	 * immediately following Top of Memory (or Top of Usable DRAM). Note
+	 * it appears that TOUD is only reported by 865g, so we just use the
+	 * top of memory as determined by the e820 probe.
+	 *
+	 * XXX gen2 requires an unavailable symbol and 945gm fails with
+	 * its value of TOLUD.
 	 */
+	base = 0;
 	if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
-		/* top 32bits are reserved = 0 */
+		/* Read Graphics Base of Stolen Memory directly */
 		pci_read_config_dword(pdev, 0xA4, &base);
-	} else {
-		/* XXX presume 8xx is the same as i915 */
-		pci_bus_read_config_dword(pdev->bus, 2, 0x5C, &base);
-	}
-#else
-	if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
-		u16 val;
-		pci_read_config_word(pdev, 0xb0, &val);
-		base = val >> 4 << 20;
-	} else {
+#if 0
+	} else if (IS_GEN3(dev)) {
 		u8 val;
+		/* Stolen is immediately below Top of Low Usable DRAM */
 		pci_read_config_byte(pdev, 0x9c, &val);
 		base = val >> 3 << 27;
-	}
-	base -= dev_priv->mm.gtt->stolen_size;
+		base -= dev_priv->mm.gtt->stolen_size;
+	} else {
+		/* Stolen is immediately above Top of Memory */
+		base = max_low_pfn_mapped << PAGE_SHIFT;
 #endif
+	}
 
-	return base + offset;
+	return base;
 }
 
 static void i915_warn_stolen(struct drm_device *dev)
@@ -117,7 +104,7 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 	if (!compressed_fb)
 		goto err;
 
-	cfb_base = i915_stolen_to_phys(dev, compressed_fb->start);
+	cfb_base = dev_priv->mm.stolen_base + compressed_fb->start;
 	if (!cfb_base)
 		goto err_fb;
 
@@ -130,7 +117,7 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 		if (!compressed_llb)
 			goto err_fb;
 
-		ll_base = i915_stolen_to_phys(dev, compressed_llb->start);
+		ll_base = dev_priv->mm.stolen_base + compressed_llb->start;
 		if (!ll_base)
 			goto err_llb;
 	}
@@ -149,7 +136,7 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 	}
 
 	DRM_DEBUG_KMS("FBC base 0x%08lx, ll base 0x%08lx, size %dM\n",
-		      cfb_base, ll_base, size >> 20);
+		      (long)cfb_base, (long)ll_base, size >> 20);
 	return;
 
 err_llb:
@@ -181,6 +168,10 @@ int i915_gem_init_stolen(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	unsigned long prealloc_size = dev_priv->mm.gtt->stolen_size;
 
+	dev_priv->mm.stolen_base = i915_stolen_to_physical(dev);
+	if (dev_priv->mm.stolen_base == 0)
+		return 0;
+
 	/* Basic memrange allocator for stolen space */
 	drm_mm_init(&dev_priv->mm.stolen, 0, prealloc_size);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 16/29] drm/i915: Fix location of stolen memory register for SandyBridge+
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (14 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 15/29] drm/i915: Fix detection of stolen base for gen2 Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-20 19:38   ` Daniel Vetter
  2012-08-11 14:41 ` [PATCH 17/29] drm/i915: Avoid clearing preallocated regions from the GTT Chris Wilson
                   ` (12 subsequent siblings)
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

A few of the earlier registers where enlarged and so the Base Data of
Stolem Memory Register (BDSM) was pushed to 0xb0.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_stolen.c |    9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index a01ff74..a528e4a 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -63,7 +63,11 @@ static unsigned long i915_stolen_to_physical(struct drm_device *dev)
 	 * its value of TOLUD.
 	 */
 	base = 0;
-	if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
+	if (INTEL_INFO(dev)->gen >= 6) {
+		/* Read Base Data of Stolen Memory Register (BDSM) directly */
+		pci_read_config_dword(pdev, 0xB0, &base);
+		base &= ~4095; /* lower bits used for locking register */
+	} else if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
 		/* Read Graphics Base of Stolen Memory directly */
 		pci_read_config_dword(pdev, 0xA4, &base);
 #if 0
@@ -172,6 +176,9 @@ int i915_gem_init_stolen(struct drm_device *dev)
 	if (dev_priv->mm.stolen_base == 0)
 		return 0;
 
+	DRM_DEBUG_KMS("found %d bytes of stolen memory at %08lx\n",
+		      dev_priv->mm.gtt->stolen_size, dev_priv->mm.stolen_base);
+
 	/* Basic memrange allocator for stolen space */
 	drm_mm_init(&dev_priv->mm.stolen, 0, prealloc_size);
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 17/29] drm/i915: Avoid clearing preallocated regions from the GTT
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (15 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 16/29] drm/i915: Fix location of stolen memory register for SandyBridge+ Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 18/29] drm/i915: Delay allocation of stolen space for FBC Chris Wilson
                   ` (11 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h     |    2 ++
 drivers/gpu/drm/i915/i915_gem_gtt.c |   35 ++++++++++++++++++++++++++++++++---
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a25edcc..0ce410a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -896,6 +896,8 @@ enum i915_cache_level {
 	I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */
 };
 
+#define I915_GTT_RESERVED ((struct drm_mm_node *)0x1)
+
 struct drm_i915_gem_object_ops {
 	int (*get_pages)(struct drm_i915_gem_object *);
 	void (*put_pages)(struct drm_i915_gem_object *);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index ea4fc20..528fd43 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -375,18 +375,47 @@ void i915_gem_init_global_gtt(struct drm_device *dev,
 			      unsigned long end)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	struct drm_mm_node *entry;
+	struct drm_i915_gem_object *obj;
 
-	/* Substract the guard page ... */
+	/* Subtract the guard page ... */
 	drm_mm_init(&dev_priv->mm.gtt_space, start, end - start - PAGE_SIZE);
 	if (!HAS_LLC(dev))
 		dev_priv->mm.gtt_space.color_adjust = i915_gtt_color_adjust;
 
+	/* Mark any preallocated objects as occupied */
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) {
+		DRM_DEBUG_KMS("reserving preallocated space: %x + %zx\n",
+			      obj->gtt_offset, obj->base.size);
+
+		BUG_ON(obj->gtt_space != I915_GTT_RESERVED);
+		obj->gtt_space = drm_mm_create_block(&dev_priv->mm.gtt_space,
+						     obj->gtt_offset,
+						     obj->base.size,
+						     false);
+		obj->has_global_gtt_mapping = 1;
+	}
+
 	dev_priv->mm.gtt_start = start;
 	dev_priv->mm.gtt_mappable_end = mappable_end;
 	dev_priv->mm.gtt_end = end;
 	dev_priv->mm.gtt_total = end - start;
 	dev_priv->mm.mappable_gtt_total = min(end, mappable_end) - start;
 
-	/* ... but ensure that we clear the entire range. */
-	intel_gtt_clear_range(start / PAGE_SIZE, (end-start) / PAGE_SIZE);
+	/* Clear any non-preallocated blocks */
+	list_for_each_entry(entry, &dev_priv->mm.gtt_space.hole_stack, hole_stack) {
+		unsigned long hole_start = entry->start + entry->size;
+		unsigned long hole_end = list_entry(entry->node_list.next,
+						    struct drm_mm_node,
+						    node_list)->start;
+
+		DRM_DEBUG_KMS("clearing unused GTT space: [%lx, %lx]\n",
+			      hole_start, hole_end);
+
+		intel_gtt_clear_range(hole_start / PAGE_SIZE,
+				      (hole_end-hole_start) / PAGE_SIZE);
+	}
+
+	/* And finally clear the reserved guard page */
+	intel_gtt_clear_range(end / PAGE_SIZE - 1, 1);
 }
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 18/29] drm/i915: Delay allocation of stolen space for FBC
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (16 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 17/29] drm/i915: Avoid clearing preallocated regions from the GTT Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-20 19:51   ` Daniel Vetter
  2012-08-11 14:41 ` [PATCH 19/29] drm/i915: Allow objects to be created with no backing pages, but stolen space Chris Wilson
                   ` (10 subsequent siblings)
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

As we may wish to wrap regions preallocated by the BIOS, we need to do
that before carving out contiguous chunks of stolen space for FBC.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |    1 +
 drivers/gpu/drm/i915/i915_gem_stolen.c |  114 +++++++++++++++++---------------
 drivers/gpu/drm/i915/intel_display.c   |    3 +
 drivers/gpu/drm/i915/intel_pm.c        |   13 ++--
 4 files changed, 70 insertions(+), 61 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0ce410a..0fec169 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1486,6 +1486,7 @@ int i915_gem_evict_everything(struct drm_device *dev);
 
 /* i915_gem_stolen.c */
 int i915_gem_init_stolen(struct drm_device *dev);
+int i915_gem_stolen_setup_compression(struct drm_device *dev);
 void i915_gem_cleanup_stolen(struct drm_device *dev);
 
 /* i915_gem_tiling.c */
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index a528e4a..00b1c1d 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -86,21 +86,13 @@ static unsigned long i915_stolen_to_physical(struct drm_device *dev)
 	return base;
 }
 
-static void i915_warn_stolen(struct drm_device *dev)
-{
-	DRM_INFO("not enough stolen space for compressed buffer, disabling\n");
-	DRM_INFO("hint: you may be able to increase stolen memory size in the BIOS to avoid this\n");
-}
-
-static void i915_setup_compression(struct drm_device *dev, int size)
+static int i915_setup_compression(struct drm_device *dev, int size)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_mm_node *compressed_fb, *uninitialized_var(compressed_llb);
-	unsigned long cfb_base;
-	unsigned long ll_base = 0;
 
-	/* Just in case the BIOS is doing something questionable. */
-	intel_disable_fbc(dev);
+	DRM_DEBUG_KMS("reserving %d bytes of contiguous stolen space for FBC\n",
+		      size);
 
 	compressed_fb = drm_mm_search_free(&dev_priv->mm.stolen, size, 4096, 0);
 	if (compressed_fb)
@@ -108,11 +100,11 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 	if (!compressed_fb)
 		goto err;
 
-	cfb_base = dev_priv->mm.stolen_base + compressed_fb->start;
-	if (!cfb_base)
-		goto err_fb;
-
-	if (!(IS_GM45(dev) || HAS_PCH_SPLIT(dev))) {
+	if (HAS_PCH_SPLIT(dev))
+		I915_WRITE(ILK_DPFC_CB_BASE, compressed_fb->start);
+	else if (IS_GM45(dev)) {
+		I915_WRITE(DPFC_CB_BASE, compressed_fb->start);
+	} else {
 		compressed_llb = drm_mm_search_free(&dev_priv->mm.stolen,
 						    4096, 4096, 0);
 		if (compressed_llb)
@@ -121,56 +113,82 @@ static void i915_setup_compression(struct drm_device *dev, int size)
 		if (!compressed_llb)
 			goto err_fb;
 
-		ll_base = dev_priv->mm.stolen_base + compressed_llb->start;
-		if (!ll_base)
-			goto err_llb;
-	}
+		dev_priv->compressed_llb = compressed_llb;
 
-	dev_priv->cfb_size = size;
+		I915_WRITE(FBC_CFB_BASE,
+			   dev_priv->mm.stolen_base + compressed_fb->start);
+		I915_WRITE(FBC_LL_BASE,
+			   dev_priv->mm.stolen_base + compressed_llb->start);
+	}
 
 	dev_priv->compressed_fb = compressed_fb;
-	if (HAS_PCH_SPLIT(dev))
-		I915_WRITE(ILK_DPFC_CB_BASE, compressed_fb->start);
-	else if (IS_GM45(dev)) {
-		I915_WRITE(DPFC_CB_BASE, compressed_fb->start);
-	} else {
-		I915_WRITE(FBC_CFB_BASE, cfb_base);
-		I915_WRITE(FBC_LL_BASE, ll_base);
-		dev_priv->compressed_llb = compressed_llb;
-	}
+	dev_priv->cfb_size = size;
 
-	DRM_DEBUG_KMS("FBC base 0x%08lx, ll base 0x%08lx, size %dM\n",
-		      (long)cfb_base, (long)ll_base, size >> 20);
-	return;
+	return size;
 
-err_llb:
-	drm_mm_put_block(compressed_llb);
 err_fb:
 	drm_mm_put_block(compressed_fb);
 err:
 	dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
-	i915_warn_stolen(dev);
+	DRM_INFO("not enough stolen space for compressed buffer (need %d bytes), disabling\n", size);
+	DRM_INFO("hint: you may be able to increase stolen memory size in the BIOS to avoid this\n");
+	return 0;
+}
+
+int i915_gem_stolen_setup_compression(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_mm_node *entry;
+	unsigned long size;
+
+	if (dev_priv->mm.stolen_base == 0)
+		return 0;
+
+	if (dev_priv->cfb_size)
+		return dev_priv->cfb_size;
+
+	/* Try to set up FBC with a reasonable compressed buffer size */
+	size = 0;
+	list_for_each_entry(entry, &dev_priv->mm.stolen.hole_stack, hole_stack) {
+		unsigned long hole_start = entry->start + entry->size;
+		unsigned long hole_end = list_entry(entry->node_list.next,
+						    struct drm_mm_node,
+						    node_list)->start;
+		unsigned long hole_size = hole_end - hole_start;
+		if (hole_size > size)
+			size = hole_size;
+	}
+
+	/* Try to get a 32M buffer... */
+	if (size > (36*1024*1024))
+		size = 32*1024*1024;
+	else /* fall back to 3/4 of the stolen space */
+		size = size * 3 / 4;
+
+	return i915_setup_compression(dev, size);
 }
 
 static void i915_cleanup_compression(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	drm_mm_put_block(dev_priv->compressed_fb);
+	if (dev_priv->compressed_fb)
+		drm_mm_put_block(dev_priv->compressed_fb);
+
 	if (dev_priv->compressed_llb)
 		drm_mm_put_block(dev_priv->compressed_llb);
+
+	dev_priv->cfb_size = 0;
 }
 
 void i915_gem_cleanup_stolen(struct drm_device *dev)
 {
-	if (I915_HAS_FBC(dev) && i915_powersave)
-		i915_cleanup_compression(dev);
+	i915_cleanup_compression(dev);
 }
 
 int i915_gem_init_stolen(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long prealloc_size = dev_priv->mm.gtt->stolen_size;
 
 	dev_priv->mm.stolen_base = i915_stolen_to_physical(dev);
 	if (dev_priv->mm.stolen_base == 0)
@@ -180,21 +198,7 @@ int i915_gem_init_stolen(struct drm_device *dev)
 		      dev_priv->mm.gtt->stolen_size, dev_priv->mm.stolen_base);
 
 	/* Basic memrange allocator for stolen space */
-	drm_mm_init(&dev_priv->mm.stolen, 0, prealloc_size);
-
-	/* Try to set up FBC with a reasonable compressed buffer size */
-	if (I915_HAS_FBC(dev) && i915_powersave) {
-		int cfb_size;
-
-		/* Leave 1M for line length buffer & misc. */
-
-		/* Try to get a 32M buffer... */
-		if (prealloc_size > (36*1024*1024))
-			cfb_size = 32*1024*1024;
-		else /* fall back to 7/8 of the stolen space */
-			cfb_size = prealloc_size * 7 / 8;
-		i915_setup_compression(dev, cfb_size);
-	}
+	drm_mm_init(&dev_priv->mm.stolen, 0, dev_priv->mm.gtt->stolen_size);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index e3afe96..47cec72 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -7173,6 +7173,9 @@ void intel_modeset_init(struct drm_device *dev)
 	/* Just disable it once at startup */
 	i915_disable_vga(dev);
 	intel_setup_outputs(dev);
+
+	/* Just in case the BIOS is doing something questionable. */
+	intel_disable_fbc(dev);
 }
 
 void intel_modeset_gem_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 3021c18..6f0f498 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -438,12 +438,6 @@ void intel_update_fbc(struct drm_device *dev)
 		dev_priv->no_fbc_reason = FBC_MODULE_PARAM;
 		goto out_disable;
 	}
-	if (intel_fb->obj->base.size > dev_priv->cfb_size) {
-		DRM_DEBUG_KMS("framebuffer too large, disabling "
-			      "compression\n");
-		dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
-		goto out_disable;
-	}
 	if ((crtc->mode.flags & DRM_MODE_FLAG_INTERLACE) ||
 	    (crtc->mode.flags & DRM_MODE_FLAG_DBLSCAN)) {
 		DRM_DEBUG_KMS("mode incompatible with compression, "
@@ -477,6 +471,13 @@ void intel_update_fbc(struct drm_device *dev)
 	if (in_dbg_master())
 		goto out_disable;
 
+	if (intel_fb->obj->base.size > i915_gem_stolen_setup_compression(dev)) {
+		DRM_DEBUG_KMS("framebuffer too large, disabling "
+			      "compression\n");
+		dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
+		goto out_disable;
+	}
+
 	/* If the scanout has not changed, don't modify the FBC settings.
 	 * Note that we make the fundamental assumption that the fb->obj
 	 * cannot be unpinned (and have its GTT offset and fence revoked)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 19/29] drm/i915: Allow objects to be created with no backing pages, but stolen space
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (17 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 18/29] drm/i915: Delay allocation of stolen space for FBC Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 20/29] drm/i915: Differentiate between prime and stolen objects Chris Wilson
                   ` (9 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

In order to accommodate objects that are not backed by struct pages, but
instead point into a contiguous region of stolen space, we need to make
various changes to avoid dereferencing obj->pages or obj->base.filp.

First introduce a marker for the stolen object, that specifies its
offset into the stolen region and implies that it has no backing pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c |    2 ++
 drivers/gpu/drm/i915/i915_drv.h     |    2 ++
 2 files changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 229cf27..fd0ca3e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -125,6 +125,8 @@ describe_obj(struct seq_file *m, struct drm_i915_gem_object *obj)
 	if (obj->gtt_space != NULL)
 		seq_printf(m, " (gtt offset: %08x, size: %08x)",
 			   obj->gtt_offset, (unsigned int)obj->gtt_space->size);
+	if (obj->stolen)
+		seq_printf(m, " (stolen: %08lx)", obj->stolen->start);
 	if (obj->pin_mappable || obj->fault_mappable) {
 		char s[3], *t = s;
 		if (obj->pin_mappable)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 0fec169..d6cf758 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -908,6 +908,8 @@ struct drm_i915_gem_object {
 
 	/** Current space allocated to this object in the GTT, if any. */
 	struct drm_mm_node *gtt_space;
+	/** Stolen memory for this object, instead of being backed by shmem. */
+	struct drm_mm_node *stolen;
 	struct list_head gtt_list;
 
 	/** This object's place on the active/inactive lists */
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 20/29] drm/i915: Differentiate between prime and stolen objects
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (18 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 19/29] drm/i915: Allow objects to be created with no backing pages, but stolen space Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 21/29] drm/i915: Support readback of stolen objects upon error Chris Wilson
                   ` (8 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Stolen objects also share the property that they have no backing shmemfs
filp, but they can be used with pwrite/pread/gtt-mapping.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h |    5 +++++
 drivers/gpu/drm/i915/i915_gem.c |    4 ++--
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d6cf758..02de587 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1044,6 +1044,11 @@ struct drm_i915_gem_object {
 	atomic_t pending_flip;
 };
 
+inline static bool i915_gem_object_is_prime(struct drm_i915_gem_object *obj)
+{
+	return obj->base.import_attach != NULL;
+}
+
 #define to_intel_bo(x) container_of(x, struct drm_i915_gem_object, base)
 
 /**
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d0fcb61..552f95b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -553,7 +553,7 @@ i915_gem_pread_ioctl(struct drm_device *dev, void *data,
 	/* prime objects have no backing filp to GEM pread/pwrite
 	 * pages from.
 	 */
-	if (!obj->base.filp) {
+	if (i915_gem_object_is_prime(obj)) {
 		ret = -EINVAL;
 		goto out;
 	}
@@ -902,7 +902,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data,
 	/* prime objects have no backing filp to GEM pread/pwrite
 	 * pages from.
 	 */
-	if (!obj->base.filp) {
+	if (i915_gem_object_is_prime(obj)) {
 		ret = -EINVAL;
 		goto out;
 	}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 21/29] drm/i915: Support readback of stolen objects upon error
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (19 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 20/29] drm/i915: Differentiate between prime and stolen objects Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 22/29] drm/i915: Handle stolen objects in pwrite Chris Wilson
                   ` (7 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_irq.c |    8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 2ec8ad7..f59eace 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -895,6 +895,14 @@ i915_error_object_create(struct drm_i915_private *dev_priv,
 						     reloc_offset);
 			memcpy_fromio(d, s, PAGE_SIZE);
 			io_mapping_unmap_atomic(s);
+		} else if (src->stolen) {
+			unsigned long offset;
+
+			offset = dev_priv->mm.stolen_base;
+			offset += src->stolen->start;
+			offset += i << PAGE_SHIFT;
+
+			memcpy_fromio(d, (void *)offset, PAGE_SIZE);
 		} else {
 			struct page *page;
 			void *s;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 22/29] drm/i915: Handle stolen objects in pwrite
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (20 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 21/29] drm/i915: Support readback of stolen objects upon error Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-20 19:56   ` Daniel Vetter
  2012-08-11 14:41 ` [PATCH 23/29] drm/i915: Handle stolen objects for pread Chris Wilson
                   ` (6 subsequent siblings)
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |  169 +++++++++++++++++++++++++--------------
 1 file changed, 111 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 552f95b..a2fb2aa 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -664,19 +664,17 @@ out:
  * needs_clflush_before is set and flushes out any written cachelines after
  * writing if needs_clflush is set. */
 static int
-shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
+shmem_pwrite_fast(char *vaddr, int shmem_page_offset, int page_length,
 		  char __user *user_data,
 		  bool page_do_bit17_swizzling,
 		  bool needs_clflush_before,
 		  bool needs_clflush_after)
 {
-	char *vaddr;
 	int ret;
 
 	if (unlikely(page_do_bit17_swizzling))
 		return -EINVAL;
 
-	vaddr = kmap_atomic(page);
 	if (needs_clflush_before)
 		drm_clflush_virt_range(vaddr + shmem_page_offset,
 				       page_length);
@@ -686,7 +684,6 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
 	if (needs_clflush_after)
 		drm_clflush_virt_range(vaddr + shmem_page_offset,
 				       page_length);
-	kunmap_atomic(vaddr);
 
 	return ret ? -EFAULT : 0;
 }
@@ -694,16 +691,14 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
 /* Only difference to the fast-path function is that this can handle bit17
  * and uses non-atomic copy and kmap functions. */
 static int
-shmem_pwrite_slow(struct page *page, int shmem_page_offset, int page_length,
+shmem_pwrite_slow(char *vaddr, int shmem_page_offset, int page_length,
 		  char __user *user_data,
 		  bool page_do_bit17_swizzling,
 		  bool needs_clflush_before,
 		  bool needs_clflush_after)
 {
-	char *vaddr;
 	int ret;
 
-	vaddr = kmap(page);
 	if (unlikely(needs_clflush_before || page_do_bit17_swizzling))
 		shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
 					     page_length,
@@ -720,7 +715,6 @@ shmem_pwrite_slow(struct page *page, int shmem_page_offset, int page_length,
 		shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
 					     page_length,
 					     page_do_bit17_swizzling);
-	kunmap(page);
 
 	return ret ? -EFAULT : 0;
 }
@@ -731,6 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		      struct drm_i915_gem_pwrite *args,
 		      struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	ssize_t remain;
 	loff_t offset;
 	char __user *user_data;
@@ -770,74 +765,132 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 	if (ret)
 		return ret;
 
-	i915_gem_object_pin_pages(obj);
-
 	offset = args->offset;
 	obj->dirty = 1;
 
-	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
-		struct page *page;
-		int partial_cacheline_write;
+	if (obj->stolen) {
+		while (remain > 0) {
+			char *vaddr;
+			int partial_cacheline_write;
 
-		if (i < offset >> PAGE_SHIFT)
-			continue;
+			/* Operation in this page
+			 *
+			 * shmem_page_offset = offset within page in shmem file
+			 * page_length = bytes to copy for this page
+			 */
+			shmem_page_offset = offset_in_page(offset);
 
-		if (remain <= 0)
-			break;
+			page_length = remain;
+			if ((shmem_page_offset + page_length) > PAGE_SIZE)
+				page_length = PAGE_SIZE - shmem_page_offset;
 
-		/* Operation in this page
-		 *
-		 * shmem_page_offset = offset within page in shmem file
-		 * page_length = bytes to copy for this page
-		 */
-		shmem_page_offset = offset_in_page(offset);
+			/* If we don't overwrite a cacheline completely we need to be
+			 * careful to have up-to-date data by first clflushing. Don't
+			 * overcomplicate things and flush the entire patch. */
+			partial_cacheline_write = needs_clflush_before &&
+				((shmem_page_offset | page_length)
+				 & (boot_cpu_data.x86_clflush_size - 1));
 
-		page_length = remain;
-		if ((shmem_page_offset + page_length) > PAGE_SIZE)
-			page_length = PAGE_SIZE - shmem_page_offset;
+			vaddr = (char *)(dev_priv->mm.stolen_base + obj->stolen->start + offset);
+			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+				((uintptr_t)vaddr & (1 << 17)) != 0;
 
-		/* If we don't overwrite a cacheline completely we need to be
-		 * careful to have up-to-date data by first clflushing. Don't
-		 * overcomplicate things and flush the entire patch. */
-		partial_cacheline_write = needs_clflush_before &&
-			((shmem_page_offset | page_length)
-				& (boot_cpu_data.x86_clflush_size - 1));
+			ret = shmem_pwrite_fast(vaddr, shmem_page_offset, page_length,
+						user_data, page_do_bit17_swizzling,
+						partial_cacheline_write,
+						needs_clflush_after);
 
-		page = sg_page(sg);
-		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
-			(page_to_phys(page) & (1 << 17)) != 0;
+			if (ret == 0)
+				goto next_stolen;
 
-		ret = shmem_pwrite_fast(page, shmem_page_offset, page_length,
-					user_data, page_do_bit17_swizzling,
-					partial_cacheline_write,
-					needs_clflush_after);
-		if (ret == 0)
-			goto next_page;
+			hit_slowpath = 1;
+			mutex_unlock(&dev->struct_mutex);
 
-		hit_slowpath = 1;
-		mutex_unlock(&dev->struct_mutex);
-		ret = shmem_pwrite_slow(page, shmem_page_offset, page_length,
-					user_data, page_do_bit17_swizzling,
-					partial_cacheline_write,
-					needs_clflush_after);
+			ret = shmem_pwrite_slow(vaddr, shmem_page_offset, page_length,
+						user_data, page_do_bit17_swizzling,
+						partial_cacheline_write,
+						needs_clflush_after);
 
-		mutex_lock(&dev->struct_mutex);
+			mutex_lock(&dev->struct_mutex);
+			if (ret)
+				goto out;
 
-next_page:
-		set_page_dirty(page);
-		mark_page_accessed(page);
+next_stolen:
+			remain -= page_length;
+			user_data += page_length;
+			offset += page_length;
+		}
+	} else {
+		i915_gem_object_pin_pages(obj);
 
-		if (ret)
-			goto out;
+		for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
+			struct page *page;
+			char *vaddr;
+			int partial_cacheline_write;
 
-		remain -= page_length;
-		user_data += page_length;
-		offset += page_length;
+			if (i < offset >> PAGE_SHIFT)
+				continue;
+
+			if (remain <= 0)
+				break;
+
+			/* Operation in this page
+			 *
+			 * shmem_page_offset = offset within page in shmem file
+			 * page_length = bytes to copy for this page
+			 */
+			shmem_page_offset = offset_in_page(offset);
+
+			page_length = remain;
+			if ((shmem_page_offset + page_length) > PAGE_SIZE)
+				page_length = PAGE_SIZE - shmem_page_offset;
+
+			/* If we don't overwrite a cacheline completely we need to be
+			 * careful to have up-to-date data by first clflushing. Don't
+			 * overcomplicate things and flush the entire patch. */
+			partial_cacheline_write = needs_clflush_before &&
+				((shmem_page_offset | page_length)
+				 & (boot_cpu_data.x86_clflush_size - 1));
+
+			page = sg_page(sg);
+			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+				(page_to_phys(page) & (1 << 17)) != 0;
+
+			vaddr = kmap_atomic(page);
+			ret = shmem_pwrite_fast(vaddr, shmem_page_offset, page_length,
+						user_data, page_do_bit17_swizzling,
+						partial_cacheline_write,
+						needs_clflush_after);
+
+			kunmap_atomic(vaddr);
+
+			if (ret == 0)
+				goto next_page;
+
+			hit_slowpath = 1;
+			mutex_unlock(&dev->struct_mutex);
+
+			vaddr = kmap(page);
+			ret = shmem_pwrite_slow(vaddr, shmem_page_offset, page_length,
+						user_data, page_do_bit17_swizzling,
+						partial_cacheline_write,
+						needs_clflush_after);
+			kunmap(page);
+
+			mutex_lock(&dev->struct_mutex);
+			if (ret)
+				goto out_unpin;
+
+next_page:
+			remain -= page_length;
+			user_data += page_length;
+			offset += page_length;
+		}
+out_unpin:
+		i915_gem_object_unpin_pages(obj);
 	}
 
 out:
-	i915_gem_object_unpin_pages(obj);
-
 	if (hit_slowpath) {
 		/* Fixup: Kill any reinstated backing storage pages */
 		if (obj->madv == __I915_MADV_PURGED)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 23/29] drm/i915: Handle stolen objects for pread
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (21 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 22/29] drm/i915: Handle stolen objects in pwrite Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 24/29] drm/i915: Introduce i915_gem_object_create_stolen() Chris Wilson
                   ` (5 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c |  175 ++++++++++++++++++++++++++-------------
 1 file changed, 116 insertions(+), 59 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a2fb2aa..a29c259 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -324,24 +324,21 @@ __copy_from_user_swizzled(char *gpu_vaddr, int gpu_offset,
  * Flushes invalid cachelines before reading the target if
  * needs_clflush is set. */
 static int
-shmem_pread_fast(struct page *page, int shmem_page_offset, int page_length,
+shmem_pread_fast(char *vaddr, int shmem_page_offset, int page_length,
 		 char __user *user_data,
 		 bool page_do_bit17_swizzling, bool needs_clflush)
 {
-	char *vaddr;
 	int ret;
 
 	if (unlikely(page_do_bit17_swizzling))
 		return -EINVAL;
 
-	vaddr = kmap_atomic(page);
 	if (needs_clflush)
 		drm_clflush_virt_range(vaddr + shmem_page_offset,
 				       page_length);
 	ret = __copy_to_user_inatomic(user_data,
 				      vaddr + shmem_page_offset,
 				      page_length);
-	kunmap_atomic(vaddr);
 
 	return ret ? -EFAULT : 0;
 }
@@ -371,14 +368,12 @@ shmem_clflush_swizzled_range(char *addr, unsigned long length,
 /* Only difference to the fast-path function is that this can handle bit17
  * and uses non-atomic copy and kmap functions. */
 static int
-shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
+shmem_pread_slow(char *vaddr, int shmem_page_offset, int page_length,
 		 char __user *user_data,
 		 bool page_do_bit17_swizzling, bool needs_clflush)
 {
-	char *vaddr;
 	int ret;
 
-	vaddr = kmap(page);
 	if (needs_clflush)
 		shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
 					     page_length,
@@ -392,7 +387,6 @@ shmem_pread_slow(struct page *page, int shmem_page_offset, int page_length,
 		ret = __copy_to_user(user_data,
 				     vaddr + shmem_page_offset,
 				     page_length);
-	kunmap(page);
 
 	return ret ? - EFAULT : 0;
 }
@@ -403,6 +397,7 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		     struct drm_i915_gem_pread *args,
 		     struct drm_file *file)
 {
+	struct drm_i915_private *dev_priv = dev->dev_private;
 	char __user *user_data;
 	ssize_t remain;
 	loff_t offset;
@@ -433,76 +428,138 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		}
 	}
 
-	ret = i915_gem_object_get_pages(obj);
-	if (ret)
-		return ret;
+	offset = args->offset;
 
-	i915_gem_object_pin_pages(obj);
+	if (obj->stolen) {
+		char *vaddr;
 
-	offset = args->offset;
+		vaddr = (char *)dev_priv->mm.stolen_base;
+		vaddr += obj->stolen->start + offset;
 
-	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
-		struct page *page;
+		shmem_page_offset = offset_in_page(offset);
+		while (remain > 0) {
+			/* Operation in this page
+			 *
+			 * shmem_page_offset = offset within page in shmem file
+			 * page_length = bytes to copy for this page
+			 */
+			page_length = remain;
+			if ((shmem_page_offset + page_length) > PAGE_SIZE)
+				page_length = PAGE_SIZE - shmem_page_offset;
 
-		if (i < offset >> PAGE_SHIFT)
-			continue;
+			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+				((uintptr_t)vaddr & (1 << 17)) != 0;
 
-		if (remain <= 0)
-			break;
+			ret = shmem_pread_fast(vaddr, shmem_page_offset, page_length,
+					       user_data, page_do_bit17_swizzling,
+					       needs_clflush);
+			if (ret == 0)
+				goto next_stolen;
 
-		/* Operation in this page
-		 *
-		 * shmem_page_offset = offset within page in shmem file
-		 * page_length = bytes to copy for this page
-		 */
-		shmem_page_offset = offset_in_page(offset);
-		page_length = remain;
-		if ((shmem_page_offset + page_length) > PAGE_SIZE)
-			page_length = PAGE_SIZE - shmem_page_offset;
+			hit_slowpath = 1;
+			mutex_unlock(&dev->struct_mutex);
 
-		page = sg_page(sg);
-		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
-			(page_to_phys(page) & (1 << 17)) != 0;
+			if (!prefaulted) {
+				ret = fault_in_multipages_writeable(user_data, remain);
+				/* Userspace is tricking us, but we've already clobbered
+				 * its pages with the prefault and promised to write the
+				 * data up to the first fault. Hence ignore any errors
+				 * and just continue. */
+				(void)ret;
+				prefaulted = 1;
+			}
 
-		ret = shmem_pread_fast(page, shmem_page_offset, page_length,
-				       user_data, page_do_bit17_swizzling,
-				       needs_clflush);
-		if (ret == 0)
-			goto next_page;
+			ret = shmem_pread_slow(vaddr, shmem_page_offset, page_length,
+					       user_data, page_do_bit17_swizzling,
+					       needs_clflush);
 
-		hit_slowpath = 1;
-		mutex_unlock(&dev->struct_mutex);
+			mutex_lock(&dev->struct_mutex);
+			if (ret)
+				goto out;
 
-		if (!prefaulted) {
-			ret = fault_in_multipages_writeable(user_data, remain);
-			/* Userspace is tricking us, but we've already clobbered
-			 * its pages with the prefault and promised to write the
-			 * data up to the first fault. Hence ignore any errors
-			 * and just continue. */
-			(void)ret;
-			prefaulted = 1;
+next_stolen:
+			remain -= page_length;
+			user_data += page_length;
+			vaddr += page_length;
+			shmem_page_offset = 0;
 		}
+	} else {
+		ret = i915_gem_object_get_pages(obj);
+		if (ret)
+			return ret;
 
-		ret = shmem_pread_slow(page, shmem_page_offset, page_length,
-				       user_data, page_do_bit17_swizzling,
-				       needs_clflush);
+		i915_gem_object_pin_pages(obj);
 
-		mutex_lock(&dev->struct_mutex);
+		for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
+			char *vaddr;
+			struct page *page;
 
-next_page:
-		mark_page_accessed(page);
+			if (i < offset >> PAGE_SHIFT)
+				continue;
 
-		if (ret)
-			goto out;
+			if (remain <= 0)
+				break;
 
-		remain -= page_length;
-		user_data += page_length;
-		offset += page_length;
+			/* Operation in this page
+			 *
+			 * shmem_page_offset = offset within page in shmem file
+			 * page_length = bytes to copy for this page
+			 */
+			shmem_page_offset = offset_in_page(offset);
+			page_length = remain;
+			if ((shmem_page_offset + page_length) > PAGE_SIZE)
+				page_length = PAGE_SIZE - shmem_page_offset;
+
+			page = sg_page(sg);
+			mark_page_accessed(page);
+
+			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
+				(page_to_phys(page) & (1 << 17)) != 0;
+
+			vaddr = kmap_atomic(page);
+			ret = shmem_pread_fast(vaddr, shmem_page_offset, page_length,
+					       user_data, page_do_bit17_swizzling,
+					       needs_clflush);
+			kunmap_atomic(vaddr);
+
+			if (ret == 0)
+				goto next_page;
+
+			hit_slowpath = 1;
+			mutex_unlock(&dev->struct_mutex);
+
+			if (!prefaulted) {
+				ret = fault_in_multipages_writeable(user_data, remain);
+				/* Userspace is tricking us, but we've already clobbered
+				 * its pages with the prefault and promised to write the
+				 * data up to the first fault. Hence ignore any errors
+				 * and just continue. */
+				(void)ret;
+				prefaulted = 1;
+			}
+
+			vaddr = kmap(page);
+			ret = shmem_pread_slow(vaddr, shmem_page_offset, page_length,
+					       user_data, page_do_bit17_swizzling,
+					       needs_clflush);
+			kunmap(page);
+
+			mutex_lock(&dev->struct_mutex);
+
+			if (ret)
+				goto out_unpin;
+
+next_page:
+			remain -= page_length;
+			user_data += page_length;
+			offset += page_length;
+		}
+out_unpin:
+		i915_gem_object_unpin_pages(obj);
 	}
 
-out:
-	i915_gem_object_unpin_pages(obj);
 
+out:
 	if (hit_slowpath) {
 		/* Fixup: Kill any reinstated backing storage pages */
 		if (obj->madv == __I915_MADV_PURGED)
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 24/29] drm/i915: Introduce i915_gem_object_create_stolen()
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (22 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 23/29] drm/i915: Handle stolen objects for pread Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 25/29] drm/i915: Allocate fbcon from stolen memory Chris Wilson
                   ` (4 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Allow for the creation of GEM objects backed by stolen memory. As these
are not backed by ordinary pages, we create a fake dma mapping and store
the address in the scatterlist rather than obj->pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h        |    3 +
 drivers/gpu/drm/i915/i915_gem.c        |    1 +
 drivers/gpu/drm/i915/i915_gem_stolen.c |  122 ++++++++++++++++++++++++++++++++
 3 files changed, 126 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 02de587..03cd1d6 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1495,6 +1495,9 @@ int i915_gem_evict_everything(struct drm_device *dev);
 int i915_gem_init_stolen(struct drm_device *dev);
 int i915_gem_stolen_setup_compression(struct drm_device *dev);
 void i915_gem_cleanup_stolen(struct drm_device *dev);
+struct drm_i915_gem_object *
+i915_gem_object_create_stolen(struct drm_device *dev, u32 size);
+void i915_gem_object_release_stolen(struct drm_i915_gem_object *obj);
 
 /* i915_gem_tiling.c */
 void i915_gem_detect_bit_6_swizzle(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a29c259..df73e02 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3808,6 +3808,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	obj->pages_pin_count = 0;
 	i915_gem_object_put_pages(obj);
 	i915_gem_object_free_mmap_offset(obj);
+	i915_gem_object_release_stolen(obj);
 
 	BUG_ON(obj->pages);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index 00b1c1d..eca3af1 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -202,3 +202,125 @@ int i915_gem_init_stolen(struct drm_device *dev)
 
 	return 0;
 }
+
+static struct sg_table *
+i915_pages_create_for_stolen(struct drm_device *dev,
+			     u32 offset, u32 size)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct sg_table *st;
+	struct scatterlist *sg;
+
+	/* We hide that we have no struct page backing our stolen object
+	 * by wrapping the contiguous physical allocation with a fake
+	 * dma mapping in a single scatterlist.
+	 */
+
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (st == NULL)
+		return NULL;
+
+	if (!sg_alloc_table(st, 1, GFP_KERNEL)) {
+		kfree(st);
+		return NULL;
+	}
+
+	sg = st->sgl;
+	sg->offset = offset;
+	sg->length = size;
+
+	sg_dma_address(sg) = dev_priv->mm.stolen_base + offset;
+	sg_dma_len(sg) = size;
+
+	return st;
+}
+
+static int i915_gem_object_get_pages_stolen(struct drm_i915_gem_object *obj)
+{
+	BUG();
+	return -EINVAL;
+}
+
+static void i915_gem_object_put_pages_stolen(struct drm_i915_gem_object *obj)
+{
+	/* Should only be called during free */
+	sg_free_table(obj->pages);
+	kfree(obj->pages);
+}
+
+static const struct drm_i915_gem_object_ops i915_gem_object_stolen_ops = {
+	.get_pages = i915_gem_object_get_pages_stolen,
+	.put_pages = i915_gem_object_put_pages_stolen,
+};
+
+struct drm_i915_gem_object *
+_i915_gem_object_create_stolen(struct drm_device *dev,
+			       struct drm_mm_node *stolen)
+{
+	struct drm_i915_gem_object *obj;
+
+	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	if (obj == NULL)
+		return NULL;
+
+	if (drm_gem_private_object_init(dev, &obj->base, stolen->size))
+		goto cleanup;
+
+	i915_gem_object_init(obj, &i915_gem_object_stolen_ops);
+
+	obj->pages = i915_pages_create_for_stolen(dev,
+						  stolen->start, stolen->size);
+	if (obj->pages == NULL)
+		goto cleanup;
+
+	obj->has_dma_mapping = true;
+	obj->pages_pin_count = 1;
+	obj->stolen = stolen;
+
+	obj->base.write_domain = I915_GEM_DOMAIN_GTT;
+	obj->base.read_domains = I915_GEM_DOMAIN_GTT;
+	obj->cache_level = I915_CACHE_NONE;
+
+	return obj;
+
+cleanup:
+	kfree(obj);
+	return NULL;
+}
+
+struct drm_i915_gem_object *
+i915_gem_object_create_stolen(struct drm_device *dev, u32 size)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_object *obj;
+	struct drm_mm_node *stolen;
+
+	if (dev_priv->mm.stolen_base == 0)
+		return 0;
+
+	DRM_DEBUG_KMS("creating stolen object: size=%x\n", size);
+	if (size == 0)
+		return NULL;
+
+	stolen = drm_mm_search_free(&dev_priv->mm.stolen, size, 4096, 0);
+	if (stolen)
+		stolen = drm_mm_get_block(stolen, size, 4096);
+	if (stolen == NULL)
+		return NULL;
+
+	obj = _i915_gem_object_create_stolen(dev, stolen);
+	if (obj)
+		return obj;
+
+	drm_mm_put_block(stolen);
+	return NULL;
+}
+
+void
+i915_gem_object_release_stolen(struct drm_i915_gem_object *obj)
+{
+	if (obj->stolen) {
+		drm_mm_put_block(obj->stolen);
+		obj->stolen = NULL;
+	}
+}
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 25/29] drm/i915: Allocate fbcon from stolen memory
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (23 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 24/29] drm/i915: Introduce i915_gem_object_create_stolen() Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 26/29] drm/i915: Allocate ringbuffers " Chris Wilson
                   ` (3 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_fb.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_fb.c b/drivers/gpu/drm/i915/intel_fb.c
index 97f6735..9de9cd9 100644
--- a/drivers/gpu/drm/i915/intel_fb.c
+++ b/drivers/gpu/drm/i915/intel_fb.c
@@ -84,7 +84,9 @@ static int intelfb_create(struct intel_fbdev *ifbdev,
 
 	size = mode_cmd.pitches[0] * mode_cmd.height;
 	size = ALIGN(size, PAGE_SIZE);
-	obj = i915_gem_alloc_object(dev, size);
+	obj = i915_gem_object_create_stolen(dev, size);
+	if (obj == NULL)
+		obj = i915_gem_alloc_object(dev, size);
 	if (!obj) {
 		DRM_ERROR("failed to allocate framebuffer\n");
 		ret = -ENOMEM;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 26/29] drm/i915: Allocate ringbuffers from stolen memory
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (24 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 25/29] drm/i915: Allocate fbcon from stolen memory Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 27/29] drm/i915: Allocate overlay registers " Chris Wilson
                   ` (2 subsequent siblings)
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5fdd297..7e6f8f4 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1016,7 +1016,11 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 			return ret;
 	}
 
-	obj = i915_gem_alloc_object(dev, ring->size);
+	obj = NULL;
+	if (!HAS_LLC(dev))
+		obj = i915_gem_object_create_stolen(dev, ring->size);
+	if (obj == NULL)
+		obj = i915_gem_alloc_object(dev, ring->size);
 	if (obj == NULL) {
 		DRM_ERROR("Failed to allocate ringbuffer\n");
 		ret = -ENOMEM;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 27/29] drm/i915: Allocate overlay registers from stolen memory
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (25 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 26/29] drm/i915: Allocate ringbuffers " Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-20 21:17   ` Daniel Vetter
  2012-08-11 14:41 ` [PATCH 28/29] drm/i915: Use a slab for object allocation Chris Wilson
  2012-08-11 14:41 ` [PATCH 29/29] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Chris Wilson
  28 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_overlay.c |    6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 7a98459..6982191 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -1424,8 +1424,10 @@ void intel_setup_overlay(struct drm_device *dev)
 
 	overlay->dev = dev;
 
-	reg_bo = i915_gem_alloc_object(dev, PAGE_SIZE);
-	if (!reg_bo)
+	reg_bo = i915_gem_object_create_stolen(dev, PAGE_SIZE);
+	if (reg_bo == NULL)
+		reg_bo = i915_gem_alloc_object(dev, PAGE_SIZE);
+	if (reg_bo == NULL)
 		goto out_free;
 	overlay->reg_bo = reg_bo;
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 28/29] drm/i915: Use a slab for object allocation
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (26 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 27/29] drm/i915: Allocate overlay registers " Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  2012-08-11 14:41 ` [PATCH 29/29] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Chris Wilson
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

The primary purpose of this was to debug some use-after-free memory
corruption that was causing an OOPS inside drm/i915. As it turned out
the corruption was being caused elsewhere and i915.ko as a major user of
many objects was being hit hardest.

Indeed as we do frequent the generic kmalloc caches, dedicating one to
ourselves (or at least naming one for us depending upon the core) aids
debugging our own slab usage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_dma.c        |    3 +++
 drivers/gpu/drm/i915/i915_drv.h        |    4 ++++
 drivers/gpu/drm/i915/i915_gem.c        |   28 +++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_dmabuf.c |    5 ++---
 drivers/gpu/drm/i915/i915_gem_stolen.c |    4 ++--
 5 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index df6490b..07f5b2e 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1759,6 +1759,9 @@ int i915_driver_unload(struct drm_device *dev)
 
 	destroy_workqueue(dev_priv->wq);
 
+	if (dev_priv->slab)
+		kmem_cache_destroy(dev_priv->slab);
+
 	pci_dev_put(dev_priv->bridge_dev);
 	kfree(dev->dev_private);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 03cd1d6..f0bf78f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -390,6 +390,7 @@ struct intel_gmbus {
 
 typedef struct drm_i915_private {
 	struct drm_device *dev;
+	struct kmem_cache *slab;
 
 	const struct intel_device_info *info;
 
@@ -1311,12 +1312,15 @@ int i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 int i915_gem_wait_ioctl(struct drm_device *dev, void *data,
 			struct drm_file *file_priv);
 void i915_gem_load(struct drm_device *dev);
+void *i915_gem_object_alloc(struct drm_device *dev);
+void i915_gem_object_free(struct drm_i915_gem_object *obj);
 int i915_gem_init_object(struct drm_gem_object *obj);
 void i915_gem_object_init(struct drm_i915_gem_object *obj,
 			 const struct drm_i915_gem_object_ops *ops);
 struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 						  size_t size);
 void i915_gem_free_object(struct drm_gem_object *obj);
+
 int __must_check i915_gem_object_pin(struct drm_i915_gem_object *obj,
 				     uint32_t alignment,
 				     bool map_and_fenceable,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index df73e02..2d7adbb 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -193,6 +193,18 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 	return 0;
 }
 
+void *i915_gem_object_alloc(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	return kmem_cache_alloc(dev_priv->slab, GFP_KERNEL | __GFP_ZERO);
+}
+
+void i915_gem_object_free(struct drm_i915_gem_object *obj)
+{
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
+	kmem_cache_free(dev_priv->slab, obj);
+}
+
 static int
 i915_gem_create(struct drm_file *file,
 		struct drm_device *dev,
@@ -216,7 +228,7 @@ i915_gem_create(struct drm_file *file,
 	if (ret) {
 		drm_gem_object_release(&obj->base);
 		i915_gem_info_remove_obj(dev->dev_private, obj->base.size);
-		kfree(obj);
+		i915_gem_object_free(obj);
 		return ret;
 	}
 
@@ -3731,12 +3743,12 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
 	struct address_space *mapping;
 	u32 mask;
 
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	obj = i915_gem_object_alloc(dev);
 	if (obj == NULL)
 		return NULL;
 
 	if (drm_gem_object_init(dev, &obj->base, size) != 0) {
-		kfree(obj);
+		i915_gem_object_free(obj);
 		return NULL;
 	}
 
@@ -3819,7 +3831,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 	i915_gem_info_remove_obj(dev_priv, obj->base.size);
 
 	kfree(obj->bit_17);
-	kfree(obj);
+	i915_gem_object_free(obj);
 }
 
 int
@@ -4197,8 +4209,14 @@ init_ring_lists(struct intel_ring_buffer *ring)
 void
 i915_gem_load(struct drm_device *dev)
 {
-	int i;
 	drm_i915_private_t *dev_priv = dev->dev_private;
+	int i;
+
+	dev_priv->slab =
+		kmem_cache_create("i915_gem_object",
+				  sizeof(struct drm_i915_gem_object), 0,
+				  SLAB_HWCACHE_ALIGN,
+				  NULL);
 
 	INIT_LIST_HEAD(&dev_priv->mm.active_list);
 	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index 5e72e95..9ba5e7f 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -260,8 +260,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 	if (IS_ERR(attach))
 		return ERR_CAST(attach);
 
-
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	obj = i915_gem_object_alloc(dev);
 	if (obj == NULL) {
 		ret = -ENOMEM;
 		goto fail_detach;
@@ -269,7 +268,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
 
 	ret = drm_gem_private_object_init(dev, &obj->base, dma_buf->size);
 	if (ret) {
-		kfree(obj);
+		i915_gem_object_free(obj);
 		goto fail_detach;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index eca3af1..4cdd6cc 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -259,7 +259,7 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 {
 	struct drm_i915_gem_object *obj;
 
-	obj = kzalloc(sizeof(*obj), GFP_KERNEL);
+	obj = i915_gem_object_alloc(dev);
 	if (obj == NULL)
 		return NULL;
 
@@ -284,7 +284,7 @@ _i915_gem_object_create_stolen(struct drm_device *dev,
 	return obj;
 
 cleanup:
-	kfree(obj);
+	i915_gem_object_free(obj);
 	return NULL;
 }
 
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 29/29] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl
  2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
                   ` (27 preceding siblings ...)
  2012-08-11 14:41 ` [PATCH 28/29] drm/i915: Use a slab for object allocation Chris Wilson
@ 2012-08-11 14:41 ` Chris Wilson
  28 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-11 14:41 UTC (permalink / raw)
  To: intel-gfx

By exporting the ability to map user address and inserting PTEs
representing their backing pages into the GTT, we can exploit UMA in order
to utilize normal application data as a texture source or even as a
render target (depending upon the capabilities of the chipset). This has
a number of uses, with zero-copy downloads to the GPU and efficient
readback making the intermixed streaming of CPU and GPU operations
fairly efficient. This ability has many widespread implications from
faster rendering of partial software fallbacks (xterm!) to faster
pipelining of texture data (such as pixel buffer objects in GL).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/Makefile           |    1 +
 drivers/gpu/drm/i915/i915_dma.c         |    1 +
 drivers/gpu/drm/i915/i915_drv.h         |   14 +++
 drivers/gpu/drm/i915/i915_gem.c         |    6 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c |  209 +++++++++++++++++++++++++++++++
 include/drm/i915_drm.h                  |   15 +++
 6 files changed, 243 insertions(+), 3 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem_userptr.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0f2c549..754d665 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -14,6 +14,7 @@ i915-y := i915_drv.o i915_dma.o i915_irq.o \
 	  i915_gem_gtt.o \
 	  i915_gem_stolen.o \
 	  i915_gem_tiling.o \
+	  i915_gem_userptr.o \
 	  i915_sysfs.o \
 	  i915_trace_points.o \
 	  intel_display.o \
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index 07f5b2e..62928fc 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -1884,6 +1884,7 @@ struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_CREATE, i915_gem_context_create_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_DESTROY, i915_gem_context_destroy_ioctl, DRM_UNLOCKED),
 	DRM_IOCTL_DEF_DRV(I915_REG_READ, i915_reg_read_ioctl, DRM_UNLOCKED),
+	DRM_IOCTL_DEF_DRV(I915_GEM_USERPTR, i915_gem_userptr_ioctl, DRM_MASTER|DRM_UNLOCKED),
 };
 
 int i915_max_ioctl = DRM_ARRAY_SIZE(i915_ioctls);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f0bf78f..50c635f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1045,6 +1045,18 @@ struct drm_i915_gem_object {
 	atomic_t pending_flip;
 };
 
+struct i915_gem_userptr_object {
+	struct drm_i915_gem_object gem;
+	uintptr_t user_ptr;
+	size_t user_size;
+	int read_only;
+};
+
+union drm_i915_gem_objects {
+	struct drm_i915_gem_object base;
+	struct i915_gem_userptr_object userptr;
+};
+
 inline static bool i915_gem_object_is_prime(struct drm_i915_gem_object *obj)
 {
 	return obj->base.import_attach != NULL;
@@ -1303,6 +1315,8 @@ int i915_gem_entervt_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv);
 int i915_gem_leavevt_ioctl(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv);
+int i915_gem_userptr_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file);
 int i915_gem_set_tiling(struct drm_device *dev, void *data,
 			struct drm_file *file_priv);
 int i915_gem_get_tiling(struct drm_device *dev, void *data,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2d7adbb..1c8904a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2517,9 +2517,9 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
+	obj->gtt_offset -= obj->gtt_space->start;
 	drm_mm_put_block(obj->gtt_space);
 	obj->gtt_space = NULL;
-	obj->gtt_offset = 0;
 
 	return 0;
 }
@@ -3035,7 +3035,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	list_move_tail(&obj->gtt_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
 
-	obj->gtt_offset = obj->gtt_space->start;
+	obj->gtt_offset += obj->gtt_space->start;
 
 	fenceable =
 		obj->gtt_space->size == fence_size &&
@@ -4214,7 +4214,7 @@ i915_gem_load(struct drm_device *dev)
 
 	dev_priv->slab =
 		kmem_cache_create("i915_gem_object",
-				  sizeof(struct drm_i915_gem_object), 0,
+				  sizeof(union drm_i915_gem_objects), 0,
 				  SLAB_HWCACHE_ALIGN,
 				  NULL);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
new file mode 100644
index 0000000..8604dad
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -0,0 +1,209 @@
+/*
+ * Copyright © 2012 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "drmP.h"
+#include "drm.h"
+#include "i915_drm.h"
+#include "i915_drv.h"
+#include "i915_trace.h"
+#include "intel_drv.h"
+#include <linux/swap.h>
+
+static struct i915_gem_userptr_object *to_userptr_object(struct drm_i915_gem_object *obj)
+{
+	return container_of(obj, struct i915_gem_userptr_object, gem);
+}
+
+static int
+i915_gem_userptr_get_pages(struct drm_i915_gem_object *obj)
+{
+	struct i915_gem_userptr_object *vmap = to_userptr_object(obj);
+	int num_pages = obj->base.size >> PAGE_SHIFT;
+	struct sg_table *st;
+	struct scatterlist *sg;
+	struct page **pvec;
+	int n, pinned, ret;
+
+	if (!access_ok(vmap->read_only ? VERIFY_READ : VERIFY_WRITE,
+		       (char __user *)vmap->user_ptr, vmap->user_size))
+		return -EFAULT;
+
+	/* If userspace should engineer that these pages are replaced in
+	 * the vma between us binding this page into the GTT and completion
+	 * of rendering... Their loss. If they change the mapping of their
+	 * pages they need to create a new bo to point to the new vma.
+	 *
+	 * However, that still leaves open the possibility of the vma
+	 * being copied upon fork. Which falls under the same userspace
+	 * synchronisation issue as a regular bo, except that this time
+	 * the process may not be expecting that a particular piece of
+	 * memory is tied to the GPU.
+	 */
+
+	pvec = kmalloc(num_pages*sizeof(struct page *),
+		       GFP_KERNEL | __GFP_NOWARN | __GFP_NORETRY);
+	if (pvec == NULL) {
+		pvec = drm_malloc_ab(num_pages, sizeof(struct page *));
+		if (pvec == NULL)
+			return -ENOMEM;
+	}
+
+	pinned = __get_user_pages_fast(vmap->user_ptr, num_pages,
+				       !vmap->read_only, pvec);
+	if (pinned < num_pages) {
+		struct mm_struct *mm = current->mm;
+
+		mutex_unlock(&obj->base.dev->struct_mutex);
+		down_read(&mm->mmap_sem);
+		ret = get_user_pages(current, mm,
+				     vmap->user_ptr + (pinned << PAGE_SHIFT),
+				     num_pages - pinned,
+				     !vmap->read_only, 0,
+				     pvec + pinned,
+				     NULL);
+		up_read(&mm->mmap_sem);
+		mutex_lock(&obj->base.dev->struct_mutex);
+		if (ret > 0)
+			pinned += ret;
+
+		if (obj->pages || pinned < num_pages) {
+			ret = obj->pages ? 0 : -EFAULT;
+			goto cleanup_pinned;
+		}
+	}
+
+	st = kmalloc(sizeof(*st), GFP_KERNEL);
+	if (st == NULL) {
+		ret = -ENOMEM;
+		goto cleanup_pinned;
+	}
+
+	if (sg_alloc_table(st, num_pages, GFP_KERNEL)) {
+		ret = -ENOMEM;
+		goto cleanup_st;
+	}
+
+	for_each_sg(st->sgl, sg, num_pages, n)
+		sg_set_page(sg, pvec[n], PAGE_SIZE, 0);
+	drm_free_large(pvec);
+
+	obj->pages = st;
+	return 0;
+
+cleanup_st:
+	kfree(st);
+cleanup_pinned:
+	release_pages(pvec, pinned, 0);
+	drm_free_large(pvec);
+	return ret;
+}
+
+static void
+i915_gem_userptr_put_pages(struct drm_i915_gem_object *obj)
+{
+	struct scatterlist *sg;
+	int i;
+
+	if (obj->madv != I915_MADV_WILLNEED)
+		obj->dirty = 0;
+
+	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
+		struct page *page = sg_page(sg);
+
+		if (obj->dirty)
+			set_page_dirty(page);
+
+		mark_page_accessed(page);
+		page_cache_release(page);
+	}
+	obj->dirty = 0;
+
+	sg_free_table(obj->pages);
+	kfree(obj->pages);
+}
+
+static const struct drm_i915_gem_object_ops i915_gem_userptr_ops = {
+	.get_pages = i915_gem_userptr_get_pages,
+	.put_pages = i915_gem_userptr_put_pages,
+};
+
+/**
+ * Creates a new mm object that wraps some user memory.
+ */
+int
+i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_gem_userptr *args = data;
+	struct i915_gem_userptr_object *obj;
+	loff_t first_data_page, last_data_page;
+	int num_pages;
+	int ret;
+	u32 handle;
+
+	first_data_page = args->user_ptr / PAGE_SIZE;
+	last_data_page = (args->user_ptr + args->user_size - 1) / PAGE_SIZE;
+	num_pages = last_data_page - first_data_page + 1;
+	if (num_pages * PAGE_SIZE > dev_priv->mm.gtt_total)
+		return -E2BIG;
+
+	ret = fault_in_multipages_readable((char __user *)(uintptr_t)args->user_ptr,
+					   args->user_size);
+	if (ret)
+		return ret;
+
+	/* Allocate the new object */
+	obj = i915_gem_object_alloc(dev);
+	if (obj == NULL)
+		return -ENOMEM;
+
+	if (drm_gem_private_object_init(dev, &obj->gem.base,
+					num_pages * PAGE_SIZE)) {
+		i915_gem_object_free(&obj->gem);
+		return -ENOMEM;
+	}
+
+	i915_gem_object_init(&obj->gem, &i915_gem_userptr_ops);
+	obj->gem.cache_level = I915_CACHE_LLC_MLC;
+
+	obj->gem.gtt_offset = offset_in_page(args->user_ptr);
+	obj->user_ptr = args->user_ptr;
+	obj->user_size = args->user_size;
+	obj->read_only = args->flags & I915_USERPTR_READ_ONLY;
+
+	ret = drm_gem_handle_create(file, &obj->gem.base, &handle);
+	if (ret) {
+		drm_gem_object_release(&obj->gem.base);
+		dev_priv->mm.object_count--;
+		dev_priv->mm.object_memory -= obj->gem.base.size;
+		i915_gem_object_free(&obj->gem);
+		return ret;
+	}
+
+	/* drop reference from allocate - handle holds it now */
+	drm_gem_object_unreference(&obj->gem.base);
+
+	args->handle = handle;
+	return 0;
+}
diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h
index d8a79bf..6e31a4f 100644
--- a/include/drm/i915_drm.h
+++ b/include/drm/i915_drm.h
@@ -206,6 +206,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_SET_CACHEING	0x2f
 #define DRM_I915_GEM_GET_CACHEING	0x30
 #define DRM_I915_REG_READ		0x31
+#define DRM_I915_GEM_USERPTR		0x32
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
 #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -255,6 +256,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_CONTEXT_CREATE	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_CREATE, struct drm_i915_gem_context_create)
 #define DRM_IOCTL_I915_GEM_CONTEXT_DESTROY	DRM_IOW (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_DESTROY, struct drm_i915_gem_context_destroy)
 #define DRM_IOCTL_I915_REG_READ			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_REG_READ, struct drm_i915_reg_read)
+#define DRM_IOCTL_I915_GEM_USERPTR			DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_USERPTR, struct drm_i915_gem_userptr)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -473,6 +475,19 @@ struct drm_i915_gem_mmap_gtt {
 	__u64 offset;
 };
 
+struct drm_i915_gem_userptr {
+	__u64 user_ptr;
+	__u32 user_size;
+	__u32 flags;
+#define I915_USERPTR_READ_ONLY 0x1
+	/**
+	 * Returned handle for the object.
+	 *
+	 * Object handles are nonzero.
+	 */
+	__u32 handle;
+};
+
 struct drm_i915_gem_set_domain {
 	/** Handle for the object */
 	__u32 handle;
-- 
1.7.10.4

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 1/2] drm/i915: move functions around
  2012-08-11 14:41 ` [PATCH 01/29] drm/i915: Track unbound pages Chris Wilson
@ 2012-08-20  9:00   ` Daniel Vetter
  2012-08-20  9:00     ` [PATCH 2/2] drm/i915: Track unbound pages Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20  9:00 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

Prep work to make Chris Wilson's unbound tracking patch a bit easier
to read. Alas, I'd have preferred that moving the page allocation
retry loop from bind to get_pages would have been a separate patch,
too. But that looks like real work ;-)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c |  116 +++++++++++++++++++--------------------
 1 file changed, 58 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0514593..0f70c2a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1340,6 +1340,64 @@ i915_gem_mmap_gtt_ioctl(struct drm_device *dev, void *data,
 	return i915_gem_mmap_gtt(file, dev, args->handle, &args->offset);
 }
 
+/* Immediately discard the backing storage */
+static void
+i915_gem_object_truncate(struct drm_i915_gem_object *obj)
+{
+	struct inode *inode;
+
+	/* Our goal here is to return as much of the memory as
+	 * is possible back to the system as we are called from OOM.
+	 * To do this we must instruct the shmfs to drop all of its
+	 * backing pages, *now*.
+	 */
+	inode = obj->base.filp->f_path.dentry->d_inode;
+	shmem_truncate_range(inode, 0, (loff_t)-1);
+
+	if (obj->base.map_list.map)
+		drm_gem_free_mmap_offset(&obj->base);
+
+	obj->madv = __I915_MADV_PURGED;
+}
+
+static inline int
+i915_gem_object_is_purgeable(struct drm_i915_gem_object *obj)
+{
+	return obj->madv == I915_MADV_DONTNEED;
+}
+
+static void
+i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
+{
+	int page_count = obj->base.size / PAGE_SIZE;
+	int i;
+
+	if (!obj->pages)
+		return;
+
+	BUG_ON(obj->madv == __I915_MADV_PURGED);
+
+	if (i915_gem_object_needs_bit17_swizzle(obj))
+		i915_gem_object_save_bit_17_swizzle(obj);
+
+	if (obj->madv == I915_MADV_DONTNEED)
+		obj->dirty = 0;
+
+	for (i = 0; i < page_count; i++) {
+		if (obj->dirty)
+			set_page_dirty(obj->pages[i]);
+
+		if (obj->madv == I915_MADV_WILLNEED)
+			mark_page_accessed(obj->pages[i]);
+
+		page_cache_release(obj->pages[i]);
+	}
+	obj->dirty = 0;
+
+	drm_free_large(obj->pages);
+	obj->pages = NULL;
+}
+
 int
 i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
 			      gfp_t gfpmask)
@@ -1387,38 +1445,6 @@ err_pages:
 	return PTR_ERR(page);
 }
 
-static void
-i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
-{
-	int page_count = obj->base.size / PAGE_SIZE;
-	int i;
-
-	if (!obj->pages)
-		return;
-
-	BUG_ON(obj->madv == __I915_MADV_PURGED);
-
-	if (i915_gem_object_needs_bit17_swizzle(obj))
-		i915_gem_object_save_bit_17_swizzle(obj);
-
-	if (obj->madv == I915_MADV_DONTNEED)
-		obj->dirty = 0;
-
-	for (i = 0; i < page_count; i++) {
-		if (obj->dirty)
-			set_page_dirty(obj->pages[i]);
-
-		if (obj->madv == I915_MADV_WILLNEED)
-			mark_page_accessed(obj->pages[i]);
-
-		page_cache_release(obj->pages[i]);
-	}
-	obj->dirty = 0;
-
-	drm_free_large(obj->pages);
-	obj->pages = NULL;
-}
-
 void
 i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
 			       struct intel_ring_buffer *ring,
@@ -1486,32 +1512,6 @@ i915_gem_object_move_to_inactive(struct drm_i915_gem_object *obj)
 	WARN_ON(i915_verify_lists(dev));
 }
 
-/* Immediately discard the backing storage */
-static void
-i915_gem_object_truncate(struct drm_i915_gem_object *obj)
-{
-	struct inode *inode;
-
-	/* Our goal here is to return as much of the memory as
-	 * is possible back to the system as we are called from OOM.
-	 * To do this we must instruct the shmfs to drop all of its
-	 * backing pages, *now*.
-	 */
-	inode = obj->base.filp->f_path.dentry->d_inode;
-	shmem_truncate_range(inode, 0, (loff_t)-1);
-
-	if (obj->base.map_list.map)
-		drm_gem_free_mmap_offset(&obj->base);
-
-	obj->madv = __I915_MADV_PURGED;
-}
-
-static inline int
-i915_gem_object_is_purgeable(struct drm_i915_gem_object *obj)
-{
-	return obj->madv == I915_MADV_DONTNEED;
-}
-
 static u32
 i915_gem_get_seqno(struct drm_device *dev)
 {
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* [PATCH 2/2] drm/i915: Track unbound pages
  2012-08-20  9:00   ` [PATCH 1/2] drm/i915: move functions around Daniel Vetter
@ 2012-08-20  9:00     ` Daniel Vetter
  2012-08-20  9:23       ` [PATCH] Add some sanity checks to unbound tracking Chris Wilson
  2012-08-20  9:36       ` [PATCH 2/2] drm/i915: Track unbound pages Chris Wilson
  0 siblings, 2 replies; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20  9:00 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

When dealing with a working set larger than the GATT, or even the
mappable aperture when touching through the GTT, we end up with evicting
objects only to rebind them at a new offset again later. Moving an
object into and out of the GTT requires clflushing the pages, thus
causing a double-clflush penalty for rebinding.

To avoid having to clflush on rebinding, we can track the pages as they
are evicted from the GTT and only relinquish those pages on memory
pressure.

As usual, if it were not for the handling of out-of-memory condition and
having to manually shrink our own bo caches, it would be a net reduction
of code. Alas.

Note: The patch also contains a few changes to the last-hope
evict_everything logic in i916_gem_execbuffer.c - we no longer try to
only evict the purgeable stuff in a first try (since that's superflous
and only helps in OOM corner-cases, not fragmented-gtt trashing
situations).

Also, the extraction of the get_pages retry loop from bind_to_gtt (and
other callsites) to get_pages should imo have been a separate patch.

v2: Ditch the newly added put_pages (for unbound objects only) in
i915_gem_reset. A quick irc discussion hasn't revealed any important
reason for this, so if we need this, I'd like to have a git blame'able
explanation for it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
[danvet: Split out code movements and rant a bit in the commit message
with a few Notes. Done v2]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   14 +-
 drivers/gpu/drm/i915/i915_drv.h            |   13 +-
 drivers/gpu/drm/i915/i915_gem.c            |  292 ++++++++++++++--------------
 drivers/gpu/drm/i915/i915_gem_dmabuf.c     |   20 +-
 drivers/gpu/drm/i915/i915_gem_evict.c      |   13 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |    9 +-
 drivers/gpu/drm/i915/i915_gem_gtt.c        |    2 +-
 drivers/gpu/drm/i915/i915_irq.c            |    4 +-
 drivers/gpu/drm/i915/i915_trace.h          |   10 +-
 9 files changed, 186 insertions(+), 191 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a18e936..608d3ae 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -211,7 +211,7 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 		   dev_priv->mm.object_memory);
 
 	size = count = mappable_size = mappable_count = 0;
-	count_objects(&dev_priv->mm.gtt_list, gtt_list);
+	count_objects(&dev_priv->mm.bound_list, gtt_list);
 	seq_printf(m, "%u [%u] objects, %zu [%zu] bytes in gtt\n",
 		   count, mappable_count, size, mappable_size);
 
@@ -225,8 +225,13 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
 	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
 		   count, mappable_count, size, mappable_size);
 
+	size = count = 0;
+	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list)
+		size += obj->base.size, ++count;
+	seq_printf(m, "%u unbound objects, %zu bytes\n", count, size);
+
 	size = count = mappable_size = mappable_count = 0;
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list) {
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) {
 		if (obj->fault_mappable) {
 			size += obj->gtt_space->size;
 			++count;
@@ -264,7 +269,7 @@ static int i915_gem_gtt_info(struct seq_file *m, void* data)
 		return ret;
 
 	total_obj_size = total_gtt_size = count = 0;
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list) {
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) {
 		if (list == PINNED_LIST && obj->pin_count == 0)
 			continue;
 
@@ -526,7 +531,8 @@ static int i915_gem_fence_regs_info(struct seq_file *m, void *data)
 	for (i = 0; i < dev_priv->num_fence_regs; i++) {
 		struct drm_i915_gem_object *obj = dev_priv->fence_regs[i].obj;
 
-		seq_printf(m, "Fenced object[%2d] = ", i);
+		seq_printf(m, "Fence %d, pin count = %d, object = ",
+			   i, dev_priv->fence_regs[i].pin_count);
 		if (obj == NULL)
 			seq_printf(m, "unused");
 		else
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ed3ba70..a2382a1 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -685,7 +685,13 @@ typedef struct drm_i915_private {
 		struct drm_mm gtt_space;
 		/** List of all objects in gtt_space. Used to restore gtt
 		 * mappings on resume */
-		struct list_head gtt_list;
+		struct list_head bound_list;
+		/**
+		 * List of objects which are not bound to the GTT (thus
+		 * are idle and not used by the GPU) but still have
+		 * (presumably uncached) pages still attached.
+		 */
+		struct list_head unbound_list;
 
 		/** Usable portion of the GTT for GEM */
 		unsigned long gtt_start;
@@ -1306,8 +1312,7 @@ int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
 void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
 void i915_gem_lastclose(struct drm_device *dev);
 
-int i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
-				  gfp_t gfpmask);
+int __must_check i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj);
 int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
 int i915_gem_object_sync(struct drm_i915_gem_object *obj,
 			 struct intel_ring_buffer *to);
@@ -1449,7 +1454,7 @@ int __must_check i915_gem_evict_something(struct drm_device *dev, int min_size,
 					  unsigned alignment,
 					  unsigned cache_level,
 					  bool mappable);
-int i915_gem_evict_everything(struct drm_device *dev, bool purgeable_only);
+int i915_gem_evict_everything(struct drm_device *dev);
 
 /* i915_gem_stolen.c */
 int i915_gem_init_stolen(struct drm_device *dev);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0f70c2a..8eacff8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -55,6 +55,8 @@ static void i915_gem_object_update_fence(struct drm_i915_gem_object *obj,
 
 static int i915_gem_inactive_shrink(struct shrinker *shrinker,
 				    struct shrink_control *sc);
+static long i915_gem_purge(struct drm_i915_private *dev_priv, long target);
+static void i915_gem_shrink_all(struct drm_i915_private *dev_priv);
 static void i915_gem_object_truncate(struct drm_i915_gem_object *obj);
 
 static inline void i915_gem_object_fence_lost(struct drm_i915_gem_object *obj)
@@ -140,7 +142,7 @@ int i915_mutex_lock_interruptible(struct drm_device *dev)
 static inline bool
 i915_gem_object_is_inactive(struct drm_i915_gem_object *obj)
 {
-	return !obj->active;
+	return obj->gtt_space && !obj->active;
 }
 
 int
@@ -179,7 +181,7 @@ i915_gem_get_aperture_ioctl(struct drm_device *dev, void *data,
 
 	pinned = 0;
 	mutex_lock(&dev->struct_mutex);
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list)
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
 		if (obj->pin_count)
 			pinned += obj->gtt_space->size;
 	mutex_unlock(&dev->struct_mutex);
@@ -423,9 +425,11 @@ i915_gem_shmem_pread(struct drm_device *dev,
 		 * anyway again before the next pread happens. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush = 1;
-		ret = i915_gem_object_set_to_gtt_domain(obj, false);
-		if (ret)
-			return ret;
+		if (obj->gtt_space) {
+			ret = i915_gem_object_set_to_gtt_domain(obj, false);
+			if (ret)
+				return ret;
+		}
 	}
 
 	offset = args->offset;
@@ -751,9 +755,11 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
 		 * right away and we therefore have to clflush anyway. */
 		if (obj->cache_level == I915_CACHE_NONE)
 			needs_clflush_after = 1;
-		ret = i915_gem_object_set_to_gtt_domain(obj, true);
-		if (ret)
-			return ret;
+		if (obj->gtt_space) {
+			ret = i915_gem_object_set_to_gtt_domain(obj, true);
+			if (ret)
+				return ret;
+		}
 	}
 	/* Same trick applies for invalidate partially written cachelines before
 	 * writing.  */
@@ -1366,17 +1372,28 @@ i915_gem_object_is_purgeable(struct drm_i915_gem_object *obj)
 	return obj->madv == I915_MADV_DONTNEED;
 }
 
-static void
+static int
 i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 {
 	int page_count = obj->base.size / PAGE_SIZE;
-	int i;
+	int ret, i;
 
-	if (!obj->pages)
-		return;
+	if (obj->pages == NULL)
+		return 0;
 
+	BUG_ON(obj->gtt_space);
 	BUG_ON(obj->madv == __I915_MADV_PURGED);
 
+	ret = i915_gem_object_set_to_cpu_domain(obj, true);
+	if (ret) {
+		/* In the event of a disaster, abandon all caches and
+		 * hope for the best.
+		 */
+		WARN_ON(ret != -EIO);
+		i915_gem_clflush_object(obj);
+		obj->base.read_domains = obj->base.write_domain = I915_GEM_DOMAIN_CPU;
+	}
+
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_save_bit_17_swizzle(obj);
 
@@ -1396,37 +1413,112 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 
 	drm_free_large(obj->pages);
 	obj->pages = NULL;
+
+	list_del(&obj->gtt_list);
+
+	if (i915_gem_object_is_purgeable(obj))
+		i915_gem_object_truncate(obj);
+
+	return 0;
+}
+
+static long
+i915_gem_purge(struct drm_i915_private *dev_priv, long target)
+{
+	struct drm_i915_gem_object *obj, *next;
+	long count = 0;
+
+	list_for_each_entry_safe(obj, next,
+				 &dev_priv->mm.unbound_list,
+				 gtt_list) {
+		if (i915_gem_object_is_purgeable(obj) &&
+		    i915_gem_object_put_pages_gtt(obj) == 0) {
+			count += obj->base.size >> PAGE_SHIFT;
+			if (count >= target)
+				return count;
+		}
+	}
+
+	list_for_each_entry_safe(obj, next,
+				 &dev_priv->mm.inactive_list,
+				 mm_list) {
+		if (i915_gem_object_is_purgeable(obj) &&
+		    i915_gem_object_unbind(obj) == 0 &&
+		    i915_gem_object_put_pages_gtt(obj) == 0) {
+			count += obj->base.size >> PAGE_SHIFT;
+			if (count >= target)
+				return count;
+		}
+	}
+
+	return count;
+}
+
+static void
+i915_gem_shrink_all(struct drm_i915_private *dev_priv)
+{
+	struct drm_i915_gem_object *obj, *next;
+
+	i915_gem_evict_everything(dev_priv->dev);
+
+	list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list)
+		i915_gem_object_put_pages_gtt(obj);
 }
 
 int
-i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
-			      gfp_t gfpmask)
+i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
 {
+	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
 	int page_count, i;
 	struct address_space *mapping;
-	struct inode *inode;
 	struct page *page;
+	gfp_t gfp;
 
 	if (obj->pages || obj->sg_table)
 		return 0;
 
+	/* Assert that the object is not currently in any GPU domain. As it
+	 * wasn't in the GTT, there shouldn't be any way it could have been in
+	 * a GPU cache
+	 */
+	BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
+	BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
+
 	/* Get the list of pages out of our struct file.  They'll be pinned
 	 * at this point until we release them.
 	 */
 	page_count = obj->base.size / PAGE_SIZE;
-	BUG_ON(obj->pages != NULL);
-	obj->pages = drm_malloc_ab(page_count, sizeof(struct page *));
+	obj->pages = kmalloc(page_count*sizeof(struct page *), GFP_KERNEL);
 	if (obj->pages == NULL)
 		return -ENOMEM;
 
-	inode = obj->base.filp->f_path.dentry->d_inode;
-	mapping = inode->i_mapping;
-	gfpmask |= mapping_gfp_mask(mapping);
-
+	/* Fail silently without starting the shrinker */
+	mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
+	gfp = mapping_gfp_mask(mapping);
+	gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD;
+	gfp &= ~(__GFP_IO | __GFP_WAIT);
 	for (i = 0; i < page_count; i++) {
-		page = shmem_read_mapping_page_gfp(mapping, i, gfpmask);
-		if (IS_ERR(page))
-			goto err_pages;
+		page = shmem_read_mapping_page_gfp(mapping, i, gfp);
+		if (IS_ERR(page)) {
+			i915_gem_purge(dev_priv, page_count);
+			page = shmem_read_mapping_page_gfp(mapping, i, gfp);
+		}
+		if (IS_ERR(page)) {
+			/* We've tried hard to allocate the memory by reaping
+			 * our own buffer, now let the real VM do its job and
+			 * go down in flames if truly OOM.
+			 */
+			gfp &= ~(__GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD);
+			gfp |= __GFP_IO | __GFP_WAIT;
+
+			i915_gem_shrink_all(dev_priv);
+			page = shmem_read_mapping_page_gfp(mapping, i, gfp);
+			if (IS_ERR(page))
+				goto err_pages;
+
+			gfp |= __GFP_NORETRY | __GFP_NOWARN | __GFP_NO_KSWAPD;
+			gfp &= ~(__GFP_IO | __GFP_WAIT);
+		}
 
 		obj->pages[i] = page;
 	}
@@ -1434,6 +1526,7 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
 	if (i915_gem_object_needs_bit17_swizzle(obj))
 		i915_gem_object_do_bit_17_swizzle(obj);
 
+	list_add_tail(&obj->gtt_list, &dev_priv->mm.unbound_list);
 	return 0;
 
 err_pages:
@@ -1681,7 +1774,7 @@ static void i915_gem_reset_fences(struct drm_device *dev)
 void i915_gem_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	struct drm_i915_gem_object *obj;
+	struct drm_i915_gem_object *obj, *next;
 	struct intel_ring_buffer *ring;
 	int i;
 
@@ -1698,6 +1791,7 @@ void i915_gem_reset(struct drm_device *dev)
 		obj->base.read_domains &= ~I915_GEM_GPU_DOMAINS;
 	}
 
+
 	/* The fence registers are invalidated so clear them out */
 	i915_gem_reset_fences(dev);
 }
@@ -2209,22 +2303,6 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 
 	i915_gem_object_finish_gtt(obj);
 
-	/* Move the object to the CPU domain to ensure that
-	 * any possible CPU writes while it's not in the GTT
-	 * are flushed when we go to remap it.
-	 */
-	if (ret == 0)
-		ret = i915_gem_object_set_to_cpu_domain(obj, 1);
-	if (ret == -ERESTARTSYS)
-		return ret;
-	if (ret) {
-		/* In the event of a disaster, abandon all caches and
-		 * hope for the best.
-		 */
-		i915_gem_clflush_object(obj);
-		obj->base.read_domains = obj->base.write_domain = I915_GEM_DOMAIN_CPU;
-	}
-
 	/* release the fence reg _after_ flushing */
 	ret = i915_gem_object_put_fence(obj);
 	if (ret)
@@ -2240,10 +2318,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	}
 	i915_gem_gtt_finish_object(obj);
 
-	i915_gem_object_put_pages_gtt(obj);
-
-	list_del_init(&obj->gtt_list);
-	list_del_init(&obj->mm_list);
+	list_del(&obj->mm_list);
+	list_move_tail(&obj->gtt_list, &dev_priv->mm.unbound_list);
 	/* Avoid an unnecessary call to unbind on rebind. */
 	obj->map_and_fenceable = true;
 
@@ -2251,10 +2327,7 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	obj->gtt_space = NULL;
 	obj->gtt_offset = 0;
 
-	if (i915_gem_object_is_purgeable(obj))
-		i915_gem_object_truncate(obj);
-
-	return ret;
+	return 0;
 }
 
 static int i915_ring_idle(struct intel_ring_buffer *ring)
@@ -2667,7 +2740,6 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_mm_node *free_space;
-	gfp_t gfpmask = __GFP_NORETRY | __GFP_NOWARN;
 	u32 size, fence_size, fence_alignment, unfenced_alignment;
 	bool mappable, fenceable;
 	int ret;
@@ -2707,6 +2779,10 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		return -E2BIG;
 	}
 
+	ret = i915_gem_object_get_pages_gtt(obj);
+	if (ret)
+		return ret;
+
  search_free:
 	if (map_and_fenceable)
 		free_space =
@@ -2733,9 +2809,6 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 							 false);
 	}
 	if (obj->gtt_space == NULL) {
-		/* If the gtt is empty and we're still having trouble
-		 * fitting our object in, we're out of memory.
-		 */
 		ret = i915_gem_evict_something(dev, size, alignment,
 					       obj->cache_level,
 					       map_and_fenceable);
@@ -2752,55 +2825,20 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
 		return -EINVAL;
 	}
 
-	ret = i915_gem_object_get_pages_gtt(obj, gfpmask);
-	if (ret) {
-		drm_mm_put_block(obj->gtt_space);
-		obj->gtt_space = NULL;
-
-		if (ret == -ENOMEM) {
-			/* first try to reclaim some memory by clearing the GTT */
-			ret = i915_gem_evict_everything(dev, false);
-			if (ret) {
-				/* now try to shrink everyone else */
-				if (gfpmask) {
-					gfpmask = 0;
-					goto search_free;
-				}
-
-				return -ENOMEM;
-			}
-
-			goto search_free;
-		}
-
-		return ret;
-	}
 
 	ret = i915_gem_gtt_prepare_object(obj);
 	if (ret) {
-		i915_gem_object_put_pages_gtt(obj);
 		drm_mm_put_block(obj->gtt_space);
 		obj->gtt_space = NULL;
-
-		if (i915_gem_evict_everything(dev, false))
-			return ret;
-
-		goto search_free;
+		return ret;
 	}
 
 	if (!dev_priv->mm.aliasing_ppgtt)
 		i915_gem_gtt_bind_object(obj, obj->cache_level);
 
-	list_add_tail(&obj->gtt_list, &dev_priv->mm.gtt_list);
+	list_move_tail(&obj->gtt_list, &dev_priv->mm.bound_list);
 	list_add_tail(&obj->mm_list, &dev_priv->mm.inactive_list);
 
-	/* Assert that the object is not currently in any GPU domain. As it
-	 * wasn't in the GTT, there shouldn't be any way it could have been in
-	 * a GPU cache
-	 */
-	BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
-	BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
-
 	obj->gtt_offset = obj->gtt_space->start;
 
 	fenceable =
@@ -3464,9 +3502,8 @@ i915_gem_madvise_ioctl(struct drm_device *dev, void *data,
 	if (obj->madv != __I915_MADV_PURGED)
 		obj->madv = args->madv;
 
-	/* if the object is no longer bound, discard its backing storage */
-	if (i915_gem_object_is_purgeable(obj) &&
-	    obj->gtt_space == NULL)
+	/* if the object is no longer attached, discard its backing storage */
+	if (i915_gem_object_is_purgeable(obj) && obj->pages == NULL)
 		i915_gem_object_truncate(obj);
 
 	args->retained = obj->madv != __I915_MADV_PURGED;
@@ -3573,6 +3610,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
 		dev_priv->mm.interruptible = was_interruptible;
 	}
 
+	i915_gem_object_put_pages_gtt(obj);
 	if (obj->base.map_list.map)
 		drm_gem_free_mmap_offset(&obj->base);
 
@@ -3605,7 +3643,7 @@ i915_gem_idle(struct drm_device *dev)
 
 	/* Under UMS, be paranoid and evict. */
 	if (!drm_core_check_feature(dev, DRIVER_MODESET))
-		i915_gem_evict_everything(dev, false);
+		i915_gem_evict_everything(dev);
 
 	i915_gem_reset_fences(dev);
 
@@ -3963,8 +4001,9 @@ i915_gem_load(struct drm_device *dev)
 
 	INIT_LIST_HEAD(&dev_priv->mm.active_list);
 	INIT_LIST_HEAD(&dev_priv->mm.inactive_list);
+	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
+	INIT_LIST_HEAD(&dev_priv->mm.bound_list);
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
-	INIT_LIST_HEAD(&dev_priv->mm.gtt_list);
 	for (i = 0; i < I915_NUM_RINGS; i++)
 		init_ring_lists(&dev_priv->ring[i]);
 	for (i = 0; i < I915_MAX_NUM_FENCES; i++)
@@ -4209,13 +4248,6 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 }
 
 static int
-i915_gpu_is_active(struct drm_device *dev)
-{
-	drm_i915_private_t *dev_priv = dev->dev_private;
-	return !list_empty(&dev_priv->mm.active_list);
-}
-
-static int
 i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 {
 	struct drm_i915_private *dev_priv =
@@ -4223,60 +4255,26 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 			     struct drm_i915_private,
 			     mm.inactive_shrinker);
 	struct drm_device *dev = dev_priv->dev;
-	struct drm_i915_gem_object *obj, *next;
+	struct drm_i915_gem_object *obj;
 	int nr_to_scan = sc->nr_to_scan;
 	int cnt;
 
 	if (!mutex_trylock(&dev->struct_mutex))
 		return 0;
 
-	/* "fast-path" to count number of available objects */
-	if (nr_to_scan == 0) {
-		cnt = 0;
-		list_for_each_entry(obj,
-				    &dev_priv->mm.inactive_list,
-				    mm_list)
-			cnt++;
-		mutex_unlock(&dev->struct_mutex);
-		return cnt / 100 * sysctl_vfs_cache_pressure;
+	if (nr_to_scan) {
+		nr_to_scan -= i915_gem_purge(dev_priv, nr_to_scan);
+		if (nr_to_scan > 0)
+			i915_gem_shrink_all(dev_priv);
 	}
 
-rescan:
-	/* first scan for clean buffers */
-	i915_gem_retire_requests(dev);
-
-	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list,
-				 mm_list) {
-		if (i915_gem_object_is_purgeable(obj)) {
-			if (i915_gem_object_unbind(obj) == 0 &&
-			    --nr_to_scan == 0)
-				break;
-		}
-	}
-
-	/* second pass, evict/count anything still on the inactive list */
 	cnt = 0;
-	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list,
-				 mm_list) {
-		if (nr_to_scan &&
-		    i915_gem_object_unbind(obj) == 0)
-			nr_to_scan--;
-		else
-			cnt++;
-	}
+	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list)
+		cnt += obj->base.size >> PAGE_SHIFT;
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
+		if (obj->pin_count == 0)
+			cnt += obj->base.size >> PAGE_SHIFT;
 
-	if (nr_to_scan && i915_gpu_is_active(dev)) {
-		/*
-		 * We are desperate for pages, so as a last resort, wait
-		 * for the GPU to finish and discard whatever we can.
-		 * This has a dramatic impact to reduce the number of
-		 * OOM-killer events whilst running the GPU aggressively.
-		 */
-		if (i915_gpu_idle(dev) == 0)
-			goto rescan;
-	}
 	mutex_unlock(&dev->struct_mutex);
-	return cnt / 100 * sysctl_vfs_cache_pressure;
+	return cnt;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
index ceaad5a..43c9530 100644
--- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
+++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
@@ -33,7 +33,7 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
 	struct drm_i915_gem_object *obj = attachment->dmabuf->priv;
 	struct drm_device *dev = obj->base.dev;
 	int npages = obj->base.size / PAGE_SIZE;
-	struct sg_table *sg = NULL;
+	struct sg_table *sg;
 	int ret;
 	int nents;
 
@@ -41,10 +41,10 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
 	if (ret)
 		return ERR_PTR(ret);
 
-	if (!obj->pages) {
-		ret = i915_gem_object_get_pages_gtt(obj, __GFP_NORETRY | __GFP_NOWARN);
-		if (ret)
-			goto out;
+	ret = i915_gem_object_get_pages_gtt(obj);
+	if (ret) {
+		sg = ERR_PTR(ret);
+		goto out;
 	}
 
 	/* link the pages into an SG then map the sg */
@@ -89,12 +89,10 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
 		goto out_unlock;
 	}
 
-	if (!obj->pages) {
-		ret = i915_gem_object_get_pages_gtt(obj, __GFP_NORETRY | __GFP_NOWARN);
-		if (ret) {
-			mutex_unlock(&dev->struct_mutex);
-			return ERR_PTR(ret);
-		}
+	ret = i915_gem_object_get_pages_gtt(obj);
+	if (ret) {
+		mutex_unlock(&dev->struct_mutex);
+		return ERR_PTR(ret);
 	}
 
 	obj->dma_buf_vmapping = vmap(obj->pages, obj->base.size / PAGE_SIZE, 0, PAGE_KERNEL);
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 7279c31..74635da 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -148,7 +148,7 @@ found:
 }
 
 int
-i915_gem_evict_everything(struct drm_device *dev, bool purgeable_only)
+i915_gem_evict_everything(struct drm_device *dev)
 {
 	drm_i915_private_t *dev_priv = dev->dev_private;
 	struct drm_i915_gem_object *obj, *next;
@@ -160,7 +160,7 @@ i915_gem_evict_everything(struct drm_device *dev, bool purgeable_only)
 	if (lists_empty)
 		return -ENOSPC;
 
-	trace_i915_gem_evict_everything(dev, purgeable_only);
+	trace_i915_gem_evict_everything(dev);
 
 	/* The gpu_idle will flush everything in the write domain to the
 	 * active list. Then we must move everything off the active list
@@ -174,12 +174,9 @@ i915_gem_evict_everything(struct drm_device *dev, bool purgeable_only)
 
 	/* Having flushed everything, unbind() should never raise an error */
 	list_for_each_entry_safe(obj, next,
-				 &dev_priv->mm.inactive_list, mm_list) {
-		if (!purgeable_only || obj->madv != I915_MADV_WILLNEED) {
-			if (obj->pin_count == 0)
-				WARN_ON(i915_gem_object_unbind(obj));
-		}
-	}
+				 &dev_priv->mm.inactive_list, mm_list)
+		if (obj->pin_count == 0)
+			WARN_ON(i915_gem_object_unbind(obj));
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index afb312e..834a636 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -502,17 +502,12 @@ i915_gem_execbuffer_reserve(struct intel_ring_buffer *ring,
 			}
 		}
 
-		if (ret != -ENOSPC || retry > 1)
+		if (ret != -ENOSPC || retry++)
 			return ret;
 
-		/* First attempt, just clear anything that is purgeable.
-		 * Second attempt, clear the entire GTT.
-		 */
-		ret = i915_gem_evict_everything(ring->dev, retry == 0);
+		ret = i915_gem_evict_everything(ring->dev);
 		if (ret)
 			return ret;
-
-		retry++;
 	} while (1);
 
 err:
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 3b3b731..8329a14 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -348,7 +348,7 @@ void i915_gem_restore_gtt_mappings(struct drm_device *dev)
 	intel_gtt_clear_range(dev_priv->mm.gtt_start / PAGE_SIZE,
 			      (dev_priv->mm.gtt_end - dev_priv->mm.gtt_start) / PAGE_SIZE);
 
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list) {
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list) {
 		i915_gem_clflush_object(obj);
 		i915_gem_gtt_bind_object(obj, obj->cache_level);
 	}
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index a61b41a..002dcee 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -1221,7 +1221,7 @@ static void i915_capture_error_state(struct drm_device *dev)
 	list_for_each_entry(obj, &dev_priv->mm.active_list, mm_list)
 		i++;
 	error->active_bo_count = i;
-	list_for_each_entry(obj, &dev_priv->mm.gtt_list, gtt_list)
+	list_for_each_entry(obj, &dev_priv->mm.bound_list, gtt_list)
 		if (obj->pin_count)
 			i++;
 	error->pinned_bo_count = i - error->active_bo_count;
@@ -1246,7 +1246,7 @@ static void i915_capture_error_state(struct drm_device *dev)
 		error->pinned_bo_count =
 			capture_pinned_bo(error->pinned_bo,
 					  error->pinned_bo_count,
-					  &dev_priv->mm.gtt_list);
+					  &dev_priv->mm.bound_list);
 
 	do_gettimeofday(&error->time);
 
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index fe90b3a..3c4093d 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -214,22 +214,18 @@ TRACE_EVENT(i915_gem_evict,
 );
 
 TRACE_EVENT(i915_gem_evict_everything,
-	    TP_PROTO(struct drm_device *dev, bool purgeable),
-	    TP_ARGS(dev, purgeable),
+	    TP_PROTO(struct drm_device *dev),
+	    TP_ARGS(dev),
 
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
-			     __field(bool, purgeable)
 			    ),
 
 	    TP_fast_assign(
 			   __entry->dev = dev->primary->index;
-			   __entry->purgeable = purgeable;
 			  ),
 
-	    TP_printk("dev=%d%s",
-		      __entry->dev,
-		      __entry->purgeable ? ", purgeable only" : "")
+	    TP_printk("dev=%d", __entry->dev)
 );
 
 TRACE_EVENT(i915_gem_ring_dispatch,
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/29] drm/i915: Show (count, size) of purgeable objects in i915_gem_objects
  2012-08-11 14:41 ` [PATCH 02/29] drm/i915: Show (count, size) of purgeable objects in i915_gem_objects Chris Wilson
@ 2012-08-20  9:04   ` Daniel Vetter
  2012-08-20  9:17     ` Chris Wilson
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20  9:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Aug 11, 2012 at 03:41:01PM +0100, Chris Wilson wrote:
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c |   19 ++++++++++++++-----
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index a7eb093..16e8701 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -197,8 +197,8 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
>  	struct drm_info_node *node = (struct drm_info_node *) m->private;
>  	struct drm_device *dev = node->minor->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	u32 count, mappable_count;
> -	size_t size, mappable_size;
> +	u32 count, mappable_count, purgeable_count;
> +	size_t size, mappable_size, purgeable_size;
>  	struct drm_i915_gem_object *obj;
>  	int ret;
>  
> @@ -225,9 +225,12 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
>  	seq_printf(m, "  %u [%u] inactive objects, %zu [%zu] bytes\n",
>  		   count, mappable_count, size, mappable_size);
>  
> -	size = count = 0;
> -	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list)
> +	size = count = purgeable_size = purgeable_count = 0;
> +	list_for_each_entry(obj, &dev_priv->mm.unbound_list, gtt_list) {
>  		size += obj->base.size, ++count;
> +		if (obj->madv == I915_MADV_DONTNEED)
> +			purgeable_size += obj->base.size, ++purgeable_count;
> +	}
>  	seq_printf(m, "%u unbound objects, %zu bytes\n", count, size);
>  
>  	size = count = mappable_size = mappable_count = 0;
> @@ -237,10 +240,16 @@ static int i915_gem_object_info(struct seq_file *m, void* data)
>  			++count;
>  		}
>  		if (obj->pin_mappable) {
> -			mappable_size += obj->gtt_space->size;
> +			mappable_size += obj->gtt_space->size,

That s/;/,/ here looks fishy. Shall I kill it when applying?
-Daniel

>  			++mappable_count;
>  		}
> +		if (obj->madv == I915_MADV_DONTNEED) {
> +			purgeable_size += obj->base.size;
> +			++purgeable_count;
> +		}
>  	}
> +	seq_printf(m, "%u purgeable objects, %zu bytes\n",
> +		   purgeable_count, purgeable_size);
>  	seq_printf(m, "%u pinned mappable objects, %zu bytes\n",
>  		   mappable_count, mappable_size);
>  	seq_printf(m, "%u fault mappable objects, %zu bytes\n",
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 02/29] drm/i915: Show (count, size) of purgeable objects in i915_gem_objects
  2012-08-20  9:04   ` Daniel Vetter
@ 2012-08-20  9:17     ` Chris Wilson
  0 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-20  9:17 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, 20 Aug 2012 11:04:52 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Sat, Aug 11, 2012 at 03:41:01PM +0100, Chris Wilson wrote:
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >  		if (obj->pin_mappable) {
> > -			mappable_size += obj->gtt_space->size;
> > +			mappable_size += obj->gtt_space->size,
> 
> That s/;/,/ here looks fishy. Shall I kill it when applying?

Right that is an inoffensive cut'n'paste. Can be killed with glee.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [PATCH] Add some sanity checks to unbound tracking
  2012-08-20  9:00     ` [PATCH 2/2] drm/i915: Track unbound pages Daniel Vetter
@ 2012-08-20  9:23       ` Chris Wilson
  2012-08-20  9:36       ` [PATCH 2/2] drm/i915: Track unbound pages Chris Wilson
  1 sibling, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-20  9:23 UTC (permalink / raw)
  To: intel-gfx

A pair of universally true checks that just need to be put in the right
place depending on where in the patch sequence you go. Note that
i915_gem_object_put_pages_gtt() already gains the
BUG_ON(obj->gtt_space), but on reflection that needed to migrate to
put_pages().

---
 drivers/gpu/drm/i915/i915_gem.c |    4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0608d95..f90390a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1806,6 +1806,8 @@ i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
 {
 	const struct drm_i915_gem_object_ops *ops = obj->base.driver_private;
 
+	BUG_ON(obj->gtt_space);
+
 	if (obj->pages == NULL)
 		return 0;
 
@@ -2530,6 +2532,8 @@ i915_gem_object_unbind(struct drm_i915_gem_object *obj)
 	if (obj->pin_count)
 		return -EBUSY;
 
+	BUG_ON(obj->pages == NULL);
+
 	ret = i915_gem_object_finish_gpu(obj);
 	if (ret)
 		return ret;
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] drm/i915: Track unbound pages
  2012-08-20  9:00     ` [PATCH 2/2] drm/i915: Track unbound pages Daniel Vetter
  2012-08-20  9:23       ` [PATCH] Add some sanity checks to unbound tracking Chris Wilson
@ 2012-08-20  9:36       ` Chris Wilson
  2012-08-20  9:42         ` Daniel Vetter
  1 sibling, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-20  9:36 UTC (permalink / raw)
  To: Intel Graphics Development; +Cc: Daniel Vetter

On Mon, 20 Aug 2012 11:00:39 +0200, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
>  int
> -i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
> -			      gfp_t gfpmask)
> +i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>  {
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
>  	int page_count, i;
>  	struct address_space *mapping;
> -	struct inode *inode;
>  	struct page *page;
> +	gfp_t gfp;
>  
>  	if (obj->pages || obj->sg_table)
>  		return 0;
>  
> +	/* Assert that the object is not currently in any GPU domain. As it
> +	 * wasn't in the GTT, there shouldn't be any way it could have been in
> +	 * a GPU cache
> +	 */
> +	BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> +	BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
> +
>  	/* Get the list of pages out of our struct file.  They'll be pinned
>  	 * at this point until we release them.
>  	 */
>  	page_count = obj->base.size / PAGE_SIZE;
> -	BUG_ON(obj->pages != NULL);
> -	obj->pages = drm_malloc_ab(page_count, sizeof(struct page *));
> +	obj->pages = kmalloc(page_count*sizeof(struct page *), GFP_KERNEL);

This is a silly one (by me). At one point the patch introduced
i915_malloc() and replaced drm_malloc_ab() with it and then I reverted
that after transitioning to using the sg_table everywhere.

It needs to still be drm_malloc_ab() in this and the follow-on patches
until it is replaced by sg_alloc_table().
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/29] drm/i915: Try harder to allocate an mmap_offset
  2012-08-11 14:41 ` [PATCH 04/29] drm/i915: Try harder to allocate an mmap_offset Chris Wilson
@ 2012-08-20  9:37   ` Daniel Vetter
  2012-08-20 11:31     ` Chris Wilson
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20  9:37 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Aug 11, 2012 at 03:41:03PM +0100, Chris Wilson wrote:
> Given the persistence of an offset for the lifetime of an object, itis
> easy to contemplate how the mmap space becomes badly fragmented to the
> point that further allocations fail with ENOSPC. Our only recourse at
> this point is to try to purge the objects to release some space and
> reattempt the allocation.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=39552
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Ok, I've picked up things up to this patch, with the few changes applied
as discussed (plus ditching an unused var that I've forgotten to kill when
purging the put_pages from gem_reset). I'll look at the others later
today.

Thanks for the patches,
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 2/2] drm/i915: Track unbound pages
  2012-08-20  9:36       ` [PATCH 2/2] drm/i915: Track unbound pages Chris Wilson
@ 2012-08-20  9:42         ` Daniel Vetter
  0 siblings, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20  9:42 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, Intel Graphics Development

On Mon, Aug 20, 2012 at 10:36:09AM +0100, Chris Wilson wrote:
> On Mon, 20 Aug 2012 11:00:39 +0200, Daniel Vetter <daniel.vetter@ffwll.ch> wrote:
> >  int
> > -i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj,
> > -			      gfp_t gfpmask)
> > +i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
> >  {
> > +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> >  	int page_count, i;
> >  	struct address_space *mapping;
> > -	struct inode *inode;
> >  	struct page *page;
> > +	gfp_t gfp;
> >  
> >  	if (obj->pages || obj->sg_table)
> >  		return 0;
> >  
> > +	/* Assert that the object is not currently in any GPU domain. As it
> > +	 * wasn't in the GTT, there shouldn't be any way it could have been in
> > +	 * a GPU cache
> > +	 */
> > +	BUG_ON(obj->base.read_domains & I915_GEM_GPU_DOMAINS);
> > +	BUG_ON(obj->base.write_domain & I915_GEM_GPU_DOMAINS);
> > +
> >  	/* Get the list of pages out of our struct file.  They'll be pinned
> >  	 * at this point until we release them.
> >  	 */
> >  	page_count = obj->base.size / PAGE_SIZE;
> > -	BUG_ON(obj->pages != NULL);
> > -	obj->pages = drm_malloc_ab(page_count, sizeof(struct page *));
> > +	obj->pages = kmalloc(page_count*sizeof(struct page *), GFP_KERNEL);
> 
> This is a silly one (by me). At one point the patch introduced
> i915_malloc() and replaced drm_malloc_ab() with it and then I reverted
> that after transitioning to using the sg_table everywhere.
> 
> It needs to still be drm_malloc_ab() in this and the follow-on patches
> until it is replaced by sg_alloc_table().

Ok, history fixed. Can you please check whether I haven't fumbled it?

Thanks, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 04/29] drm/i915: Try harder to allocate an mmap_offset
  2012-08-20  9:37   ` Daniel Vetter
@ 2012-08-20 11:31     ` Chris Wilson
  0 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-20 11:31 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, 20 Aug 2012 11:37:30 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> Ok, I've picked up things up to this patch, with the few changes applied
> as discussed (plus ditching an unused var that I've forgotten to kill when
> purging the put_pages from gem_reset). I'll look at the others later
> today.

I think that's a good point to hand over to QA for the first round of
testing; especially with the swap thrashing tests and sporadic failures
afterwards.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 08/29] drm/i915: Introduce drm_i915_gem_object_ops
  2012-08-11 14:41 ` [PATCH 08/29] drm/i915: Introduce drm_i915_gem_object_ops Chris Wilson
@ 2012-08-20 19:35   ` Daniel Vetter
  0 siblings, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20 19:35 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Aug 11, 2012 at 03:41:07PM +0100, Chris Wilson wrote:
> In order to specialise functions depending upon the type of object, we
> can attach vfuncs to each object via their obj->driver_private pointer,
> bringing it back to life!
> 
> For instance, this will be used in future patches to only bind pages from
> a dma-buf for the duration that the object is used by the GPU - and so
> prevent them from pinning those pages for the entire of the object.

Tbh I'd prefer adding a pointer with the right type over not wasting that
untyped pointer ... Otherwise I like this, and I'll follow up with
suggestions for other functions we could add when reviewing the other
patches ;-)
-Daniel

> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.h        |   10 ++++-
>  drivers/gpu/drm/i915/i915_gem.c        |   65 ++++++++++++++++++++++----------
>  drivers/gpu/drm/i915/i915_gem_dmabuf.c |    4 +-
>  3 files changed, 56 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bbc51ef..c42190b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -895,6 +895,11 @@ enum i915_cache_level {
>  	I915_CACHE_LLC_MLC, /* gen6+, in docs at least! */
>  };
>  
> +struct drm_i915_gem_object_ops {
> +	int (*get_pages)(struct drm_i915_gem_object *);
> +	void (*put_pages)(struct drm_i915_gem_object *);
> +};
> +
>  struct drm_i915_gem_object {
>  	struct drm_gem_object base;
>  
> @@ -1302,7 +1307,8 @@ int i915_gem_wait_ioctl(struct drm_device *dev, void *data,
>  			struct drm_file *file_priv);
>  void i915_gem_load(struct drm_device *dev);
>  int i915_gem_init_object(struct drm_gem_object *obj);
> -void i915_gem_object_init(struct drm_i915_gem_object *obj);
> +void i915_gem_object_init(struct drm_i915_gem_object *obj,
> +			 const struct drm_i915_gem_object_ops *ops);
>  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>  						  size_t size);
>  void i915_gem_free_object(struct drm_gem_object *obj);
> @@ -1315,7 +1321,7 @@ int __must_check i915_gem_object_unbind(struct drm_i915_gem_object *obj);
>  void i915_gem_release_mmap(struct drm_i915_gem_object *obj);
>  void i915_gem_lastclose(struct drm_device *dev);
>  
> -int __must_check i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj);
> +int __must_check i915_gem_object_get_pages(struct drm_i915_gem_object *obj);
>  int __must_check i915_mutex_lock_interruptible(struct drm_device *dev);
>  int i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  			 struct intel_ring_buffer *to);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 9c8787e..ed6a1ec 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1407,15 +1407,12 @@ i915_gem_object_is_purgeable(struct drm_i915_gem_object *obj)
>  	return obj->madv == I915_MADV_DONTNEED;
>  }
>  
> -static int
> +static void
>  i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
>  {
>  	int page_count = obj->base.size / PAGE_SIZE;
>  	int ret, i;
>  
> -	if (obj->pages == NULL)
> -		return 0;
> -
>  	BUG_ON(obj->gtt_space);
>  	BUG_ON(obj->madv == __I915_MADV_PURGED);
>  
> @@ -1448,9 +1445,19 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
>  
>  	drm_free_large(obj->pages);
>  	obj->pages = NULL;
> +}
>  
> -	list_del(&obj->gtt_list);
> +static int
> +i915_gem_object_put_pages(struct drm_i915_gem_object *obj)
> +{
> +	const struct drm_i915_gem_object_ops *ops = obj->base.driver_private;
>  
> +	if (obj->sg_table || obj->pages == NULL)
> +		return 0;
> +
> +	ops->put_pages(obj);
> +
> +	list_del(&obj->gtt_list);
>  	if (i915_gem_object_is_purgeable(obj))
>  		i915_gem_object_truncate(obj);
>  
> @@ -1467,7 +1474,7 @@ i915_gem_purge(struct drm_i915_private *dev_priv, long target)
>  				 &dev_priv->mm.unbound_list,
>  				 gtt_list) {
>  		if (i915_gem_object_is_purgeable(obj) &&
> -		    i915_gem_object_put_pages_gtt(obj) == 0) {
> +		    i915_gem_object_put_pages(obj) == 0) {
>  			count += obj->base.size >> PAGE_SHIFT;
>  			if (count >= target)
>  				return count;
> @@ -1479,7 +1486,7 @@ i915_gem_purge(struct drm_i915_private *dev_priv, long target)
>  				 mm_list) {
>  		if (i915_gem_object_is_purgeable(obj) &&
>  		    i915_gem_object_unbind(obj) == 0 &&
> -		    i915_gem_object_put_pages_gtt(obj) == 0) {
> +		    i915_gem_object_put_pages(obj) == 0) {
>  			count += obj->base.size >> PAGE_SHIFT;
>  			if (count >= target)
>  				return count;
> @@ -1497,10 +1504,10 @@ i915_gem_shrink_all(struct drm_i915_private *dev_priv)
>  	i915_gem_evict_everything(dev_priv->dev);
>  
>  	list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list)
> -		i915_gem_object_put_pages_gtt(obj);
> +		i915_gem_object_put_pages(obj);
>  }
>  
> -int
> +static int
>  i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>  {
>  	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> @@ -1509,9 +1516,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>  	struct page *page;
>  	gfp_t gfp;
>  
> -	if (obj->pages || obj->sg_table)
> -		return 0;
> -
>  	/* Assert that the object is not currently in any GPU domain. As it
>  	 * wasn't in the GTT, there shouldn't be any way it could have been in
>  	 * a GPU cache
> @@ -1561,7 +1565,6 @@ i915_gem_object_get_pages_gtt(struct drm_i915_gem_object *obj)
>  	if (i915_gem_object_needs_bit17_swizzle(obj))
>  		i915_gem_object_do_bit_17_swizzle(obj);
>  
> -	list_add_tail(&obj->gtt_list, &dev_priv->mm.unbound_list);
>  	return 0;
>  
>  err_pages:
> @@ -1573,6 +1576,24 @@ err_pages:
>  	return PTR_ERR(page);
>  }
>  
> +int
> +i915_gem_object_get_pages(struct drm_i915_gem_object *obj)
> +{
> +	struct drm_i915_private *dev_priv = obj->base.dev->dev_private;
> +	const struct drm_i915_gem_object_ops *ops = obj->base.driver_private;
> +	int ret;
> +
> +	if (obj->sg_table || obj->pages)
> +		return 0;
> +
> +	ret = ops->get_pages(obj);
> +	if (ret)
> +		return ret;
> +
> +	list_add_tail(&obj->gtt_list, &dev_priv->mm.unbound_list);
> +	return 0;
> +}
> +
>  void
>  i915_gem_object_move_to_active(struct drm_i915_gem_object *obj,
>  			       struct intel_ring_buffer *ring,
> @@ -1828,7 +1849,7 @@ void i915_gem_reset(struct drm_device *dev)
>  
>  
>  	list_for_each_entry_safe(obj, next, &dev_priv->mm.unbound_list, gtt_list)
> -		i915_gem_object_put_pages_gtt(obj);
> +		i915_gem_object_put_pages(obj);
>  
>  	/* The fence registers are invalidated so clear them out */
>  	i915_gem_reset_fences(dev);
> @@ -2818,7 +2839,7 @@ i915_gem_object_bind_to_gtt(struct drm_i915_gem_object *obj,
>  		return -E2BIG;
>  	}
>  
> -	ret = i915_gem_object_get_pages_gtt(obj);
> +	ret = i915_gem_object_get_pages(obj);
>  	if (ret)
>  		return ret;
>  
> @@ -3557,9 +3578,10 @@ unlock:
>  	return ret;
>  }
>  
> -void i915_gem_object_init(struct drm_i915_gem_object *obj)
> +void i915_gem_object_init(struct drm_i915_gem_object *obj,
> +			  const struct drm_i915_gem_object_ops *ops)
>  {
> -	obj->base.driver_private = NULL;
> +	obj->base.driver_private = (void *)ops;
>  
>  	INIT_LIST_HEAD(&obj->mm_list);
>  	INIT_LIST_HEAD(&obj->gtt_list);
> @@ -3574,6 +3596,11 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj)
>  	i915_gem_info_add_obj(obj->base.dev->dev_private, obj->base.size);
>  }
>  
> +static const struct drm_i915_gem_object_ops i915_gem_object_ops = {
> +	.get_pages = i915_gem_object_get_pages_gtt,
> +	.put_pages = i915_gem_object_put_pages_gtt,
> +};
> +
>  struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>  						  size_t size)
>  {
> @@ -3600,7 +3627,7 @@ struct drm_i915_gem_object *i915_gem_alloc_object(struct drm_device *dev,
>  	mapping = obj->base.filp->f_path.dentry->d_inode->i_mapping;
>  	mapping_set_gfp_mask(mapping, mask);
>  
> -	i915_gem_object_init(obj);
> +	i915_gem_object_init(obj, &i915_gem_object_ops);
>  
>  	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
>  	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
> @@ -3658,7 +3685,7 @@ void i915_gem_free_object(struct drm_gem_object *gem_obj)
>  		dev_priv->mm.interruptible = was_interruptible;
>  	}
>  
> -	i915_gem_object_put_pages_gtt(obj);
> +	i915_gem_object_put_pages(obj);
>  	i915_gem_object_free_mmap_offset(obj);
>  
>  	drm_gem_object_release(&obj->base);
> diff --git a/drivers/gpu/drm/i915/i915_gem_dmabuf.c b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> index e5f0375..1203460 100644
> --- a/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> +++ b/drivers/gpu/drm/i915/i915_gem_dmabuf.c
> @@ -41,7 +41,7 @@ static struct sg_table *i915_gem_map_dma_buf(struct dma_buf_attachment *attachme
>  	if (ret)
>  		return ERR_PTR(ret);
>  
> -	ret = i915_gem_object_get_pages_gtt(obj);
> +	ret = i915_gem_object_get_pages(obj);
>  	if (ret) {
>  		sg = ERR_PTR(ret);
>  		goto out;
> @@ -89,7 +89,7 @@ static void *i915_gem_dmabuf_vmap(struct dma_buf *dma_buf)
>  		goto out_unlock;
>  	}
>  
> -	ret = i915_gem_object_get_pages_gtt(obj);
> +	ret = i915_gem_object_get_pages(obj);
>  	if (ret) {
>  		mutex_unlock(&dev->struct_mutex);
>  		return ERR_PTR(ret);
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 16/29] drm/i915: Fix location of stolen memory register for SandyBridge+
  2012-08-11 14:41 ` [PATCH 16/29] drm/i915: Fix location of stolen memory register for SandyBridge+ Chris Wilson
@ 2012-08-20 19:38   ` Daniel Vetter
  2012-08-22 15:54     ` Chris Wilson
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20 19:38 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Aug 11, 2012 at 03:41:15PM +0100, Chris Wilson wrote:
> A few of the earlier registers where enlarged and so the Base Data of
> Stolem Memory Register (BDSM) was pushed to 0xb0.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_gem_stolen.c |    9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> index a01ff74..a528e4a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> @@ -63,7 +63,11 @@ static unsigned long i915_stolen_to_physical(struct drm_device *dev)
>  	 * its value of TOLUD.
>  	 */
>  	base = 0;
> -	if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
> +	if (INTEL_INFO(dev)->gen >= 6) {
> +		/* Read Base Data of Stolen Memory Register (BDSM) directly */
> +		pci_read_config_dword(pdev, 0xB0, &base);

Wishlist (i.e. feel free to ignore): Can we have #defines instead of magic
numbers here, please?
-Daniel

> +		base &= ~4095; /* lower bits used for locking register */
> +	} else if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
>  		/* Read Graphics Base of Stolen Memory directly */
>  		pci_read_config_dword(pdev, 0xA4, &base);
>  #if 0
> @@ -172,6 +176,9 @@ int i915_gem_init_stolen(struct drm_device *dev)
>  	if (dev_priv->mm.stolen_base == 0)
>  		return 0;
>  
> +	DRM_DEBUG_KMS("found %d bytes of stolen memory at %08lx\n",
> +		      dev_priv->mm.gtt->stolen_size, dev_priv->mm.stolen_base);
> +
>  	/* Basic memrange allocator for stolen space */
>  	drm_mm_init(&dev_priv->mm.stolen, 0, prealloc_size);
>  
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 18/29] drm/i915: Delay allocation of stolen space for FBC
  2012-08-11 14:41 ` [PATCH 18/29] drm/i915: Delay allocation of stolen space for FBC Chris Wilson
@ 2012-08-20 19:51   ` Daniel Vetter
  2012-08-22 15:51     ` Chris Wilson
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20 19:51 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Aug 11, 2012 at 03:41:17PM +0100, Chris Wilson wrote:
> As we may wish to wrap regions preallocated by the BIOS, we need to do
> that before carving out contiguous chunks of stolen space for FBC.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Some comments inline below.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_drv.h        |    1 +
>  drivers/gpu/drm/i915/i915_gem_stolen.c |  114 +++++++++++++++++---------------
>  drivers/gpu/drm/i915/intel_display.c   |    3 +
>  drivers/gpu/drm/i915/intel_pm.c        |   13 ++--
>  4 files changed, 70 insertions(+), 61 deletions(-)
> 

[snip]

> +int i915_gem_stolen_setup_compression(struct drm_device *dev)
> +{
> +	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct drm_mm_node *entry;
> +	unsigned long size;
> +
> +	if (dev_priv->mm.stolen_base == 0)
> +		return 0;
> +
> +	if (dev_priv->cfb_size)
> +		return dev_priv->cfb_size;
> +
> +	/* Try to set up FBC with a reasonable compressed buffer size */
> +	size = 0;
> +	list_for_each_entry(entry, &dev_priv->mm.stolen.hole_stack, hole_stack) {
> +		unsigned long hole_start = entry->start + entry->size;
> +		unsigned long hole_end = list_entry(entry->node_list.next,
> +						    struct drm_mm_node,
> +						    node_list)->start;
> +		unsigned long hole_size = hole_end - hole_start;

This feels a bit to much munging around in drm_mm internals. What about a
drm_mm_for_each_hole(entry, mm, hole_start, hole_end)
#define helper?

> +		if (hole_size > size)
> +			size = hole_size;
> +	}
> +
> +	/* Try to get a 32M buffer... */
> +	if (size > (36*1024*1024))
> +		size = 32*1024*1024;
> +	else /* fall back to 3/4 of the stolen space */
> +		size = size * 3 / 4;
> +
> +	return i915_setup_compression(dev, size);
>  }
>  
>  void intel_modeset_gem_init(struct drm_device *dev)

[snap]

> diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
> index 3021c18..6f0f498 100644
> --- a/drivers/gpu/drm/i915/intel_pm.c
> +++ b/drivers/gpu/drm/i915/intel_pm.c
> @@ -438,12 +438,6 @@ void intel_update_fbc(struct drm_device *dev)
>  		dev_priv->no_fbc_reason = FBC_MODULE_PARAM;
>  		goto out_disable;
>  	}
> -	if (intel_fb->obj->base.size > dev_priv->cfb_size) {
> -		DRM_DEBUG_KMS("framebuffer too large, disabling "
> -			      "compression\n");
> -		dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
> -		goto out_disable;
> -	}
>  	if ((crtc->mode.flags & DRM_MODE_FLAG_INTERLACE) ||
>  	    (crtc->mode.flags & DRM_MODE_FLAG_DBLSCAN)) {
>  		DRM_DEBUG_KMS("mode incompatible with compression, "
> @@ -477,6 +471,13 @@ void intel_update_fbc(struct drm_device *dev)
>  	if (in_dbg_master())
>  		goto out_disable;
>  
> +	if (intel_fb->obj->base.size > i915_gem_stolen_setup_compression(dev)) {
> +		DRM_DEBUG_KMS("framebuffer too large, disabling "
> +			      "compression\n");
> +		dev_priv->no_fbc_reason = FBC_STOLEN_TOO_SMALL;
> +		goto out_disable;
> +	}

I couldn't figure out why this block here had to move ... please enlighten
me.

> +
>  	/* If the scanout has not changed, don't modify the FBC settings.
>  	 * Note that we make the fundamental assumption that the fb->obj
>  	 * cannot be unpinned (and have its GTT offset and fence revoked)
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 22/29] drm/i915: Handle stolen objects in pwrite
  2012-08-11 14:41 ` [PATCH 22/29] drm/i915: Handle stolen objects in pwrite Chris Wilson
@ 2012-08-20 19:56   ` Daniel Vetter
  2012-08-22 15:47     ` Chris Wilson
  2012-08-30 15:09     ` Chris Wilson
  0 siblings, 2 replies; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20 19:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Aug 11, 2012 at 03:41:21PM +0100, Chris Wilson wrote:
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

What about putting kmap/unmap abstractions into obj->ops (like the dma_buf
interface already has)? Since the pwrite/pread code is already rather
branch heave I hope we don't see the overhead of the indirect call even
in microbenchmarks (haven't checked). And this way we would also neatly
wrap up dma_bufs for pwrite (if anyone ever really wants that ...).

The kmap(_atomic) for stolen mem backed objects would boil down to doing
the pointer arithmetic, kunmap would be just a noop.

Cheers, Daniel
> ---
>  drivers/gpu/drm/i915/i915_gem.c |  169 +++++++++++++++++++++++++--------------
>  1 file changed, 111 insertions(+), 58 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 552f95b..a2fb2aa 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -664,19 +664,17 @@ out:
>   * needs_clflush_before is set and flushes out any written cachelines after
>   * writing if needs_clflush is set. */
>  static int
> -shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
> +shmem_pwrite_fast(char *vaddr, int shmem_page_offset, int page_length,
>  		  char __user *user_data,
>  		  bool page_do_bit17_swizzling,
>  		  bool needs_clflush_before,
>  		  bool needs_clflush_after)
>  {
> -	char *vaddr;
>  	int ret;
>  
>  	if (unlikely(page_do_bit17_swizzling))
>  		return -EINVAL;
>  
> -	vaddr = kmap_atomic(page);
>  	if (needs_clflush_before)
>  		drm_clflush_virt_range(vaddr + shmem_page_offset,
>  				       page_length);
> @@ -686,7 +684,6 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
>  	if (needs_clflush_after)
>  		drm_clflush_virt_range(vaddr + shmem_page_offset,
>  				       page_length);
> -	kunmap_atomic(vaddr);
>  
>  	return ret ? -EFAULT : 0;
>  }
> @@ -694,16 +691,14 @@ shmem_pwrite_fast(struct page *page, int shmem_page_offset, int page_length,
>  /* Only difference to the fast-path function is that this can handle bit17
>   * and uses non-atomic copy and kmap functions. */
>  static int
> -shmem_pwrite_slow(struct page *page, int shmem_page_offset, int page_length,
> +shmem_pwrite_slow(char *vaddr, int shmem_page_offset, int page_length,
>  		  char __user *user_data,
>  		  bool page_do_bit17_swizzling,
>  		  bool needs_clflush_before,
>  		  bool needs_clflush_after)
>  {
> -	char *vaddr;
>  	int ret;
>  
> -	vaddr = kmap(page);
>  	if (unlikely(needs_clflush_before || page_do_bit17_swizzling))
>  		shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
>  					     page_length,
> @@ -720,7 +715,6 @@ shmem_pwrite_slow(struct page *page, int shmem_page_offset, int page_length,
>  		shmem_clflush_swizzled_range(vaddr + shmem_page_offset,
>  					     page_length,
>  					     page_do_bit17_swizzling);
> -	kunmap(page);
>  
>  	return ret ? -EFAULT : 0;
>  }
> @@ -731,6 +725,7 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>  		      struct drm_i915_gem_pwrite *args,
>  		      struct drm_file *file)
>  {
> +	struct drm_i915_private *dev_priv = dev->dev_private;
>  	ssize_t remain;
>  	loff_t offset;
>  	char __user *user_data;
> @@ -770,74 +765,132 @@ i915_gem_shmem_pwrite(struct drm_device *dev,
>  	if (ret)
>  		return ret;
>  
> -	i915_gem_object_pin_pages(obj);
> -
>  	offset = args->offset;
>  	obj->dirty = 1;
>  
> -	for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
> -		struct page *page;
> -		int partial_cacheline_write;
> +	if (obj->stolen) {
> +		while (remain > 0) {
> +			char *vaddr;
> +			int partial_cacheline_write;
>  
> -		if (i < offset >> PAGE_SHIFT)
> -			continue;
> +			/* Operation in this page
> +			 *
> +			 * shmem_page_offset = offset within page in shmem file
> +			 * page_length = bytes to copy for this page
> +			 */
> +			shmem_page_offset = offset_in_page(offset);
>  
> -		if (remain <= 0)
> -			break;
> +			page_length = remain;
> +			if ((shmem_page_offset + page_length) > PAGE_SIZE)
> +				page_length = PAGE_SIZE - shmem_page_offset;
>  
> -		/* Operation in this page
> -		 *
> -		 * shmem_page_offset = offset within page in shmem file
> -		 * page_length = bytes to copy for this page
> -		 */
> -		shmem_page_offset = offset_in_page(offset);
> +			/* If we don't overwrite a cacheline completely we need to be
> +			 * careful to have up-to-date data by first clflushing. Don't
> +			 * overcomplicate things and flush the entire patch. */
> +			partial_cacheline_write = needs_clflush_before &&
> +				((shmem_page_offset | page_length)
> +				 & (boot_cpu_data.x86_clflush_size - 1));
>  
> -		page_length = remain;
> -		if ((shmem_page_offset + page_length) > PAGE_SIZE)
> -			page_length = PAGE_SIZE - shmem_page_offset;
> +			vaddr = (char *)(dev_priv->mm.stolen_base + obj->stolen->start + offset);
> +			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
> +				((uintptr_t)vaddr & (1 << 17)) != 0;
>  
> -		/* If we don't overwrite a cacheline completely we need to be
> -		 * careful to have up-to-date data by first clflushing. Don't
> -		 * overcomplicate things and flush the entire patch. */
> -		partial_cacheline_write = needs_clflush_before &&
> -			((shmem_page_offset | page_length)
> -				& (boot_cpu_data.x86_clflush_size - 1));
> +			ret = shmem_pwrite_fast(vaddr, shmem_page_offset, page_length,
> +						user_data, page_do_bit17_swizzling,
> +						partial_cacheline_write,
> +						needs_clflush_after);
>  
> -		page = sg_page(sg);
> -		page_do_bit17_swizzling = obj_do_bit17_swizzling &&
> -			(page_to_phys(page) & (1 << 17)) != 0;
> +			if (ret == 0)
> +				goto next_stolen;
>  
> -		ret = shmem_pwrite_fast(page, shmem_page_offset, page_length,
> -					user_data, page_do_bit17_swizzling,
> -					partial_cacheline_write,
> -					needs_clflush_after);
> -		if (ret == 0)
> -			goto next_page;
> +			hit_slowpath = 1;
> +			mutex_unlock(&dev->struct_mutex);
>  
> -		hit_slowpath = 1;
> -		mutex_unlock(&dev->struct_mutex);
> -		ret = shmem_pwrite_slow(page, shmem_page_offset, page_length,
> -					user_data, page_do_bit17_swizzling,
> -					partial_cacheline_write,
> -					needs_clflush_after);
> +			ret = shmem_pwrite_slow(vaddr, shmem_page_offset, page_length,
> +						user_data, page_do_bit17_swizzling,
> +						partial_cacheline_write,
> +						needs_clflush_after);
>  
> -		mutex_lock(&dev->struct_mutex);
> +			mutex_lock(&dev->struct_mutex);
> +			if (ret)
> +				goto out;
>  
> -next_page:
> -		set_page_dirty(page);
> -		mark_page_accessed(page);
> +next_stolen:
> +			remain -= page_length;
> +			user_data += page_length;
> +			offset += page_length;
> +		}
> +	} else {
> +		i915_gem_object_pin_pages(obj);
>  
> -		if (ret)
> -			goto out;
> +		for_each_sg(obj->pages->sgl, sg, obj->pages->nents, i) {
> +			struct page *page;
> +			char *vaddr;
> +			int partial_cacheline_write;
>  
> -		remain -= page_length;
> -		user_data += page_length;
> -		offset += page_length;
> +			if (i < offset >> PAGE_SHIFT)
> +				continue;
> +
> +			if (remain <= 0)
> +				break;
> +
> +			/* Operation in this page
> +			 *
> +			 * shmem_page_offset = offset within page in shmem file
> +			 * page_length = bytes to copy for this page
> +			 */
> +			shmem_page_offset = offset_in_page(offset);
> +
> +			page_length = remain;
> +			if ((shmem_page_offset + page_length) > PAGE_SIZE)
> +				page_length = PAGE_SIZE - shmem_page_offset;
> +
> +			/* If we don't overwrite a cacheline completely we need to be
> +			 * careful to have up-to-date data by first clflushing. Don't
> +			 * overcomplicate things and flush the entire patch. */
> +			partial_cacheline_write = needs_clflush_before &&
> +				((shmem_page_offset | page_length)
> +				 & (boot_cpu_data.x86_clflush_size - 1));
> +
> +			page = sg_page(sg);
> +			page_do_bit17_swizzling = obj_do_bit17_swizzling &&
> +				(page_to_phys(page) & (1 << 17)) != 0;
> +
> +			vaddr = kmap_atomic(page);
> +			ret = shmem_pwrite_fast(vaddr, shmem_page_offset, page_length,
> +						user_data, page_do_bit17_swizzling,
> +						partial_cacheline_write,
> +						needs_clflush_after);
> +
> +			kunmap_atomic(vaddr);
> +
> +			if (ret == 0)
> +				goto next_page;
> +
> +			hit_slowpath = 1;
> +			mutex_unlock(&dev->struct_mutex);
> +
> +			vaddr = kmap(page);
> +			ret = shmem_pwrite_slow(vaddr, shmem_page_offset, page_length,
> +						user_data, page_do_bit17_swizzling,
> +						partial_cacheline_write,
> +						needs_clflush_after);
> +			kunmap(page);
> +
> +			mutex_lock(&dev->struct_mutex);
> +			if (ret)
> +				goto out_unpin;
> +
> +next_page:
> +			remain -= page_length;
> +			user_data += page_length;
> +			offset += page_length;
> +		}
> +out_unpin:
> +		i915_gem_object_unpin_pages(obj);
>  	}
>  
>  out:
> -	i915_gem_object_unpin_pages(obj);
> -
>  	if (hit_slowpath) {
>  		/* Fixup: Kill any reinstated backing storage pages */
>  		if (obj->madv == __I915_MADV_PURGED)
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 27/29] drm/i915: Allocate overlay registers from stolen memory
  2012-08-11 14:41 ` [PATCH 27/29] drm/i915: Allocate overlay registers " Chris Wilson
@ 2012-08-20 21:17   ` Daniel Vetter
  2012-08-22 15:45     ` Chris Wilson
  0 siblings, 1 reply; 51+ messages in thread
From: Daniel Vetter @ 2012-08-20 21:17 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Aug 11, 2012 at 03:41:26PM +0100, Chris Wilson wrote:
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Since most of the overlay-supporting hw uses physical mem for the overlay
I think this isn't much worth it: The additional frobbery in
attach/detach_phys object is likely more work than we'll anything we'll
ever gain from using stolen mem here. Especially since we'll use stolen
mem already for the rings.
-Daniel
> ---
>  drivers/gpu/drm/i915/intel_overlay.c |    6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 7a98459..6982191 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -1424,8 +1424,10 @@ void intel_setup_overlay(struct drm_device *dev)
>  
>  	overlay->dev = dev;
>  
> -	reg_bo = i915_gem_alloc_object(dev, PAGE_SIZE);
> -	if (!reg_bo)
> +	reg_bo = i915_gem_object_create_stolen(dev, PAGE_SIZE);
> +	if (reg_bo == NULL)
> +		reg_bo = i915_gem_alloc_object(dev, PAGE_SIZE);
> +	if (reg_bo == NULL)
>  		goto out_free;
>  	overlay->reg_bo = reg_bo;
>  
> -- 
> 1.7.10.4
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 27/29] drm/i915: Allocate overlay registers from stolen memory
  2012-08-20 21:17   ` Daniel Vetter
@ 2012-08-22 15:45     ` Chris Wilson
  2012-08-22 16:26       ` Daniel Vetter
  0 siblings, 1 reply; 51+ messages in thread
From: Chris Wilson @ 2012-08-22 15:45 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, 20 Aug 2012 23:17:06 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Sat, Aug 11, 2012 at 03:41:26PM +0100, Chris Wilson wrote:
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Since most of the overlay-supporting hw uses physical mem for the overlay
> I think this isn't much worth it: The additional frobbery in
> attach/detach_phys object is likely more work than we'll anything we'll
> ever gain from using stolen mem here. Especially since we'll use stolen
> mem already for the rings.

In a straw poll of the machines on my desk, non-physical machines outnumber the physical overlay machines. :-p

However, hooking up the physical to use stolen is also a good idea. Too
bad, I haven't found a way to detect the base of stolen memory on gen2
devices without arch specific internals. It worked nicely right up until
I tried to build i915.ko as a module.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 22/29] drm/i915: Handle stolen objects in pwrite
  2012-08-20 19:56   ` Daniel Vetter
@ 2012-08-22 15:47     ` Chris Wilson
  2012-08-30 15:09     ` Chris Wilson
  1 sibling, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-22 15:47 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, 20 Aug 2012 21:56:08 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Sat, Aug 11, 2012 at 03:41:21PM +0100, Chris Wilson wrote:
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> What about putting kmap/unmap abstractions into obj->ops (like the dma_buf
> interface already has)? Since the pwrite/pread code is already rather
> branch heave I hope we don't see the overhead of the indirect call even
> in microbenchmarks (haven't checked). And this way we would also neatly
> wrap up dma_bufs for pwrite (if anyone ever really wants that ...).
> 
> The kmap(_atomic) for stolen mem backed objects would boil down to doing
> the pointer arithmetic, kunmap would be just a noop.

Sounds nice, I'll cook something up and allocate yet another pointer in
drm_i915_gem_object for the typed ops. I wonder if we can unify the phys
and the dma_buf...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 18/29] drm/i915: Delay allocation of stolen space for FBC
  2012-08-20 19:51   ` Daniel Vetter
@ 2012-08-22 15:51     ` Chris Wilson
  0 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-22 15:51 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, 20 Aug 2012 21:51:42 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Sat, Aug 11, 2012 at 03:41:17PM +0100, Chris Wilson wrote:
> > As we may wish to wrap regions preallocated by the BIOS, we need to do
> > that before carving out contiguous chunks of stolen space for FBC.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Some comments inline below.

I split the pm chunk into its own commit for a standalone justitication
(do not allocate stolen until we need FBC).

The drm_mm_for_each_hole merits a slightly bigger patch, as there is a
couple of sites in drm_mm.c that can take advantage of it and so will be
a good test of its palatability.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 16/29] drm/i915: Fix location of stolen memory register for SandyBridge+
  2012-08-20 19:38   ` Daniel Vetter
@ 2012-08-22 15:54     ` Chris Wilson
  0 siblings, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-22 15:54 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, 20 Aug 2012 21:38:04 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Sat, Aug 11, 2012 at 03:41:15PM +0100, Chris Wilson wrote:
> > A few of the earlier registers where enlarged and so the Base Data of
> > Stolem Memory Register (BDSM) was pushed to 0xb0.
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/i915_gem_stolen.c |    9 ++++++++-
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > index a01ff74..a528e4a 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_stolen.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
> > @@ -63,7 +63,11 @@ static unsigned long i915_stolen_to_physical(struct drm_device *dev)
> >  	 * its value of TOLUD.
> >  	 */
> >  	base = 0;
> > -	if (INTEL_INFO(dev)->gen > 3 || IS_G33(dev)) {
> > +	if (INTEL_INFO(dev)->gen >= 6) {
> > +		/* Read Base Data of Stolen Memory Register (BDSM) directly */
> > +		pci_read_config_dword(pdev, 0xB0, &base);
> 
> Wishlist (i.e. feel free to ignore): Can we have #defines instead of magic
> numbers here, please?

Shrug, I'm not sure in this instance. Each chipset generation seems to
move it about and give it a different name and rationale, so sticking with
a verbose comment made sense.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 27/29] drm/i915: Allocate overlay registers from stolen memory
  2012-08-22 15:45     ` Chris Wilson
@ 2012-08-22 16:26       ` Daniel Vetter
  0 siblings, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2012-08-22 16:26 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Wed, Aug 22, 2012 at 04:45:45PM +0100, Chris Wilson wrote:
> On Mon, 20 Aug 2012 23:17:06 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Sat, Aug 11, 2012 at 03:41:26PM +0100, Chris Wilson wrote:
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > Since most of the overlay-supporting hw uses physical mem for the overlay
> > I think this isn't much worth it: The additional frobbery in
> > attach/detach_phys object is likely more work than we'll anything we'll
> > ever gain from using stolen mem here. Especially since we'll use stolen
> > mem already for the rings.
> 
> In a straw poll of the machines on my desk, non-physical machines outnumber the physical overlay machines. :-p
> 
> However, hooking up the physical to use stolen is also a good idea. Too
> bad, I haven't found a way to detect the base of stolen memory on gen2
> devices without arch specific internals. It worked nicely right up until
> I tried to build i915.ko as a module.

Hm, can't we make a case to EXPORT_GPL that memmap?
-Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 07/29] drm/i915: Extract general object init routine
  2012-08-11 14:41 ` [PATCH 07/29] drm/i915: Extract general object init routine Chris Wilson
@ 2012-08-24  0:05   ` Daniel Vetter
  0 siblings, 0 replies; 51+ messages in thread
From: Daniel Vetter @ 2012-08-24  0:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sat, Aug 11, 2012 at 03:41:06PM +0100, Chris Wilson wrote:
> As we wish to create specialised object constructions in the near
> future that share the same basic GEM object struct, export the default
> initializer.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
I've slurped in the 3 patches up to here to -next, too. For the remaining
ones I'll wait for the new colours ;-)

Thanks, Daniel
-- 
Daniel Vetter
Mail: daniel@ffwll.ch
Mobile: +41 (0)79 365 57 48

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [PATCH 22/29] drm/i915: Handle stolen objects in pwrite
  2012-08-20 19:56   ` Daniel Vetter
  2012-08-22 15:47     ` Chris Wilson
@ 2012-08-30 15:09     ` Chris Wilson
  1 sibling, 0 replies; 51+ messages in thread
From: Chris Wilson @ 2012-08-30 15:09 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Mon, 20 Aug 2012 21:56:08 +0200, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Sat, Aug 11, 2012 at 03:41:21PM +0100, Chris Wilson wrote:
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> What about putting kmap/unmap abstractions into obj->ops (like the dma_buf
> interface already has)? Since the pwrite/pread code is already rather
> branch heave I hope we don't see the overhead of the indirect call even
> in microbenchmarks (haven't checked). And this way we would also neatly
> wrap up dma_bufs for pwrite (if anyone ever really wants that ...).
> 
> The kmap(_atomic) for stolen mem backed objects would boil down to doing
> the pointer arithmetic, kunmap would be just a noop.

Tried doing so. The lack of struct page for the stolen makes it more
cumbersome than it is worth, and worse confusing.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2012-08-30 15:09 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-08-11 14:40 Stolen pages, with a little surprise Chris Wilson
2012-08-11 14:41 ` [PATCH 01/29] drm/i915: Track unbound pages Chris Wilson
2012-08-20  9:00   ` [PATCH 1/2] drm/i915: move functions around Daniel Vetter
2012-08-20  9:00     ` [PATCH 2/2] drm/i915: Track unbound pages Daniel Vetter
2012-08-20  9:23       ` [PATCH] Add some sanity checks to unbound tracking Chris Wilson
2012-08-20  9:36       ` [PATCH 2/2] drm/i915: Track unbound pages Chris Wilson
2012-08-20  9:42         ` Daniel Vetter
2012-08-11 14:41 ` [PATCH 02/29] drm/i915: Show (count, size) of purgeable objects in i915_gem_objects Chris Wilson
2012-08-20  9:04   ` Daniel Vetter
2012-08-20  9:17     ` Chris Wilson
2012-08-11 14:41 ` [PATCH 03/29] drm/i915: Show pin count in debugfs Chris Wilson
2012-08-11 14:41 ` [PATCH 04/29] drm/i915: Try harder to allocate an mmap_offset Chris Wilson
2012-08-20  9:37   ` Daniel Vetter
2012-08-20 11:31     ` Chris Wilson
2012-08-11 14:41 ` [PATCH 05/29] drm/i915: Only pwrite through the GTT if there is space in the aperture Chris Wilson
2012-08-11 14:41 ` [PATCH 06/29] drm/i915: Protect private gem objects from truncate (such as imported dmabuf) Chris Wilson
2012-08-11 14:41 ` [PATCH 07/29] drm/i915: Extract general object init routine Chris Wilson
2012-08-24  0:05   ` Daniel Vetter
2012-08-11 14:41 ` [PATCH 08/29] drm/i915: Introduce drm_i915_gem_object_ops Chris Wilson
2012-08-20 19:35   ` Daniel Vetter
2012-08-11 14:41 ` [PATCH 09/29] drm/i915: Pin backing pages whilst exporting through a dmabuf vmap Chris Wilson
2012-08-11 14:41 ` [PATCH 10/29] drm/i915: Pin backing pages for pwrite Chris Wilson
2012-08-11 14:41 ` [PATCH 11/29] drm/i915: Pin backing pages for pread Chris Wilson
2012-08-11 14:41 ` [PATCH 12/29] drm/i915: Replace the array of pages with a scatterlist Chris Wilson
2012-08-11 14:41 ` [PATCH 13/29] drm/i915: Convert the dmabuf object to use the new i915_gem_object_ops Chris Wilson
2012-08-11 14:41 ` [PATCH 14/29] drm: Introduce drm_mm_create_block() Chris Wilson
2012-08-11 14:41 ` [PATCH 15/29] drm/i915: Fix detection of stolen base for gen2 Chris Wilson
2012-08-11 14:41 ` [PATCH 16/29] drm/i915: Fix location of stolen memory register for SandyBridge+ Chris Wilson
2012-08-20 19:38   ` Daniel Vetter
2012-08-22 15:54     ` Chris Wilson
2012-08-11 14:41 ` [PATCH 17/29] drm/i915: Avoid clearing preallocated regions from the GTT Chris Wilson
2012-08-11 14:41 ` [PATCH 18/29] drm/i915: Delay allocation of stolen space for FBC Chris Wilson
2012-08-20 19:51   ` Daniel Vetter
2012-08-22 15:51     ` Chris Wilson
2012-08-11 14:41 ` [PATCH 19/29] drm/i915: Allow objects to be created with no backing pages, but stolen space Chris Wilson
2012-08-11 14:41 ` [PATCH 20/29] drm/i915: Differentiate between prime and stolen objects Chris Wilson
2012-08-11 14:41 ` [PATCH 21/29] drm/i915: Support readback of stolen objects upon error Chris Wilson
2012-08-11 14:41 ` [PATCH 22/29] drm/i915: Handle stolen objects in pwrite Chris Wilson
2012-08-20 19:56   ` Daniel Vetter
2012-08-22 15:47     ` Chris Wilson
2012-08-30 15:09     ` Chris Wilson
2012-08-11 14:41 ` [PATCH 23/29] drm/i915: Handle stolen objects for pread Chris Wilson
2012-08-11 14:41 ` [PATCH 24/29] drm/i915: Introduce i915_gem_object_create_stolen() Chris Wilson
2012-08-11 14:41 ` [PATCH 25/29] drm/i915: Allocate fbcon from stolen memory Chris Wilson
2012-08-11 14:41 ` [PATCH 26/29] drm/i915: Allocate ringbuffers " Chris Wilson
2012-08-11 14:41 ` [PATCH 27/29] drm/i915: Allocate overlay registers " Chris Wilson
2012-08-20 21:17   ` Daniel Vetter
2012-08-22 15:45     ` Chris Wilson
2012-08-22 16:26       ` Daniel Vetter
2012-08-11 14:41 ` [PATCH 28/29] drm/i915: Use a slab for object allocation Chris Wilson
2012-08-11 14:41 ` [PATCH 29/29] drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl Chris Wilson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).