All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/8] drm/i915: Try harder to finish the idle-worker
@ 2017-09-03 12:47 Chris Wilson
  2017-09-03 12:47 ` [PATCH 2/8] drm/i915: Keep the device awake whilst in the GTT domain Chris Wilson
                   ` (8 more replies)
  0 siblings, 9 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-03 12:47 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

If a worker requeues itself, it may switch to a different kworker pool,
which flush_work() considers as complete. To be strict, we then need to
keep flushing the work until it is no longer pending.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102456
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  3 +--
 drivers/gpu/drm/i915/i915_gem.c     |  3 +--
 drivers/gpu/drm/i915/i915_utils.h   | 13 +++++++++++++
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 48572b157222..1dd687a74879 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4202,8 +4202,7 @@ fault_irq_set(struct drm_i915_private *i915,
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	/* Flush idle worker to disarm irq */
-	while (flush_delayed_work(&i915->gt.idle_work))
-		;
+	drain_delayed_work(&i915->gt.idle_work);
 
 	return 0;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e4cc08bc518c..1829d3fa15d1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4582,8 +4582,7 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 	/* As the idle_work is rearming if it detects a race, play safe and
 	 * repeat the flush until it is definitely idle.
 	 */
-	while (flush_delayed_work(&dev_priv->gt.idle_work))
-		;
+	drain_delayed_work(&dev_priv->gt.idle_work);
 
 	/* Assert that we sucessfully flushed all the work and
 	 * reset the GPU back to its idle, low power state.
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 12fc250b47b9..4f7ffa0976b1 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -119,4 +119,17 @@ static inline void __list_del_many(struct list_head *head,
 	WRITE_ONCE(head->next, first);
 }
 
+/*
+ * Wait until the work is finally complete, even if it tries to postpone
+ * by requeueing itself. Note, that if the worker never cancels itself,
+ * we will spin forever.
+ */
+static inline void drain_delayed_work(struct delayed_work *dw)
+{
+	do {
+		while (flush_delayed_work(dw))
+			;
+	} while (delayed_work_pending(dw));
+}
+
 #endif /* !__I915_UTILS_H */
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/8] drm/i915: Keep the device awake whilst in the GTT domain
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
@ 2017-09-03 12:47 ` Chris Wilson
  2017-09-03 12:47 ` [PATCH 3/8] drm/i915: Pin fence for iomap Chris Wilson
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-03 12:47 UTC (permalink / raw)
  To: intel-gfx

Since runtime suspend is very harsh on GTT mmappings (they all get
zapped on suspend) keep the device awake while the buffer remains in
the GTT domain. However, userspace can control the domain and
although there is a soft contract that writes must be flushed (for e.g.
flushing scanouts and fbc), we are intentionally lax with respect to read
domains, allowing them to persist for as long as is feasible.

We acquire a wakeref when using the buffer in the GEM domain so that all
subsequent operations on that object are fast, trying to avoid
suspending while the GTT is still in active use by userspace.  To ensure
that the device can eventually suspend, we install a timer and expire the
GTT wakeref. So in effect we have just a fancy pm autosuspend that tries
to estimate the cost of restoring actively used GTT mmapping.
---
 drivers/gpu/drm/i915/i915_debugfs.c              |   2 +
 drivers/gpu/drm/i915/i915_drv.h                  |  12 +++
 drivers/gpu/drm/i915/i915_gem.c                  | 116 +++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_object.h           |   5 +
 drivers/gpu/drm/i915/i915_gem_shrinker.c         |   4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_object.c |   2 +-
 6 files changed, 122 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 1dd687a74879..50fba3afec10 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2809,6 +2809,8 @@ static int i915_runtime_pm_status(struct seq_file *m, void *unused)
 	if (!HAS_RUNTIME_PM(dev_priv))
 		seq_puts(m, "Runtime power management not supported\n");
 
+	seq_printf(m, "GTT wakeref count: %d\n",
+		   atomic_read(&dev_priv->mm.gtt_wakeref.count));
 	seq_printf(m, "GPU idle: %s\n", yesno(!dev_priv->gt.awake));
 	seq_printf(m, "IRQs disabled: %s\n",
 		   yesno(!intel_irqs_enabled(dev_priv)));
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 38bd08f2539b..2deed4a899b3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1479,6 +1479,18 @@ struct i915_gem_mm {
 	 */
 	struct list_head userfault_list;
 
+	/* List of all objects in gtt domain, holding a wakeref.
+	 * The list is reaped periodically, and protected by its own mutex.
+	 * Entries on this list do not carry a reference.
+	 */
+	struct {
+		struct mutex lock;
+		struct list_head list;
+		atomic_t count; /* -1 == no timeout scheduled */
+
+		struct delayed_work work;
+	} gtt_wakeref;
+
 	/**
 	 * List of objects which are pending destruction.
 	 */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1829d3fa15d1..924f1de2fadf 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -235,6 +235,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 
 static void __start_cpu_write(struct drm_i915_gem_object *obj)
 {
+	GEM_BUG_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU);
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	if (cpu_write_needs_clflush(obj))
@@ -667,11 +668,14 @@ fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain)
 		obj->frontbuffer_ggtt_origin : ORIGIN_CPU);
 }
 
-static void
-flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
+void
+i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
+				   unsigned int flush_domains)
 {
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 
+	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+
 	if (!(obj->base.write_domain & flush_domains))
 		return;
 
@@ -694,16 +698,19 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
 
 	switch (obj->base.write_domain) {
 	case I915_GEM_DOMAIN_GTT:
-		if (INTEL_GEN(dev_priv) >= 6 && !HAS_LLC(dev_priv)) {
-			intel_runtime_pm_get(dev_priv);
-			spin_lock_irq(&dev_priv->uncore.lock);
-			POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
-			spin_unlock_irq(&dev_priv->uncore.lock);
-			intel_runtime_pm_put(dev_priv);
-		}
+		mutex_lock(&dev_priv->mm.gtt_wakeref.lock);
+		if (!list_empty(&obj->mm.gtt_wakeref_link)) {
+			if (INTEL_GEN(dev_priv) >= 6 && !HAS_LLC(dev_priv)) {
+				spin_lock_irq(&dev_priv->uncore.lock);
+				POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
+				spin_unlock_irq(&dev_priv->uncore.lock);
+			}
 
-		intel_fb_obj_flush(obj,
-				   fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
+			intel_fb_obj_flush(obj, fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
+
+			list_del_init(&obj->mm.gtt_wakeref_link);
+		}
+		mutex_unlock(&dev_priv->mm.gtt_wakeref.lock);
 		break;
 
 	case I915_GEM_DOMAIN_CPU:
@@ -808,7 +815,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
 			goto out;
 	}
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* If we're not in the cpu read domain, set ourself into the gtt
 	 * read domain and manually flush cachelines (if required). This
@@ -861,7 +868,7 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj,
 			goto out;
 	}
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* If we're not in the cpu write domain, set ourself into the
 	 * gtt write domain and manually flush cachelines (as required).
@@ -3422,9 +3429,10 @@ static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object *obj)
 	 * We manually flush the CPU domain so that we can override and
 	 * force the flush for the display, and perform it asyncrhonously.
 	 */
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 	if (obj->cache_dirty)
 		i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE);
+	GEM_BUG_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU);
 	obj->base.write_domain = 0;
 }
 
@@ -3477,7 +3485,7 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
 
 	/* Serialise direct access to this object with the barriers for
 	 * coherent writes from the GPU, by effectively invalidating the
@@ -3501,6 +3509,49 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 	return 0;
 }
 
+static void wakeref_timeout(struct work_struct *work)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(work, typeof(*dev_priv), mm.gtt_wakeref.work.work);
+	struct drm_i915_gem_object *obj, *on;
+	unsigned int count;
+
+	mutex_lock(&dev_priv->mm.gtt_wakeref.lock);
+	count = atomic_xchg(&dev_priv->mm.gtt_wakeref.count, 0);
+	if (count) {
+		unsigned long timeout;
+
+		GEM_BUG_ON(count == -1);
+
+		if (INTEL_GEN(dev_priv) >= 6 && !HAS_LLC(dev_priv)) {
+			spin_lock_irq(&dev_priv->uncore.lock);
+			POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
+			spin_unlock_irq(&dev_priv->uncore.lock);
+		}
+
+		timeout = 0;
+		list_for_each_entry_safe(obj, on,
+					 &dev_priv->mm.gtt_wakeref.list,
+					 mm.gtt_wakeref_link) {
+			list_del_init(&obj->mm.gtt_wakeref_link);
+			if (obj->base.write_domain & I915_GEM_DOMAIN_GTT) {
+				intel_fb_obj_flush(obj, fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
+				obj->base.write_domain = 0;
+				timeout++;
+			}
+			timeout++;
+		}
+
+		timeout = HZ * (ilog2(timeout) + 1) / 2;
+		schedule_delayed_work(&dev_priv->mm.gtt_wakeref.work,
+				      round_jiffies_up_relative(timeout));
+	} else {
+		intel_runtime_pm_put(dev_priv);
+		atomic_set(&dev_priv->mm.gtt_wakeref.count, -1);
+	}
+	mutex_unlock(&dev_priv->mm.gtt_wakeref.lock);
+}
+
 /**
  * Moves a single object to the GTT read, and possibly write domain.
  * @obj: object to act on
@@ -3512,6 +3563,7 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 int
 i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	int ret;
 
 	lockdep_assert_held(&obj->base.dev->struct_mutex);
@@ -3525,6 +3577,17 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
+	if (list_empty_careful(&obj->mm.gtt_wakeref_link)) {
+		mutex_lock(&i915->mm.gtt_wakeref.lock);
+		if (atomic_inc_and_test(&i915->mm.gtt_wakeref.count)) {
+			intel_runtime_pm_get(i915);
+			schedule_delayed_work(&i915->mm.gtt_wakeref.work,
+					      round_jiffies_up_relative(HZ));
+		}
+		list_add(&obj->mm.gtt_wakeref_link, &i915->mm.gtt_wakeref.list);
+		mutex_unlock(&i915->mm.gtt_wakeref.lock);
+	}
+
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
 		return 0;
 
@@ -3540,7 +3603,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT);
 
 	/* Serialise direct access to this object with the barriers for
 	 * coherent writes from the GPU, by effectively invalidating the
@@ -3908,7 +3971,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* Flush the CPU cache if it's still invalid. */
 	if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) {
@@ -4257,6 +4320,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 
 	INIT_LIST_HEAD(&obj->global_link);
 	INIT_LIST_HEAD(&obj->userfault_link);
+	INIT_LIST_HEAD(&obj->mm.gtt_wakeref_link);
 	INIT_LIST_HEAD(&obj->vma_list);
 	INIT_LIST_HEAD(&obj->lut_list);
 	INIT_LIST_HEAD(&obj->batch_pool_link);
@@ -4394,6 +4458,8 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 
 		trace_i915_gem_object_destroy(obj);
 
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+
 		GEM_BUG_ON(i915_gem_object_is_active(obj));
 		list_for_each_entry_safe(vma, vn,
 					 &obj->vma_list, obj_link) {
@@ -4416,6 +4482,12 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 		GEM_BUG_ON(atomic_read(&obj->frontbuffer_bits));
 		GEM_BUG_ON(!list_empty(&obj->lut_list));
 
+		if (!list_empty_careful(&obj->mm.gtt_wakeref_link)) {
+			mutex_lock(&i915->mm.gtt_wakeref.lock);
+			list_del(&obj->mm.gtt_wakeref_link);
+			mutex_unlock(&i915->mm.gtt_wakeref.lock);
+		}
+
 		if (obj->ops->release)
 			obj->ops->release(obj);
 
@@ -4548,6 +4620,8 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 	int ret;
 
 	intel_runtime_pm_get(dev_priv);
+	drain_delayed_work(&dev_priv->mm.gtt_wakeref.work);
+	WARN_ON(atomic_read(&dev_priv->mm.gtt_wakeref.count) != -1);
 	intel_suspend_gt_powersave(dev_priv);
 
 	mutex_lock(&dev->struct_mutex);
@@ -4931,6 +5005,11 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 	if (err)
 		goto err_priorities;
 
+	INIT_LIST_HEAD(&dev_priv->mm.gtt_wakeref.list);
+	INIT_DELAYED_WORK(&dev_priv->mm.gtt_wakeref.work, wakeref_timeout);
+	mutex_init(&dev_priv->mm.gtt_wakeref.lock);
+	atomic_set(&dev_priv->mm.gtt_wakeref.count, -1);
+
 	INIT_WORK(&dev_priv->mm.free_work, __i915_gem_free_work);
 	init_llist_head(&dev_priv->mm.free_list);
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
@@ -4977,6 +5056,9 @@ void i915_gem_load_cleanup(struct drm_i915_private *dev_priv)
 	WARN_ON(!list_empty(&dev_priv->gt.timelines));
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
+	drain_delayed_work(&dev_priv->mm.gtt_wakeref.work);
+	WARN_ON(atomic_read(&dev_priv->mm.gtt_wakeref.count) != -1);
+
 	kmem_cache_destroy(dev_priv->priorities);
 	kmem_cache_destroy(dev_priv->dependencies);
 	kmem_cache_destroy(dev_priv->requests);
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index c30d8f808185..5797b75a804b 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -177,6 +177,8 @@ struct drm_i915_gem_object {
 			struct mutex lock; /* protects this cache */
 		} get_page;
 
+		struct list_head gtt_wakeref_link;
+
 		/**
 		 * Advice: are the backing pages purgeable?
 		 */
@@ -421,5 +423,8 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 					 unsigned int cache_level);
 void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
 
+void i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
+					unsigned int flush_domains);
+
 #endif
 
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 77fb39808131..8d146584ee49 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -127,8 +127,10 @@ static bool can_release_pages(struct drm_i915_gem_object *obj)
 
 static bool unsafe_drop_pages(struct drm_i915_gem_object *obj)
 {
-	if (i915_gem_object_unbind(obj) == 0)
+	if (i915_gem_object_unbind(obj) == 0) {
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 		__i915_gem_object_put_pages(obj, I915_MM_SHRINKER);
+	}
 	return !READ_ONCE(obj->mm.pages);
 }
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index 8f011c447e41..f3f6588afbdb 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -266,7 +266,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
 		if (offset >= obj->base.size)
 			continue;
 
-		flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 		p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
 		cpu = kmap(p) + offset_in_page(offset);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 3/8] drm/i915: Pin fence for iomap
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
  2017-09-03 12:47 ` [PATCH 2/8] drm/i915: Keep the device awake whilst in the GTT domain Chris Wilson
@ 2017-09-03 12:47 ` Chris Wilson
  2017-09-03 12:47 ` [PATCH 4/8] drm/i915: Consolidate get_fence with pin_fence Chris Wilson
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-03 12:47 UTC (permalink / raw)
  To: intel-gfx

Acquire the fence register for the iomap in i915_vma_pin_iomap() on
behalf of the caller.

We probably want for the caller to specify whether the fence should be
pinned for their usage, but at the moment all callers do want the
associated fence, or none, so take it on their behalf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_vma.c                  | 19 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma.h                  |  7 +------
 drivers/gpu/drm/i915/selftests/i915_gem_object.c |  8 --------
 3 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 02d1a5eacb00..f15542e64808 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -278,6 +278,7 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
 void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 {
 	void __iomem *ptr;
+	int ret;
 
 	/* Access through the GTT requires the device to be awake. */
 	assert_rpm_wakelock_held(vma->vm->i915);
@@ -301,9 +302,27 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 	}
 
 	__i915_vma_pin(vma);
+
+	ret = i915_vma_get_fence(vma);
+	if (ret) {
+		__i915_vma_unpin(vma);
+		return IO_ERR_PTR(ret);
+	}
+	i915_vma_pin_fence(vma);
+
 	return ptr;
 }
 
+void i915_vma_unpin_iomap(struct i915_vma *vma)
+{
+	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
+
+	GEM_BUG_ON(vma->iomap == NULL);
+
+	i915_vma_unpin_fence(vma);
+	i915_vma_unpin(vma);
+}
+
 void i915_vma_unpin_and_release(struct i915_vma **p_vma)
 {
 	struct i915_vma *vma;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index e811067c7724..347a40d08b9b 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -321,12 +321,7 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma);
  * Callers must hold the struct_mutex. This function is only valid to be
  * called on a VMA previously iomapped by the caller with i915_vma_pin_iomap().
  */
-static inline void i915_vma_unpin_iomap(struct i915_vma *vma)
-{
-	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
-	GEM_BUG_ON(vma->iomap == NULL);
-	i915_vma_unpin(vma);
-}
+void i915_vma_unpin_iomap(struct i915_vma *vma);
 
 static inline struct page *i915_vma_first_page(struct i915_vma *vma)
 {
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index f3f6588afbdb..5f9330b4094a 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -251,14 +251,6 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
 			return PTR_ERR(io);
 		}
 
-		err = i915_vma_get_fence(vma);
-		if (err) {
-			pr_err("Failed to get fence for partial view: offset=%lu\n",
-			       page);
-			i915_vma_unpin_iomap(vma);
-			return err;
-		}
-
 		iowrite32(page, io + n * PAGE_SIZE/sizeof(*io));
 		i915_vma_unpin_iomap(vma);
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 4/8] drm/i915: Consolidate get_fence with pin_fence
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
  2017-09-03 12:47 ` [PATCH 2/8] drm/i915: Keep the device awake whilst in the GTT domain Chris Wilson
  2017-09-03 12:47 ` [PATCH 3/8] drm/i915: Pin fence for iomap Chris Wilson
@ 2017-09-03 12:47 ` Chris Wilson
  2017-09-03 12:47 ` [PATCH 5/8] drm/i915: Emit pipelined fence changes Chris Wilson
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-03 12:47 UTC (permalink / raw)
  To: intel-gfx

Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h            |  4 ----
 drivers/gpu/drm/i915/i915_gem.c            |  3 ++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 ++++-----
 drivers/gpu/drm/i915/i915_gem_fence_reg.c  | 33 +++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_vma.c            |  3 +--
 drivers/gpu/drm/i915/i915_vma.h            | 20 ++++++++----------
 drivers/gpu/drm/i915/intel_display.c       |  3 +--
 7 files changed, 45 insertions(+), 31 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2deed4a899b3..d24b7a68a78c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3682,10 +3682,6 @@ i915_vm_to_ppgtt(struct i915_address_space *vm)
 	return container_of(vm, struct i915_hw_ppgtt, base);
 }
 
-/* i915_gem_fence_reg.c */
-int __must_check i915_vma_get_fence(struct i915_vma *vma);
-int __must_check i915_vma_put_fence(struct i915_vma *vma);
-
 void i915_gem_revoke_fences(struct drm_i915_private *dev_priv);
 void i915_gem_restore_fences(struct drm_i915_private *dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 924f1de2fadf..1ac44c9e4581 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1909,7 +1909,7 @@ int i915_gem_fault(struct vm_fault *vmf)
 	if (ret)
 		goto err_unpin;
 
-	ret = i915_vma_get_fence(vma);
+	ret = i915_vma_pin_fence(vma);
 	if (ret)
 		goto err_unpin;
 
@@ -1925,6 +1925,7 @@ int i915_gem_fault(struct vm_fault *vmf)
 			       min_t(u64, vma->size, area->vm_end - area->vm_start),
 			       &ggtt->mappable);
 
+	i915_vma_unpin_fence(vma);
 err_unpin:
 	__i915_vma_unpin(vma);
 err_unlock:
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 8a9d37ac16d4..b22307372157 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -366,12 +366,12 @@ eb_pin_vma(struct i915_execbuffer *eb,
 		return false;
 
 	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		if (unlikely(i915_vma_get_fence(vma))) {
+		if (unlikely(i915_vma_pin_fence(vma))) {
 			i915_vma_unpin(vma);
 			return false;
 		}
 
-		if (i915_vma_pin_fence(vma))
+		if (vma->fence)
 			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
 
@@ -384,7 +384,7 @@ static inline void __eb_unreserve_vma(struct i915_vma *vma, unsigned int flags)
 	GEM_BUG_ON(!(flags & __EXEC_OBJECT_HAS_PIN));
 
 	if (unlikely(flags & __EXEC_OBJECT_HAS_FENCE))
-		i915_vma_unpin_fence(vma);
+		__i915_vma_unpin_fence(vma);
 
 	__i915_vma_unpin(vma);
 }
@@ -562,13 +562,13 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
 	}
 
 	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		err = i915_vma_get_fence(vma);
+		err = i915_vma_pin_fence(vma);
 		if (unlikely(err)) {
 			i915_vma_unpin(vma);
 			return err;
 		}
 
-		if (i915_vma_pin_fence(vma))
+		if (vma->fence)
 			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index 5fe2cd8c8f28..38fd077941d3 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -280,8 +280,7 @@ static int fence_update(struct drm_i915_fence_reg *fence,
  *
  * 0 on success, negative error code on failure.
  */
-int
-i915_vma_put_fence(struct i915_vma *vma)
+int i915_vma_put_fence(struct i915_vma *vma)
 {
 	struct drm_i915_fence_reg *fence = vma->fence;
 
@@ -299,6 +298,8 @@ static struct drm_i915_fence_reg *fence_find(struct drm_i915_private *dev_priv)
 	struct drm_i915_fence_reg *fence;
 
 	list_for_each_entry(fence, &dev_priv->mm.fence_list, link) {
+		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
+
 		if (fence->pin_count)
 			continue;
 
@@ -313,7 +314,7 @@ static struct drm_i915_fence_reg *fence_find(struct drm_i915_private *dev_priv)
 }
 
 /**
- * i915_vma_get_fence - set up fencing for a vma
+ * i915_vma_pin_fence - set up fencing for a vma
  * @vma: vma to map through a fence reg
  *
  * When mapping objects through the GTT, userspace wants to be able to write
@@ -331,10 +332,11 @@ static struct drm_i915_fence_reg *fence_find(struct drm_i915_private *dev_priv)
  * 0 on success, negative error code on failure.
  */
 int
-i915_vma_get_fence(struct i915_vma *vma)
+i915_vma_pin_fence(struct i915_vma *vma)
 {
 	struct drm_i915_fence_reg *fence;
 	struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL;
+	int err;
 
 	/* Note that we revoke fences on runtime suspend. Therefore the user
 	 * must keep the device awake whilst using the fence.
@@ -344,6 +346,8 @@ i915_vma_get_fence(struct i915_vma *vma)
 	/* Just update our place in the LRU if our fence is getting reused. */
 	if (vma->fence) {
 		fence = vma->fence;
+		GEM_BUG_ON(fence->vma != vma);
+		fence->pin_count++;
 		if (!fence->dirty) {
 			list_move_tail(&fence->link,
 				       &fence->i915->mm.fence_list);
@@ -353,10 +357,25 @@ i915_vma_get_fence(struct i915_vma *vma)
 		fence = fence_find(vma->vm->i915);
 		if (IS_ERR(fence))
 			return PTR_ERR(fence);
+
+		GEM_BUG_ON(fence->pin_count);
+		fence->pin_count++;
 	} else
 		return 0;
 
-	return fence_update(fence, set);
+	err = fence_update(fence, set);
+	if (err)
+		goto out_unpin;
+
+	GEM_BUG_ON(fence->vma != set);
+	GEM_BUG_ON(vma->fence != (set ? fence : NULL));
+
+	if (set)
+		return 0;
+
+out_unpin:
+	fence->pin_count--;
+	return err;
 }
 
 /**
@@ -378,6 +397,8 @@ void i915_gem_revoke_fences(struct drm_i915_private *dev_priv)
 	for (i = 0; i < dev_priv->num_fence_regs; i++) {
 		struct drm_i915_fence_reg *fence = &dev_priv->fence_regs[i];
 
+		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
+
 		if (fence->vma)
 			i915_gem_release_mmap(fence->vma->obj);
 	}
@@ -399,6 +420,8 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
 		struct i915_vma *vma = reg->vma;
 
+		GEM_BUG_ON(vma && vma->fence != reg);
+
 		/*
 		 * Commit delayed tiling changes if we have an object still
 		 * attached to the fence, otherwise just clear the fence.
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f15542e64808..c7d5604b4489 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -303,12 +303,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 
 	__i915_vma_pin(vma);
 
-	ret = i915_vma_get_fence(vma);
+	ret = i915_vma_pin_fence(vma);
 	if (ret) {
 		__i915_vma_unpin(vma);
 		return IO_ERR_PTR(ret);
 	}
-	i915_vma_pin_fence(vma);
 
 	return ptr;
 }
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 347a40d08b9b..ab4a3c0b1526 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -344,15 +344,13 @@ static inline struct page *i915_vma_first_page(struct i915_vma *vma)
  *
  * True if the vma has a fence, false otherwise.
  */
-static inline bool
-i915_vma_pin_fence(struct i915_vma *vma)
+int i915_vma_pin_fence(struct i915_vma *vma);
+int __must_check i915_vma_put_fence(struct i915_vma *vma);
+
+static inline void __i915_vma_unpin_fence(struct i915_vma *vma)
 {
-	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
-	if (vma->fence) {
-		vma->fence->pin_count++;
-		return true;
-	} else
-		return false;
+	GEM_BUG_ON(vma->fence->pin_count <= 0);
+	vma->fence->pin_count--;
 }
 
 /**
@@ -367,10 +365,8 @@ static inline void
 i915_vma_unpin_fence(struct i915_vma *vma)
 {
 	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
-	if (vma->fence) {
-		GEM_BUG_ON(vma->fence->pin_count <= 0);
-		vma->fence->pin_count--;
-	}
+	if (vma->fence)
+		__i915_vma_unpin_fence(vma);
 }
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 216cd9e0e08f..6f4df6bee362 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2219,8 +2219,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
 		 * something and try to run the system in a "less than optimal"
 		 * mode that matches the user configuration.
 		 */
-		if (i915_vma_get_fence(vma) == 0)
-			i915_vma_pin_fence(vma);
+		i915_vma_pin_fence(vma);
 	}
 
 	i915_vma_get(vma);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 5/8] drm/i915: Emit pipelined fence changes
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
                   ` (2 preceding siblings ...)
  2017-09-03 12:47 ` [PATCH 4/8] drm/i915: Consolidate get_fence with pin_fence Chris Wilson
@ 2017-09-03 12:47 ` Chris Wilson
  2017-09-03 12:47 ` [PATCH 6/8] drm/i915: Track user GTT faulting per-vma Chris Wilson
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-03 12:47 UTC (permalink / raw)
  To: intel-gfx

Many years ago, long before requests, we tried doing this. We never
quite got it right, but now with requests we have the tracking to do the
job properly!

One of the stall points for gen2/gen3 is the use of fence registers for
GPU operations. There are only a few available, and currently if we
exhaust the available fence register we must stall the GPU between
batches. By pipelining the fence register writes, we can avoid the stall
and continuously emit the batches. The challenge is remembering to wait
for those pipelined LRI before accessing the fence with the CPU, and
that is what our request tracking makes easy.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/gvt/aperture_gm.c     |   2 +-
 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +-
 drivers/gpu/drm/i915/i915_gem.c            |   1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  15 ++-
 drivers/gpu/drm/i915/i915_gem_fence_reg.c  | 206 ++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_fence_reg.h  |   3 +-
 drivers/gpu/drm/i915/i915_vma.c            |   1 +
 drivers/gpu/drm/i915/i915_vma.h            |  13 +-
 8 files changed, 202 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c b/drivers/gpu/drm/i915/gvt/aperture_gm.c
index ca3d1925beda..3c835cb59d95 100644
--- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
+++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
@@ -196,7 +196,7 @@ static int alloc_vgpu_fence(struct intel_vgpu *vgpu)
 	i = 0;
 	list_for_each_safe(pos, q, &dev_priv->mm.fence_list) {
 		reg = list_entry(pos, struct drm_i915_fence_reg, link);
-		if (reg->pin_count || reg->vma)
+		if (atomic_read(&reg->pin_count) || reg->vma)
 			continue;
 		list_del(pos);
 		vgpu->fence.regs[i] = reg;
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 50fba3afec10..23438ee92177 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -896,7 +896,7 @@ static int i915_gem_fence_regs_info(struct seq_file *m, void *data)
 		struct i915_vma *vma = dev_priv->fence_regs[i].vma;
 
 		seq_printf(m, "Fence %d, pin count = %d, object = ",
-			   i, dev_priv->fence_regs[i].pin_count);
+			   i, atomic_read(&dev_priv->fence_regs[i].pin_count));
 		if (!vma)
 			seq_puts(m, "unused");
 		else
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 1ac44c9e4581..bcfb04e8fecc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4959,6 +4959,7 @@ i915_gem_load_init_fences(struct drm_i915_private *dev_priv)
 		fence->i915 = dev_priv;
 		fence->id = i;
 		list_add_tail(&fence->link, &dev_priv->mm.fence_list);
+		init_request_active(&fence->pipelined, NULL);
 	}
 	i915_gem_restore_fences(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index b22307372157..90fe7ab1a920 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -366,11 +366,12 @@ eb_pin_vma(struct i915_execbuffer *eb,
 		return false;
 
 	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		if (unlikely(i915_vma_pin_fence(vma))) {
+		if (unlikely(i915_vma_reserve_fence(vma))) {
 			i915_vma_unpin(vma);
 			return false;
 		}
 
+		exec_flags &= ~EXEC_OBJECT_ASYNC;
 		if (vma->fence)
 			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
@@ -562,12 +563,13 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
 	}
 
 	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		err = i915_vma_pin_fence(vma);
+		err = i915_vma_reserve_fence(vma);
 		if (unlikely(err)) {
 			i915_vma_unpin(vma);
 			return err;
 		}
 
+		exec_flags &= ~EXEC_OBJECT_ASYNC;
 		if (vma->fence)
 			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
@@ -1789,6 +1791,12 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 		if (flags & EXEC_OBJECT_ASYNC)
 			continue;
 
+		if (unlikely(flags & EXEC_OBJECT_NEEDS_FENCE)) {
+			err = i915_vma_emit_pipelined_fence(vma, eb->request);
+			if (err)
+				return err;
+		}
+
 		err = i915_gem_request_await_object
 			(eb->request, obj, flags & EXEC_OBJECT_WRITE);
 		if (err)
@@ -1875,9 +1883,6 @@ void i915_vma_move_to_active(struct i915_vma *vma,
 		obj->base.read_domains = 0;
 	}
 	obj->base.read_domains |= I915_GEM_GPU_DOMAINS;
-
-	if (flags & EXEC_OBJECT_NEEDS_FENCE)
-		i915_gem_active_set(&vma->last_fence, req);
 }
 
 static int i915_reset_gen7_sol_offsets(struct drm_i915_gem_request *req)
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index 38fd077941d3..c5a00ff44786 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -55,10 +55,9 @@
  * CPU ptes into GTT mmaps (not the GTT ptes themselves) as needed.
  */
 
-#define pipelined 0
-
-static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
-				 struct i915_vma *vma)
+static int i965_write_fence_reg(struct drm_i915_fence_reg *fence,
+				struct i915_vma *vma,
+				struct drm_i915_gem_request *pipelined)
 {
 	i915_reg_t fence_reg_lo, fence_reg_hi;
 	int fence_pitch_shift;
@@ -110,11 +109,30 @@ static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
 		I915_WRITE(fence_reg_hi, upper_32_bits(val));
 		I915_WRITE(fence_reg_lo, lower_32_bits(val));
 		POSTING_READ(fence_reg_lo);
+	} else {
+		u32 *cs;
+
+		cs = intel_ring_begin(pipelined, 8);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		*cs++ = MI_LOAD_REGISTER_IMM(3);
+		*cs++ = i915_mmio_reg_offset(fence_reg_lo);
+		*cs++ = 0;
+		*cs++ = i915_mmio_reg_offset(fence_reg_hi);
+		*cs++ = upper_32_bits(val);
+		*cs++ = i915_mmio_reg_offset(fence_reg_lo);
+		*cs++ = lower_32_bits(val);
+		*cs++ = MI_NOOP;
+		intel_ring_advance(pipelined, cs);
 	}
+
+	return 0;
 }
 
-static void i915_write_fence_reg(struct drm_i915_fence_reg *fence,
-				 struct i915_vma *vma)
+static int i915_write_fence_reg(struct drm_i915_fence_reg *fence,
+				struct i915_vma *vma,
+				struct drm_i915_gem_request *pipelined)
 {
 	u32 val;
 
@@ -150,11 +168,26 @@ static void i915_write_fence_reg(struct drm_i915_fence_reg *fence,
 
 		I915_WRITE(reg, val);
 		POSTING_READ(reg);
+	} else {
+		u32 *cs;
+
+		cs = intel_ring_begin(pipelined, 4);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		*cs++ = MI_LOAD_REGISTER_IMM(1);
+		*cs++ = i915_mmio_reg_offset(FENCE_REG(fence->id));
+		*cs++ = val;
+		*cs++ = MI_NOOP;
+		intel_ring_advance(pipelined, cs);
 	}
+
+	return 0;
 }
 
-static void i830_write_fence_reg(struct drm_i915_fence_reg *fence,
-				 struct i915_vma *vma)
+static int i830_write_fence_reg(struct drm_i915_fence_reg *fence,
+				struct i915_vma *vma,
+				struct drm_i915_gem_request *pipelined)
 {
 	u32 val;
 
@@ -182,29 +215,49 @@ static void i830_write_fence_reg(struct drm_i915_fence_reg *fence,
 
 		I915_WRITE(reg, val);
 		POSTING_READ(reg);
+	} else {
+		u32 *cs;
+
+		cs = intel_ring_begin(pipelined, 4);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		*cs++ = MI_LOAD_REGISTER_IMM(1);
+		*cs++ = i915_mmio_reg_offset(FENCE_REG(fence->id));
+		*cs++ = val;
+		*cs++ = MI_NOOP;
+		intel_ring_advance(pipelined, cs);
 	}
+
+	return 0;
 }
 
-static void fence_write(struct drm_i915_fence_reg *fence,
-			struct i915_vma *vma)
+static int fence_write(struct drm_i915_fence_reg *fence,
+		       struct i915_vma *vma,
+		       struct drm_i915_gem_request *rq)
 {
+	int err;
+
 	/* Previous access through the fence register is marshalled by
 	 * the mb() inside the fault handlers (i915_gem_release_mmaps)
 	 * and explicitly managed for internal users.
 	 */
 
 	if (IS_GEN2(fence->i915))
-		i830_write_fence_reg(fence, vma);
+		err = i830_write_fence_reg(fence, vma, rq);
 	else if (IS_GEN3(fence->i915))
-		i915_write_fence_reg(fence, vma);
+		err = i915_write_fence_reg(fence, vma, rq);
 	else
-		i965_write_fence_reg(fence, vma);
+		err = i965_write_fence_reg(fence, vma, rq);
+	if (err)
+		return err;
 
 	/* Access through the fenced region afterwards is
 	 * ordered by the posting reads whilst writing the registers.
 	 */
 
 	fence->dirty = false;
+	return 0;
 }
 
 static int fence_update(struct drm_i915_fence_reg *fence,
@@ -212,17 +265,15 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 {
 	int ret;
 
+	ret = i915_gem_active_retire(&fence->pipelined,
+				     &fence->i915->drm.struct_mutex);
+	if (ret)
+		return ret;
+
 	if (vma) {
 		if (!i915_vma_is_map_and_fenceable(vma))
 			return -EINVAL;
 
-		if (WARN(!i915_gem_object_get_stride(vma->obj) ||
-			 !i915_gem_object_get_tiling(vma->obj),
-			 "bogus fence setup with stride: 0x%x, tiling mode: %i\n",
-			 i915_gem_object_get_stride(vma->obj),
-			 i915_gem_object_get_tiling(vma->obj)))
-			return -EINVAL;
-
 		ret = i915_gem_active_retire(&vma->last_fence,
 					     &vma->obj->base.dev->struct_mutex);
 		if (ret)
@@ -253,7 +304,7 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 	 * to the runtime resume, see i915_gem_restore_fences().
 	 */
 	if (intel_runtime_pm_get_if_in_use(fence->i915)) {
-		fence_write(fence, vma);
+		fence_write(fence, vma, NULL);
 		intel_runtime_pm_put(fence->i915);
 	}
 
@@ -287,7 +338,9 @@ int i915_vma_put_fence(struct i915_vma *vma)
 	if (!fence)
 		return 0;
 
-	if (fence->pin_count)
+	GEM_BUG_ON(fence->vma != vma);
+
+	if (atomic_read(&fence->pin_count))
 		return -EBUSY;
 
 	return fence_update(fence, NULL);
@@ -300,7 +353,7 @@ static struct drm_i915_fence_reg *fence_find(struct drm_i915_private *dev_priv)
 	list_for_each_entry(fence, &dev_priv->mm.fence_list, link) {
 		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
 
-		if (fence->pin_count)
+		if (atomic_read(&fence->pin_count))
 			continue;
 
 		return fence;
@@ -344,11 +397,16 @@ i915_vma_pin_fence(struct i915_vma *vma)
 	assert_rpm_wakelock_held(vma->vm->i915);
 
 	/* Just update our place in the LRU if our fence is getting reused. */
-	if (vma->fence) {
-		fence = vma->fence;
+	fence = vma->fence;
+	if (fence) {
 		GEM_BUG_ON(fence->vma != vma);
-		fence->pin_count++;
+		atomic_inc(&fence->pin_count);
 		if (!fence->dirty) {
+			err = i915_gem_active_retire(&fence->pipelined,
+						     &fence->i915->drm.struct_mutex);
+			if (err)
+				goto out_unpin;
+
 			list_move_tail(&fence->link,
 				       &fence->i915->mm.fence_list);
 			return 0;
@@ -358,8 +416,8 @@ i915_vma_pin_fence(struct i915_vma *vma)
 		if (IS_ERR(fence))
 			return PTR_ERR(fence);
 
-		GEM_BUG_ON(fence->pin_count);
-		fence->pin_count++;
+		GEM_BUG_ON(atomic_read(&fence->pin_count));
+		atomic_inc(&fence->pin_count);
 	} else
 		return 0;
 
@@ -374,10 +432,97 @@ i915_vma_pin_fence(struct i915_vma *vma)
 		return 0;
 
 out_unpin:
-	fence->pin_count--;
+	atomic_dec(&fence->pin_count);
 	return err;
 }
 
+int i915_vma_reserve_fence(struct i915_vma *vma)
+{
+	struct drm_i915_fence_reg *fence;
+
+	lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
+	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
+	GEM_BUG_ON(!i915_vma_is_pinned(vma));
+
+	fence = vma->fence;
+	if (!fence) {
+		if (!i915_gem_object_is_tiled(vma->obj))
+			return 0;
+
+		if (!i915_vma_is_map_and_fenceable(vma))
+			return -EINVAL;
+
+		fence = fence_find(vma->vm->i915);
+		if (IS_ERR(fence))
+			return PTR_ERR(fence);
+
+		vma->fence = fence;
+
+		if (fence->vma) {
+			i915_gem_release_mmap(fence->vma->obj);
+			fence->vma->fence = NULL;
+		}
+		fence->vma = vma;
+		fence->dirty = true;
+	}
+	atomic_inc(&fence->pin_count);
+	list_move_tail(&fence->link, &fence->i915->mm.fence_list);
+
+	GEM_BUG_ON(!i915_gem_object_is_tiled(vma->obj));
+	GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+	GEM_BUG_ON(vma->node.size != vma->fence_size);
+	GEM_BUG_ON(!IS_ALIGNED(vma->node.start, vma->fence_alignment));
+
+	return 0;
+}
+
+int i915_vma_emit_pipelined_fence(struct i915_vma *vma,
+				  struct drm_i915_gem_request *rq)
+{
+	struct drm_i915_fence_reg *fence = vma->fence;
+	struct drm_i915_gem_request *prev;
+	int err;
+
+	lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
+	GEM_BUG_ON(fence && !atomic_read(&fence->pin_count));
+
+	if (!fence)
+		goto out;
+
+	prev = i915_gem_active_raw(&fence->pipelined,
+				   &fence->i915->drm.struct_mutex);
+	if (prev) {
+		err = i915_gem_request_await_dma_fence(rq, &prev->fence);
+		if (err)
+			return err;
+	}
+
+	if (!fence->dirty)
+		goto out;
+
+	GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+
+	if (fence->vma) {
+		prev = i915_gem_active_raw(&fence->vma->last_fence,
+					   &fence->i915->drm.struct_mutex);
+		if (prev) {
+			err = i915_gem_request_await_dma_fence(rq,
+							       &prev->fence);
+			if (err)
+				return err;
+		}
+	}
+
+	err = fence_write(fence, vma, rq);
+	if (err)
+		return err;
+
+	i915_gem_active_set(&fence->pipelined, rq);
+out:
+	i915_gem_active_set(&vma->last_fence, rq);
+	return 0;
+}
+
 /**
  * i915_gem_revoke_fences - revoke fence state
  * @dev_priv: i915 device private
@@ -428,6 +573,7 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 		 */
 		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
 			GEM_BUG_ON(!reg->dirty);
+			GEM_BUG_ON(atomic_read(&reg->pin_count));
 			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
 
 			list_move(&reg->link, &dev_priv->mm.fence_list);
@@ -435,7 +581,7 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 			vma = NULL;
 		}
 
-		fence_write(reg, vma);
+		fence_write(reg, vma, NULL);
 		reg->vma = vma;
 	}
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.h b/drivers/gpu/drm/i915/i915_gem_fence_reg.h
index 99a31ded4dfd..f69d894997fb 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.h
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.h
@@ -36,7 +36,7 @@ struct drm_i915_fence_reg {
 	struct list_head link;
 	struct drm_i915_private *i915;
 	struct i915_vma *vma;
-	int pin_count;
+	atomic_t pin_count;
 	int id;
 	/**
 	 * Whether the tiling parameters for the currently
@@ -47,6 +47,7 @@ struct drm_i915_fence_reg {
 	 * command (such as BLT on gen2/3), as a "fence".
 	 */
 	bool dirty;
+	struct i915_gem_active pipelined;
 };
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index c7d5604b4489..34bb50582563 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -706,6 +706,7 @@ int i915_vma_unbind(struct i915_vma *vma)
 		__i915_vma_iounmap(vma);
 		vma->flags &= ~I915_VMA_CAN_FENCE;
 	}
+	GEM_BUG_ON(vma->fence);
 
 	if (likely(!vma->vm->closed)) {
 		trace_i915_vma_unbind(vma);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index ab4a3c0b1526..1c0f9def81ab 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -349,8 +349,8 @@ int __must_check i915_vma_put_fence(struct i915_vma *vma);
 
 static inline void __i915_vma_unpin_fence(struct i915_vma *vma)
 {
-	GEM_BUG_ON(vma->fence->pin_count <= 0);
-	vma->fence->pin_count--;
+	GEM_BUG_ON(atomic_read(&vma->fence->pin_count) <= 0);
+	atomic_dec(&vma->fence->pin_count);
 }
 
 /**
@@ -364,10 +364,17 @@ static inline void __i915_vma_unpin_fence(struct i915_vma *vma)
 static inline void
 i915_vma_unpin_fence(struct i915_vma *vma)
 {
-	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
+	/* The assumption is if the caller has a fence, the caller owns a pin
+	 * on that fence, i.e. that vma->fence cannot become NULL prior to us
+	 * releasing our pin.
+	 */
 	if (vma->fence)
 		__i915_vma_unpin_fence(vma);
 }
 
+int __must_check i915_vma_reserve_fence(struct i915_vma *vma);
+int i915_vma_emit_pipelined_fence(struct i915_vma *vma,
+				  struct drm_i915_gem_request *rq);
+
 #endif
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 6/8] drm/i915: Track user GTT faulting per-vma
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
                   ` (3 preceding siblings ...)
  2017-09-03 12:47 ` [PATCH 5/8] drm/i915: Emit pipelined fence changes Chris Wilson
@ 2017-09-03 12:47 ` Chris Wilson
  2017-09-03 12:47 ` [PATCH 7/8] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node Chris Wilson
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-03 12:47 UTC (permalink / raw)
  To: intel-gfx

We don't wish to refault the entire object (other vma) when unbinding
one partial vma. To do this track which vma have been faulted into the
user's address space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c       |  2 +-
 drivers/gpu/drm/i915/i915_gem.c           | 50 +++++++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_evict.c     |  2 +-
 drivers/gpu/drm/i915/i915_gem_fence_reg.c | 10 ++++---
 drivers/gpu/drm/i915/i915_gem_object.h    |  1 +
 drivers/gpu/drm/i915/i915_vma.c           | 26 +++++++++++++++-
 drivers/gpu/drm/i915/i915_vma.h           | 21 ++++++++++++-
 7 files changed, 88 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 23438ee92177..8705620f11f3 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -96,7 +96,7 @@ static char get_tiling_flag(struct drm_i915_gem_object *obj)
 
 static char get_global_flag(struct drm_i915_gem_object *obj)
 {
-	return !list_empty(&obj->userfault_link) ? 'g' : ' ';
+	return obj->userfault_count ? 'g' : ' ';
 }
 
 static char get_pin_mapped_flag(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bcfb04e8fecc..5e6436f39140 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1913,18 +1913,22 @@ int i915_gem_fault(struct vm_fault *vmf)
 	if (ret)
 		goto err_unpin;
 
-	/* Mark as being mmapped into userspace for later revocation */
-	assert_rpm_wakelock_held(dev_priv);
-	if (list_empty(&obj->userfault_link))
-		list_add(&obj->userfault_link, &dev_priv->mm.userfault_list);
-
 	/* Finally, remap it using the new GTT offset */
 	ret = remap_io_mapping(area,
 			       area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT),
 			       (ggtt->mappable_base + vma->node.start) >> PAGE_SHIFT,
 			       min_t(u64, vma->size, area->vm_end - area->vm_start),
 			       &ggtt->mappable);
+	if (ret)
+		goto err_fence;
 
+	/* Mark as being mmapped into userspace for later revocation */
+	assert_rpm_wakelock_held(dev_priv);
+	if (!i915_vma_set_userfault(vma) && !obj->userfault_count++)
+		list_add(&obj->userfault_link, &dev_priv->mm.userfault_list);
+	GEM_BUG_ON(!obj->userfault_count);
+
+err_fence:
 	i915_vma_unpin_fence(vma);
 err_unpin:
 	__i915_vma_unpin(vma);
@@ -1977,6 +1981,25 @@ int i915_gem_fault(struct vm_fault *vmf)
 	return ret;
 }
 
+static void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
+{
+	struct i915_vma *vma;
+
+	GEM_BUG_ON(!obj->userfault_count);
+
+	obj->userfault_count = 0;
+	list_del(&obj->userfault_link);
+	drm_vma_node_unmap(&obj->base.vma_node,
+			   obj->base.dev->anon_inode->i_mapping);
+
+	list_for_each_entry(vma, &obj->vma_list, obj_link) {
+		if (!i915_vma_is_ggtt(vma))
+			break;
+
+		i915_vma_unset_userfault(vma);
+	}
+}
+
 /**
  * i915_gem_release_mmap - remove physical page mappings
  * @obj: obj in question
@@ -2007,12 +2030,10 @@ i915_gem_release_mmap(struct drm_i915_gem_object *obj)
 	lockdep_assert_held(&i915->drm.struct_mutex);
 	intel_runtime_pm_get(i915);
 
-	if (list_empty(&obj->userfault_link))
+	if (!obj->userfault_count)
 		goto out;
 
-	list_del_init(&obj->userfault_link);
-	drm_vma_node_unmap(&obj->base.vma_node,
-			   obj->base.dev->anon_inode->i_mapping);
+	__i915_gem_object_release_mmap(obj);
 
 	/* Ensure that the CPU's PTE are revoked and there are not outstanding
 	 * memory transactions from userspace before we return. The TLB
@@ -2040,11 +2061,8 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
 	 */
 
 	list_for_each_entry_safe(obj, on,
-				 &dev_priv->mm.userfault_list, userfault_link) {
-		list_del_init(&obj->userfault_link);
-		drm_vma_node_unmap(&obj->base.vma_node,
-				   obj->base.dev->anon_inode->i_mapping);
-	}
+				 &dev_priv->mm.userfault_list, userfault_link)
+		__i915_gem_object_release_mmap(obj);
 
 	/* The fence will be lost when the device powers down. If any were
 	 * in use by hardware (i.e. they are pinned), we should not be powering
@@ -2067,7 +2085,7 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
 		if (!reg->vma)
 			continue;
 
-		GEM_BUG_ON(!list_empty(&reg->vma->obj->userfault_link));
+		GEM_BUG_ON(i915_vma_has_userfault(reg->vma));
 		reg->dirty = true;
 	}
 }
@@ -4320,7 +4338,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	mutex_init(&obj->mm.lock);
 
 	INIT_LIST_HEAD(&obj->global_link);
-	INIT_LIST_HEAD(&obj->userfault_link);
 	INIT_LIST_HEAD(&obj->mm.gtt_wakeref_link);
 	INIT_LIST_HEAD(&obj->vma_list);
 	INIT_LIST_HEAD(&obj->lut_list);
@@ -4480,6 +4497,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 
 	llist_for_each_entry_safe(obj, on, freed, freed) {
 		GEM_BUG_ON(obj->bind_count);
+		GEM_BUG_ON(obj->userfault_count);
 		GEM_BUG_ON(atomic_read(&obj->frontbuffer_bits));
 		GEM_BUG_ON(!list_empty(&obj->lut_list));
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 4df039ef2ce3..933ee8ecfa54 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -82,7 +82,7 @@ mark_free(struct drm_mm_scan *scan,
 	if (i915_vma_is_pinned(vma))
 		return false;
 
-	if (flags & PIN_NONFAULT && !list_empty(&vma->obj->userfault_link))
+	if (flags & PIN_NONFAULT && i915_vma_has_userfault(vma))
 		return false;
 
 	list_add(&vma->evict_link, unwind);
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index c5a00ff44786..ec5c42c3dc91 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -291,7 +291,8 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 		/* Ensure that all userspace CPU access is completed before
 		 * stealing the fence.
 		 */
-		i915_gem_release_mmap(fence->vma->obj);
+		GEM_BUG_ON(fence->vma->fence != fence);
+		i915_vma_revoke_mmap(fence->vma);
 
 		fence->vma->fence = NULL;
 		fence->vma = NULL;
@@ -459,7 +460,8 @@ int i915_vma_reserve_fence(struct i915_vma *vma)
 		vma->fence = fence;
 
 		if (fence->vma) {
-			i915_gem_release_mmap(fence->vma->obj);
+			GEM_BUG_ON(fence->vma->fence != fence);
+			i915_vma_revoke_mmap(fence->vma);
 			fence->vma->fence = NULL;
 		}
 		fence->vma = vma;
@@ -545,7 +547,7 @@ void i915_gem_revoke_fences(struct drm_i915_private *dev_priv)
 		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
 
 		if (fence->vma)
-			i915_gem_release_mmap(fence->vma->obj);
+			i915_vma_revoke_mmap(fence->vma);
 	}
 }
 
@@ -574,7 +576,7 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
 			GEM_BUG_ON(!reg->dirty);
 			GEM_BUG_ON(atomic_read(&reg->pin_count));
-			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
+			GEM_BUG_ON(i915_vma_has_userfault(vma));
 
 			list_move(&reg->link, &dev_priv->mm.fence_list);
 			vma->fence = NULL;
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 5797b75a804b..26648355d92b 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -123,6 +123,7 @@ struct drm_i915_gem_object {
 	/**
 	 * Whether the object is currently in the GGTT mmap.
 	 */
+	unsigned int userfault_count;
 	struct list_head userfault_link;
 
 	struct list_head batch_pool_link;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 34bb50582563..220a7064f466 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -638,6 +638,29 @@ static void __i915_vma_iounmap(struct i915_vma *vma)
 	vma->iomap = NULL;
 }
 
+void i915_vma_revoke_mmap(struct i915_vma *vma)
+{
+	struct drm_vma_offset_node *node = &vma->obj->base.vma_node;
+
+	lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
+
+	if (!i915_vma_has_userfault(vma))
+		return;
+
+	GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+	GEM_BUG_ON(!vma->obj->userfault_count);
+
+	unmap_mapping_range(vma->vm->i915->drm.anon_inode->i_mapping,
+			    drm_vma_node_offset_addr(node) +
+			    (vma->ggtt_view.partial.offset << PAGE_SHIFT),
+			    vma->size,
+			    1);
+
+	i915_vma_unset_userfault(vma);
+	if (!--vma->obj->userfault_count)
+		list_del(&vma->obj->userfault_link);
+}
+
 int i915_vma_unbind(struct i915_vma *vma)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
@@ -701,12 +724,13 @@ int i915_vma_unbind(struct i915_vma *vma)
 			return ret;
 
 		/* Force a pagefault for domain tracking on next user access */
-		i915_gem_release_mmap(obj);
+		i915_vma_revoke_mmap(vma);
 
 		__i915_vma_iounmap(vma);
 		vma->flags &= ~I915_VMA_CAN_FENCE;
 	}
 	GEM_BUG_ON(vma->fence);
+	GEM_BUG_ON(i915_vma_has_userfault(vma));
 
 	if (likely(!vma->vm->closed)) {
 		trace_i915_vma_unbind(vma);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1c0f9def81ab..21cecf492568 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -65,7 +65,7 @@ struct i915_vma {
 	 * that exist in the ctx->handle_vmas LUT for this vma.
 	 */
 	unsigned int open_count;
-	unsigned int flags;
+	unsigned long flags;
 	/**
 	 * How many users have pinned this object in GTT space. The following
 	 * users can each hold at most one reference: pwrite/pread, execbuffer
@@ -87,6 +87,8 @@ struct i915_vma {
 #define I915_VMA_GGTT		BIT(8)
 #define I915_VMA_CAN_FENCE	BIT(9)
 #define I915_VMA_CLOSED		BIT(10)
+#define I915_VMA_USERFAULT_BIT	11
+#define I915_VMA_USERFAULT	BIT(I915_VMA_USERFAULT_BIT)
 
 	unsigned int active;
 	struct i915_gem_active last_read[I915_NUM_ENGINES];
@@ -145,6 +147,22 @@ static inline bool i915_vma_is_closed(const struct i915_vma *vma)
 	return vma->flags & I915_VMA_CLOSED;
 }
 
+static inline bool i915_vma_set_userfault(struct i915_vma *vma)
+{
+	GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+	return __test_and_set_bit(I915_VMA_USERFAULT_BIT, &vma->flags);
+}
+
+static inline void i915_vma_unset_userfault(struct i915_vma *vma)
+{
+	return __clear_bit(I915_VMA_USERFAULT_BIT, &vma->flags);
+}
+
+static inline bool i915_vma_has_userfault(const struct i915_vma *vma)
+{
+	return test_bit(I915_VMA_USERFAULT_BIT, &vma->flags);
+}
+
 static inline unsigned int i915_vma_get_active(const struct i915_vma *vma)
 {
 	return vma->active;
@@ -243,6 +261,7 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long cache_level);
 bool i915_vma_misplaced(const struct i915_vma *vma,
 			u64 size, u64 alignment, u64 flags);
 void __i915_vma_set_map_and_fenceable(struct i915_vma *vma);
+void i915_vma_revoke_mmap(struct i915_vma *vma);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 void i915_vma_unlink_ctx(struct i915_vma *vma);
 void i915_vma_close(struct i915_vma *vma);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 7/8] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
                   ` (4 preceding siblings ...)
  2017-09-03 12:47 ` [PATCH 6/8] drm/i915: Track user GTT faulting per-vma Chris Wilson
@ 2017-09-03 12:47 ` Chris Wilson
  2017-09-03 12:47 ` [PATCH 8/8] drm/i915: Try a minimal attempt to insert the whole object for relocations Chris Wilson
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-03 12:47 UTC (permalink / raw)
  To: intel-gfx

If the caller says that he doesn't want to evict any other faulting
vma, honour that flag. The logic was used in evict_something, but not
the more specific evict_for_node, now being used as a preliminary probe
since commit 606fec956c0e ("drm/i915: Prefer random replacement before
eviction search").

Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction search")
Fixes: 821188778b9b ("drm/i915: Choose not to evict faultable objects from the GGTT")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_evict.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 933ee8ecfa54..a5a5b7e6daae 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -315,6 +315,11 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 			break;
 		}
 
+		if (flags & PIN_NONFAULT && i915_vma_has_userfault(vma)) {
+			ret = -ENOSPC;
+			break;
+		}
+
 		/* Overlap of objects in the same batch? */
 		if (i915_vma_is_pinned(vma)) {
 			ret = -ENOSPC;
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 8/8] drm/i915: Try a minimal attempt to insert the whole object for relocations
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
                   ` (5 preceding siblings ...)
  2017-09-03 12:47 ` [PATCH 7/8] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node Chris Wilson
@ 2017-09-03 12:47 ` Chris Wilson
  2017-09-03 13:08 ` ✓ Fi.CI.BAT: success for series starting with [1/8] drm/i915: Try harder to finish the idle-worker Patchwork
  2017-09-03 14:17 ` ✓ Fi.CI.IGT: " Patchwork
  8 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-03 12:47 UTC (permalink / raw)
  To: intel-gfx

As we have a lightweight fallback to insert a single page into the
aperture, try to avoid any heavier evictions when attempting to insert
the entire object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 90fe7ab1a920..b527f22cf8d0 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -978,7 +978,9 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj,
 			return ERR_PTR(err);
 
 		vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
-					       PIN_MAPPABLE | PIN_NONBLOCK);
+					       PIN_MAPPABLE |
+					       PIN_NONBLOCK |
+					       PIN_NONFAULT);
 		if (IS_ERR(vma)) {
 			memset(&cache->node, 0, sizeof(cache->node));
 			err = drm_mm_insert_node_in_range
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/8] drm/i915: Try harder to finish the idle-worker
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
                   ` (6 preceding siblings ...)
  2017-09-03 12:47 ` [PATCH 8/8] drm/i915: Try a minimal attempt to insert the whole object for relocations Chris Wilson
@ 2017-09-03 13:08 ` Patchwork
  2017-09-03 14:17 ` ✓ Fi.CI.IGT: " Patchwork
  8 siblings, 0 replies; 12+ messages in thread
From: Patchwork @ 2017-09-03 13:08 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915: Try harder to finish the idle-worker
URL   : https://patchwork.freedesktop.org/series/29764/
State : success

== Summary ==

Series 29764v1 series starting with [1/8] drm/i915: Try harder to finish the idle-worker
https://patchwork.freedesktop.org/api/1.0/series/29764/revisions/1/mbox/

Test kms_cursor_legacy:
        Subgroup basic-busy-flip-before-cursor-atomic:
                fail       -> PASS       (fi-snb-2600) fdo#100215 +1
Test kms_flip:
        Subgroup basic-flip-vs-modeset:
                skip       -> PASS       (fi-skl-x1585l) fdo#101781

fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fdo#101781 https://bugs.freedesktop.org/show_bug.cgi?id=101781

fi-bdw-5557u     total:288  pass:268  dwarn:0   dfail:0   fail:0   skip:20  time:459s
fi-bdw-gvtdvm    total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:442s
fi-blb-e6850     total:288  pass:224  dwarn:1   dfail:0   fail:0   skip:63  time:359s
fi-bsw-n3050     total:288  pass:243  dwarn:0   dfail:0   fail:0   skip:45  time:563s
fi-bwr-2160      total:288  pass:184  dwarn:0   dfail:0   fail:0   skip:104 time:254s
fi-bxt-j4205     total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:532s
fi-byt-j1900     total:288  pass:254  dwarn:1   dfail:0   fail:0   skip:33  time:523s
fi-byt-n2820     total:288  pass:250  dwarn:1   dfail:0   fail:0   skip:37  time:520s
fi-elk-e7500     total:288  pass:230  dwarn:0   dfail:0   fail:0   skip:58  time:438s
fi-glk-2a        total:288  pass:260  dwarn:0   dfail:0   fail:0   skip:28  time:616s
fi-hsw-4770      total:288  pass:263  dwarn:0   dfail:0   fail:0   skip:25  time:451s
fi-hsw-4770r     total:288  pass:263  dwarn:0   dfail:0   fail:0   skip:25  time:431s
fi-ilk-650       total:288  pass:229  dwarn:0   dfail:0   fail:0   skip:59  time:422s
fi-ivb-3520m     total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:508s
fi-ivb-3770      total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:472s
fi-kbl-7500u     total:288  pass:264  dwarn:1   dfail:0   fail:0   skip:23  time:524s
fi-kbl-7560u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:599s
fi-kbl-r         total:288  pass:261  dwarn:0   dfail:0   fail:0   skip:27  time:595s
fi-pnv-d510      total:288  pass:223  dwarn:1   dfail:0   fail:0   skip:64  time:521s
fi-skl-6260u     total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:474s
fi-skl-6700k     total:288  pass:265  dwarn:0   dfail:0   fail:0   skip:23  time:540s
fi-skl-6770hq    total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:495s
fi-skl-gvtdvm    total:288  pass:266  dwarn:0   dfail:0   fail:0   skip:22  time:446s
fi-skl-x1585l    total:288  pass:269  dwarn:0   dfail:0   fail:0   skip:19  time:522s
fi-snb-2520m     total:288  pass:251  dwarn:0   dfail:0   fail:0   skip:37  time:551s
fi-snb-2600      total:288  pass:249  dwarn:0   dfail:0   fail:1   skip:38  time:405s

69c8e7a57ac940f8b392b23a23dcdd61e727efd3 drm-tip: 2017y-09m-02d-20h-25m-44s UTC integration manifest
b91df4c60483 drm/i915: Try a minimal attempt to insert the whole object for relocations
cdb8fda2ef00 drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
645bb218df64 drm/i915: Track user GTT faulting per-vma
cc987056c1ef drm/i915: Emit pipelined fence changes
9f7db7abd260 drm/i915: Consolidate get_fence with pin_fence
e499deaecdff drm/i915: Pin fence for iomap
e79c7f0a1e98 drm/i915: Keep the device awake whilst in the GTT domain
f2c293109261 drm/i915: Try harder to finish the idle-worker

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5572/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/8] drm/i915: Try harder to finish the idle-worker
  2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
                   ` (7 preceding siblings ...)
  2017-09-03 13:08 ` ✓ Fi.CI.BAT: success for series starting with [1/8] drm/i915: Try harder to finish the idle-worker Patchwork
@ 2017-09-03 14:17 ` Patchwork
  2017-09-04 12:02   ` Chris Wilson
  8 siblings, 1 reply; 12+ messages in thread
From: Patchwork @ 2017-09-03 14:17 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915: Try harder to finish the idle-worker
URL   : https://patchwork.freedesktop.org/series/29764/
State : success

== Summary ==

Test kms_sysfs_edid_timing:
                warn       -> PASS       (shard-hsw) fdo#100047
Test perf:
        Subgroup polling:
                fail       -> PASS       (shard-hsw) fdo#102252

fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047
fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252

shard-hsw        total:2263 pass:1233 dwarn:0   dfail:0   fail:14  skip:1015 time:9262s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5572/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ✓ Fi.CI.IGT: success for series starting with [1/8] drm/i915: Try harder to finish the idle-worker
  2017-09-03 14:17 ` ✓ Fi.CI.IGT: " Patchwork
@ 2017-09-04 12:02   ` Chris Wilson
  2017-09-07 22:03     ` Chris Wilson
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Wilson @ 2017-09-04 12:02 UTC (permalink / raw)
  To: Patchwork; +Cc: intel-gfx

Quoting Patchwork (2017-09-03 15:17:57)
> == Series Details ==
> 
> Series: series starting with [1/8] drm/i915: Try harder to finish the idle-worker
> URL   : https://patchwork.freedesktop.org/series/29764/
> State : success
> 
> == Summary ==
> 
> Test kms_sysfs_edid_timing:
>                 warn       -> PASS       (shard-hsw) fdo#100047
> Test perf:
>         Subgroup polling:
>                 fail       -> PASS       (shard-hsw) fdo#102252
> 
> fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047
> fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
> 
> shard-hsw        total:2263 pass:1233 dwarn:0   dfail:0   fail:14  skip:1015 time:9262s
> 
> == Logs ==
> 
> For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5572/shards.html

Fwiw, finally we bring the incompletes on apl under control.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: ✓ Fi.CI.IGT: success for series starting with [1/8] drm/i915: Try harder to finish the idle-worker
  2017-09-04 12:02   ` Chris Wilson
@ 2017-09-07 22:03     ` Chris Wilson
  0 siblings, 0 replies; 12+ messages in thread
From: Chris Wilson @ 2017-09-07 22:03 UTC (permalink / raw)
  To: Patchwork; +Cc: intel-gfx

Quoting Chris Wilson (2017-09-04 13:02:27)
> Quoting Patchwork (2017-09-03 15:17:57)
> > == Series Details ==
> > 
> > Series: series starting with [1/8] drm/i915: Try harder to finish the idle-worker
> > URL   : https://patchwork.freedesktop.org/series/29764/
> > State : success
> > 
> > == Summary ==
> > 
> > Test kms_sysfs_edid_timing:
> >                 warn       -> PASS       (shard-hsw) fdo#100047
> > Test perf:
> >         Subgroup polling:
> >                 fail       -> PASS       (shard-hsw) fdo#102252
> > 
> > fdo#100047 https://bugs.freedesktop.org/show_bug.cgi?id=100047
> > fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
> > 
> > shard-hsw        total:2263 pass:1233 dwarn:0   dfail:0   fail:14  skip:1015 time:9262s
> > 
> > == Logs ==
> > 
> > For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5572/shards.html
> 
> Fwiw, finally we bring the incompletes on apl^W bxt under control.

Ping.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2017-09-07 22:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-03 12:47 [PATCH 1/8] drm/i915: Try harder to finish the idle-worker Chris Wilson
2017-09-03 12:47 ` [PATCH 2/8] drm/i915: Keep the device awake whilst in the GTT domain Chris Wilson
2017-09-03 12:47 ` [PATCH 3/8] drm/i915: Pin fence for iomap Chris Wilson
2017-09-03 12:47 ` [PATCH 4/8] drm/i915: Consolidate get_fence with pin_fence Chris Wilson
2017-09-03 12:47 ` [PATCH 5/8] drm/i915: Emit pipelined fence changes Chris Wilson
2017-09-03 12:47 ` [PATCH 6/8] drm/i915: Track user GTT faulting per-vma Chris Wilson
2017-09-03 12:47 ` [PATCH 7/8] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node Chris Wilson
2017-09-03 12:47 ` [PATCH 8/8] drm/i915: Try a minimal attempt to insert the whole object for relocations Chris Wilson
2017-09-03 13:08 ` ✓ Fi.CI.BAT: success for series starting with [1/8] drm/i915: Try harder to finish the idle-worker Patchwork
2017-09-03 14:17 ` ✓ Fi.CI.IGT: " Patchwork
2017-09-04 12:02   ` Chris Wilson
2017-09-07 22:03     ` Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.