All of lore.kernel.org
 help / color / mirror / Atom feed
* Fixes for CI/bxt
@ 2017-09-11  8:41 Chris Wilson
  2017-09-11  8:41 ` [PATCH 01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM Chris Wilson
                   ` (15 more replies)
  0 siblings, 16 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

A collection of fixes targetting the timeouts (INCOMPLETES) and failures
on CI/bxt. You've seen most of these patches before...
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-11 11:33   ` Michał Winiarski
  2017-09-11  8:41 ` [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer Chris Wilson
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

To silence the critcs:

[56532.161115] workqueue: PF_MEMALLOC task 36(khugepaged) is flushing !WQ_MEM_RECLAIM i915-userptr-release:          (null)
[56532.161138] ------------[ cut here ]------------
[56532.161144] WARNING: CPU: 1 PID: 36 at kernel/workqueue.c:2418 check_flush_dependency+0xe8/0xf0
[56532.161145] Modules linked in: wmi_bmof
[56532.161148] CPU: 1 PID: 36 Comm: khugepaged Not tainted 4.13.0-krejzi #1
[56532.161149] Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.17 06/08/2017
[56532.161150] task: ffff8802371ee200 task.stack: ffffc90000174000
[56532.161152] RIP: 0010:check_flush_dependency+0xe8/0xf0
[56532.161152] RSP: 0018:ffffc900001777b8 EFLAGS: 00010286
[56532.161153] RAX: 000000000000006c RBX: ffff88022fc5a000 RCX: 0000000000000001
[56532.161154] RDX: 0000000000000000 RSI: 0000000000000086 RDI: 00000000ffffffff
[56532.161155] RBP: 0000000000000000 R08: 14f038bb55f6dae0 R09: 0000000000000516
[56532.161155] R10: ffffc900001778a0 R11: 000000006c756e28 R12: ffff8802371ee200
[56532.161156] R13: 0000000000000000 R14: 000000000000000b R15: ffffc90000177810
[56532.161157] FS:  0000000000000000(0000) GS:ffff880240480000(0000) knlGS:0000000000000000
[56532.161158] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[56532.161158] CR2: 0000000004795ff8 CR3: 000000000220a000 CR4: 00000000003406e0
[56532.161159] Call Trace:
[56532.161161]  ? flush_workqueue+0x136/0x3e0
[56532.161178]  ? _raw_spin_unlock_irqrestore+0xf/0x30
[56532.161179]  ? try_to_wake_up+0x1ce/0x3b0
[56532.161183]  ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150
[56532.161184]  ? _raw_spin_unlock+0xd/0x20
[56532.161186]  ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150
[56532.161189]  ? __mmu_notifier_invalidate_range_start+0x4a/0x70
[56532.161191]  ? try_to_unmap_one+0x5e5/0x660
[56532.161193]  ? rmap_walk_file+0xe4/0x240
[56532.161195]  ? __ClearPageMovable+0x10/0x10
[56532.161196]  ? try_to_unmap+0x8c/0xe0
[56532.161197]  ? page_remove_rmap+0x280/0x280
[56532.161199]  ? page_not_mapped+0x10/0x10
[56532.161200]  ? page_get_anon_vma+0x90/0x90
[56532.161202]  ? migrate_pages+0x6a5/0x940
[56532.161203]  ? isolate_freepages_block+0x330/0x330
[56532.161205]  ? compact_zone+0x593/0x6a0
[56532.161206]  ? enqueue_task_fair+0xc3/0x1180
[56532.161208]  ? compact_zone_order+0x9b/0xc0
[56532.161210]  ? get_page_from_freelist+0x24a/0x900
[56532.161212]  ? try_to_compact_pages+0xc8/0x240
[56532.161213]  ? try_to_compact_pages+0xc8/0x240
[56532.161215]  ? __alloc_pages_direct_compact+0x45/0xe0
[56532.161216]  ? __alloc_pages_slowpath+0x845/0xb90
[56532.161218]  ? __alloc_pages_nodemask+0x176/0x1f0
[56532.161220]  ? wait_woken+0x80/0x80
[56532.161222]  ? khugepaged+0x29e/0x17d0
[56532.161223]  ? wait_woken+0x80/0x80
[56532.161225]  ? collapse_shmem.isra.39+0xa60/0xa60
[56532.161226]  ? kthread+0x10d/0x130
[56532.161227]  ? kthread_create_on_node+0x60/0x60
[56532.161228]  ? ret_from_fork+0x22/0x30
[56532.161229] Code: 00 8b b0 10 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 b8 06 00 00 49 89 e8 48 c7 c7 38 55 09 82 c6 05 f9 c6 1d 01 01 e8 0e a1 03 00 <0f> ff e9 6b ff ff ff 90 48 8b 37 40 f6 c6 04 75 1b 48 c1 ee 05
[56532.161251] ---[ end trace 2ce2b4f5f69b803b ]---

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_userptr.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 2ce078e8c763..15547c8f7d3f 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -813,7 +813,9 @@ int i915_gem_init_userptr(struct drm_i915_private *dev_priv)
 	hash_init(dev_priv->mm_structs);
 
 	dev_priv->mm.userptr_wq =
-		alloc_workqueue("i915-userptr-acquire", WQ_HIGHPRI, 0);
+		alloc_workqueue("i915-userptr-acquire",
+				WQ_HIGHPRI | WQ_MEM_RECLAIM,
+				0);
 	if (!dev_priv->mm.userptr_wq)
 		return -ENOMEM;
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
  2017-09-11  8:41 ` [PATCH 01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-19  9:24   ` Chris Wilson
  2017-09-19 10:19   ` Tvrtko Ursulin
  2017-09-11  8:41 ` [PATCH 03/11] drm/i915: Include fence-hint for timeout warning Chris Wilson
                   ` (13 subsequent siblings)
  15 siblings, 2 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

A fence may be signaled from any context, including from inside a timer.
One example is timer_i915_sw_fence_wake() which is used to provide a
safety-net when waiting on an external fence. If the external fence is
not signaled within a timely fashion, we signal our fence on its behalf,
and so we then may process subsequent fences in the chain from within
that timer context.

Given that dma_i915_sw_fence_cb() may be from inside a timer, we cannot
then use del_timer_sync() as that requires the timer lock for itself. To
circumvent this, while trying to keep the signal propagation as low
latency as possible, move the completion into a worker and use a bit of
atomic switheroo to serialise the timer-callback and the dma-callback.

Testcase: igt/gem_eio/in-flight-external
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/Kconfig         |  1 +
 drivers/gpu/drm/i915/i915_sw_fence.c | 27 +++++++++++++++++++++------
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index e9e64e8e9765..dfd95889f4b7 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -12,6 +12,7 @@ config DRM_I915
 	select DRM_PANEL
 	select DRM_MIPI_DSI
 	select RELAY
+	select IRQ_WORK
 	# i915 depends on ACPI_VIDEO when ACPI is enabled
 	# but for select to work, need to select ACPI_VIDEO's dependencies, ick
 	select BACKLIGHT_LCD_SUPPORT if ACPI
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index f29540f922af..808ea4d5b962 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -9,6 +9,7 @@
 
 #include <linux/slab.h>
 #include <linux/dma-fence.h>
+#include <linux/irq_work.h>
 #include <linux/reservation.h>
 
 #include "i915_sw_fence.h"
@@ -356,31 +357,44 @@ struct i915_sw_dma_fence_cb {
 	struct i915_sw_fence *fence;
 	struct dma_fence *dma;
 	struct timer_list timer;
+	struct irq_work work;
 };
 
 static void timer_i915_sw_fence_wake(unsigned long data)
 {
 	struct i915_sw_dma_fence_cb *cb = (struct i915_sw_dma_fence_cb *)data;
+	struct i915_sw_fence *fence;
+
+	fence = xchg(&cb->fence, NULL);
+	if (!fence)
+		return;
 
 	pr_warn("asynchronous wait on fence %s:%s:%x timed out\n",
 		cb->dma->ops->get_driver_name(cb->dma),
 		cb->dma->ops->get_timeline_name(cb->dma),
 		cb->dma->seqno);
-	dma_fence_put(cb->dma);
-	cb->dma = NULL;
 
-	i915_sw_fence_complete(cb->fence);
-	cb->timer.function = NULL;
+	i915_sw_fence_complete(fence);
 }
 
 static void dma_i915_sw_fence_wake(struct dma_fence *dma,
 				   struct dma_fence_cb *data)
 {
 	struct i915_sw_dma_fence_cb *cb = container_of(data, typeof(*cb), base);
+	struct i915_sw_fence *fence;
+
+	fence = xchg(&cb->fence, NULL);
+	if (fence)
+		i915_sw_fence_complete(fence);
+
+	irq_work_queue(&cb->work);
+}
+
+static void irq_i915_sw_fence_work(struct irq_work *wrk)
+{
+	struct i915_sw_dma_fence_cb *cb = container_of(wrk, typeof(*cb), work);
 
 	del_timer_sync(&cb->timer);
-	if (cb->timer.function)
-		i915_sw_fence_complete(cb->fence);
 	dma_fence_put(cb->dma);
 
 	kfree(cb);
@@ -414,6 +428,7 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
 	__setup_timer(&cb->timer,
 		      timer_i915_sw_fence_wake, (unsigned long)cb,
 		      TIMER_IRQSAFE);
+	init_irq_work(&cb->work, irq_i915_sw_fence_work);
 	if (timeout) {
 		cb->dma = dma_fence_get(dma);
 		mod_timer(&cb->timer, round_jiffies_up(jiffies + timeout));
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 03/11] drm/i915: Include fence-hint for timeout warning
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
  2017-09-11  8:41 ` [PATCH 01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM Chris Wilson
  2017-09-11  8:41 ` [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-28 10:14   ` Joonas Lahtinen
  2017-09-11  8:41 ` [PATCH 04/11] drm/i915: Try harder to finish the idle-worker Chris Wilson
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

If an asynchronous wait on a foriegn fence, we print a warning
indicating which fence was not signaled. As i915_sw_fences become more
common, include the debug hint (the symbol-name of the target) to help
identify the waiter. E.g.

[   31.968144] Asynchronous wait on fence sw_sync:gem_eio:1 timed out for hint:submit_notify [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_sw_fence.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 808ea4d5b962..e63cda8adc45 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -24,13 +24,13 @@ enum {
 	DEBUG_FENCE_NOTIFY,
 };
 
-#ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
-
 static void *i915_sw_fence_debug_hint(void *addr)
 {
 	return (void *)(((struct i915_sw_fence *)addr)->flags & I915_SW_FENCE_MASK);
 }
 
+#ifdef CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS
+
 static struct debug_obj_descr i915_sw_fence_debug_descr = {
 	.name = "i915_sw_fence",
 	.debug_hint = i915_sw_fence_debug_hint,
@@ -369,10 +369,11 @@ static void timer_i915_sw_fence_wake(unsigned long data)
 	if (!fence)
 		return;
 
-	pr_warn("asynchronous wait on fence %s:%s:%x timed out\n",
+	pr_warn("Asynchronous wait on fence %s:%s:%x timed out for hint:%pF\n",
 		cb->dma->ops->get_driver_name(cb->dma),
 		cb->dma->ops->get_timeline_name(cb->dma),
-		cb->dma->seqno);
+		cb->dma->seqno,
+		i915_sw_fence_debug_hint(fence));
 
 	i915_sw_fence_complete(fence);
 }
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 04/11] drm/i915: Try harder to finish the idle-worker
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (2 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 03/11] drm/i915: Include fence-hint for timeout warning Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-28 10:16   ` Joonas Lahtinen
  2017-09-11  8:41 ` [PATCH 05/11] drm/i915: Keep the device awake whilst in the GTT domain Chris Wilson
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

If a worker requeues itself, it may switch to a different kworker pool,
which flush_work() considers as complete. To be strict, we then need to
keep flushing the work until it is no longer pending.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102456
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  3 +--
 drivers/gpu/drm/i915/i915_gem.c     |  3 +--
 drivers/gpu/drm/i915/i915_utils.h   | 13 +++++++++++++
 3 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 6338018f655d..d65558f650d6 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4256,8 +4256,7 @@ fault_irq_set(struct drm_i915_private *i915,
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	/* Flush idle worker to disarm irq */
-	while (flush_delayed_work(&i915->gt.idle_work))
-		;
+	drain_delayed_work(&i915->gt.idle_work);
 
 	return 0;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f445587c1a4b..d98f7b25d395 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4583,8 +4583,7 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 	/* As the idle_work is rearming if it detects a race, play safe and
 	 * repeat the flush until it is definitely idle.
 	 */
-	while (flush_delayed_work(&dev_priv->gt.idle_work))
-		;
+	drain_delayed_work(&dev_priv->gt.idle_work);
 
 	/* Assert that we sucessfully flushed all the work and
 	 * reset the GPU back to its idle, low power state.
diff --git a/drivers/gpu/drm/i915/i915_utils.h b/drivers/gpu/drm/i915/i915_utils.h
index 12fc250b47b9..4f7ffa0976b1 100644
--- a/drivers/gpu/drm/i915/i915_utils.h
+++ b/drivers/gpu/drm/i915/i915_utils.h
@@ -119,4 +119,17 @@ static inline void __list_del_many(struct list_head *head,
 	WRITE_ONCE(head->next, first);
 }
 
+/*
+ * Wait until the work is finally complete, even if it tries to postpone
+ * by requeueing itself. Note, that if the worker never cancels itself,
+ * we will spin forever.
+ */
+static inline void drain_delayed_work(struct delayed_work *dw)
+{
+	do {
+		while (flush_delayed_work(dw))
+			;
+	} while (delayed_work_pending(dw));
+}
+
 #endif /* !__I915_UTILS_H */
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 05/11] drm/i915: Keep the device awake whilst in the GTT domain
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (3 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 04/11] drm/i915: Try harder to finish the idle-worker Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-14 14:58   ` [PATCH v2] " Chris Wilson
  2017-09-11  8:41 ` [PATCH 06/11] drm/i915: Pin fence for iomap Chris Wilson
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

Since runtime suspend is very harsh on GTT mmappings (they all get
zapped on suspend) keep the device awake while the buffer remains in
the GTT domain. However, userspace can control the domain and
although there is a soft contract that writes must be flushed (for e.g.
flushing scanouts and fbc), we are intentionally lax with respect to read
domains, allowing them to persist for as long as is feasible.

We acquire a wakeref when using the buffer in the GEM domain so that all
subsequent operations on that object are fast, trying to avoid
suspending while the GTT is still in active use by userspace.  To ensure
that the device can eventually suspend, we install a timer and expire the
GTT wakeref. So in effect we have just a fancy pm autosuspend that tries
to estimate the cost of restoring actively used GTT mmapping.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102490
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c              |   2 +
 drivers/gpu/drm/i915/i915_drv.h                  |  12 +++
 drivers/gpu/drm/i915/i915_gem.c                  | 116 +++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_object.h           |   5 +
 drivers/gpu/drm/i915/i915_gem_shrinker.c         |   4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_object.c |   2 +-
 6 files changed, 122 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d65558f650d6..9e961ba9e099 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2812,6 +2812,8 @@ static int i915_runtime_pm_status(struct seq_file *m, void *unused)
 	if (!HAS_RUNTIME_PM(dev_priv))
 		seq_puts(m, "Runtime power management not supported\n");
 
+	seq_printf(m, "GTT wakeref count: %d\n",
+		   atomic_read(&dev_priv->mm.gtt_wakeref.count));
 	seq_printf(m, "GPU idle: %s\n", yesno(!dev_priv->gt.awake));
 	seq_printf(m, "IRQs disabled: %s\n",
 		   yesno(!intel_irqs_enabled(dev_priv)));
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d07d1109e784..a4b49016593e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1479,6 +1479,18 @@ struct i915_gem_mm {
 	 */
 	struct list_head userfault_list;
 
+	/* List of all objects in gtt domain, holding a wakeref.
+	 * The list is reaped periodically, and protected by its own mutex.
+	 * Entries on this list do not carry a reference.
+	 */
+	struct {
+		struct mutex lock;
+		struct list_head list;
+		atomic_t count; /* -1 == no timeout scheduled */
+
+		struct delayed_work work;
+	} gtt_wakeref;
+
 	/**
 	 * List of objects which are pending destruction.
 	 */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d98f7b25d395..a72376007b64 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -235,6 +235,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 
 static void __start_cpu_write(struct drm_i915_gem_object *obj)
 {
+	GEM_BUG_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU);
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	if (cpu_write_needs_clflush(obj))
@@ -667,11 +668,14 @@ fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain)
 		obj->frontbuffer_ggtt_origin : ORIGIN_CPU);
 }
 
-static void
-flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
+void
+i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
+				   unsigned int flush_domains)
 {
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 
+	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+
 	if (!(obj->base.write_domain & flush_domains))
 		return;
 
@@ -694,16 +698,19 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
 
 	switch (obj->base.write_domain) {
 	case I915_GEM_DOMAIN_GTT:
-		if (!HAS_LLC(dev_priv)) {
-			intel_runtime_pm_get(dev_priv);
-			spin_lock_irq(&dev_priv->uncore.lock);
-			POSTING_READ_FW(RING_HEAD(dev_priv->engine[RCS]->mmio_base));
-			spin_unlock_irq(&dev_priv->uncore.lock);
-			intel_runtime_pm_put(dev_priv);
-		}
+		mutex_lock(&dev_priv->mm.gtt_wakeref.lock);
+		if (!list_empty(&obj->mm.gtt_wakeref_link)) {
+			if (!HAS_LLC(dev_priv)) {
+				spin_lock_irq(&dev_priv->uncore.lock);
+				POSTING_READ_FW(RING_HEAD(dev_priv->engine[RCS]->mmio_base));
+				spin_unlock_irq(&dev_priv->uncore.lock);
+			}
+
+			intel_fb_obj_flush(obj, fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
 
-		intel_fb_obj_flush(obj,
-				   fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
+			list_del_init(&obj->mm.gtt_wakeref_link);
+		}
+		mutex_unlock(&dev_priv->mm.gtt_wakeref.lock);
 		break;
 
 	case I915_GEM_DOMAIN_CPU:
@@ -808,7 +815,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
 			goto out;
 	}
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* If we're not in the cpu read domain, set ourself into the gtt
 	 * read domain and manually flush cachelines (if required). This
@@ -861,7 +868,7 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj,
 			goto out;
 	}
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* If we're not in the cpu write domain, set ourself into the
 	 * gtt write domain and manually flush cachelines (as required).
@@ -3423,9 +3430,10 @@ static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object *obj)
 	 * We manually flush the CPU domain so that we can override and
 	 * force the flush for the display, and perform it asyncrhonously.
 	 */
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 	if (obj->cache_dirty)
 		i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE);
+	GEM_BUG_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU);
 	obj->base.write_domain = 0;
 }
 
@@ -3478,7 +3486,7 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
 
 	/* Serialise direct access to this object with the barriers for
 	 * coherent writes from the GPU, by effectively invalidating the
@@ -3502,6 +3510,49 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 	return 0;
 }
 
+static void wakeref_timeout(struct work_struct *work)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(work, typeof(*dev_priv), mm.gtt_wakeref.work.work);
+	struct drm_i915_gem_object *obj, *on;
+	unsigned int count;
+
+	mutex_lock(&dev_priv->mm.gtt_wakeref.lock);
+	count = atomic_xchg(&dev_priv->mm.gtt_wakeref.count, 0);
+	if (count) {
+		unsigned long timeout;
+
+		GEM_BUG_ON(count == -1);
+
+		if (INTEL_GEN(dev_priv) >= 6 && !HAS_LLC(dev_priv)) {
+			spin_lock_irq(&dev_priv->uncore.lock);
+			POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
+			spin_unlock_irq(&dev_priv->uncore.lock);
+		}
+
+		timeout = 0;
+		list_for_each_entry_safe(obj, on,
+					 &dev_priv->mm.gtt_wakeref.list,
+					 mm.gtt_wakeref_link) {
+			list_del_init(&obj->mm.gtt_wakeref_link);
+			if (obj->base.write_domain & I915_GEM_DOMAIN_GTT) {
+				intel_fb_obj_flush(obj, fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
+				obj->base.write_domain = 0;
+				timeout++;
+			}
+			timeout++;
+		}
+
+		timeout = HZ * (ilog2(timeout) + 1) / 2;
+		schedule_delayed_work(&dev_priv->mm.gtt_wakeref.work,
+				      round_jiffies_up_relative(timeout));
+	} else {
+		intel_runtime_pm_put(dev_priv);
+		atomic_set(&dev_priv->mm.gtt_wakeref.count, -1);
+	}
+	mutex_unlock(&dev_priv->mm.gtt_wakeref.lock);
+}
+
 /**
  * Moves a single object to the GTT read, and possibly write domain.
  * @obj: object to act on
@@ -3513,6 +3564,7 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 int
 i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	int ret;
 
 	lockdep_assert_held(&obj->base.dev->struct_mutex);
@@ -3526,6 +3578,17 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
+	if (list_empty_careful(&obj->mm.gtt_wakeref_link)) {
+		mutex_lock(&i915->mm.gtt_wakeref.lock);
+		if (atomic_inc_and_test(&i915->mm.gtt_wakeref.count)) {
+			intel_runtime_pm_get(i915);
+			schedule_delayed_work(&i915->mm.gtt_wakeref.work,
+					      round_jiffies_up_relative(HZ));
+		}
+		list_add(&obj->mm.gtt_wakeref_link, &i915->mm.gtt_wakeref.list);
+		mutex_unlock(&i915->mm.gtt_wakeref.lock);
+	}
+
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
 		return 0;
 
@@ -3541,7 +3604,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT);
 
 	/* Serialise direct access to this object with the barriers for
 	 * coherent writes from the GPU, by effectively invalidating the
@@ -3909,7 +3972,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* Flush the CPU cache if it's still invalid. */
 	if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) {
@@ -4258,6 +4321,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 
 	INIT_LIST_HEAD(&obj->global_link);
 	INIT_LIST_HEAD(&obj->userfault_link);
+	INIT_LIST_HEAD(&obj->mm.gtt_wakeref_link);
 	INIT_LIST_HEAD(&obj->vma_list);
 	INIT_LIST_HEAD(&obj->lut_list);
 	INIT_LIST_HEAD(&obj->batch_pool_link);
@@ -4395,6 +4459,8 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 
 		trace_i915_gem_object_destroy(obj);
 
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+
 		GEM_BUG_ON(i915_gem_object_is_active(obj));
 		list_for_each_entry_safe(vma, vn,
 					 &obj->vma_list, obj_link) {
@@ -4417,6 +4483,12 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 		GEM_BUG_ON(atomic_read(&obj->frontbuffer_bits));
 		GEM_BUG_ON(!list_empty(&obj->lut_list));
 
+		if (!list_empty_careful(&obj->mm.gtt_wakeref_link)) {
+			mutex_lock(&i915->mm.gtt_wakeref.lock);
+			list_del(&obj->mm.gtt_wakeref_link);
+			mutex_unlock(&i915->mm.gtt_wakeref.lock);
+		}
+
 		if (obj->ops->release)
 			obj->ops->release(obj);
 
@@ -4549,6 +4621,8 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 	int ret;
 
 	intel_runtime_pm_get(dev_priv);
+	drain_delayed_work(&dev_priv->mm.gtt_wakeref.work);
+	WARN_ON(atomic_read(&dev_priv->mm.gtt_wakeref.count) != -1);
 	intel_suspend_gt_powersave(dev_priv);
 
 	mutex_lock(&dev->struct_mutex);
@@ -4932,6 +5006,11 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 	if (err)
 		goto err_priorities;
 
+	INIT_LIST_HEAD(&dev_priv->mm.gtt_wakeref.list);
+	INIT_DELAYED_WORK(&dev_priv->mm.gtt_wakeref.work, wakeref_timeout);
+	mutex_init(&dev_priv->mm.gtt_wakeref.lock);
+	atomic_set(&dev_priv->mm.gtt_wakeref.count, -1);
+
 	INIT_WORK(&dev_priv->mm.free_work, __i915_gem_free_work);
 	init_llist_head(&dev_priv->mm.free_list);
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
@@ -4978,6 +5057,9 @@ void i915_gem_load_cleanup(struct drm_i915_private *dev_priv)
 	WARN_ON(!list_empty(&dev_priv->gt.timelines));
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
+	drain_delayed_work(&dev_priv->mm.gtt_wakeref.work);
+	WARN_ON(atomic_read(&dev_priv->mm.gtt_wakeref.count) != -1);
+
 	kmem_cache_destroy(dev_priv->priorities);
 	kmem_cache_destroy(dev_priv->dependencies);
 	kmem_cache_destroy(dev_priv->requests);
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index c30d8f808185..5797b75a804b 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -177,6 +177,8 @@ struct drm_i915_gem_object {
 			struct mutex lock; /* protects this cache */
 		} get_page;
 
+		struct list_head gtt_wakeref_link;
+
 		/**
 		 * Advice: are the backing pages purgeable?
 		 */
@@ -421,5 +423,8 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 					 unsigned int cache_level);
 void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
 
+void i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
+					unsigned int flush_domains);
+
 #endif
 
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 77fb39808131..8d146584ee49 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -127,8 +127,10 @@ static bool can_release_pages(struct drm_i915_gem_object *obj)
 
 static bool unsafe_drop_pages(struct drm_i915_gem_object *obj)
 {
-	if (i915_gem_object_unbind(obj) == 0)
+	if (i915_gem_object_unbind(obj) == 0) {
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 		__i915_gem_object_put_pages(obj, I915_MM_SHRINKER);
+	}
 	return !READ_ONCE(obj->mm.pages);
 }
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index 8f011c447e41..f3f6588afbdb 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -266,7 +266,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
 		if (offset >= obj->base.size)
 			continue;
 
-		flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 		p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
 		cpu = kmap(p) + offset_in_page(offset);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 06/11] drm/i915: Pin fence for iomap
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (4 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 05/11] drm/i915: Keep the device awake whilst in the GTT domain Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-28 10:20   ` Joonas Lahtinen
  2017-09-11  8:41 ` [PATCH 07/11] drm/i915: Consolidate get_fence with pin_fence Chris Wilson
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

Acquire the fence register for the iomap in i915_vma_pin_iomap() on
behalf of the caller.

We probably want for the caller to specify whether the fence should be
pinned for their usage, but at the moment all callers do want the
associated fence, or none, so take it on their behalf.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_vma.c                  | 19 +++++++++++++++++++
 drivers/gpu/drm/i915/i915_vma.h                  |  7 +------
 drivers/gpu/drm/i915/selftests/i915_gem_object.c |  8 --------
 3 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 02d1a5eacb00..f15542e64808 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -278,6 +278,7 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
 void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 {
 	void __iomem *ptr;
+	int ret;
 
 	/* Access through the GTT requires the device to be awake. */
 	assert_rpm_wakelock_held(vma->vm->i915);
@@ -301,9 +302,27 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 	}
 
 	__i915_vma_pin(vma);
+
+	ret = i915_vma_get_fence(vma);
+	if (ret) {
+		__i915_vma_unpin(vma);
+		return IO_ERR_PTR(ret);
+	}
+	i915_vma_pin_fence(vma);
+
 	return ptr;
 }
 
+void i915_vma_unpin_iomap(struct i915_vma *vma)
+{
+	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
+
+	GEM_BUG_ON(vma->iomap == NULL);
+
+	i915_vma_unpin_fence(vma);
+	i915_vma_unpin(vma);
+}
+
 void i915_vma_unpin_and_release(struct i915_vma **p_vma)
 {
 	struct i915_vma *vma;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index e811067c7724..347a40d08b9b 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -321,12 +321,7 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma);
  * Callers must hold the struct_mutex. This function is only valid to be
  * called on a VMA previously iomapped by the caller with i915_vma_pin_iomap().
  */
-static inline void i915_vma_unpin_iomap(struct i915_vma *vma)
-{
-	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
-	GEM_BUG_ON(vma->iomap == NULL);
-	i915_vma_unpin(vma);
-}
+void i915_vma_unpin_iomap(struct i915_vma *vma);
 
 static inline struct page *i915_vma_first_page(struct i915_vma *vma)
 {
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index f3f6588afbdb..5f9330b4094a 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -251,14 +251,6 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
 			return PTR_ERR(io);
 		}
 
-		err = i915_vma_get_fence(vma);
-		if (err) {
-			pr_err("Failed to get fence for partial view: offset=%lu\n",
-			       page);
-			i915_vma_unpin_iomap(vma);
-			return err;
-		}
-
 		iowrite32(page, io + n * PAGE_SIZE/sizeof(*io));
 		i915_vma_unpin_iomap(vma);
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 07/11] drm/i915: Consolidate get_fence with pin_fence
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (5 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 06/11] drm/i915: Pin fence for iomap Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-11  8:41 ` [PATCH 08/11] drm/i915: Emit pipelined fence changes Chris Wilson
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

Following the pattern now used for obj->mm.pages, use just pin_fence and
unpin_fence to control access to the fence registers. I.e. instead of
calling get_fence(); pin_fence(), we now just need to call pin_fence().
This will make it easier to reduce the locking requirements around
fence registers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h            |  2 --
 drivers/gpu/drm/i915/i915_gem.c            |  3 ++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 10 ++++-----
 drivers/gpu/drm/i915/i915_gem_fence_reg.c  | 33 +++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_vma.c            |  3 +--
 drivers/gpu/drm/i915/i915_vma.h            | 20 ++++++++----------
 drivers/gpu/drm/i915/intel_display.c       |  3 +--
 7 files changed, 45 insertions(+), 29 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a4b49016593e..f8d98d3736ac 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3714,8 +3714,6 @@ i915_vm_to_ppgtt(struct i915_address_space *vm)
 }
 
 /* i915_gem_fence_reg.c */
-int __must_check i915_vma_get_fence(struct i915_vma *vma);
-int __must_check i915_vma_put_fence(struct i915_vma *vma);
 struct drm_i915_fence_reg *
 i915_reserve_fence(struct drm_i915_private *dev_priv);
 void i915_unreserve_fence(struct drm_i915_fence_reg *fence);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a72376007b64..90bf5cba712f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1912,7 +1912,7 @@ int i915_gem_fault(struct vm_fault *vmf)
 	if (ret)
 		goto err_unpin;
 
-	ret = i915_vma_get_fence(vma);
+	ret = i915_vma_pin_fence(vma);
 	if (ret)
 		goto err_unpin;
 
@@ -1928,6 +1928,7 @@ int i915_gem_fault(struct vm_fault *vmf)
 			       min_t(u64, vma->size, area->vm_end - area->vm_start),
 			       &ggtt->mappable);
 
+	i915_vma_unpin_fence(vma);
 err_unpin:
 	__i915_vma_unpin(vma);
 err_unlock:
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 7687483ff218..da12bba60c40 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -366,12 +366,12 @@ eb_pin_vma(struct i915_execbuffer *eb,
 		return false;
 
 	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		if (unlikely(i915_vma_get_fence(vma))) {
+		if (unlikely(i915_vma_pin_fence(vma))) {
 			i915_vma_unpin(vma);
 			return false;
 		}
 
-		if (i915_vma_pin_fence(vma))
+		if (vma->fence)
 			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
 
@@ -384,7 +384,7 @@ static inline void __eb_unreserve_vma(struct i915_vma *vma, unsigned int flags)
 	GEM_BUG_ON(!(flags & __EXEC_OBJECT_HAS_PIN));
 
 	if (unlikely(flags & __EXEC_OBJECT_HAS_FENCE))
-		i915_vma_unpin_fence(vma);
+		__i915_vma_unpin_fence(vma);
 
 	__i915_vma_unpin(vma);
 }
@@ -562,13 +562,13 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
 	}
 
 	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		err = i915_vma_get_fence(vma);
+		err = i915_vma_pin_fence(vma);
 		if (unlikely(err)) {
 			i915_vma_unpin(vma);
 			return err;
 		}
 
-		if (i915_vma_pin_fence(vma))
+		if (vma->fence)
 			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index 2783d63bd1ad..af824b8d73ea 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -280,8 +280,7 @@ static int fence_update(struct drm_i915_fence_reg *fence,
  *
  * 0 on success, negative error code on failure.
  */
-int
-i915_vma_put_fence(struct i915_vma *vma)
+int i915_vma_put_fence(struct i915_vma *vma)
 {
 	struct drm_i915_fence_reg *fence = vma->fence;
 
@@ -299,6 +298,8 @@ static struct drm_i915_fence_reg *fence_find(struct drm_i915_private *dev_priv)
 	struct drm_i915_fence_reg *fence;
 
 	list_for_each_entry(fence, &dev_priv->mm.fence_list, link) {
+		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
+
 		if (fence->pin_count)
 			continue;
 
@@ -313,7 +314,7 @@ static struct drm_i915_fence_reg *fence_find(struct drm_i915_private *dev_priv)
 }
 
 /**
- * i915_vma_get_fence - set up fencing for a vma
+ * i915_vma_pin_fence - set up fencing for a vma
  * @vma: vma to map through a fence reg
  *
  * When mapping objects through the GTT, userspace wants to be able to write
@@ -331,10 +332,11 @@ static struct drm_i915_fence_reg *fence_find(struct drm_i915_private *dev_priv)
  * 0 on success, negative error code on failure.
  */
 int
-i915_vma_get_fence(struct i915_vma *vma)
+i915_vma_pin_fence(struct i915_vma *vma)
 {
 	struct drm_i915_fence_reg *fence;
 	struct i915_vma *set = i915_gem_object_is_tiled(vma->obj) ? vma : NULL;
+	int err;
 
 	/* Note that we revoke fences on runtime suspend. Therefore the user
 	 * must keep the device awake whilst using the fence.
@@ -344,6 +346,8 @@ i915_vma_get_fence(struct i915_vma *vma)
 	/* Just update our place in the LRU if our fence is getting reused. */
 	if (vma->fence) {
 		fence = vma->fence;
+		GEM_BUG_ON(fence->vma != vma);
+		fence->pin_count++;
 		if (!fence->dirty) {
 			list_move_tail(&fence->link,
 				       &fence->i915->mm.fence_list);
@@ -353,10 +357,25 @@ i915_vma_get_fence(struct i915_vma *vma)
 		fence = fence_find(vma->vm->i915);
 		if (IS_ERR(fence))
 			return PTR_ERR(fence);
+
+		GEM_BUG_ON(fence->pin_count);
+		fence->pin_count++;
 	} else
 		return 0;
 
-	return fence_update(fence, set);
+	err = fence_update(fence, set);
+	if (err)
+		goto out_unpin;
+
+	GEM_BUG_ON(fence->vma != set);
+	GEM_BUG_ON(vma->fence != (set ? fence : NULL));
+
+	if (set)
+		return 0;
+
+out_unpin:
+	fence->pin_count--;
+	return err;
 }
 
 /**
@@ -429,6 +448,8 @@ void i915_gem_revoke_fences(struct drm_i915_private *dev_priv)
 	for (i = 0; i < dev_priv->num_fence_regs; i++) {
 		struct drm_i915_fence_reg *fence = &dev_priv->fence_regs[i];
 
+		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
+
 		if (fence->vma)
 			i915_gem_release_mmap(fence->vma->obj);
 	}
@@ -450,6 +471,8 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 		struct drm_i915_fence_reg *reg = &dev_priv->fence_regs[i];
 		struct i915_vma *vma = reg->vma;
 
+		GEM_BUG_ON(vma && vma->fence != reg);
+
 		/*
 		 * Commit delayed tiling changes if we have an object still
 		 * attached to the fence, otherwise just clear the fence.
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index f15542e64808..c7d5604b4489 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -303,12 +303,11 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
 
 	__i915_vma_pin(vma);
 
-	ret = i915_vma_get_fence(vma);
+	ret = i915_vma_pin_fence(vma);
 	if (ret) {
 		__i915_vma_unpin(vma);
 		return IO_ERR_PTR(ret);
 	}
-	i915_vma_pin_fence(vma);
 
 	return ptr;
 }
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 347a40d08b9b..ab4a3c0b1526 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -344,15 +344,13 @@ static inline struct page *i915_vma_first_page(struct i915_vma *vma)
  *
  * True if the vma has a fence, false otherwise.
  */
-static inline bool
-i915_vma_pin_fence(struct i915_vma *vma)
+int i915_vma_pin_fence(struct i915_vma *vma);
+int __must_check i915_vma_put_fence(struct i915_vma *vma);
+
+static inline void __i915_vma_unpin_fence(struct i915_vma *vma)
 {
-	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
-	if (vma->fence) {
-		vma->fence->pin_count++;
-		return true;
-	} else
-		return false;
+	GEM_BUG_ON(vma->fence->pin_count <= 0);
+	vma->fence->pin_count--;
 }
 
 /**
@@ -367,10 +365,8 @@ static inline void
 i915_vma_unpin_fence(struct i915_vma *vma)
 {
 	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
-	if (vma->fence) {
-		GEM_BUG_ON(vma->fence->pin_count <= 0);
-		vma->fence->pin_count--;
-	}
+	if (vma->fence)
+		__i915_vma_unpin_fence(vma);
 }
 
 #endif
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index efb8c0875fc8..581cd21065e4 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -2219,8 +2219,7 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
 		 * something and try to run the system in a "less than optimal"
 		 * mode that matches the user configuration.
 		 */
-		if (i915_vma_get_fence(vma) == 0)
-			i915_vma_pin_fence(vma);
+		i915_vma_pin_fence(vma);
 	}
 
 	i915_vma_get(vma);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 08/11] drm/i915: Emit pipelined fence changes
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (6 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 07/11] drm/i915: Consolidate get_fence with pin_fence Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-11  8:41 ` [PATCH 09/11] drm/i915: Track user GTT faulting per-vma Chris Wilson
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

Many years ago, long before requests, we tried doing this. We never
quite got it right, but now with requests we have the tracking to do the
job properly!

One of the stall points for gen2/gen3 is the use of fence registers for
GPU operations. There are only a few available, and currently if we
exhaust the available fence register we must stall the GPU between
batches. By pipelining the fence register writes, we can avoid the stall
and continuously emit the batches. The challenge is remembering to wait
for those pipelined LRI before accessing the fence with the CPU, and
that is what our request tracking makes easy.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c        |   2 +-
 drivers/gpu/drm/i915/i915_gem.c            |   1 +
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  15 ++-
 drivers/gpu/drm/i915/i915_gem_fence_reg.c  | 208 ++++++++++++++++++++++++-----
 drivers/gpu/drm/i915/i915_gem_fence_reg.h  |   3 +-
 drivers/gpu/drm/i915/i915_vma.c            |   1 +
 drivers/gpu/drm/i915/i915_vma.h            |  13 +-
 7 files changed, 202 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 9e961ba9e099..c5602e2e9ec3 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -896,7 +896,7 @@ static int i915_gem_fence_regs_info(struct seq_file *m, void *data)
 		struct i915_vma *vma = dev_priv->fence_regs[i].vma;
 
 		seq_printf(m, "Fence %d, pin count = %d, object = ",
-			   i, dev_priv->fence_regs[i].pin_count);
+			   i, atomic_read(&dev_priv->fence_regs[i].pin_count));
 		if (!vma)
 			seq_puts(m, "unused");
 		else
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 90bf5cba712f..32ecf5d9436b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4960,6 +4960,7 @@ i915_gem_load_init_fences(struct drm_i915_private *dev_priv)
 		fence->i915 = dev_priv;
 		fence->id = i;
 		list_add_tail(&fence->link, &dev_priv->mm.fence_list);
+		init_request_active(&fence->pipelined, NULL);
 	}
 	i915_gem_restore_fences(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index da12bba60c40..d3acbd15ce6d 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -366,11 +366,12 @@ eb_pin_vma(struct i915_execbuffer *eb,
 		return false;
 
 	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		if (unlikely(i915_vma_pin_fence(vma))) {
+		if (unlikely(i915_vma_reserve_fence(vma))) {
 			i915_vma_unpin(vma);
 			return false;
 		}
 
+		exec_flags &= ~EXEC_OBJECT_ASYNC;
 		if (vma->fence)
 			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
@@ -562,12 +563,13 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
 	}
 
 	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
-		err = i915_vma_pin_fence(vma);
+		err = i915_vma_reserve_fence(vma);
 		if (unlikely(err)) {
 			i915_vma_unpin(vma);
 			return err;
 		}
 
+		exec_flags &= ~EXEC_OBJECT_ASYNC;
 		if (vma->fence)
 			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
@@ -1790,6 +1792,12 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 		if (flags & EXEC_OBJECT_ASYNC)
 			continue;
 
+		if (unlikely(flags & EXEC_OBJECT_NEEDS_FENCE)) {
+			err = i915_vma_emit_pipelined_fence(vma, eb->request);
+			if (err)
+				return err;
+		}
+
 		err = i915_gem_request_await_object
 			(eb->request, obj, flags & EXEC_OBJECT_WRITE);
 		if (err)
@@ -1876,9 +1884,6 @@ void i915_vma_move_to_active(struct i915_vma *vma,
 		obj->base.read_domains = 0;
 	}
 	obj->base.read_domains |= I915_GEM_GPU_DOMAINS;
-
-	if (flags & EXEC_OBJECT_NEEDS_FENCE)
-		i915_gem_active_set(&vma->last_fence, req);
 }
 
 static int i915_reset_gen7_sol_offsets(struct drm_i915_gem_request *req)
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index af824b8d73ea..be5b63547105 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -55,10 +55,9 @@
  * CPU ptes into GTT mmaps (not the GTT ptes themselves) as needed.
  */
 
-#define pipelined 0
-
-static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
-				 struct i915_vma *vma)
+static int i965_write_fence_reg(struct drm_i915_fence_reg *fence,
+				struct i915_vma *vma,
+				struct drm_i915_gem_request *pipelined)
 {
 	i915_reg_t fence_reg_lo, fence_reg_hi;
 	int fence_pitch_shift;
@@ -110,11 +109,30 @@ static void i965_write_fence_reg(struct drm_i915_fence_reg *fence,
 		I915_WRITE(fence_reg_hi, upper_32_bits(val));
 		I915_WRITE(fence_reg_lo, lower_32_bits(val));
 		POSTING_READ(fence_reg_lo);
+	} else {
+		u32 *cs;
+
+		cs = intel_ring_begin(pipelined, 8);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		*cs++ = MI_LOAD_REGISTER_IMM(3);
+		*cs++ = i915_mmio_reg_offset(fence_reg_lo);
+		*cs++ = 0;
+		*cs++ = i915_mmio_reg_offset(fence_reg_hi);
+		*cs++ = upper_32_bits(val);
+		*cs++ = i915_mmio_reg_offset(fence_reg_lo);
+		*cs++ = lower_32_bits(val);
+		*cs++ = MI_NOOP;
+		intel_ring_advance(pipelined, cs);
 	}
+
+	return 0;
 }
 
-static void i915_write_fence_reg(struct drm_i915_fence_reg *fence,
-				 struct i915_vma *vma)
+static int i915_write_fence_reg(struct drm_i915_fence_reg *fence,
+				struct i915_vma *vma,
+				struct drm_i915_gem_request *pipelined)
 {
 	u32 val;
 
@@ -150,11 +168,26 @@ static void i915_write_fence_reg(struct drm_i915_fence_reg *fence,
 
 		I915_WRITE(reg, val);
 		POSTING_READ(reg);
+	} else {
+		u32 *cs;
+
+		cs = intel_ring_begin(pipelined, 4);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		*cs++ = MI_LOAD_REGISTER_IMM(1);
+		*cs++ = i915_mmio_reg_offset(FENCE_REG(fence->id));
+		*cs++ = val;
+		*cs++ = MI_NOOP;
+		intel_ring_advance(pipelined, cs);
 	}
+
+	return 0;
 }
 
-static void i830_write_fence_reg(struct drm_i915_fence_reg *fence,
-				 struct i915_vma *vma)
+static int i830_write_fence_reg(struct drm_i915_fence_reg *fence,
+				struct i915_vma *vma,
+				struct drm_i915_gem_request *pipelined)
 {
 	u32 val;
 
@@ -182,29 +215,49 @@ static void i830_write_fence_reg(struct drm_i915_fence_reg *fence,
 
 		I915_WRITE(reg, val);
 		POSTING_READ(reg);
+	} else {
+		u32 *cs;
+
+		cs = intel_ring_begin(pipelined, 4);
+		if (IS_ERR(cs))
+			return PTR_ERR(cs);
+
+		*cs++ = MI_LOAD_REGISTER_IMM(1);
+		*cs++ = i915_mmio_reg_offset(FENCE_REG(fence->id));
+		*cs++ = val;
+		*cs++ = MI_NOOP;
+		intel_ring_advance(pipelined, cs);
 	}
+
+	return 0;
 }
 
-static void fence_write(struct drm_i915_fence_reg *fence,
-			struct i915_vma *vma)
+static int fence_write(struct drm_i915_fence_reg *fence,
+		       struct i915_vma *vma,
+		       struct drm_i915_gem_request *rq)
 {
+	int err;
+
 	/* Previous access through the fence register is marshalled by
 	 * the mb() inside the fault handlers (i915_gem_release_mmaps)
 	 * and explicitly managed for internal users.
 	 */
 
 	if (IS_GEN2(fence->i915))
-		i830_write_fence_reg(fence, vma);
+		err = i830_write_fence_reg(fence, vma, rq);
 	else if (IS_GEN3(fence->i915))
-		i915_write_fence_reg(fence, vma);
+		err = i915_write_fence_reg(fence, vma, rq);
 	else
-		i965_write_fence_reg(fence, vma);
+		err = i965_write_fence_reg(fence, vma, rq);
+	if (err)
+		return err;
 
 	/* Access through the fenced region afterwards is
 	 * ordered by the posting reads whilst writing the registers.
 	 */
 
 	fence->dirty = false;
+	return 0;
 }
 
 static int fence_update(struct drm_i915_fence_reg *fence,
@@ -212,17 +265,15 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 {
 	int ret;
 
+	ret = i915_gem_active_retire(&fence->pipelined,
+				     &fence->i915->drm.struct_mutex);
+	if (ret)
+		return ret;
+
 	if (vma) {
 		if (!i915_vma_is_map_and_fenceable(vma))
 			return -EINVAL;
 
-		if (WARN(!i915_gem_object_get_stride(vma->obj) ||
-			 !i915_gem_object_get_tiling(vma->obj),
-			 "bogus fence setup with stride: 0x%x, tiling mode: %i\n",
-			 i915_gem_object_get_stride(vma->obj),
-			 i915_gem_object_get_tiling(vma->obj)))
-			return -EINVAL;
-
 		ret = i915_gem_active_retire(&vma->last_fence,
 					     &vma->obj->base.dev->struct_mutex);
 		if (ret)
@@ -253,7 +304,7 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 	 * to the runtime resume, see i915_gem_restore_fences().
 	 */
 	if (intel_runtime_pm_get_if_in_use(fence->i915)) {
-		fence_write(fence, vma);
+		fence_write(fence, vma, NULL);
 		intel_runtime_pm_put(fence->i915);
 	}
 
@@ -287,7 +338,9 @@ int i915_vma_put_fence(struct i915_vma *vma)
 	if (!fence)
 		return 0;
 
-	if (fence->pin_count)
+	GEM_BUG_ON(fence->vma != vma);
+
+	if (atomic_read(&fence->pin_count))
 		return -EBUSY;
 
 	return fence_update(fence, NULL);
@@ -300,7 +353,7 @@ static struct drm_i915_fence_reg *fence_find(struct drm_i915_private *dev_priv)
 	list_for_each_entry(fence, &dev_priv->mm.fence_list, link) {
 		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
 
-		if (fence->pin_count)
+		if (atomic_read(&fence->pin_count))
 			continue;
 
 		return fence;
@@ -344,11 +397,16 @@ i915_vma_pin_fence(struct i915_vma *vma)
 	assert_rpm_wakelock_held(vma->vm->i915);
 
 	/* Just update our place in the LRU if our fence is getting reused. */
-	if (vma->fence) {
-		fence = vma->fence;
+	fence = vma->fence;
+	if (fence) {
 		GEM_BUG_ON(fence->vma != vma);
-		fence->pin_count++;
+		atomic_inc(&fence->pin_count);
 		if (!fence->dirty) {
+			err = i915_gem_active_retire(&fence->pipelined,
+						     &fence->i915->drm.struct_mutex);
+			if (err)
+				goto out_unpin;
+
 			list_move_tail(&fence->link,
 				       &fence->i915->mm.fence_list);
 			return 0;
@@ -358,8 +416,8 @@ i915_vma_pin_fence(struct i915_vma *vma)
 		if (IS_ERR(fence))
 			return PTR_ERR(fence);
 
-		GEM_BUG_ON(fence->pin_count);
-		fence->pin_count++;
+		GEM_BUG_ON(atomic_read(&fence->pin_count));
+		atomic_inc(&fence->pin_count);
 	} else
 		return 0;
 
@@ -374,10 +432,97 @@ i915_vma_pin_fence(struct i915_vma *vma)
 		return 0;
 
 out_unpin:
-	fence->pin_count--;
+	atomic_dec(&fence->pin_count);
 	return err;
 }
 
+int i915_vma_reserve_fence(struct i915_vma *vma)
+{
+	struct drm_i915_fence_reg *fence;
+
+	lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
+	GEM_BUG_ON(!drm_mm_node_allocated(&vma->node));
+	GEM_BUG_ON(!i915_vma_is_pinned(vma));
+
+	fence = vma->fence;
+	if (!fence) {
+		if (!i915_gem_object_is_tiled(vma->obj))
+			return 0;
+
+		if (!i915_vma_is_map_and_fenceable(vma))
+			return -EINVAL;
+
+		fence = fence_find(vma->vm->i915);
+		if (IS_ERR(fence))
+			return PTR_ERR(fence);
+
+		vma->fence = fence;
+
+		if (fence->vma) {
+			i915_gem_release_mmap(fence->vma->obj);
+			fence->vma->fence = NULL;
+		}
+		fence->vma = vma;
+		fence->dirty = true;
+	}
+	atomic_inc(&fence->pin_count);
+	list_move_tail(&fence->link, &fence->i915->mm.fence_list);
+
+	GEM_BUG_ON(!i915_gem_object_is_tiled(vma->obj));
+	GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+	GEM_BUG_ON(vma->node.size != vma->fence_size);
+	GEM_BUG_ON(!IS_ALIGNED(vma->node.start, vma->fence_alignment));
+
+	return 0;
+}
+
+int i915_vma_emit_pipelined_fence(struct i915_vma *vma,
+				  struct drm_i915_gem_request *rq)
+{
+	struct drm_i915_fence_reg *fence = vma->fence;
+	struct drm_i915_gem_request *prev;
+	int err;
+
+	lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
+	GEM_BUG_ON(fence && !atomic_read(&fence->pin_count));
+
+	if (!fence)
+		goto out;
+
+	prev = i915_gem_active_raw(&fence->pipelined,
+				   &fence->i915->drm.struct_mutex);
+	if (prev) {
+		err = i915_gem_request_await_dma_fence(rq, &prev->fence);
+		if (err)
+			return err;
+	}
+
+	if (!fence->dirty)
+		goto out;
+
+	GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+
+	if (fence->vma) {
+		prev = i915_gem_active_raw(&fence->vma->last_fence,
+					   &fence->i915->drm.struct_mutex);
+		if (prev) {
+			err = i915_gem_request_await_dma_fence(rq,
+							       &prev->fence);
+			if (err)
+				return err;
+		}
+	}
+
+	err = fence_write(fence, vma, rq);
+	if (err)
+		return err;
+
+	i915_gem_active_set(&fence->pipelined, rq);
+out:
+	i915_gem_active_set(&vma->last_fence, rq);
+	return 0;
+}
+
 /**
  * i915_reserve_fence - Reserve a fence for vGPU
  * @dev_priv: i915 device private
@@ -397,7 +542,7 @@ i915_reserve_fence(struct drm_i915_private *dev_priv)
 	/* Keep at least one fence available for the display engine. */
 	count = 0;
 	list_for_each_entry(fence, &dev_priv->mm.fence_list, link)
-		count += !fence->pin_count;
+		count += !atomic_read(&fence->pin_count);
 	if (count <= 1)
 		return ERR_PTR(-ENOSPC);
 
@@ -479,6 +624,7 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 		 */
 		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
 			GEM_BUG_ON(!reg->dirty);
+			GEM_BUG_ON(atomic_read(&reg->pin_count));
 			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
 
 			list_move(&reg->link, &dev_priv->mm.fence_list);
@@ -486,7 +632,7 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 			vma = NULL;
 		}
 
-		fence_write(reg, vma);
+		fence_write(reg, vma, NULL);
 		reg->vma = vma;
 	}
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.h b/drivers/gpu/drm/i915/i915_gem_fence_reg.h
index 99a31ded4dfd..f69d894997fb 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.h
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.h
@@ -36,7 +36,7 @@ struct drm_i915_fence_reg {
 	struct list_head link;
 	struct drm_i915_private *i915;
 	struct i915_vma *vma;
-	int pin_count;
+	atomic_t pin_count;
 	int id;
 	/**
 	 * Whether the tiling parameters for the currently
@@ -47,6 +47,7 @@ struct drm_i915_fence_reg {
 	 * command (such as BLT on gen2/3), as a "fence".
 	 */
 	bool dirty;
+	struct i915_gem_active pipelined;
 };
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index c7d5604b4489..34bb50582563 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -706,6 +706,7 @@ int i915_vma_unbind(struct i915_vma *vma)
 		__i915_vma_iounmap(vma);
 		vma->flags &= ~I915_VMA_CAN_FENCE;
 	}
+	GEM_BUG_ON(vma->fence);
 
 	if (likely(!vma->vm->closed)) {
 		trace_i915_vma_unbind(vma);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index ab4a3c0b1526..1c0f9def81ab 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -349,8 +349,8 @@ int __must_check i915_vma_put_fence(struct i915_vma *vma);
 
 static inline void __i915_vma_unpin_fence(struct i915_vma *vma)
 {
-	GEM_BUG_ON(vma->fence->pin_count <= 0);
-	vma->fence->pin_count--;
+	GEM_BUG_ON(atomic_read(&vma->fence->pin_count) <= 0);
+	atomic_dec(&vma->fence->pin_count);
 }
 
 /**
@@ -364,10 +364,17 @@ static inline void __i915_vma_unpin_fence(struct i915_vma *vma)
 static inline void
 i915_vma_unpin_fence(struct i915_vma *vma)
 {
-	lockdep_assert_held(&vma->obj->base.dev->struct_mutex);
+	/* The assumption is if the caller has a fence, the caller owns a pin
+	 * on that fence, i.e. that vma->fence cannot become NULL prior to us
+	 * releasing our pin.
+	 */
 	if (vma->fence)
 		__i915_vma_unpin_fence(vma);
 }
 
+int __must_check i915_vma_reserve_fence(struct i915_vma *vma);
+int i915_vma_emit_pipelined_fence(struct i915_vma *vma,
+				  struct drm_i915_gem_request *rq);
+
 #endif
 
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 09/11] drm/i915: Track user GTT faulting per-vma
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (7 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 08/11] drm/i915: Emit pipelined fence changes Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-11  8:41 ` [PATCH 10/11] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node Chris Wilson
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

We don't wish to refault the entire object (other vma) when unbinding
one partial vma. To do this track which vma have been faulted into the
user's address space.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c       |  2 +-
 drivers/gpu/drm/i915/i915_gem.c           | 50 +++++++++++++++++++++----------
 drivers/gpu/drm/i915/i915_gem_evict.c     |  2 +-
 drivers/gpu/drm/i915/i915_gem_fence_reg.c | 10 ++++---
 drivers/gpu/drm/i915/i915_gem_object.h    |  1 +
 drivers/gpu/drm/i915/i915_vma.c           | 26 +++++++++++++++-
 drivers/gpu/drm/i915/i915_vma.h           | 21 ++++++++++++-
 7 files changed, 88 insertions(+), 24 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index c5602e2e9ec3..cc1d9bec3cbf 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -96,7 +96,7 @@ static char get_tiling_flag(struct drm_i915_gem_object *obj)
 
 static char get_global_flag(struct drm_i915_gem_object *obj)
 {
-	return !list_empty(&obj->userfault_link) ? 'g' : ' ';
+	return obj->userfault_count ? 'g' : ' ';
 }
 
 static char get_pin_mapped_flag(struct drm_i915_gem_object *obj)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 32ecf5d9436b..ebc15ae3e3a3 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1916,18 +1916,22 @@ int i915_gem_fault(struct vm_fault *vmf)
 	if (ret)
 		goto err_unpin;
 
-	/* Mark as being mmapped into userspace for later revocation */
-	assert_rpm_wakelock_held(dev_priv);
-	if (list_empty(&obj->userfault_link))
-		list_add(&obj->userfault_link, &dev_priv->mm.userfault_list);
-
 	/* Finally, remap it using the new GTT offset */
 	ret = remap_io_mapping(area,
 			       area->vm_start + (vma->ggtt_view.partial.offset << PAGE_SHIFT),
 			       (ggtt->mappable_base + vma->node.start) >> PAGE_SHIFT,
 			       min_t(u64, vma->size, area->vm_end - area->vm_start),
 			       &ggtt->mappable);
+	if (ret)
+		goto err_fence;
 
+	/* Mark as being mmapped into userspace for later revocation */
+	assert_rpm_wakelock_held(dev_priv);
+	if (!i915_vma_set_userfault(vma) && !obj->userfault_count++)
+		list_add(&obj->userfault_link, &dev_priv->mm.userfault_list);
+	GEM_BUG_ON(!obj->userfault_count);
+
+err_fence:
 	i915_vma_unpin_fence(vma);
 err_unpin:
 	__i915_vma_unpin(vma);
@@ -1980,6 +1984,25 @@ int i915_gem_fault(struct vm_fault *vmf)
 	return ret;
 }
 
+static void __i915_gem_object_release_mmap(struct drm_i915_gem_object *obj)
+{
+	struct i915_vma *vma;
+
+	GEM_BUG_ON(!obj->userfault_count);
+
+	obj->userfault_count = 0;
+	list_del(&obj->userfault_link);
+	drm_vma_node_unmap(&obj->base.vma_node,
+			   obj->base.dev->anon_inode->i_mapping);
+
+	list_for_each_entry(vma, &obj->vma_list, obj_link) {
+		if (!i915_vma_is_ggtt(vma))
+			break;
+
+		i915_vma_unset_userfault(vma);
+	}
+}
+
 /**
  * i915_gem_release_mmap - remove physical page mappings
  * @obj: obj in question
@@ -2010,12 +2033,10 @@ i915_gem_release_mmap(struct drm_i915_gem_object *obj)
 	lockdep_assert_held(&i915->drm.struct_mutex);
 	intel_runtime_pm_get(i915);
 
-	if (list_empty(&obj->userfault_link))
+	if (!obj->userfault_count)
 		goto out;
 
-	list_del_init(&obj->userfault_link);
-	drm_vma_node_unmap(&obj->base.vma_node,
-			   obj->base.dev->anon_inode->i_mapping);
+	__i915_gem_object_release_mmap(obj);
 
 	/* Ensure that the CPU's PTE are revoked and there are not outstanding
 	 * memory transactions from userspace before we return. The TLB
@@ -2043,11 +2064,8 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
 	 */
 
 	list_for_each_entry_safe(obj, on,
-				 &dev_priv->mm.userfault_list, userfault_link) {
-		list_del_init(&obj->userfault_link);
-		drm_vma_node_unmap(&obj->base.vma_node,
-				   obj->base.dev->anon_inode->i_mapping);
-	}
+				 &dev_priv->mm.userfault_list, userfault_link)
+		__i915_gem_object_release_mmap(obj);
 
 	/* The fence will be lost when the device powers down. If any were
 	 * in use by hardware (i.e. they are pinned), we should not be powering
@@ -2070,7 +2088,7 @@ void i915_gem_runtime_suspend(struct drm_i915_private *dev_priv)
 		if (!reg->vma)
 			continue;
 
-		GEM_BUG_ON(!list_empty(&reg->vma->obj->userfault_link));
+		GEM_BUG_ON(i915_vma_has_userfault(reg->vma));
 		reg->dirty = true;
 	}
 }
@@ -4321,7 +4339,6 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	mutex_init(&obj->mm.lock);
 
 	INIT_LIST_HEAD(&obj->global_link);
-	INIT_LIST_HEAD(&obj->userfault_link);
 	INIT_LIST_HEAD(&obj->mm.gtt_wakeref_link);
 	INIT_LIST_HEAD(&obj->vma_list);
 	INIT_LIST_HEAD(&obj->lut_list);
@@ -4481,6 +4498,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 
 	llist_for_each_entry_safe(obj, on, freed, freed) {
 		GEM_BUG_ON(obj->bind_count);
+		GEM_BUG_ON(obj->userfault_count);
 		GEM_BUG_ON(atomic_read(&obj->frontbuffer_bits));
 		GEM_BUG_ON(!list_empty(&obj->lut_list));
 
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 4df039ef2ce3..933ee8ecfa54 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -82,7 +82,7 @@ mark_free(struct drm_mm_scan *scan,
 	if (i915_vma_is_pinned(vma))
 		return false;
 
-	if (flags & PIN_NONFAULT && !list_empty(&vma->obj->userfault_link))
+	if (flags & PIN_NONFAULT && i915_vma_has_userfault(vma))
 		return false;
 
 	list_add(&vma->evict_link, unwind);
diff --git a/drivers/gpu/drm/i915/i915_gem_fence_reg.c b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
index be5b63547105..6e695426ec1f 100644
--- a/drivers/gpu/drm/i915/i915_gem_fence_reg.c
+++ b/drivers/gpu/drm/i915/i915_gem_fence_reg.c
@@ -291,7 +291,8 @@ static int fence_update(struct drm_i915_fence_reg *fence,
 		/* Ensure that all userspace CPU access is completed before
 		 * stealing the fence.
 		 */
-		i915_gem_release_mmap(fence->vma->obj);
+		GEM_BUG_ON(fence->vma->fence != fence);
+		i915_vma_revoke_mmap(fence->vma);
 
 		fence->vma->fence = NULL;
 		fence->vma = NULL;
@@ -459,7 +460,8 @@ int i915_vma_reserve_fence(struct i915_vma *vma)
 		vma->fence = fence;
 
 		if (fence->vma) {
-			i915_gem_release_mmap(fence->vma->obj);
+			GEM_BUG_ON(fence->vma->fence != fence);
+			i915_vma_revoke_mmap(fence->vma);
 			fence->vma->fence = NULL;
 		}
 		fence->vma = vma;
@@ -596,7 +598,7 @@ void i915_gem_revoke_fences(struct drm_i915_private *dev_priv)
 		GEM_BUG_ON(fence->vma && fence->vma->fence != fence);
 
 		if (fence->vma)
-			i915_gem_release_mmap(fence->vma->obj);
+			i915_vma_revoke_mmap(fence->vma);
 	}
 }
 
@@ -625,7 +627,7 @@ void i915_gem_restore_fences(struct drm_i915_private *dev_priv)
 		if (vma && !i915_gem_object_is_tiled(vma->obj)) {
 			GEM_BUG_ON(!reg->dirty);
 			GEM_BUG_ON(atomic_read(&reg->pin_count));
-			GEM_BUG_ON(!list_empty(&vma->obj->userfault_link));
+			GEM_BUG_ON(i915_vma_has_userfault(vma));
 
 			list_move(&reg->link, &dev_priv->mm.fence_list);
 			vma->fence = NULL;
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 5797b75a804b..26648355d92b 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -123,6 +123,7 @@ struct drm_i915_gem_object {
 	/**
 	 * Whether the object is currently in the GGTT mmap.
 	 */
+	unsigned int userfault_count;
 	struct list_head userfault_link;
 
 	struct list_head batch_pool_link;
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 34bb50582563..220a7064f466 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -638,6 +638,29 @@ static void __i915_vma_iounmap(struct i915_vma *vma)
 	vma->iomap = NULL;
 }
 
+void i915_vma_revoke_mmap(struct i915_vma *vma)
+{
+	struct drm_vma_offset_node *node = &vma->obj->base.vma_node;
+
+	lockdep_assert_held(&vma->vm->i915->drm.struct_mutex);
+
+	if (!i915_vma_has_userfault(vma))
+		return;
+
+	GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+	GEM_BUG_ON(!vma->obj->userfault_count);
+
+	unmap_mapping_range(vma->vm->i915->drm.anon_inode->i_mapping,
+			    drm_vma_node_offset_addr(node) +
+			    (vma->ggtt_view.partial.offset << PAGE_SHIFT),
+			    vma->size,
+			    1);
+
+	i915_vma_unset_userfault(vma);
+	if (!--vma->obj->userfault_count)
+		list_del(&vma->obj->userfault_link);
+}
+
 int i915_vma_unbind(struct i915_vma *vma)
 {
 	struct drm_i915_gem_object *obj = vma->obj;
@@ -701,12 +724,13 @@ int i915_vma_unbind(struct i915_vma *vma)
 			return ret;
 
 		/* Force a pagefault for domain tracking on next user access */
-		i915_gem_release_mmap(obj);
+		i915_vma_revoke_mmap(vma);
 
 		__i915_vma_iounmap(vma);
 		vma->flags &= ~I915_VMA_CAN_FENCE;
 	}
 	GEM_BUG_ON(vma->fence);
+	GEM_BUG_ON(i915_vma_has_userfault(vma));
 
 	if (likely(!vma->vm->closed)) {
 		trace_i915_vma_unbind(vma);
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 1c0f9def81ab..21cecf492568 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -65,7 +65,7 @@ struct i915_vma {
 	 * that exist in the ctx->handle_vmas LUT for this vma.
 	 */
 	unsigned int open_count;
-	unsigned int flags;
+	unsigned long flags;
 	/**
 	 * How many users have pinned this object in GTT space. The following
 	 * users can each hold at most one reference: pwrite/pread, execbuffer
@@ -87,6 +87,8 @@ struct i915_vma {
 #define I915_VMA_GGTT		BIT(8)
 #define I915_VMA_CAN_FENCE	BIT(9)
 #define I915_VMA_CLOSED		BIT(10)
+#define I915_VMA_USERFAULT_BIT	11
+#define I915_VMA_USERFAULT	BIT(I915_VMA_USERFAULT_BIT)
 
 	unsigned int active;
 	struct i915_gem_active last_read[I915_NUM_ENGINES];
@@ -145,6 +147,22 @@ static inline bool i915_vma_is_closed(const struct i915_vma *vma)
 	return vma->flags & I915_VMA_CLOSED;
 }
 
+static inline bool i915_vma_set_userfault(struct i915_vma *vma)
+{
+	GEM_BUG_ON(!i915_vma_is_map_and_fenceable(vma));
+	return __test_and_set_bit(I915_VMA_USERFAULT_BIT, &vma->flags);
+}
+
+static inline void i915_vma_unset_userfault(struct i915_vma *vma)
+{
+	return __clear_bit(I915_VMA_USERFAULT_BIT, &vma->flags);
+}
+
+static inline bool i915_vma_has_userfault(const struct i915_vma *vma)
+{
+	return test_bit(I915_VMA_USERFAULT_BIT, &vma->flags);
+}
+
 static inline unsigned int i915_vma_get_active(const struct i915_vma *vma)
 {
 	return vma->active;
@@ -243,6 +261,7 @@ bool i915_gem_valid_gtt_space(struct i915_vma *vma, unsigned long cache_level);
 bool i915_vma_misplaced(const struct i915_vma *vma,
 			u64 size, u64 alignment, u64 flags);
 void __i915_vma_set_map_and_fenceable(struct i915_vma *vma);
+void i915_vma_revoke_mmap(struct i915_vma *vma);
 int __must_check i915_vma_unbind(struct i915_vma *vma);
 void i915_vma_unlink_ctx(struct i915_vma *vma);
 void i915_vma_close(struct i915_vma *vma);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 10/11] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (8 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 09/11] drm/i915: Track user GTT faulting per-vma Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-19 12:43   ` Chris Wilson
  2017-09-11  8:41 ` [PATCH 11/11] drm/i915: Try a minimal attempt to insert the whole object for relocations Chris Wilson
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

If the caller says that he doesn't want to evict any other faulting
vma, honour that flag. The logic was used in evict_something, but not
the more specific evict_for_node, now being used as a preliminary probe
since commit 606fec956c0e ("drm/i915: Prefer random replacement before
eviction search").

Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction search")
Fixes: 821188778b9b ("drm/i915: Choose not to evict faultable objects from the GGTT")
References: https://patchwork.freedesktop.org/patch/174781/
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_evict.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index 933ee8ecfa54..a5a5b7e6daae 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -315,6 +315,11 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 			break;
 		}
 
+		if (flags & PIN_NONFAULT && i915_vma_has_userfault(vma)) {
+			ret = -ENOSPC;
+			break;
+		}
+
 		/* Overlap of objects in the same batch? */
 		if (i915_vma_is_pinned(vma)) {
 			ret = -ENOSPC;
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 11/11] drm/i915: Try a minimal attempt to insert the whole object for relocations
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (9 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 10/11] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node Chris Wilson
@ 2017-09-11  8:41 ` Chris Wilson
  2017-09-11  9:16 ` ✗ Fi.CI.BAT: warning for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM Patchwork
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-11  8:41 UTC (permalink / raw)
  To: intel-gfx

As we have a lightweight fallback to insert a single page into the
aperture, try to avoid any heavier evictions when attempting to insert
the entire object.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index d3acbd15ce6d..d92a6e2ecbad 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -978,7 +978,9 @@ static void *reloc_iomap(struct drm_i915_gem_object *obj,
 			return ERR_PTR(err);
 
 		vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0,
-					       PIN_MAPPABLE | PIN_NONBLOCK);
+					       PIN_MAPPABLE |
+					       PIN_NONBLOCK |
+					       PIN_NONFAULT);
 		if (IS_ERR(vma)) {
 			memset(&cache->node, 0, sizeof(cache->node));
 			err = drm_mm_insert_node_in_range
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* ✗ Fi.CI.BAT: warning for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (10 preceding siblings ...)
  2017-09-11  8:41 ` [PATCH 11/11] drm/i915: Try a minimal attempt to insert the whole object for relocations Chris Wilson
@ 2017-09-11  9:16 ` Patchwork
  2017-09-11 13:16 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2017-09-11  9:16 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
URL   : https://patchwork.freedesktop.org/series/30108/
State : warning

== Summary ==

Series 30108v1 series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
https://patchwork.freedesktop.org/api/1.0/series/30108/revisions/1/mbox/

Test kms_cursor_legacy:
        Subgroup basic-busy-flip-before-cursor-atomic:
                pass       -> FAIL       (fi-snb-2600) fdo#100215 +1
Test kms_flip:
        Subgroup basic-flip-vs-modeset:
                pass       -> SKIP       (fi-skl-x1585l) fdo#101781
Test kms_pipe_crc_basic:
        Subgroup suspend-read-crc-pipe-b:
                dmesg-warn -> PASS       (fi-byt-n2820) fdo#101705
Test pm_rpm:
        Subgroup basic-pci-d3-state:
                pass       -> DMESG-WARN (fi-cfl-s) fdo#102294
        Subgroup basic-rte:
                pass       -> DMESG-WARN (fi-cfl-s)

fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fdo#101781 https://bugs.freedesktop.org/show_bug.cgi?id=101781
fdo#101705 https://bugs.freedesktop.org/show_bug.cgi?id=101705
fdo#102294 https://bugs.freedesktop.org/show_bug.cgi?id=102294

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:456s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:455s
fi-blb-e6850     total:289  pass:224  dwarn:1   dfail:0   fail:0   skip:64  time:379s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:538s
fi-bwr-2160      total:289  pass:184  dwarn:0   dfail:0   fail:0   skip:105 time:268s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:513s
fi-byt-j1900     total:289  pass:254  dwarn:1   dfail:0   fail:0   skip:34  time:505s
fi-byt-n2820     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:500s
fi-cfl-s         total:289  pass:248  dwarn:6   dfail:0   fail:0   skip:35  time:472s
fi-elk-e7500     total:289  pass:230  dwarn:0   dfail:0   fail:0   skip:59  time:448s
fi-glk-2a        total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:599s
fi-hsw-4770      total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:430s
fi-hsw-4770r     total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:413s
fi-ilk-650       total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:439s
fi-ivb-3520m     total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:476s
fi-ivb-3770      total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:460s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:499s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:581s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:582s
fi-pnv-d510      total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:554s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:458s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:527s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:508s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:456s
fi-skl-x1585l    total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:480s
fi-snb-2520m     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:567s
fi-snb-2600      total:289  pass:248  dwarn:0   dfail:0   fail:2   skip:39  time:422s

9ef5732ddf6924e98a5c5d034a3535cff14f72bf drm-tip: 2017y-09m-10d-21h-59m-53s UTC integration manifest
8d8ecd3be23a drm/i915: Try a minimal attempt to insert the whole object for relocations
721ab4bca347 drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
7eed8c4b5ec1 drm/i915: Track user GTT faulting per-vma
89f8f7a73e8a drm/i915: Emit pipelined fence changes
6693ae534e3b drm/i915: Consolidate get_fence with pin_fence
361f2087af47 drm/i915: Pin fence for iomap
35662d758163 drm/i915: Keep the device awake whilst in the GTT domain
a4ad75757fb2 drm/i915: Try harder to finish the idle-worker
fcc760c7170d drm/i915: Include fence-hint for timeout warning
3b52099b8042 drm/i915/fence: Avoid del_timer_sync() from inside a timer
8323891cbc32 drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5635/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
  2017-09-11  8:41 ` [PATCH 01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM Chris Wilson
@ 2017-09-11 11:33   ` Michał Winiarski
  2017-09-15  9:19     ` Chris Wilson
  0 siblings, 1 reply; 29+ messages in thread
From: Michał Winiarski @ 2017-09-11 11:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Mon, Sep 11, 2017 at 09:41:25AM +0100, Chris Wilson wrote:
> To silence the critcs:
> 
> [56532.161115] workqueue: PF_MEMALLOC task 36(khugepaged) is flushing !WQ_MEM_RECLAIM i915-userptr-release:          (null)
> [56532.161138] ------------[ cut here ]------------
> [56532.161144] WARNING: CPU: 1 PID: 36 at kernel/workqueue.c:2418 check_flush_dependency+0xe8/0xf0
> [56532.161145] Modules linked in: wmi_bmof
> [56532.161148] CPU: 1 PID: 36 Comm: khugepaged Not tainted 4.13.0-krejzi #1
> [56532.161149] Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.17 06/08/2017
> [56532.161150] task: ffff8802371ee200 task.stack: ffffc90000174000
> [56532.161152] RIP: 0010:check_flush_dependency+0xe8/0xf0
> [56532.161152] RSP: 0018:ffffc900001777b8 EFLAGS: 00010286
> [56532.161153] RAX: 000000000000006c RBX: ffff88022fc5a000 RCX: 0000000000000001
> [56532.161154] RDX: 0000000000000000 RSI: 0000000000000086 RDI: 00000000ffffffff
> [56532.161155] RBP: 0000000000000000 R08: 14f038bb55f6dae0 R09: 0000000000000516
> [56532.161155] R10: ffffc900001778a0 R11: 000000006c756e28 R12: ffff8802371ee200
> [56532.161156] R13: 0000000000000000 R14: 000000000000000b R15: ffffc90000177810
> [56532.161157] FS:  0000000000000000(0000) GS:ffff880240480000(0000) knlGS:0000000000000000
> [56532.161158] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [56532.161158] CR2: 0000000004795ff8 CR3: 000000000220a000 CR4: 00000000003406e0
> [56532.161159] Call Trace:
> [56532.161161]  ? flush_workqueue+0x136/0x3e0
> [56532.161178]  ? _raw_spin_unlock_irqrestore+0xf/0x30
> [56532.161179]  ? try_to_wake_up+0x1ce/0x3b0
> [56532.161183]  ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150
> [56532.161184]  ? _raw_spin_unlock+0xd/0x20
> [56532.161186]  ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150
> [56532.161189]  ? __mmu_notifier_invalidate_range_start+0x4a/0x70
> [56532.161191]  ? try_to_unmap_one+0x5e5/0x660
> [56532.161193]  ? rmap_walk_file+0xe4/0x240
> [56532.161195]  ? __ClearPageMovable+0x10/0x10
> [56532.161196]  ? try_to_unmap+0x8c/0xe0
> [56532.161197]  ? page_remove_rmap+0x280/0x280
> [56532.161199]  ? page_not_mapped+0x10/0x10
> [56532.161200]  ? page_get_anon_vma+0x90/0x90
> [56532.161202]  ? migrate_pages+0x6a5/0x940
> [56532.161203]  ? isolate_freepages_block+0x330/0x330
> [56532.161205]  ? compact_zone+0x593/0x6a0
> [56532.161206]  ? enqueue_task_fair+0xc3/0x1180
> [56532.161208]  ? compact_zone_order+0x9b/0xc0
> [56532.161210]  ? get_page_from_freelist+0x24a/0x900
> [56532.161212]  ? try_to_compact_pages+0xc8/0x240
> [56532.161213]  ? try_to_compact_pages+0xc8/0x240
> [56532.161215]  ? __alloc_pages_direct_compact+0x45/0xe0
> [56532.161216]  ? __alloc_pages_slowpath+0x845/0xb90
> [56532.161218]  ? __alloc_pages_nodemask+0x176/0x1f0
> [56532.161220]  ? wait_woken+0x80/0x80
> [56532.161222]  ? khugepaged+0x29e/0x17d0
> [56532.161223]  ? wait_woken+0x80/0x80
> [56532.161225]  ? collapse_shmem.isra.39+0xa60/0xa60
> [56532.161226]  ? kthread+0x10d/0x130
> [56532.161227]  ? kthread_create_on_node+0x60/0x60
> [56532.161228]  ? ret_from_fork+0x22/0x30
> [56532.161229] Code: 00 8b b0 10 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 b8 06 00 00 49 89 e8 48 c7 c7 38 55 09 82 c6 05 f9 c6 1d 01 01 e8 0e a1 03 00 <0f> ff e9 6b ff ff ff 90 48 8b 37 40 f6 c6 04 75 1b 48 c1 ee 05
> [56532.161251] ---[ end trace 2ce2b4f5f69b803b ]---
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

-Michał

> ---
>  drivers/gpu/drm/i915/i915_gem_userptr.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
> index 2ce078e8c763..15547c8f7d3f 100644
> --- a/drivers/gpu/drm/i915/i915_gem_userptr.c
> +++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
> @@ -813,7 +813,9 @@ int i915_gem_init_userptr(struct drm_i915_private *dev_priv)
>  	hash_init(dev_priv->mm_structs);
>  
>  	dev_priv->mm.userptr_wq =
> -		alloc_workqueue("i915-userptr-acquire", WQ_HIGHPRI, 0);
> +		alloc_workqueue("i915-userptr-acquire",
> +				WQ_HIGHPRI | WQ_MEM_RECLAIM,
> +				0);
>  	if (!dev_priv->mm.userptr_wq)
>  		return -ENOMEM;
>  
> -- 
> 2.14.1
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (11 preceding siblings ...)
  2017-09-11  9:16 ` ✗ Fi.CI.BAT: warning for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM Patchwork
@ 2017-09-11 13:16 ` Patchwork
  2017-09-11 16:02 ` ✗ Fi.CI.IGT: warning " Patchwork
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2017-09-11 13:16 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
URL   : https://patchwork.freedesktop.org/series/30108/
State : success

== Summary ==

Series 30108v1 series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
https://patchwork.freedesktop.org/api/1.0/series/30108/revisions/1/mbox/

Test chamelium:
        Subgroup dp-crc-fast:
                pass       -> DMESG-FAIL (fi-kbl-7500u) fdo#102514
Test kms_cursor_legacy:
        Subgroup basic-busy-flip-before-cursor-atomic:
                pass       -> FAIL       (fi-snb-2600) fdo#100215 +1
Test kms_flip:
        Subgroup basic-flip-vs-modeset:
                pass       -> SKIP       (fi-skl-x1585l) fdo#101781
Test pm_rpm:
        Subgroup basic-pci-d3-state:
                pass       -> DMESG-WARN (fi-cfl-s) fdo#102294 +1

fdo#102514 https://bugs.freedesktop.org/show_bug.cgi?id=102514
fdo#100215 https://bugs.freedesktop.org/show_bug.cgi?id=100215
fdo#101781 https://bugs.freedesktop.org/show_bug.cgi?id=101781
fdo#102294 https://bugs.freedesktop.org/show_bug.cgi?id=102294

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:452s
fi-blb-e6850     total:289  pass:224  dwarn:1   dfail:0   fail:0   skip:64  time:375s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:541s
fi-bwr-2160      total:289  pass:184  dwarn:0   dfail:0   fail:0   skip:105 time:271s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:516s
fi-byt-j1900     total:289  pass:254  dwarn:1   dfail:0   fail:0   skip:34  time:506s
fi-byt-n2820     total:289  pass:250  dwarn:1   dfail:0   fail:0   skip:38  time:493s
fi-cfl-s         total:289  pass:248  dwarn:6   dfail:0   fail:0   skip:35  time:460s
fi-elk-e7500     total:289  pass:230  dwarn:0   dfail:0   fail:0   skip:59  time:456s
fi-glk-2a        total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:600s
fi-hsw-4770      total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:440s
fi-hsw-4770r     total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:411s
fi-ilk-650       total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:438s
fi-ivb-3520m     total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:478s
fi-ivb-3770      total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:464s
fi-kbl-7500u     total:289  pass:263  dwarn:1   dfail:1   fail:0   skip:24  time:501s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:577s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:582s
fi-pnv-d510      total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:550s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:467s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:531s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:514s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:457s
fi-skl-x1585l    total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:475s
fi-snb-2520m     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:566s
fi-snb-2600      total:289  pass:248  dwarn:0   dfail:0   fail:2   skip:39  time:428s
fi-bdw-gvtdvm failed to connect after reboot

14ea1a9ec7378f8172037843a5e795bc9eacbd10 drm-tip: 2017y-09m-11d-12h-03m-37s UTC integration manifest
9207474b3f4a drm/i915: Try a minimal attempt to insert the whole object for relocations
1a425ee6ec6a drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
1eac9778ae64 drm/i915: Track user GTT faulting per-vma
1558347fa327 drm/i915: Emit pipelined fence changes
bcb82a8bc321 drm/i915: Consolidate get_fence with pin_fence
6383d18e75c9 drm/i915: Pin fence for iomap
aacbf7846b1c drm/i915: Keep the device awake whilst in the GTT domain
13dcf99db19a drm/i915: Try harder to finish the idle-worker
6d90292f3b3e drm/i915: Include fence-hint for timeout warning
cf484fc31a2b drm/i915/fence: Avoid del_timer_sync() from inside a timer
dadbf5f20670 drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5640/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* ✗ Fi.CI.IGT: warning for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (12 preceding siblings ...)
  2017-09-11 13:16 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2017-09-11 16:02 ` Patchwork
  2017-09-14 15:30 ` ✓ Fi.CI.BAT: success for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM (rev2) Patchwork
  2017-09-14 19:39 ` ✗ Fi.CI.IGT: failure " Patchwork
  15 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2017-09-11 16:02 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
URL   : https://patchwork.freedesktop.org/series/30108/
State : warning

== Summary ==

Test perf:
        Subgroup polling:
                fail       -> PASS       (shard-hsw) fdo#102252
Test kms_busy:
        Subgroup extended-modeset-hang-newfb-render-A:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-render-B:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-with-reset-render-A:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-with-reset-render-B:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-pageflip-modeset-hang-oldfb-render-C:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-render-C:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-with-reset-render-C:
                pass       -> DMESG-WARN (shard-hsw) fdo#102249
        Subgroup extended-pageflip-modeset-hang-oldfb-render-B:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-pageflip-modeset-hang-oldfb-render-A:
                pass       -> DMESG-WARN (shard-hsw)

fdo#102252 https://bugs.freedesktop.org/show_bug.cgi?id=102252
fdo#102249 https://bugs.freedesktop.org/show_bug.cgi?id=102249

shard-hsw        total:2301 pass:1227 dwarn:9   dfail:0   fail:13  skip:1052 time:9164s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5640/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH v2] drm/i915: Keep the device awake whilst in the GTT domain
  2017-09-11  8:41 ` [PATCH 05/11] drm/i915: Keep the device awake whilst in the GTT domain Chris Wilson
@ 2017-09-14 14:58   ` Chris Wilson
  0 siblings, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-14 14:58 UTC (permalink / raw)
  To: intel-gfx

Since runtime suspend is very harsh on GTT mmappings (they all get
zapped on suspend) keep the device awake while the buffer remains in
the GTT domain. However, userspace can control the domain and
although there is a soft contract that writes must be flushed (for e.g.
flushing scanouts and fbc), we are intentionally lax with respect to read
domains, allowing them to persist for as long as is feasible.

We acquire a wakeref when using the buffer in the GEM domain so that all
subsequent operations on that object are fast, trying to avoid
suspending while the GTT is still in active use by userspace.  To ensure
that the device can eventually suspend, we install a timer and expire the
GTT wakeref. So in effect we have just a fancy pm autosuspend that tries
to estimate the cost of restoring actively used GTT mmapping.

v2: Don't forget to setup/cleanup selftests.

References: https://bugs.freedesktop.org/show_bug.cgi?id=102490
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c              |   2 +
 drivers/gpu/drm/i915/i915_drv.h                  |  12 +++
 drivers/gpu/drm/i915/i915_gem.c                  | 122 +++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_object.h           |   5 +
 drivers/gpu/drm/i915/i915_gem_shrinker.c         |   4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_object.c |   2 +-
 drivers/gpu/drm/i915/selftests/mock_gem_device.c |   6 ++
 7 files changed, 134 insertions(+), 19 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2f691cc7b563..9bd082957202 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2812,6 +2812,8 @@ static int i915_runtime_pm_status(struct seq_file *m, void *unused)
 	if (!HAS_RUNTIME_PM(dev_priv))
 		seq_puts(m, "Runtime power management not supported\n");
 
+	seq_printf(m, "GTT wakeref count: %d\n",
+		   atomic_read(&dev_priv->mm.gtt_wakeref.count));
 	seq_printf(m, "GPU idle: %s\n", yesno(!dev_priv->gt.awake));
 	seq_printf(m, "IRQs disabled: %s\n",
 		   yesno(!intel_irqs_enabled(dev_priv)));
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 28ad5dadbb18..ebacd046736e 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1487,6 +1487,18 @@ struct i915_gem_mm {
 	 */
 	struct list_head userfault_list;
 
+	/* List of all objects in gtt domain, holding a wakeref.
+	 * The list is reaped periodically, and protected by its own mutex.
+	 * Entries on this list do not carry a reference.
+	 */
+	struct {
+		struct mutex lock;
+		struct list_head list;
+		atomic_t count; /* -1 == no timeout scheduled */
+
+		struct delayed_work work;
+	} gtt_wakeref;
+
 	/**
 	 * List of objects which are pending destruction.
 	 */
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index d98f7b25d395..253bb893c17e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -235,6 +235,7 @@ i915_gem_object_get_pages_phys(struct drm_i915_gem_object *obj)
 
 static void __start_cpu_write(struct drm_i915_gem_object *obj)
 {
+	GEM_BUG_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU);
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	if (cpu_write_needs_clflush(obj))
@@ -667,11 +668,14 @@ fb_write_origin(struct drm_i915_gem_object *obj, unsigned int domain)
 		obj->frontbuffer_ggtt_origin : ORIGIN_CPU);
 }
 
-static void
-flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
+void
+i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
+				   unsigned int flush_domains)
 {
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
 
+	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+
 	if (!(obj->base.write_domain & flush_domains))
 		return;
 
@@ -694,16 +698,19 @@ flush_write_domain(struct drm_i915_gem_object *obj, unsigned int flush_domains)
 
 	switch (obj->base.write_domain) {
 	case I915_GEM_DOMAIN_GTT:
-		if (!HAS_LLC(dev_priv)) {
-			intel_runtime_pm_get(dev_priv);
-			spin_lock_irq(&dev_priv->uncore.lock);
-			POSTING_READ_FW(RING_HEAD(dev_priv->engine[RCS]->mmio_base));
-			spin_unlock_irq(&dev_priv->uncore.lock);
-			intel_runtime_pm_put(dev_priv);
-		}
+		mutex_lock(&dev_priv->mm.gtt_wakeref.lock);
+		if (!list_empty(&obj->mm.gtt_wakeref_link)) {
+			if (!HAS_LLC(dev_priv)) {
+				spin_lock_irq(&dev_priv->uncore.lock);
+				POSTING_READ_FW(RING_HEAD(dev_priv->engine[RCS]->mmio_base));
+				spin_unlock_irq(&dev_priv->uncore.lock);
+			}
 
-		intel_fb_obj_flush(obj,
-				   fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
+			intel_fb_obj_flush(obj, fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
+
+			list_del_init(&obj->mm.gtt_wakeref_link);
+		}
+		mutex_unlock(&dev_priv->mm.gtt_wakeref.lock);
 		break;
 
 	case I915_GEM_DOMAIN_CPU:
@@ -808,7 +815,7 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
 			goto out;
 	}
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* If we're not in the cpu read domain, set ourself into the gtt
 	 * read domain and manually flush cachelines (if required). This
@@ -861,7 +868,7 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj,
 			goto out;
 	}
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* If we're not in the cpu write domain, set ourself into the
 	 * gtt write domain and manually flush cachelines (as required).
@@ -3423,9 +3430,10 @@ static void __i915_gem_object_flush_for_display(struct drm_i915_gem_object *obj)
 	 * We manually flush the CPU domain so that we can override and
 	 * force the flush for the display, and perform it asyncrhonously.
 	 */
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 	if (obj->cache_dirty)
 		i915_gem_clflush_object(obj, I915_CLFLUSH_FORCE);
+	GEM_BUG_ON(obj->base.write_domain & ~I915_GEM_DOMAIN_CPU);
 	obj->base.write_domain = 0;
 }
 
@@ -3478,7 +3486,7 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_WC);
 
 	/* Serialise direct access to this object with the barriers for
 	 * coherent writes from the GPU, by effectively invalidating the
@@ -3502,6 +3510,49 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 	return 0;
 }
 
+static void wakeref_timeout(struct work_struct *work)
+{
+	struct drm_i915_private *dev_priv =
+		container_of(work, typeof(*dev_priv), mm.gtt_wakeref.work.work);
+	struct drm_i915_gem_object *obj, *on;
+	unsigned int count;
+
+	mutex_lock(&dev_priv->mm.gtt_wakeref.lock);
+	count = atomic_xchg(&dev_priv->mm.gtt_wakeref.count, 0);
+	if (count) {
+		unsigned long timeout;
+
+		GEM_BUG_ON(count == -1);
+
+		if (INTEL_GEN(dev_priv) >= 6 && !HAS_LLC(dev_priv)) {
+			spin_lock_irq(&dev_priv->uncore.lock);
+			POSTING_READ_FW(RING_ACTHD(dev_priv->engine[RCS]->mmio_base));
+			spin_unlock_irq(&dev_priv->uncore.lock);
+		}
+
+		timeout = 0;
+		list_for_each_entry_safe(obj, on,
+					 &dev_priv->mm.gtt_wakeref.list,
+					 mm.gtt_wakeref_link) {
+			list_del_init(&obj->mm.gtt_wakeref_link);
+			if (obj->base.write_domain & I915_GEM_DOMAIN_GTT) {
+				intel_fb_obj_flush(obj, fb_write_origin(obj, I915_GEM_DOMAIN_GTT));
+				obj->base.write_domain = 0;
+				timeout++;
+			}
+			timeout++;
+		}
+
+		timeout = HZ * (ilog2(timeout) + 1) / 2;
+		schedule_delayed_work(&dev_priv->mm.gtt_wakeref.work,
+				      round_jiffies_up_relative(timeout));
+	} else {
+		intel_runtime_pm_put(dev_priv);
+		atomic_set(&dev_priv->mm.gtt_wakeref.count, -1);
+	}
+	mutex_unlock(&dev_priv->mm.gtt_wakeref.lock);
+}
+
 /**
  * Moves a single object to the GTT read, and possibly write domain.
  * @obj: object to act on
@@ -3513,6 +3564,7 @@ i915_gem_object_set_to_wc_domain(struct drm_i915_gem_object *obj, bool write)
 int
 i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 {
+	struct drm_i915_private *i915 = to_i915(obj->base.dev);
 	int ret;
 
 	lockdep_assert_held(&obj->base.dev->struct_mutex);
@@ -3526,6 +3578,17 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
+	mutex_lock(&i915->mm.gtt_wakeref.lock);
+	if (list_empty(&obj->mm.gtt_wakeref_link)) {
+		if (atomic_inc_and_test(&i915->mm.gtt_wakeref.count)) {
+			intel_runtime_pm_get(i915);
+			schedule_delayed_work(&i915->mm.gtt_wakeref.work,
+					      round_jiffies_up_relative(HZ));
+		}
+		list_add(&obj->mm.gtt_wakeref_link, &i915->mm.gtt_wakeref.list);
+	}
+	mutex_unlock(&i915->mm.gtt_wakeref.lock);
+
 	if (obj->base.write_domain == I915_GEM_DOMAIN_GTT)
 		return 0;
 
@@ -3541,7 +3604,7 @@ i915_gem_object_set_to_gtt_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_GTT);
 
 	/* Serialise direct access to this object with the barriers for
 	 * coherent writes from the GPU, by effectively invalidating the
@@ -3909,7 +3972,7 @@ i915_gem_object_set_to_cpu_domain(struct drm_i915_gem_object *obj, bool write)
 	if (ret)
 		return ret;
 
-	flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+	i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 	/* Flush the CPU cache if it's still invalid. */
 	if ((obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0) {
@@ -4258,6 +4321,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 
 	INIT_LIST_HEAD(&obj->global_link);
 	INIT_LIST_HEAD(&obj->userfault_link);
+	INIT_LIST_HEAD(&obj->mm.gtt_wakeref_link);
 	INIT_LIST_HEAD(&obj->vma_list);
 	INIT_LIST_HEAD(&obj->lut_list);
 	INIT_LIST_HEAD(&obj->batch_pool_link);
@@ -4395,6 +4459,8 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 
 		trace_i915_gem_object_destroy(obj);
 
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+
 		GEM_BUG_ON(i915_gem_object_is_active(obj));
 		list_for_each_entry_safe(vma, vn,
 					 &obj->vma_list, obj_link) {
@@ -4417,6 +4483,12 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
 		GEM_BUG_ON(atomic_read(&obj->frontbuffer_bits));
 		GEM_BUG_ON(!list_empty(&obj->lut_list));
 
+		if (!list_empty_careful(&obj->mm.gtt_wakeref_link)) {
+			mutex_lock(&i915->mm.gtt_wakeref.lock);
+			list_del(&obj->mm.gtt_wakeref_link);
+			mutex_unlock(&i915->mm.gtt_wakeref.lock);
+		}
+
 		if (obj->ops->release)
 			obj->ops->release(obj);
 
@@ -4549,6 +4621,8 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 	int ret;
 
 	intel_runtime_pm_get(dev_priv);
+	drain_delayed_work(&dev_priv->mm.gtt_wakeref.work);
+	WARN_ON(atomic_read(&dev_priv->mm.gtt_wakeref.count) != -1);
 	intel_suspend_gt_powersave(dev_priv);
 
 	mutex_lock(&dev->struct_mutex);
@@ -4891,6 +4965,15 @@ i915_gem_load_init_fences(struct drm_i915_private *dev_priv)
 	i915_gem_detect_bit_6_swizzle(dev_priv);
 }
 
+static void
+i915_gem_load_init__wakeref(struct drm_i915_private *i915)
+{
+	INIT_LIST_HEAD(&i915->mm.gtt_wakeref.list);
+	INIT_DELAYED_WORK(&i915->mm.gtt_wakeref.work, wakeref_timeout);
+	mutex_init(&i915->mm.gtt_wakeref.lock);
+	atomic_set(&i915->mm.gtt_wakeref.count, -1);
+}
+
 int
 i915_gem_load_init(struct drm_i915_private *dev_priv)
 {
@@ -4932,6 +5015,8 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 	if (err)
 		goto err_priorities;
 
+	i915_gem_load_init__wakeref(dev_priv);
+
 	INIT_WORK(&dev_priv->mm.free_work, __i915_gem_free_work);
 	init_llist_head(&dev_priv->mm.free_list);
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
@@ -4978,6 +5063,9 @@ void i915_gem_load_cleanup(struct drm_i915_private *dev_priv)
 	WARN_ON(!list_empty(&dev_priv->gt.timelines));
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
+	drain_delayed_work(&dev_priv->mm.gtt_wakeref.work);
+	WARN_ON(atomic_read(&dev_priv->mm.gtt_wakeref.count) != -1);
+
 	kmem_cache_destroy(dev_priv->priorities);
 	kmem_cache_destroy(dev_priv->dependencies);
 	kmem_cache_destroy(dev_priv->requests);
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index c30d8f808185..5797b75a804b 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -177,6 +177,8 @@ struct drm_i915_gem_object {
 			struct mutex lock; /* protects this cache */
 		} get_page;
 
+		struct list_head gtt_wakeref_link;
+
 		/**
 		 * Advice: are the backing pages purgeable?
 		 */
@@ -421,5 +423,8 @@ void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
 					 unsigned int cache_level);
 void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
 
+void i915_gem_object_flush_write_domain(struct drm_i915_gem_object *obj,
+					unsigned int flush_domains);
+
 #endif
 
diff --git a/drivers/gpu/drm/i915/i915_gem_shrinker.c b/drivers/gpu/drm/i915/i915_gem_shrinker.c
index 77fb39808131..8d146584ee49 100644
--- a/drivers/gpu/drm/i915/i915_gem_shrinker.c
+++ b/drivers/gpu/drm/i915/i915_gem_shrinker.c
@@ -127,8 +127,10 @@ static bool can_release_pages(struct drm_i915_gem_object *obj)
 
 static bool unsafe_drop_pages(struct drm_i915_gem_object *obj)
 {
-	if (i915_gem_object_unbind(obj) == 0)
+	if (i915_gem_object_unbind(obj) == 0) {
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 		__i915_gem_object_put_pages(obj, I915_MM_SHRINKER);
+	}
 	return !READ_ONCE(obj->mm.pages);
 }
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_object.c b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
index 8f011c447e41..f3f6588afbdb 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_object.c
@@ -266,7 +266,7 @@ static int check_partial_mapping(struct drm_i915_gem_object *obj,
 		if (offset >= obj->base.size)
 			continue;
 
-		flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
+		i915_gem_object_flush_write_domain(obj, ~I915_GEM_DOMAIN_CPU);
 
 		p = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
 		cpu = kmap(p) + offset_in_page(offset);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 678723430d78..ca260e4658af 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -67,6 +67,9 @@ static void mock_device_release(struct drm_device *dev)
 	i915_gem_contexts_fini(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
+	drain_delayed_work(&i915->mm.gtt_wakeref.work);
+	WARN_ON(atomic_read(&i915->mm.gtt_wakeref.count) != -1);
+
 	drain_workqueue(i915->wq);
 	i915_gem_drain_freed_objects(i915);
 
@@ -166,6 +169,7 @@ struct drm_i915_private *mock_gem_device(void)
 	drm_mode_config_init(&i915->drm);
 
 	mkwrite_device_info(i915)->gen = -1;
+	mkwrite_device_info(i915)->has_llc = true;
 
 	spin_lock_init(&i915->mm.object_stat_lock);
 	mock_uncore_init(i915);
@@ -177,6 +181,8 @@ struct drm_i915_private *mock_gem_device(void)
 	if (!i915->wq)
 		goto put_device;
 
+	i915_gem_load_init__wakeref(i915);
+
 	INIT_WORK(&i915->mm.free_work, __i915_gem_free_work);
 	init_llist_head(&i915->mm.free_list);
 	INIT_LIST_HEAD(&i915->mm.unbound_list);
-- 
2.14.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 29+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM (rev2)
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (13 preceding siblings ...)
  2017-09-11 16:02 ` ✗ Fi.CI.IGT: warning " Patchwork
@ 2017-09-14 15:30 ` Patchwork
  2017-09-14 19:39 ` ✗ Fi.CI.IGT: failure " Patchwork
  15 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2017-09-14 15:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM (rev2)
URL   : https://patchwork.freedesktop.org/series/30108/
State : success

== Summary ==

Series 30108v2 series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
https://patchwork.freedesktop.org/api/1.0/series/30108/revisions/2/mbox/

Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                skip       -> PASS       (fi-skl-x1585l)
Test pm_rpm:
        Subgroup basic-rte:
                pass       -> DMESG-WARN (fi-cfl-s) fdo#102294

fdo#102294 https://bugs.freedesktop.org/show_bug.cgi?id=102294

fi-bdw-5557u     total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:452s
fi-bdw-gvtdvm    total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:454s
fi-blb-e6850     total:289  pass:224  dwarn:1   dfail:0   fail:0   skip:64  time:382s
fi-bsw-n3050     total:289  pass:243  dwarn:0   dfail:0   fail:0   skip:46  time:542s
fi-bwr-2160      total:289  pass:184  dwarn:0   dfail:0   fail:0   skip:105 time:268s
fi-bxt-j4205     total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:518s
fi-byt-j1900     total:289  pass:254  dwarn:1   dfail:0   fail:0   skip:34  time:503s
fi-byt-n2820     total:289  pass:250  dwarn:1   dfail:0   fail:0   skip:38  time:499s
fi-cfl-s         total:289  pass:222  dwarn:35  dfail:0   fail:0   skip:32  time:549s
fi-elk-e7500     total:289  pass:230  dwarn:0   dfail:0   fail:0   skip:59  time:444s
fi-glk-2a        total:289  pass:260  dwarn:0   dfail:0   fail:0   skip:29  time:599s
fi-hsw-4770      total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:432s
fi-hsw-4770r     total:289  pass:263  dwarn:0   dfail:0   fail:0   skip:26  time:413s
fi-ilk-650       total:289  pass:229  dwarn:0   dfail:0   fail:0   skip:60  time:429s
fi-ivb-3520m     total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:495s
fi-ivb-3770      total:289  pass:261  dwarn:0   dfail:0   fail:0   skip:28  time:464s
fi-kbl-7500u     total:289  pass:264  dwarn:1   dfail:0   fail:0   skip:24  time:497s
fi-kbl-7560u     total:289  pass:270  dwarn:0   dfail:0   fail:0   skip:19  time:579s
fi-kbl-r         total:289  pass:262  dwarn:0   dfail:0   fail:0   skip:27  time:587s
fi-pnv-d510      total:289  pass:223  dwarn:1   dfail:0   fail:0   skip:65  time:563s
fi-skl-6260u     total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:488s
fi-skl-6700k     total:289  pass:265  dwarn:0   dfail:0   fail:0   skip:24  time:530s
fi-skl-6770hq    total:289  pass:269  dwarn:0   dfail:0   fail:0   skip:20  time:501s
fi-skl-gvtdvm    total:289  pass:266  dwarn:0   dfail:0   fail:0   skip:23  time:462s
fi-skl-x1585l    total:289  pass:268  dwarn:0   dfail:0   fail:0   skip:21  time:483s
fi-snb-2520m     total:289  pass:251  dwarn:0   dfail:0   fail:0   skip:38  time:571s
fi-snb-2600      total:289  pass:250  dwarn:0   dfail:0   fail:0   skip:39  time:429s

46e05d756fe32271bca4ac3ef1b3f1f6006fcc0c drm-tip: 2017y-09m-14d-14h-46m-01s UTC integration manifest
f502b9f345ff drm/i915: Try a minimal attempt to insert the whole object for relocations
d0b5ab2d4173 drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
f61470f155f3 drm/i915: Track user GTT faulting per-vma
17e6c088f2e0 drm/i915: Emit pipelined fence changes
4a3f7522b1f5 drm/i915: Consolidate get_fence with pin_fence
dcf00ff57792 drm/i915: Pin fence for iomap
dfbc8c28d85c drm/i915: Keep the device awake whilst in the GTT domain
b77092c771a2 drm/i915: Try harder to finish the idle-worker
fbebebb56629 drm/i915: Include fence-hint for timeout warning
a5258d561422 drm/i915/fence: Avoid del_timer_sync() from inside a timer
905cfa0ff0b1 drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5700/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* ✗ Fi.CI.IGT: failure for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM (rev2)
  2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
                   ` (14 preceding siblings ...)
  2017-09-14 15:30 ` ✓ Fi.CI.BAT: success for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM (rev2) Patchwork
@ 2017-09-14 19:39 ` Patchwork
  15 siblings, 0 replies; 29+ messages in thread
From: Patchwork @ 2017-09-14 19:39 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM (rev2)
URL   : https://patchwork.freedesktop.org/series/30108/
State : failure

== Summary ==

Test kms_busy:
        Subgroup extended-modeset-hang-newfb-render-A:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-pageflip-modeset-hang-oldfb-render-B:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-pageflip-modeset-hang-oldfb-render-C:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-render-B:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-with-reset-render-B:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-pageflip-modeset-hang-oldfb-render-A:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-with-reset-render-A:
                pass       -> DMESG-WARN (shard-hsw)
        Subgroup extended-modeset-hang-newfb-with-reset-render-C:
                pass       -> DMESG-WARN (shard-hsw) fdo#102249
        Subgroup extended-modeset-hang-newfb-render-C:
                pass       -> DMESG-WARN (shard-hsw)
Test kms_plane:
        Subgroup plane-panning-bottom-right-suspend-pipe-C-planes:
                skip       -> PASS       (shard-hsw)
Test kms_rmfb:
        Subgroup close-fd:
                skip       -> PASS       (shard-hsw)
Test kms_universal_plane:
        Subgroup universal-plane-pipe-B-functional:
                skip       -> PASS       (shard-hsw)
Test kms_cursor_crc:
        Subgroup cursor-256x256-dpms:
                skip       -> PASS       (shard-hsw)
Test gem_exec_store:
        Subgroup basic-render:
                pass       -> FAIL       (shard-hsw)
Test kms_cursor_legacy:
        Subgroup flip-vs-cursor-toggle:
                pass       -> FAIL       (shard-hsw) fdo#102670

fdo#102249 https://bugs.freedesktop.org/show_bug.cgi?id=102249
fdo#102670 https://bugs.freedesktop.org/show_bug.cgi?id=102670

shard-hsw        total:2313 pass:1235 dwarn:9   dfail:0   fail:14  skip:1055 time:9188s

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_5700/shards.html
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM
  2017-09-11 11:33   ` Michał Winiarski
@ 2017-09-15  9:19     ` Chris Wilson
  0 siblings, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-15  9:19 UTC (permalink / raw)
  To: Michał Winiarski; +Cc: intel-gfx

Quoting Michał Winiarski (2017-09-11 12:33:41)
> On Mon, Sep 11, 2017 at 09:41:25AM +0100, Chris Wilson wrote:
> > To silence the critcs:
> > 
> > [56532.161115] workqueue: PF_MEMALLOC task 36(khugepaged) is flushing !WQ_MEM_RECLAIM i915-userptr-release:          (null)
> > [56532.161138] ------------[ cut here ]------------
> > [56532.161144] WARNING: CPU: 1 PID: 36 at kernel/workqueue.c:2418 check_flush_dependency+0xe8/0xf0
> > [56532.161145] Modules linked in: wmi_bmof
> > [56532.161148] CPU: 1 PID: 36 Comm: khugepaged Not tainted 4.13.0-krejzi #1
> > [56532.161149] Hardware name: HP HP ProBook 470 G3/8102, BIOS N78 Ver. 01.17 06/08/2017
> > [56532.161150] task: ffff8802371ee200 task.stack: ffffc90000174000
> > [56532.161152] RIP: 0010:check_flush_dependency+0xe8/0xf0
> > [56532.161152] RSP: 0018:ffffc900001777b8 EFLAGS: 00010286
> > [56532.161153] RAX: 000000000000006c RBX: ffff88022fc5a000 RCX: 0000000000000001
> > [56532.161154] RDX: 0000000000000000 RSI: 0000000000000086 RDI: 00000000ffffffff
> > [56532.161155] RBP: 0000000000000000 R08: 14f038bb55f6dae0 R09: 0000000000000516
> > [56532.161155] R10: ffffc900001778a0 R11: 000000006c756e28 R12: ffff8802371ee200
> > [56532.161156] R13: 0000000000000000 R14: 000000000000000b R15: ffffc90000177810
> > [56532.161157] FS:  0000000000000000(0000) GS:ffff880240480000(0000) knlGS:0000000000000000
> > [56532.161158] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [56532.161158] CR2: 0000000004795ff8 CR3: 000000000220a000 CR4: 00000000003406e0
> > [56532.161159] Call Trace:
> > [56532.161161]  ? flush_workqueue+0x136/0x3e0
> > [56532.161178]  ? _raw_spin_unlock_irqrestore+0xf/0x30
> > [56532.161179]  ? try_to_wake_up+0x1ce/0x3b0
> > [56532.161183]  ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150
> > [56532.161184]  ? _raw_spin_unlock+0xd/0x20
> > [56532.161186]  ? i915_gem_userptr_mn_invalidate_range_start+0x13f/0x150
> > [56532.161189]  ? __mmu_notifier_invalidate_range_start+0x4a/0x70
> > [56532.161191]  ? try_to_unmap_one+0x5e5/0x660
> > [56532.161193]  ? rmap_walk_file+0xe4/0x240
> > [56532.161195]  ? __ClearPageMovable+0x10/0x10
> > [56532.161196]  ? try_to_unmap+0x8c/0xe0
> > [56532.161197]  ? page_remove_rmap+0x280/0x280
> > [56532.161199]  ? page_not_mapped+0x10/0x10
> > [56532.161200]  ? page_get_anon_vma+0x90/0x90
> > [56532.161202]  ? migrate_pages+0x6a5/0x940
> > [56532.161203]  ? isolate_freepages_block+0x330/0x330
> > [56532.161205]  ? compact_zone+0x593/0x6a0
> > [56532.161206]  ? enqueue_task_fair+0xc3/0x1180
> > [56532.161208]  ? compact_zone_order+0x9b/0xc0
> > [56532.161210]  ? get_page_from_freelist+0x24a/0x900
> > [56532.161212]  ? try_to_compact_pages+0xc8/0x240
> > [56532.161213]  ? try_to_compact_pages+0xc8/0x240
> > [56532.161215]  ? __alloc_pages_direct_compact+0x45/0xe0
> > [56532.161216]  ? __alloc_pages_slowpath+0x845/0xb90
> > [56532.161218]  ? __alloc_pages_nodemask+0x176/0x1f0
> > [56532.161220]  ? wait_woken+0x80/0x80
> > [56532.161222]  ? khugepaged+0x29e/0x17d0
> > [56532.161223]  ? wait_woken+0x80/0x80
> > [56532.161225]  ? collapse_shmem.isra.39+0xa60/0xa60
> > [56532.161226]  ? kthread+0x10d/0x130
> > [56532.161227]  ? kthread_create_on_node+0x60/0x60
> > [56532.161228]  ? ret_from_fork+0x22/0x30
> > [56532.161229] Code: 00 8b b0 10 05 00 00 48 8d 8b b0 00 00 00 48 8d 90 b8 06 00 00 49 89 e8 48 c7 c7 38 55 09 82 c6 05 f9 c6 1d 01 01 e8 0e a1 03 00 <0f> ff e9 6b ff ff ff 90 48 8b 37 40 f6 c6 04 75 1b 48 c1 ee 05
> > [56532.161251] ---[ end trace 2ce2b4f5f69b803b ]---
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>

Plonked this one in, but the rest of the series remains untouched. Those
INCOMPLETES are not fixing themselves... :-p
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer
  2017-09-11  8:41 ` [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer Chris Wilson
@ 2017-09-19  9:24   ` Chris Wilson
  2017-09-19 10:19   ` Tvrtko Ursulin
  1 sibling, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-19  9:24 UTC (permalink / raw)
  To: intel-gfx

Quoting Chris Wilson (2017-09-11 09:41:26)
> A fence may be signaled from any context, including from inside a timer.
> One example is timer_i915_sw_fence_wake() which is used to provide a
> safety-net when waiting on an external fence. If the external fence is
> not signaled within a timely fashion, we signal our fence on its behalf,
> and so we then may process subsequent fences in the chain from within
> that timer context.
> 
> Given that dma_i915_sw_fence_cb() may be from inside a timer, we cannot
> then use del_timer_sync() as that requires the timer lock for itself. To
> circumvent this, while trying to keep the signal propagation as low
> latency as possible, move the completion into a worker and use a bit of
> atomic switheroo to serialise the timer-callback and the dma-callback.
> 
> Testcase: igt/gem_eio/in-flight-external
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Ping? It's ugly but it's the best I have to offer to escape the lock
inversion whilst trying to avoid adding any delay to signaling.
-Chris

> ---
>  drivers/gpu/drm/i915/Kconfig         |  1 +
>  drivers/gpu/drm/i915/i915_sw_fence.c | 27 +++++++++++++++++++++------
>  2 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
> index e9e64e8e9765..dfd95889f4b7 100644
> --- a/drivers/gpu/drm/i915/Kconfig
> +++ b/drivers/gpu/drm/i915/Kconfig
> @@ -12,6 +12,7 @@ config DRM_I915
>         select DRM_PANEL
>         select DRM_MIPI_DSI
>         select RELAY
> +       select IRQ_WORK
>         # i915 depends on ACPI_VIDEO when ACPI is enabled
>         # but for select to work, need to select ACPI_VIDEO's dependencies, ick
>         select BACKLIGHT_LCD_SUPPORT if ACPI
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
> index f29540f922af..808ea4d5b962 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> @@ -9,6 +9,7 @@
>  
>  #include <linux/slab.h>
>  #include <linux/dma-fence.h>
> +#include <linux/irq_work.h>
>  #include <linux/reservation.h>
>  
>  #include "i915_sw_fence.h"
> @@ -356,31 +357,44 @@ struct i915_sw_dma_fence_cb {
>         struct i915_sw_fence *fence;
>         struct dma_fence *dma;
>         struct timer_list timer;
> +       struct irq_work work;
>  };
>  
>  static void timer_i915_sw_fence_wake(unsigned long data)
>  {
>         struct i915_sw_dma_fence_cb *cb = (struct i915_sw_dma_fence_cb *)data;
> +       struct i915_sw_fence *fence;
> +
> +       fence = xchg(&cb->fence, NULL);
> +       if (!fence)
> +               return;
>  
>         pr_warn("asynchronous wait on fence %s:%s:%x timed out\n",
>                 cb->dma->ops->get_driver_name(cb->dma),
>                 cb->dma->ops->get_timeline_name(cb->dma),
>                 cb->dma->seqno);
> -       dma_fence_put(cb->dma);
> -       cb->dma = NULL;
>  
> -       i915_sw_fence_complete(cb->fence);
> -       cb->timer.function = NULL;
> +       i915_sw_fence_complete(fence);
>  }
>  
>  static void dma_i915_sw_fence_wake(struct dma_fence *dma,
>                                    struct dma_fence_cb *data)
>  {
>         struct i915_sw_dma_fence_cb *cb = container_of(data, typeof(*cb), base);
> +       struct i915_sw_fence *fence;
> +
> +       fence = xchg(&cb->fence, NULL);
> +       if (fence)
> +               i915_sw_fence_complete(fence);
> +
> +       irq_work_queue(&cb->work);
> +}
> +
> +static void irq_i915_sw_fence_work(struct irq_work *wrk)
> +{
> +       struct i915_sw_dma_fence_cb *cb = container_of(wrk, typeof(*cb), work);
>  
>         del_timer_sync(&cb->timer);
> -       if (cb->timer.function)
> -               i915_sw_fence_complete(cb->fence);
>         dma_fence_put(cb->dma);
>  
>         kfree(cb);
> @@ -414,6 +428,7 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
>         __setup_timer(&cb->timer,
>                       timer_i915_sw_fence_wake, (unsigned long)cb,
>                       TIMER_IRQSAFE);
> +       init_irq_work(&cb->work, irq_i915_sw_fence_work);
>         if (timeout) {
>                 cb->dma = dma_fence_get(dma);
>                 mod_timer(&cb->timer, round_jiffies_up(jiffies + timeout));
> -- 
> 2.14.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer
  2017-09-11  8:41 ` [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer Chris Wilson
  2017-09-19  9:24   ` Chris Wilson
@ 2017-09-19 10:19   ` Tvrtko Ursulin
  2017-09-19 12:15     ` Chris Wilson
  1 sibling, 1 reply; 29+ messages in thread
From: Tvrtko Ursulin @ 2017-09-19 10:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 11/09/2017 09:41, Chris Wilson wrote:
> A fence may be signaled from any context, including from inside a timer.
> One example is timer_i915_sw_fence_wake() which is used to provide a
> safety-net when waiting on an external fence. If the external fence is
> not signaled within a timely fashion, we signal our fence on its behalf,
> and so we then may process subsequent fences in the chain from within
> that timer context.
> 
> Given that dma_i915_sw_fence_cb() may be from inside a timer, we cannot

s/dma_i915_sw_fence_cb/dma_i915_sw_fence_wake/ ?

> then use del_timer_sync() as that requires the timer lock for itself. To
> circumvent this, while trying to keep the signal propagation as low
> latency as possible, move the completion into a worker and use a bit of
> atomic switheroo to serialise the timer-callback and the dma-callback.
> 
> Testcase: igt/gem_eio/in-flight-external
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/Kconfig         |  1 +
>   drivers/gpu/drm/i915/i915_sw_fence.c | 27 +++++++++++++++++++++------
>   2 files changed, 22 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
> index e9e64e8e9765..dfd95889f4b7 100644
> --- a/drivers/gpu/drm/i915/Kconfig
> +++ b/drivers/gpu/drm/i915/Kconfig
> @@ -12,6 +12,7 @@ config DRM_I915
>   	select DRM_PANEL
>   	select DRM_MIPI_DSI
>   	select RELAY
> +	select IRQ_WORK
>   	# i915 depends on ACPI_VIDEO when ACPI is enabled
>   	# but for select to work, need to select ACPI_VIDEO's dependencies, ick
>   	select BACKLIGHT_LCD_SUPPORT if ACPI
> diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
> index f29540f922af..808ea4d5b962 100644
> --- a/drivers/gpu/drm/i915/i915_sw_fence.c
> +++ b/drivers/gpu/drm/i915/i915_sw_fence.c
> @@ -9,6 +9,7 @@
>   
>   #include <linux/slab.h>
>   #include <linux/dma-fence.h>
> +#include <linux/irq_work.h>
>   #include <linux/reservation.h>
>   
>   #include "i915_sw_fence.h"
> @@ -356,31 +357,44 @@ struct i915_sw_dma_fence_cb {
>   	struct i915_sw_fence *fence;
>   	struct dma_fence *dma;
>   	struct timer_list timer;
> +	struct irq_work work;
>   };
>   
>   static void timer_i915_sw_fence_wake(unsigned long data)
>   {
>   	struct i915_sw_dma_fence_cb *cb = (struct i915_sw_dma_fence_cb *)data;
> +	struct i915_sw_fence *fence;
> +
> +	fence = xchg(&cb->fence, NULL);
> +	if (!fence)
> +		return;
>   
>   	pr_warn("asynchronous wait on fence %s:%s:%x timed out\n",
>   		cb->dma->ops->get_driver_name(cb->dma),
>   		cb->dma->ops->get_timeline_name(cb->dma),
>   		cb->dma->seqno);
> -	dma_fence_put(cb->dma);
> -	cb->dma = NULL;
>   
> -	i915_sw_fence_complete(cb->fence);
> -	cb->timer.function = NULL;
> +	i915_sw_fence_complete(fence);
>   }
>   
>   static void dma_i915_sw_fence_wake(struct dma_fence *dma,
>   				   struct dma_fence_cb *data)
>   {
>   	struct i915_sw_dma_fence_cb *cb = container_of(data, typeof(*cb), base);
> +	struct i915_sw_fence *fence;
> +
> +	fence = xchg(&cb->fence, NULL);
> +	if (fence)
> +		i915_sw_fence_complete(fence);
> +
> +	irq_work_queue(&cb->work); > +}
> +
> +static void irq_i915_sw_fence_work(struct irq_work *wrk)
> +{
> +	struct i915_sw_dma_fence_cb *cb = container_of(wrk, typeof(*cb), work);
>   
>   	del_timer_sync(&cb->timer);
> -	if (cb->timer.function)
> -		i915_sw_fence_complete(cb->fence);
>   	dma_fence_put(cb->dma);
>   
>   	kfree(cb);
> @@ -414,6 +428,7 @@ int i915_sw_fence_await_dma_fence(struct i915_sw_fence *fence,
>   	__setup_timer(&cb->timer,
>   		      timer_i915_sw_fence_wake, (unsigned long)cb,
>   		      TIMER_IRQSAFE);
> +	init_irq_work(&cb->work, irq_i915_sw_fence_work);
>   	if (timeout) {
>   		cb->dma = dma_fence_get(dma); >   		mod_timer(&cb->timer, round_jiffies_up(jiffies + timeout));
> 

Okay it took me a few minutes to figure it out. Looks correct.

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer
  2017-09-19 10:19   ` Tvrtko Ursulin
@ 2017-09-19 12:15     ` Chris Wilson
  0 siblings, 0 replies; 29+ messages in thread
From: Chris Wilson @ 2017-09-19 12:15 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2017-09-19 11:19:25)
> 
> On 11/09/2017 09:41, Chris Wilson wrote:
> > A fence may be signaled from any context, including from inside a timer.
> > One example is timer_i915_sw_fence_wake() which is used to provide a
> > safety-net when waiting on an external fence. If the external fence is
> > not signaled within a timely fashion, we signal our fence on its behalf,
> > and so we then may process subsequent fences in the chain from within
> > that timer context.
> > 
> > Given that dma_i915_sw_fence_cb() may be from inside a timer, we cannot
> 
> s/dma_i915_sw_fence_cb/dma_i915_sw_fence_wake/ ?
> 
> > then use del_timer_sync() as that requires the timer lock for itself. To
> > circumvent this, while trying to keep the signal propagation as low
> > latency as possible, move the completion into a worker and use a bit of
> > atomic switheroo to serialise the timer-callback and the dma-callback.
> > 
> > Testcase: igt/gem_eio/in-flight-external
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

> Okay it took me a few minutes to figure it out. Looks correct.
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Ta, fixed up the bad reference and pushed this along with the igt that
was happily deadlocking the kernel.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 10/11] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
  2017-09-11  8:41 ` [PATCH 10/11] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node Chris Wilson
@ 2017-09-19 12:43   ` Chris Wilson
  2017-09-25  8:40     ` Chris Wilson
  0 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2017-09-19 12:43 UTC (permalink / raw)
  To: intel-gfx

Quoting Chris Wilson (2017-09-11 09:41:34)
> If the caller says that he doesn't want to evict any other faulting
> vma, honour that flag. The logic was used in evict_something, but not
> the more specific evict_for_node, now being used as a preliminary probe
> since commit 606fec956c0e ("drm/i915: Prefer random replacement before
> eviction search").
> 
> Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction search")
> Fixes: 821188778b9b ("drm/i915: Choose not to evict faultable objects from the GGTT")
> References: https://patchwork.freedesktop.org/patch/174781/
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Any chance of getting review to this point?

> ---
>  drivers/gpu/drm/i915/i915_gem_evict.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> index 933ee8ecfa54..a5a5b7e6daae 100644
> --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> @@ -315,6 +315,11 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
>                         break;
>                 }
>  
> +               if (flags & PIN_NONFAULT && i915_vma_has_userfault(vma)) {
> +                       ret = -ENOSPC;
> +                       break;
> +               }
> +
>                 /* Overlap of objects in the same batch? */
>                 if (i915_vma_is_pinned(vma)) {
>                         ret = -ENOSPC;
> -- 
> 2.14.1
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 10/11] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
  2017-09-19 12:43   ` Chris Wilson
@ 2017-09-25  8:40     ` Chris Wilson
  2017-09-28 10:40       ` Joonas Lahtinen
  0 siblings, 1 reply; 29+ messages in thread
From: Chris Wilson @ 2017-09-25  8:40 UTC (permalink / raw)
  To: intel-gfx

Quoting Chris Wilson (2017-09-19 13:43:41)
> Quoting Chris Wilson (2017-09-11 09:41:34)
> > If the caller says that he doesn't want to evict any other faulting
> > vma, honour that flag. The logic was used in evict_something, but not
> > the more specific evict_for_node, now being used as a preliminary probe
> > since commit 606fec956c0e ("drm/i915: Prefer random replacement before
> > eviction search").
> > 
> > Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction search")
> > Fixes: 821188778b9b ("drm/i915: Choose not to evict faultable objects from the GGTT")
> > References: https://patchwork.freedesktop.org/patch/174781/
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> 
> Any chance of getting review to this point?

Any better suggestions for resolving the incompletes in CI? Is this good
enough to land right now and then we can improve again with a happy CI?

> > ---
> >  drivers/gpu/drm/i915/i915_gem_evict.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
> > index 933ee8ecfa54..a5a5b7e6daae 100644
> > --- a/drivers/gpu/drm/i915/i915_gem_evict.c
> > +++ b/drivers/gpu/drm/i915/i915_gem_evict.c
> > @@ -315,6 +315,11 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
> >                         break;
> >                 }
> >  
> > +               if (flags & PIN_NONFAULT && i915_vma_has_userfault(vma)) {
> > +                       ret = -ENOSPC;
> > +                       break;
> > +               }
> > +
> >                 /* Overlap of objects in the same batch? */
> >                 if (i915_vma_is_pinned(vma)) {
> >                         ret = -ENOSPC;
> > -- 
> > 2.14.1
> > 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 03/11] drm/i915: Include fence-hint for timeout warning
  2017-09-11  8:41 ` [PATCH 03/11] drm/i915: Include fence-hint for timeout warning Chris Wilson
@ 2017-09-28 10:14   ` Joonas Lahtinen
  0 siblings, 0 replies; 29+ messages in thread
From: Joonas Lahtinen @ 2017-09-28 10:14 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Mon, 2017-09-11 at 09:41 +0100, Chris Wilson wrote:
> If an asynchronous wait on a foriegn fence, we print a warning

"foreign"

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas


> indicating which fence was not signaled. As i915_sw_fences become more
> common, include the debug hint (the symbol-name of the target) to help
> identify the waiter. E.g.
> 
> [   31.968144] Asynchronous wait on fence sw_sync:gem_eio:1 timed out for hint:submit_notify [i915]
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

<SNIP>
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 04/11] drm/i915: Try harder to finish the idle-worker
  2017-09-11  8:41 ` [PATCH 04/11] drm/i915: Try harder to finish the idle-worker Chris Wilson
@ 2017-09-28 10:16   ` Joonas Lahtinen
  0 siblings, 0 replies; 29+ messages in thread
From: Joonas Lahtinen @ 2017-09-28 10:16 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Mika Kuoppala

On Mon, 2017-09-11 at 09:41 +0100, Chris Wilson wrote:
> If a worker requeues itself, it may switch to a different kworker pool,
> which flush_work() considers as complete. To be strict, we then need to
> keep flushing the work until it is no longer pending.
> 
> References: https://bugs.freedesktop.org/show_bug.cgi?id=102456
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 06/11] drm/i915: Pin fence for iomap
  2017-09-11  8:41 ` [PATCH 06/11] drm/i915: Pin fence for iomap Chris Wilson
@ 2017-09-28 10:20   ` Joonas Lahtinen
  0 siblings, 0 replies; 29+ messages in thread
From: Joonas Lahtinen @ 2017-09-28 10:20 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Mon, 2017-09-11 at 09:41 +0100, Chris Wilson wrote:
> Acquire the fence register for the iomap in i915_vma_pin_iomap() on
> behalf of the caller.
> 
> We probably want for the caller to specify whether the fence should be
> pinned for their usage, but at the moment all callers do want the
> associated fence, or none, so take it on their behalf.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

<SNIP>

> @@ -278,6 +278,7 @@ int i915_vma_bind(struct i915_vma *vma, enum i915_cache_level cache_level,
>  void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
>  {
>  	void __iomem *ptr;
> +	int ret;
>  
>  	/* Access through the GTT requires the device to be awake. */
>  	assert_rpm_wakelock_held(vma->vm->i915);
> @@ -301,9 +302,27 @@ void __iomem *i915_vma_pin_iomap(struct i915_vma *vma)
>  	}
>  
>  	__i915_vma_pin(vma);
> +
> +	ret = i915_vma_get_fence(vma);
> +	if (ret) {
> +		__i915_vma_unpin(vma);
> +		return IO_ERR_PTR(ret);

Please consolidate the IO_ERR_PTR's into goto labels. Then this is;

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 10/11] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node
  2017-09-25  8:40     ` Chris Wilson
@ 2017-09-28 10:40       ` Joonas Lahtinen
  0 siblings, 0 replies; 29+ messages in thread
From: Joonas Lahtinen @ 2017-09-28 10:40 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On Mon, 2017-09-25 at 09:40 +0100, Chris Wilson wrote:
> Quoting Chris Wilson (2017-09-19 13:43:41)
> > Quoting Chris Wilson (2017-09-11 09:41:34)
> > > If the caller says that he doesn't want to evict any other faulting
> > > vma, honour that flag. The logic was used in evict_something, but not
> > > the more specific evict_for_node, now being used as a preliminary probe
> > > since commit 606fec956c0e ("drm/i915: Prefer random replacement before
> > > eviction search").
> > > 
> > > Fixes: 606fec956c0e ("drm/i915: Prefer random replacement before eviction search")
> > > Fixes: 821188778b9b ("drm/i915: Choose not to evict faultable objects from the GGTT")
> > > References: https://patchwork.freedesktop.org/patch/174781/
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> > 
> > Any chance of getting review to this point?
> 
> Any better suggestions for resolving the incompletes in CI? Is this good
> enough to land right now and then we can improve again with a happy CI?

Review provided for the CI/bxt relevant parts.

Regards, Joonas
-- 
Joonas Lahtinen
Open Source Technology Center
Intel Corporation
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2017-09-28 10:40 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-11  8:41 Fixes for CI/bxt Chris Wilson
2017-09-11  8:41 ` [PATCH 01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM Chris Wilson
2017-09-11 11:33   ` Michał Winiarski
2017-09-15  9:19     ` Chris Wilson
2017-09-11  8:41 ` [PATCH 02/11] drm/i915/fence: Avoid del_timer_sync() from inside a timer Chris Wilson
2017-09-19  9:24   ` Chris Wilson
2017-09-19 10:19   ` Tvrtko Ursulin
2017-09-19 12:15     ` Chris Wilson
2017-09-11  8:41 ` [PATCH 03/11] drm/i915: Include fence-hint for timeout warning Chris Wilson
2017-09-28 10:14   ` Joonas Lahtinen
2017-09-11  8:41 ` [PATCH 04/11] drm/i915: Try harder to finish the idle-worker Chris Wilson
2017-09-28 10:16   ` Joonas Lahtinen
2017-09-11  8:41 ` [PATCH 05/11] drm/i915: Keep the device awake whilst in the GTT domain Chris Wilson
2017-09-14 14:58   ` [PATCH v2] " Chris Wilson
2017-09-11  8:41 ` [PATCH 06/11] drm/i915: Pin fence for iomap Chris Wilson
2017-09-28 10:20   ` Joonas Lahtinen
2017-09-11  8:41 ` [PATCH 07/11] drm/i915: Consolidate get_fence with pin_fence Chris Wilson
2017-09-11  8:41 ` [PATCH 08/11] drm/i915: Emit pipelined fence changes Chris Wilson
2017-09-11  8:41 ` [PATCH 09/11] drm/i915: Track user GTT faulting per-vma Chris Wilson
2017-09-11  8:41 ` [PATCH 10/11] drm/i915: Check PIN_NONFAULT overlaps in evict_for_node Chris Wilson
2017-09-19 12:43   ` Chris Wilson
2017-09-25  8:40     ` Chris Wilson
2017-09-28 10:40       ` Joonas Lahtinen
2017-09-11  8:41 ` [PATCH 11/11] drm/i915: Try a minimal attempt to insert the whole object for relocations Chris Wilson
2017-09-11  9:16 ` ✗ Fi.CI.BAT: warning for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM Patchwork
2017-09-11 13:16 ` ✓ Fi.CI.BAT: success " Patchwork
2017-09-11 16:02 ` ✗ Fi.CI.IGT: warning " Patchwork
2017-09-14 15:30 ` ✓ Fi.CI.BAT: success for series starting with [01/11] drm/i915: Mark the userptr invalidate workqueue as WQ_MEM_RECLAIM (rev2) Patchwork
2017-09-14 19:39 ` ✗ Fi.CI.IGT: failure " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.