All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
@ 2018-12-03 11:36 Chris Wilson
  2018-12-03 11:36 ` [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging Chris Wilson
                   ` (15 more replies)
  0 siblings, 16 replies; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 11:36 UTC (permalink / raw)
  To: intel-gfx

Change the on-cpu check to on-runqueue to catch if the waiter has been
woken (and reset its current_state back to TASK_UNINTERRUPTIBLE to
perform the seqno check) but is sleeping due to being preempted off the
cpu.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 84bf8d827136..447c5256f63a 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -27,11 +27,7 @@
 
 #include "i915_drv.h"
 
-#ifdef CONFIG_SMP
-#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_cpu)
-#else
-#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL)
-#endif
+#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_rq)
 
 static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b)
 {
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
@ 2018-12-03 11:36 ` Chris Wilson
  2018-12-03 17:11   ` Tvrtko Ursulin
  2018-12-03 11:36 ` [PATCH 3/8] drm/i915/ringbuffer: Clear semaphore sync registers on ring init Chris Wilson
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 11:36 UTC (permalink / raw)
  To: intel-gfx

We inspect the requests under the assumption that they will be marked as
completed when they are removed from the queue. Currently however, in the
process of wedging the requests will be removed from the queue before they
are completed, so rearrange the code to complete the fences before the
locks are dropped.

<1>[  354.473346] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250
<6>[  354.473363] PGD 0 P4D 0
<4>[  354.473370] Oops: 0000 [#1] PREEMPT SMP PTI
<4>[  354.473380] CPU: 0 PID: 4470 Comm: gem_eio Tainted: G     U            4.20.0-rc4-CI-CI_DRM_5216+ #1
<4>[  354.473393] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018
<4>[  354.473480] RIP: 0010:__i915_schedule+0x311/0x5e0 [i915]
<4>[  354.473490] Code: 49 89 44 24 20 4d 89 4c 24 28 4d 89 29 44 39 b3 a0 04 00 00 7d 3a 41 8b 44 24 78 85 c0 74 13 48 8b 93 78 04 00 00 48 83 e2 fc <39> 82 50 02 00 00 79 1e 44 89 b3 a0 04 00 00 48 8d bb d0 03 00 00
<4>[  354.473515] RSP: 0018:ffffc900001bba90 EFLAGS: 00010046
<4>[  354.473524] RAX: 0000000000000003 RBX: ffff8882624c8008 RCX: f34a737800000000
<4>[  354.473535] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8882624c8048
<4>[  354.473545] RBP: ffffc900001bbab0 R08: 000000005963f1f1 R09: 0000000000000000
<4>[  354.473556] R10: ffffc900001bba10 R11: ffff8882624c8060 R12: ffff88824fdd7b98
<4>[  354.473567] R13: ffff88824fdd7bb8 R14: 0000000000000001 R15: ffff88824fdd7750
<4>[  354.473578] FS:  00007f44b4b5b980(0000) GS:ffff888277e00000(0000) knlGS:0000000000000000
<4>[  354.473590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[  354.473599] CR2: 0000000000000250 CR3: 000000026976e000 CR4: 0000000000340ef0
<4>[  354.473611] Call Trace:
<4>[  354.473622]  ? lock_acquire+0xa6/0x1c0
<4>[  354.473677]  ? i915_schedule_bump_priority+0x57/0xd0 [i915]
<4>[  354.473736]  i915_schedule_bump_priority+0x72/0xd0 [i915]
<4>[  354.473792]  i915_request_wait+0x4db/0x840 [i915]
<4>[  354.473804]  ? get_pwq.isra.4+0x2c/0x50
<4>[  354.473813]  ? ___preempt_schedule+0x16/0x18
<4>[  354.473824]  ? wake_up_q+0x70/0x70
<4>[  354.473831]  ? wake_up_q+0x70/0x70
<4>[  354.473882]  ? gen6_rps_boost+0x118/0x120 [i915]
<4>[  354.473936]  i915_gem_object_wait_fence+0x8a/0x110 [i915]
<4>[  354.473991]  i915_gem_object_wait+0x113/0x500 [i915]
<4>[  354.474047]  i915_gem_wait_ioctl+0x11c/0x2f0 [i915]
<4>[  354.474101]  ? i915_gem_unset_wedged+0x210/0x210 [i915]
<4>[  354.474113]  drm_ioctl_kernel+0x81/0xf0
<4>[  354.474123]  drm_ioctl+0x2de/0x390
<4>[  354.474175]  ? i915_gem_unset_wedged+0x210/0x210 [i915]
<4>[  354.474187]  ? finish_task_switch+0x95/0x260
<4>[  354.474197]  ? lock_acquire+0xa6/0x1c0
<4>[  354.474207]  do_vfs_ioctl+0xa0/0x6e0
<4>[  354.474217]  ? __fget+0xfc/0x1e0
<4>[  354.474225]  ksys_ioctl+0x35/0x60
<4>[  354.474233]  __x64_sys_ioctl+0x11/0x20
<4>[  354.474241]  do_syscall_64+0x55/0x190
<4>[  354.474251]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[  354.474260] RIP: 0033:0x7f44b3de65d7
<4>[  354.474267] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
<4>[  354.474293] RSP: 002b:00007fff974948e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
<4>[  354.474305] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f44b3de65d7
<4>[  354.474316] RDX: 00007fff97494940 RSI: 00000000c010646c RDI: 0000000000000007
<4>[  354.474327] RBP: 00007fff97494940 R08: 0000000000000000 R09: 00007f44b40bbc40
<4>[  354.474337] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c010646c
<4>[  354.474348] R13: 0000000000000007 R14: 0000000000000000 R15: 0000000000000000

v2: Avoid floating requests.
v3: Can't call dma_fence_signal() under the timeline lock!
v4: Can't call dma_fence_signal() from inside another fence either.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c         | 54 +++++--------------------
 drivers/gpu/drm/i915/intel_lrc.c        | 13 ++++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 13 +++++-
 3 files changed, 31 insertions(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c55b1f75c980..834240a9b262 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -3308,16 +3308,6 @@ void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
 }
 
 static void nop_submit_request(struct i915_request *request)
-{
-	GEM_TRACE("%s fence %llx:%d -> -EIO\n",
-		  request->engine->name,
-		  request->fence.context, request->fence.seqno);
-	dma_fence_set_error(&request->fence, -EIO);
-
-	i915_request_submit(request);
-}
-
-static void nop_complete_submit_request(struct i915_request *request)
 {
 	unsigned long flags;
 
@@ -3354,57 +3344,33 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
 	 * rolling the global seqno forward (since this would complete requests
 	 * for which we haven't set the fence error to EIO yet).
 	 */
-	for_each_engine(engine, i915, id) {
+	for_each_engine(engine, i915, id)
 		i915_gem_reset_prepare_engine(engine);
 
-		engine->submit_request = nop_submit_request;
-		engine->schedule = NULL;
-	}
-	i915->caps.scheduler = 0;
-
 	/* Even if the GPU reset fails, it should still stop the engines */
 	if (INTEL_GEN(i915) >= 5)
 		intel_gpu_reset(i915, ALL_ENGINES);
 
-	/*
-	 * Make sure no one is running the old callback before we proceed with
-	 * cancelling requests and resetting the completion tracking. Otherwise
-	 * we might submit a request to the hardware which never completes.
-	 */
-	synchronize_rcu();
-
 	for_each_engine(engine, i915, id) {
-		/* Mark all executing requests as skipped */
-		engine->cancel_requests(engine);
-
-		/*
-		 * Only once we've force-cancelled all in-flight requests can we
-		 * start to complete all requests.
-		 */
-		engine->submit_request = nop_complete_submit_request;
+		engine->submit_request = nop_submit_request;
+		engine->schedule = NULL;
 	}
+	i915->caps.scheduler = 0;
 
 	/*
 	 * Make sure no request can slip through without getting completed by
 	 * either this call here to intel_engine_init_global_seqno, or the one
-	 * in nop_complete_submit_request.
+	 * in nop_submit_request.
 	 */
 	synchronize_rcu();
 
-	for_each_engine(engine, i915, id) {
-		unsigned long flags;
-
-		/*
-		 * Mark all pending requests as complete so that any concurrent
-		 * (lockless) lookup doesn't try and wait upon the request as we
-		 * reset it.
-		 */
-		spin_lock_irqsave(&engine->timeline.lock, flags);
-		intel_engine_init_global_seqno(engine,
-					       intel_engine_last_submit(engine));
-		spin_unlock_irqrestore(&engine->timeline.lock, flags);
+	/* Mark all executing requests as skipped */
+	for_each_engine(engine, i915, id)
+		engine->cancel_requests(engine);
 
+	for_each_engine(engine, i915, id) {
 		i915_gem_reset_finish_engine(engine);
+		intel_engine_wakeup(engine);
 	}
 
 out:
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 11f4e6148557..b5511a054f30 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -818,8 +818,11 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 	/* Mark all executing requests as skipped. */
 	list_for_each_entry(rq, &engine->timeline.requests, link) {
 		GEM_BUG_ON(!rq->global_seqno);
-		if (!i915_request_completed(rq))
-			dma_fence_set_error(&rq->fence, -EIO);
+
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
+			continue;
+
+		dma_fence_set_error(&rq->fence, -EIO);
 	}
 
 	/* Flush the queued requests to the timeline list (for retiring). */
@@ -830,8 +833,8 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 		priolist_for_each_request_consume(rq, rn, p, i) {
 			list_del_init(&rq->sched.link);
 
-			dma_fence_set_error(&rq->fence, -EIO);
 			__i915_request_submit(rq);
+			dma_fence_set_error(&rq->fence, -EIO);
 		}
 
 		rb_erase_cached(&p->node, &execlists->queue);
@@ -839,6 +842,10 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
 			kmem_cache_free(engine->i915->priorities, p);
 	}
 
+	intel_write_status_page(engine,
+				I915_GEM_HWS_INDEX,
+				intel_engine_last_submit(engine));
+
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	execlists->queue_priority = INT_MIN;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 28ae1e436ea6..992889f9e0ff 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -748,9 +748,18 @@ static void cancel_requests(struct intel_engine_cs *engine)
 	/* Mark all submitted requests as skipped. */
 	list_for_each_entry(request, &engine->timeline.requests, link) {
 		GEM_BUG_ON(!request->global_seqno);
-		if (!i915_request_completed(request))
-			dma_fence_set_error(&request->fence, -EIO);
+
+		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+			     &request->fence.flags))
+			continue;
+
+		dma_fence_set_error(&request->fence, -EIO);
 	}
+
+	intel_write_status_page(engine,
+				I915_GEM_HWS_INDEX,
+				intel_engine_last_submit(engine));
+
 	/* Remaining _unready_ requests will be nop'ed when submitted */
 
 	spin_unlock_irqrestore(&engine->timeline.lock, flags);
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/8] drm/i915/ringbuffer: Clear semaphore sync registers on ring init
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
  2018-12-03 11:36 ` [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging Chris Wilson
@ 2018-12-03 11:36 ` Chris Wilson
  2018-12-03 12:05   ` Mika Kuoppala
  2018-12-03 11:36 ` [PATCH 4/8] drm/i915: Allocate a common scratch page Chris Wilson
                   ` (13 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 11:36 UTC (permalink / raw)
  To: intel-gfx

Ensure that the sync registers are cleared every time we restart the
ring to avoid stale values from creeping in from random neutrinos.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 992889f9e0ff..81b10d85b738 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -529,6 +529,13 @@ static int init_ring_common(struct intel_engine_cs *engine)
 
 	intel_engine_reset_breadcrumbs(engine);
 
+	if (HAS_LEGACY_SEMAPHORES(engine->i915)) {
+		I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
+		I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
+		if (HAS_VEBOX(dev_priv))
+			I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
+	}
+
 	/* Enforce ordering by reading HEAD register back */
 	I915_READ_HEAD(engine);
 
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/8] drm/i915: Allocate a common scratch page
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
  2018-12-03 11:36 ` [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging Chris Wilson
  2018-12-03 11:36 ` [PATCH 3/8] drm/i915/ringbuffer: Clear semaphore sync registers on ring init Chris Wilson
@ 2018-12-03 11:36 ` Chris Wilson
  2018-12-03 15:28   ` Mika Kuoppala
  2018-12-03 17:29   ` [PATCH v2] " Chris Wilson
  2018-12-03 11:36 ` [PATCH 5/8] drm/i915: Flush GPU relocs harder for gen3 Chris Wilson
                   ` (12 subsequent siblings)
  15 siblings, 2 replies; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 11:36 UTC (permalink / raw)
  To: intel-gfx

Currently we allocate a scratch page for each engine, but since we only
ever write into it for post-sync operations, it is not exposed to
userspace nor do we care for coherency. As we then do not care about its
contents, we can use one page for all, reducing our allocations and
avoid complications by not assuming per-engine isolation.

For later use, it simplifies engine initialisation (by removing the
allocation that required struct_mutex!) and means that we can always rely
on there being a scratch page.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  7 ++++
 drivers/gpu/drm/i915/i915_gem.c         | 50 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c  | 42 ---------------------
 drivers/gpu/drm/i915/intel_lrc.c        | 17 +++------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 33 +++++-----------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  5 ---
 7 files changed, 71 insertions(+), 85 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d45475287130..0ec65cc48b5a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1996,6 +1996,8 @@ struct drm_i915_private {
 		struct delayed_work idle_work;
 
 		ktime_t last_init_time;
+
+		struct i915_vma *scratch;
 	} gt;
 
 	/* perform PHY state sanity checks? */
@@ -3724,4 +3726,9 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
 		return I915_HWS_CSB_WRITE_INDEX;
 }
 
+static inline u32 i915_scratch_offset(const struct drm_i915_private *i915)
+{
+	return i915_ggtt_offset(i915->gt.scratch);
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 834240a9b262..cca4285e3329 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5495,6 +5495,44 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	goto out_ctx;
 }
 
+static int
+i915_gem_init_scratch(struct drm_i915_private *i915, unsigned int size)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int ret;
+
+	obj = i915_gem_object_create_stolen(i915, size);
+	if (!obj)
+		obj = i915_gem_object_create_internal(i915, size);
+	if (IS_ERR(obj)) {
+		DRM_ERROR("Failed to allocate scratch page\n");
+		return PTR_ERR(obj);
+	}
+
+	vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto err_unref;
+	}
+
+	ret = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
+	if (ret)
+		goto err_unref;
+
+	i915->gt.scratch = vma;
+	return 0;
+
+err_unref:
+	i915_gem_object_put(obj);
+	return ret;
+}
+
+static void i915_gem_fini_scratch(struct drm_i915_private *i915)
+{
+	i915_vma_unpin_and_release(&i915->gt.scratch, 0);
+}
+
 int i915_gem_init(struct drm_i915_private *dev_priv)
 {
 	int ret;
@@ -5541,12 +5579,19 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		goto err_unlock;
 	}
 
-	ret = i915_gem_contexts_init(dev_priv);
+	ret = i915_gem_init_scratch(dev_priv,
+				    IS_GEN2(dev_priv) ? SZ_256K : PAGE_SIZE);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
 		goto err_ggtt;
 	}
 
+	ret = i915_gem_contexts_init(dev_priv);
+	if (ret) {
+		GEM_BUG_ON(ret == -EIO);
+		goto err_scratch;
+	}
+
 	ret = intel_engines_init(dev_priv);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
@@ -5619,6 +5664,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 err_context:
 	if (ret != -EIO)
 		i915_gem_contexts_fini(dev_priv);
+err_scratch:
+	i915_gem_fini_scratch(dev_priv);
 err_ggtt:
 err_unlock:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
@@ -5670,6 +5717,7 @@ void i915_gem_fini(struct drm_i915_private *dev_priv)
 	intel_uc_fini(dev_priv);
 	i915_gem_cleanup_engines(dev_priv);
 	i915_gem_contexts_fini(dev_priv);
+	i915_gem_fini_scratch(dev_priv);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	intel_cleanup_gt_powersave(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index a6885a59568b..07465123c166 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1571,7 +1571,7 @@ static void gem_record_rings(struct i915_gpu_state *error)
 			if (HAS_BROKEN_CS_TLB(i915))
 				ee->wa_batchbuffer =
 					i915_error_object_create(i915,
-								 engine->scratch);
+								 i915->gt.scratch);
 			request_record_user_bo(request, ee);
 
 			ee->ctx =
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 759c0fd58f8c..2390985384d6 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -493,46 +493,6 @@ void intel_engine_setup_common(struct intel_engine_cs *engine)
 	intel_engine_init_cmd_parser(engine);
 }
 
-int intel_engine_create_scratch(struct intel_engine_cs *engine,
-				unsigned int size)
-{
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	int ret;
-
-	WARN_ON(engine->scratch);
-
-	obj = i915_gem_object_create_stolen(engine->i915, size);
-	if (!obj)
-		obj = i915_gem_object_create_internal(engine->i915, size);
-	if (IS_ERR(obj)) {
-		DRM_ERROR("Failed to allocate scratch page\n");
-		return PTR_ERR(obj);
-	}
-
-	vma = i915_vma_instance(obj, &engine->i915->ggtt.vm, NULL);
-	if (IS_ERR(vma)) {
-		ret = PTR_ERR(vma);
-		goto err_unref;
-	}
-
-	ret = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
-	if (ret)
-		goto err_unref;
-
-	engine->scratch = vma;
-	return 0;
-
-err_unref:
-	i915_gem_object_put(obj);
-	return ret;
-}
-
-void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
-{
-	i915_vma_unpin_and_release(&engine->scratch, 0);
-}
-
 static void cleanup_status_page(struct intel_engine_cs *engine)
 {
 	if (HWS_NEEDS_PHYSICAL(engine->i915)) {
@@ -707,8 +667,6 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = engine->i915;
 
-	intel_engine_cleanup_scratch(engine);
-
 	cleanup_status_page(engine);
 
 	intel_engine_fini_breadcrumbs(engine);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b5511a054f30..de070dca4033 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1286,9 +1286,10 @@ static int execlists_request_alloc(struct i915_request *request)
 static u32 *
 gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
 {
+	/* NB no one else is allowed to scribble over scratch + 256! */
 	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
 	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
+	*batch++ = i915_scratch_offset(engine->i915) + 256;
 	*batch++ = 0;
 
 	*batch++ = MI_LOAD_REGISTER_IMM(1);
@@ -1302,7 +1303,7 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
 
 	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
 	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
+	*batch++ = i915_scratch_offset(engine->i915) + 256;
 	*batch++ = 0;
 
 	return batch;
@@ -1339,7 +1340,7 @@ static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
 				       PIPE_CONTROL_GLOBAL_GTT_IVB |
 				       PIPE_CONTROL_CS_STALL |
 				       PIPE_CONTROL_QW_WRITE,
-				       i915_ggtt_offset(engine->scratch) +
+				       i915_scratch_offset(engine->i915) +
 				       2 * CACHELINE_BYTES);
 
 	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
@@ -1969,7 +1970,7 @@ static int gen8_emit_flush_render(struct i915_request *request,
 {
 	struct intel_engine_cs *engine = request->engine;
 	u32 scratch_addr =
-		i915_ggtt_offset(engine->scratch) + 2 * CACHELINE_BYTES;
+		i915_scratch_offset(engine->i915) + 2 * CACHELINE_BYTES;
 	bool vf_flush_wa = false, dc_flush_wa = false;
 	u32 *cs, flags = 0;
 	int len;
@@ -2306,10 +2307,6 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
-	ret = intel_engine_create_scratch(engine, PAGE_SIZE);
-	if (ret)
-		goto err_cleanup_common;
-
 	ret = intel_init_workaround_bb(engine);
 	if (ret) {
 		/*
@@ -2322,10 +2319,6 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 	}
 
 	return 0;
-
-err_cleanup_common:
-	intel_engine_cleanup_common(engine);
-	return ret;
 }
 
 int logical_xcs_ring_init(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 81b10d85b738..a3d3126a3938 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -150,8 +150,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
 	 */
 	if (mode & EMIT_INVALIDATE) {
 		*cs++ = GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE;
-		*cs++ = i915_ggtt_offset(rq->engine->scratch) |
-			PIPE_CONTROL_GLOBAL_GTT;
+		*cs++ = i915_scratch_offset(rq->i915) | PIPE_CONTROL_GLOBAL_GTT;
 		*cs++ = 0;
 		*cs++ = 0;
 
@@ -159,8 +158,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
 			*cs++ = MI_FLUSH;
 
 		*cs++ = GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE;
-		*cs++ = i915_ggtt_offset(rq->engine->scratch) |
-			PIPE_CONTROL_GLOBAL_GTT;
+		*cs++ = i915_scratch_offset(rq->i915) | PIPE_CONTROL_GLOBAL_GTT;
 		*cs++ = 0;
 		*cs++ = 0;
 	}
@@ -212,8 +210,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
 static int
 intel_emit_post_sync_nonzero_flush(struct i915_request *rq)
 {
-	u32 scratch_addr =
-		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
+	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
 	u32 *cs;
 
 	cs = intel_ring_begin(rq, 6);
@@ -246,8 +243,7 @@ intel_emit_post_sync_nonzero_flush(struct i915_request *rq)
 static int
 gen6_render_ring_flush(struct i915_request *rq, u32 mode)
 {
-	u32 scratch_addr =
-		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
+	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
 	u32 *cs, flags = 0;
 	int ret;
 
@@ -316,8 +312,7 @@ gen7_render_ring_cs_stall_wa(struct i915_request *rq)
 static int
 gen7_render_ring_flush(struct i915_request *rq, u32 mode)
 {
-	u32 scratch_addr =
-		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
+	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
 	u32 *cs, flags = 0;
 
 	/*
@@ -1002,7 +997,7 @@ i830_emit_bb_start(struct i915_request *rq,
 		   u64 offset, u32 len,
 		   unsigned int dispatch_flags)
 {
-	u32 *cs, cs_offset = i915_ggtt_offset(rq->engine->scratch);
+	u32 *cs, cs_offset = i915_scratch_offset(rq->i915);
 
 	cs = intel_ring_begin(rq, 6);
 	if (IS_ERR(cs))
@@ -1459,7 +1454,6 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 {
 	struct i915_timeline *timeline;
 	struct intel_ring *ring;
-	unsigned int size;
 	int err;
 
 	intel_engine_setup_common(engine);
@@ -1484,21 +1478,12 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 	GEM_BUG_ON(engine->buffer);
 	engine->buffer = ring;
 
-	size = PAGE_SIZE;
-	if (HAS_BROKEN_CS_TLB(engine->i915))
-		size = I830_WA_SIZE;
-	err = intel_engine_create_scratch(engine, size);
-	if (err)
-		goto err_unpin;
-
 	err = intel_engine_init_common(engine);
 	if (err)
-		goto err_scratch;
+		goto err_unpin;
 
 	return 0;
 
-err_scratch:
-	intel_engine_cleanup_scratch(engine);
 err_unpin:
 	intel_ring_unpin(ring);
 err_ring:
@@ -1572,7 +1557,7 @@ static int flush_pd_dir(struct i915_request *rq)
 	/* Stall until the page table load is complete */
 	*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
 	*cs++ = i915_mmio_reg_offset(RING_PP_DIR_BASE(engine));
-	*cs++ = i915_ggtt_offset(engine->scratch);
+	*cs++ = i915_scratch_offset(rq->i915);
 	*cs++ = MI_NOOP;
 
 	intel_ring_advance(rq, cs);
@@ -1681,7 +1666,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 			/* Insert a delay before the next switch! */
 			*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
 			*cs++ = i915_mmio_reg_offset(last_reg);
-			*cs++ = i915_ggtt_offset(engine->scratch);
+			*cs++ = i915_scratch_offset(rq->i915);
 			*cs++ = MI_NOOP;
 		}
 		*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 8a2270b209b0..970fb5c05c36 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -451,7 +451,6 @@ struct intel_engine_cs {
 
 	struct intel_hw_status_page status_page;
 	struct i915_ctx_workarounds wa_ctx;
-	struct i915_vma *scratch;
 
 	u32             irq_keep_mask; /* always keep these interrupts */
 	u32		irq_enable_mask; /* bitmask to enable ring interrupt */
@@ -908,10 +907,6 @@ void intel_engine_setup_common(struct intel_engine_cs *engine);
 int intel_engine_init_common(struct intel_engine_cs *engine);
 void intel_engine_cleanup_common(struct intel_engine_cs *engine);
 
-int intel_engine_create_scratch(struct intel_engine_cs *engine,
-				unsigned int size);
-void intel_engine_cleanup_scratch(struct intel_engine_cs *engine);
-
 int intel_init_render_ring_buffer(struct intel_engine_cs *engine);
 int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine);
 int intel_init_blt_ring_buffer(struct intel_engine_cs *engine);
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/8] drm/i915: Flush GPU relocs harder for gen3
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (2 preceding siblings ...)
  2018-12-03 11:36 ` [PATCH 4/8] drm/i915: Allocate a common scratch page Chris Wilson
@ 2018-12-03 11:36 ` Chris Wilson
  2018-12-03 11:36 ` [PATCH 6/8] drm/i915/selftests: Terminate hangcheck sanitycheck forcibly Chris Wilson
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 11:36 UTC (permalink / raw)
  To: intel-gfx

Adding an extra MI_STORE_DWORD_IMM to the gpu relocation path for gen3
was good, but still not good enough. To survive 24+ hours under test we
needed to perform not one, not two but three extra store-dw. Doing so
for each GPU relocation was a little unsightly and since we need to
worry about userspace hitting the same issues, we should apply the dummy
store-dw into the EMIT_FLUSH.

Fixes: 7dd4f6729f92 ("drm/i915: Async GPU relocation processing")
References: 7fa28e146994 ("drm/i915: Write GPU relocs harder with gen3")
Testcase: igt/gem_tiled_fence_blits # blb/pnv
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  7 +------
 drivers/gpu/drm/i915/intel_ringbuffer.c    | 15 ++++++++++++---
 2 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index d4fac09095f8..1aaccbe7e1de 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1268,7 +1268,7 @@ relocate_entry(struct i915_vma *vma,
 		else if (gen >= 4)
 			len = 4;
 		else
-			len = 6;
+			len = 3;
 
 		batch = reloc_gpu(eb, vma, len);
 		if (IS_ERR(batch))
@@ -1309,11 +1309,6 @@ relocate_entry(struct i915_vma *vma,
 			*batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
 			*batch++ = addr;
 			*batch++ = target_offset;
-
-			/* And again for good measure (blb/pnv) */
-			*batch++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
-			*batch++ = addr;
-			*batch++ = target_offset;
 		}
 
 		goto out;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index a3d3126a3938..37bd05cef0e9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -69,19 +69,28 @@ unsigned int intel_ring_update_space(struct intel_ring *ring)
 static int
 gen2_render_ring_flush(struct i915_request *rq, u32 mode)
 {
+	unsigned int num_store_dw;
 	u32 cmd, *cs;
 
 	cmd = MI_FLUSH;
-
+	num_store_dw = 0;
 	if (mode & EMIT_INVALIDATE)
 		cmd |= MI_READ_FLUSH;
+	if (mode & EMIT_FLUSH)
+		num_store_dw = 4;
 
-	cs = intel_ring_begin(rq, 2);
+	cs = intel_ring_begin(rq, 2 + 3 * num_store_dw);
 	if (IS_ERR(cs))
 		return PTR_ERR(cs);
 
 	*cs++ = cmd;
-	*cs++ = MI_NOOP;
+	while (num_store_dw--) {
+		*cs++ = MI_STORE_DWORD_IMM | MI_MEM_VIRTUAL;
+		*cs++ = i915_scratch_offset(rq->i915);
+		*cs++ = 0;
+	}
+	*cs++ = MI_FLUSH | MI_NO_WRITE_FLUSH;
+
 	intel_ring_advance(rq, cs);
 
 	return 0;
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/8] drm/i915/selftests: Terminate hangcheck sanitycheck forcibly
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (3 preceding siblings ...)
  2018-12-03 11:36 ` [PATCH 5/8] drm/i915: Flush GPU relocs harder for gen3 Chris Wilson
@ 2018-12-03 11:36 ` Chris Wilson
  2018-12-03 12:09   ` Mika Kuoppala
  2018-12-03 11:37 ` [PATCH 7/8] drm/i915/selftests: Reorder request allocation vs vma pinning Chris Wilson
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 11:36 UTC (permalink / raw)
  To: intel-gfx

If all else fails and we are stuck eternally waiting for the undying
request, abandon all hope.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index defe671130ab..a48fbe2557ea 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -308,6 +308,7 @@ static int igt_hang_sanitycheck(void *arg)
 		goto unlock;
 
 	for_each_engine(engine, i915, id) {
+		struct igt_wedge_me w;
 		long timeout;
 
 		if (!intel_engine_can_store_dword(engine))
@@ -328,9 +329,14 @@ static int igt_hang_sanitycheck(void *arg)
 
 		i915_request_add(rq);
 
-		timeout = i915_request_wait(rq,
-					    I915_WAIT_LOCKED,
-					    MAX_SCHEDULE_TIMEOUT);
+		timeout = 0;
+		igt_wedge_on_timeout(&w, i915, HZ / 10 /* 100ms timeout*/)
+			timeout = i915_request_wait(rq,
+						    I915_WAIT_LOCKED,
+						    MAX_SCHEDULE_TIMEOUT);
+		if (i915_terminally_wedged(&i915->gpu_error))
+			timeout = -EIO;
+
 		i915_request_put(rq);
 
 		if (timeout < 0) {
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 7/8] drm/i915/selftests: Reorder request allocation vs vma pinning
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (4 preceding siblings ...)
  2018-12-03 11:36 ` [PATCH 6/8] drm/i915/selftests: Terminate hangcheck sanitycheck forcibly Chris Wilson
@ 2018-12-03 11:37 ` Chris Wilson
  2018-12-04 11:06   ` Tvrtko Ursulin
  2018-12-03 11:37 ` [PATCH 8/8] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
                   ` (9 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 11:37 UTC (permalink / raw)
  To: intel-gfx

Impose a restraint that we have all vma pinned for a request prior to
its allocation. This is to simplify request construction, and should
facilitate unravelling the lock interdependencies later.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/selftests/huge_pages.c   |  31 +++--
 drivers/gpu/drm/i915/selftests/igt_spinner.c  |  86 ++++++------
 .../gpu/drm/i915/selftests/intel_hangcheck.c  | 123 +++++++++---------
 3 files changed, 119 insertions(+), 121 deletions(-)

diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
index 26c065c8d2c0..a0c7cbc212ba 100644
--- a/drivers/gpu/drm/i915/selftests/huge_pages.c
+++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
@@ -972,7 +972,6 @@ static int gpu_write(struct i915_vma *vma,
 {
 	struct i915_request *rq;
 	struct i915_vma *batch;
-	int flags = 0;
 	int err;
 
 	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
@@ -981,14 +980,14 @@ static int gpu_write(struct i915_vma *vma,
 	if (err)
 		return err;
 
-	rq = i915_request_alloc(engine, ctx);
-	if (IS_ERR(rq))
-		return PTR_ERR(rq);
-
 	batch = gpu_write_dw(vma, dword * sizeof(u32), value);
-	if (IS_ERR(batch)) {
-		err = PTR_ERR(batch);
-		goto err_request;
+	if (IS_ERR(batch))
+		return PTR_ERR(batch);
+
+	rq = i915_request_alloc(engine, ctx);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_batch;
 	}
 
 	err = i915_vma_move_to_active(batch, rq, 0);
@@ -996,21 +995,21 @@ static int gpu_write(struct i915_vma *vma,
 		goto err_request;
 
 	i915_gem_object_set_active_reference(batch->obj);
-	i915_vma_unpin(batch);
-	i915_vma_close(batch);
 
-	err = engine->emit_bb_start(rq,
-				    batch->node.start, batch->node.size,
-				    flags);
+	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
 	if (err)
 		goto err_request;
 
-	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
+	err = engine->emit_bb_start(rq,
+				    batch->node.start, batch->node.size,
+				    0);
+err_request:
 	if (err)
 		i915_request_skip(rq, err);
-
-err_request:
 	i915_request_add(rq);
+err_batch:
+	i915_vma_unpin(batch);
+	i915_vma_close(batch);
 
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
index 8cd34f6e6859..0e70df0230b8 100644
--- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
+++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
@@ -68,48 +68,65 @@ static u64 hws_address(const struct i915_vma *hws,
 	return hws->node.start + seqno_offset(rq->fence.context);
 }
 
-static int emit_recurse_batch(struct igt_spinner *spin,
-			      struct i915_request *rq,
-			      u32 arbitration_command)
+static int move_to_active(struct i915_vma *vma,
+			  struct i915_request *rq,
+			  unsigned int flags)
 {
-	struct i915_address_space *vm = &rq->gem_context->ppgtt->vm;
+	int err;
+
+	err = i915_vma_move_to_active(vma, rq, flags);
+	if (err)
+		return err;
+
+	if (!i915_gem_object_has_active_reference(vma->obj)) {
+		i915_gem_object_get(vma->obj);
+		i915_gem_object_set_active_reference(vma->obj);
+	}
+
+	return 0;
+}
+
+struct i915_request *
+igt_spinner_create_request(struct igt_spinner *spin,
+			   struct i915_gem_context *ctx,
+			   struct intel_engine_cs *engine,
+			   u32 arbitration_command)
+{
+	struct i915_address_space *vm = &ctx->ppgtt->vm;
+	struct i915_request *rq = NULL;
 	struct i915_vma *hws, *vma;
 	u32 *batch;
 	int err;
 
 	vma = i915_vma_instance(spin->obj, vm, NULL);
 	if (IS_ERR(vma))
-		return PTR_ERR(vma);
+		return ERR_CAST(vma);
 
 	hws = i915_vma_instance(spin->hws, vm, NULL);
 	if (IS_ERR(hws))
-		return PTR_ERR(hws);
+		return ERR_CAST(hws);
 
 	err = i915_vma_pin(vma, 0, 0, PIN_USER);
 	if (err)
-		return err;
+		return ERR_PTR(err);
 
 	err = i915_vma_pin(hws, 0, 0, PIN_USER);
 	if (err)
 		goto unpin_vma;
 
-	err = i915_vma_move_to_active(vma, rq, 0);
-	if (err)
+	rq = i915_request_alloc(engine, ctx);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
 		goto unpin_hws;
-
-	if (!i915_gem_object_has_active_reference(vma->obj)) {
-		i915_gem_object_get(vma->obj);
-		i915_gem_object_set_active_reference(vma->obj);
 	}
 
-	err = i915_vma_move_to_active(hws, rq, 0);
+	err = move_to_active(vma, rq, 0);
 	if (err)
-		goto unpin_hws;
+		goto cancel_rq;
 
-	if (!i915_gem_object_has_active_reference(hws->obj)) {
-		i915_gem_object_get(hws->obj);
-		i915_gem_object_set_active_reference(hws->obj);
-	}
+	err = move_to_active(hws, rq, 0);
+	if (err)
+		goto cancel_rq;
 
 	batch = spin->batch;
 
@@ -127,35 +144,18 @@ static int emit_recurse_batch(struct igt_spinner *spin,
 
 	i915_gem_chipset_flush(spin->i915);
 
-	err = rq->engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
+	err = engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
 
+cancel_rq:
+	if (err) {
+		i915_request_skip(rq, err);
+		i915_request_add(rq);
+	}
 unpin_hws:
 	i915_vma_unpin(hws);
 unpin_vma:
 	i915_vma_unpin(vma);
-	return err;
-}
-
-struct i915_request *
-igt_spinner_create_request(struct igt_spinner *spin,
-			   struct i915_gem_context *ctx,
-			   struct intel_engine_cs *engine,
-			   u32 arbitration_command)
-{
-	struct i915_request *rq;
-	int err;
-
-	rq = i915_request_alloc(engine, ctx);
-	if (IS_ERR(rq))
-		return rq;
-
-	err = emit_recurse_batch(spin, rq, arbitration_command);
-	if (err) {
-		i915_request_add(rq);
-		return ERR_PTR(err);
-	}
-
-	return rq;
+	return err ? ERR_PTR(err) : rq;
 }
 
 static u32
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index a48fbe2557ea..0ff813ad3462 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -102,52 +102,87 @@ static u64 hws_address(const struct i915_vma *hws,
 	return hws->node.start + offset_in_page(sizeof(u32)*rq->fence.context);
 }
 
-static int emit_recurse_batch(struct hang *h,
-			      struct i915_request *rq)
+static int move_to_active(struct i915_vma *vma,
+			  struct i915_request *rq,
+			  unsigned int flags)
+{
+	int err;
+
+	err = i915_vma_move_to_active(vma, rq, flags);
+	if (err)
+		return err;
+
+	if (!i915_gem_object_has_active_reference(vma->obj)) {
+		i915_gem_object_get(vma->obj);
+		i915_gem_object_set_active_reference(vma->obj);
+	}
+
+	return 0;
+}
+
+static struct i915_request *
+hang_create_request(struct hang *h, struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = h->i915;
 	struct i915_address_space *vm =
-		rq->gem_context->ppgtt ?
-		&rq->gem_context->ppgtt->vm :
-		&i915->ggtt.vm;
+		h->ctx->ppgtt ? &h->ctx->ppgtt->vm : &i915->ggtt.vm;
+	struct i915_request *rq = NULL;
 	struct i915_vma *hws, *vma;
 	unsigned int flags;
 	u32 *batch;
 	int err;
 
+	if (i915_gem_object_is_active(h->obj)) {
+		struct drm_i915_gem_object *obj;
+		void *vaddr;
+
+		obj = i915_gem_object_create_internal(h->i915, PAGE_SIZE);
+		if (IS_ERR(obj))
+			return ERR_CAST(obj);
+
+		vaddr = i915_gem_object_pin_map(obj,
+						i915_coherent_map_type(h->i915));
+		if (IS_ERR(vaddr)) {
+			i915_gem_object_put(obj);
+			return ERR_CAST(vaddr);
+		}
+
+		i915_gem_object_unpin_map(h->obj);
+		i915_gem_object_put(h->obj);
+
+		h->obj = obj;
+		h->batch = vaddr;
+	}
+
 	vma = i915_vma_instance(h->obj, vm, NULL);
 	if (IS_ERR(vma))
-		return PTR_ERR(vma);
+		return ERR_CAST(vma);
 
 	hws = i915_vma_instance(h->hws, vm, NULL);
 	if (IS_ERR(hws))
-		return PTR_ERR(hws);
+		return ERR_CAST(hws);
 
 	err = i915_vma_pin(vma, 0, 0, PIN_USER);
 	if (err)
-		return err;
+		return ERR_PTR(err);
 
 	err = i915_vma_pin(hws, 0, 0, PIN_USER);
 	if (err)
 		goto unpin_vma;
 
-	err = i915_vma_move_to_active(vma, rq, 0);
-	if (err)
+	rq = i915_request_alloc(engine, h->ctx);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
 		goto unpin_hws;
-
-	if (!i915_gem_object_has_active_reference(vma->obj)) {
-		i915_gem_object_get(vma->obj);
-		i915_gem_object_set_active_reference(vma->obj);
 	}
 
-	err = i915_vma_move_to_active(hws, rq, 0);
+	err = move_to_active(vma, rq, 0);
 	if (err)
-		goto unpin_hws;
+		goto cancel_rq;
 
-	if (!i915_gem_object_has_active_reference(hws->obj)) {
-		i915_gem_object_get(hws->obj);
-		i915_gem_object_set_active_reference(hws->obj);
-	}
+	err = move_to_active(hws, rq, 0);
+	if (err)
+		goto cancel_rq;
 
 	batch = h->batch;
 	if (INTEL_GEN(i915) >= 8) {
@@ -212,52 +247,16 @@ static int emit_recurse_batch(struct hang *h,
 
 	err = rq->engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, flags);
 
+cancel_rq:
+	if (err) {
+		i915_request_skip(rq, err);
+		i915_request_add(rq);
+	}
 unpin_hws:
 	i915_vma_unpin(hws);
 unpin_vma:
 	i915_vma_unpin(vma);
-	return err;
-}
-
-static struct i915_request *
-hang_create_request(struct hang *h, struct intel_engine_cs *engine)
-{
-	struct i915_request *rq;
-	int err;
-
-	if (i915_gem_object_is_active(h->obj)) {
-		struct drm_i915_gem_object *obj;
-		void *vaddr;
-
-		obj = i915_gem_object_create_internal(h->i915, PAGE_SIZE);
-		if (IS_ERR(obj))
-			return ERR_CAST(obj);
-
-		vaddr = i915_gem_object_pin_map(obj,
-						i915_coherent_map_type(h->i915));
-		if (IS_ERR(vaddr)) {
-			i915_gem_object_put(obj);
-			return ERR_CAST(vaddr);
-		}
-
-		i915_gem_object_unpin_map(h->obj);
-		i915_gem_object_put(h->obj);
-
-		h->obj = obj;
-		h->batch = vaddr;
-	}
-
-	rq = i915_request_alloc(engine, h->ctx);
-	if (IS_ERR(rq))
-		return rq;
-
-	err = emit_recurse_batch(h, rq);
-	if (err) {
-		i915_request_add(rq);
-		return ERR_PTR(err);
-	}
-
-	return rq;
+	return err ? ERR_PTR(err) : rq;
 }
 
 static u32 hws_seqno(const struct hang *h, const struct i915_request *rq)
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 8/8] drm/i915: Pipeline PDP updates for Braswell
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (5 preceding siblings ...)
  2018-12-03 11:37 ` [PATCH 7/8] drm/i915/selftests: Reorder request allocation vs vma pinning Chris Wilson
@ 2018-12-03 11:37 ` Chris Wilson
  2018-12-04 11:53   ` Tvrtko Ursulin
  2018-12-03 12:00 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Patchwork
                   ` (8 subsequent siblings)
  15 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 11:37 UTC (permalink / raw)
  To: intel-gfx

Currently we face a severe problem on Braswell that manifests as invalid
ppGTT accesses. The code tries to maintain the PDP (page directory
pointers) inside the context in two ways, direct write into the context
and a pipelined LRI update. The direct write into the context is
fundamentally racy as it is unserialised with any access (read or write)
the GPU is doing. By asserting that Braswell is not used with vGPU
(currently an unsupported platform) we can eliminate the dangerous
direct write into the context image and solely use the pipelined update.

However, the LRI of the PDP fouls up the GPU, causing it to freeze and
take out the machine with "forcewake ack timeouts". This seems possible
to workaround by preventing the GPU from sleeping (via means of
disabling the power-state management interface, i.e. forcing each ring
to remain awake) around the update.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
References: https://bugs.freedesktop.org/show_bug.cgi?id=108714
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_gem_gtt.c     |   2 -
 drivers/gpu/drm/i915/i915_request.c     |   5 -
 drivers/gpu/drm/i915/intel_lrc.c        | 137 +++++++++++-------------
 drivers/gpu/drm/i915/intel_ringbuffer.c |   5 +-
 4 files changed, 68 insertions(+), 81 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index add1fe7aeb93..62bde517d383 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -1423,8 +1423,6 @@ static int gen8_ppgtt_alloc_pdp(struct i915_address_space *vm,
 			gen8_initialize_pd(vm, pd);
 			gen8_ppgtt_set_pdpe(vm, pdp, pd, pdpe);
 			GEM_BUG_ON(pdp->used_pdpes > i915_pdpes_per_pdp(vm));
-
-			mark_tlbs_dirty(i915_vm_to_ppgtt(vm));
 		}
 
 		ret = gen8_ppgtt_alloc_pd(vm, pd, start, length);
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index ca95ab2f4cfa..8ab8e8e6a086 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -719,11 +719,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
 	 */
 	rq->head = rq->ring->emit;
 
-	/* Unconditionally invalidate GPU caches and TLBs. */
-	ret = engine->emit_flush(rq, EMIT_INVALIDATE);
-	if (ret)
-		goto err_unwind;
-
 	ret = engine->request_alloc(rq);
 	if (ret)
 		goto err_unwind;
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index de070dca4033..1ec3f80a4472 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -363,31 +363,12 @@ execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
 	trace_i915_request_out(rq);
 }
 
-static void
-execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
-{
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
-	ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
-}
-
 static u64 execlists_update_context(struct i915_request *rq)
 {
-	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
 	struct intel_context *ce = rq->hw_context;
-	u32 *reg_state = ce->lrc_reg_state;
-
-	reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
 
-	/*
-	 * True 32b PPGTT with dynamic page allocation: update PDP
-	 * registers and point the unallocated PDPs to scratch page.
-	 * PML4 is allocated during ppgtt init, so this is not needed
-	 * in 48-bit mode.
-	 */
-	if (!i915_vm_is_48bit(&ppgtt->vm))
-		execlists_update_context_pdps(ppgtt, reg_state);
+	ce->lrc_reg_state[CTX_RING_TAIL + 1] =
+		intel_ring_set_tail(rq->ring, rq->tail);
 
 	/*
 	 * Make sure the context image is complete before we submit it to HW.
@@ -1240,29 +1221,80 @@ execlists_context_pin(struct intel_engine_cs *engine,
 	return __execlists_context_pin(engine, ctx, ce);
 }
 
+static int emit_pdps(struct i915_request *rq)
+{
+	const struct intel_engine_cs * const engine = rq->engine;
+	struct i915_hw_ppgtt * const ppgtt = rq->gem_context->ppgtt;
+	int err, i;
+	u32 *cs;
+
+	err = engine->emit_flush(rq, EMIT_INVALIDATE);
+	if (err)
+		return err;
+
+	cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
+	if (IS_ERR(cs))
+		return PTR_ERR(cs);
+
+	/*
+	 * Force the GPU (not just the local engine/powerwell!) to remain awake,
+	 * or else we may kill the machine with "timed out waiting for
+	 * forcewake ack request".
+	 */
+
+	*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;
+	for (i = GEN8_3LVL_PDPES; i--; ) {
+		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
+
+		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
+		*cs++ = upper_32_bits(pd_daddr);
+		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
+		*cs++ = lower_32_bits(pd_daddr);
+	}
+	*cs++ = MI_NOOP;
+
+	intel_ring_advance(rq, cs);
+
+	err = engine->emit_flush(rq, EMIT_INVALIDATE);
+	if (err)
+		return err;
+
+	return 0;
+}
+
 static int execlists_request_alloc(struct i915_request *request)
 {
 	int ret;
 
 	GEM_BUG_ON(!request->hw_context->pin_count);
 
-	/* Flush enough space to reduce the likelihood of waiting after
+	/*
+	 * Flush enough space to reduce the likelihood of waiting after
 	 * we start building the request - in which case we will just
 	 * have to repeat work.
 	 */
 	request->reserved_space += EXECLISTS_REQUEST_SIZE;
 
-	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
-	if (ret)
-		return ret;
-
-	/* Note that after this point, we have committed to using
+	/*
+	 * Note that after this point, we have committed to using
 	 * this request as it is being used to both track the
 	 * state of engine initialisation and liveness of the
 	 * golden renderstate above. Think twice before you try
 	 * to cancel/unwind this request now.
 	 */
 
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	if (i915_vm_is_48bit(&request->gem_context->ppgtt->vm)) {
+		ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
+		if (ret)
+			return ret;
+	} else {
+		GEM_BUG_ON(intel_vgpu_active(request->i915));
+		ret = emit_pdps(request);
+		if (ret)
+			return ret;
+	}
+
 	request->reserved_space -= EXECLISTS_REQUEST_SIZE;
 	return 0;
 }
@@ -1832,56 +1864,11 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
 		  atomic_read(&execlists->tasklet.count));
 }
 
-static int intel_logical_ring_emit_pdps(struct i915_request *rq)
-{
-	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
-	struct intel_engine_cs *engine = rq->engine;
-	const int num_lri_cmds = GEN8_3LVL_PDPES * 2;
-	u32 *cs;
-	int i;
-
-	cs = intel_ring_begin(rq, num_lri_cmds * 2 + 2);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_LOAD_REGISTER_IMM(num_lri_cmds);
-	for (i = GEN8_3LVL_PDPES - 1; i >= 0; i--) {
-		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
-
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
-		*cs++ = upper_32_bits(pd_daddr);
-		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
-		*cs++ = lower_32_bits(pd_daddr);
-	}
-
-	*cs++ = MI_NOOP;
-	intel_ring_advance(rq, cs);
-
-	return 0;
-}
-
 static int gen8_emit_bb_start(struct i915_request *rq,
 			      u64 offset, u32 len,
 			      const unsigned int flags)
 {
 	u32 *cs;
-	int ret;
-
-	/* Don't rely in hw updating PDPs, specially in lite-restore.
-	 * Ideally, we should set Force PD Restore in ctx descriptor,
-	 * but we can't. Force Restore would be a second option, but
-	 * it is unsafe in case of lite-restore (because the ctx is
-	 * not idle). PML4 is allocated during ppgtt init so this is
-	 * not needed in 48-bit.*/
-	if ((intel_engine_flag(rq->engine) & rq->gem_context->ppgtt->pd_dirty_rings) &&
-	    !i915_vm_is_48bit(&rq->gem_context->ppgtt->vm) &&
-	    !intel_vgpu_active(rq->i915)) {
-		ret = intel_logical_ring_emit_pdps(rq);
-		if (ret)
-			return ret;
-
-		rq->gem_context->ppgtt->pd_dirty_rings &= ~intel_engine_flag(rq->engine);
-	}
 
 	cs = intel_ring_begin(rq, 6);
 	if (IS_ERR(cs))
@@ -1914,6 +1901,7 @@ static int gen8_emit_bb_start(struct i915_request *rq,
 
 	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
 	*cs++ = MI_NOOP;
+
 	intel_ring_advance(rq, cs);
 
 	return 0;
@@ -2544,6 +2532,11 @@ static void execlists_init_reg_state(u32 *regs,
 		 * other PDP Descriptors are ignored.
 		 */
 		ASSIGN_CTX_PML4(ctx->ppgtt, regs);
+	} else {
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 3);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 2);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 1);
+		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 0);
 	}
 
 	if (rcs) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 37bd05cef0e9..4591f568547c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1833,11 +1833,12 @@ static int ring_request_alloc(struct i915_request *request)
 	 */
 	request->reserved_space += LEGACY_REQUEST_SIZE;
 
-	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
+	ret = switch_context(request);
 	if (ret)
 		return ret;
 
-	ret = switch_context(request);
+	/* Unconditionally invalidate GPU caches and TLBs. */
+	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
 	if (ret)
 		return ret;
 
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (6 preceding siblings ...)
  2018-12-03 11:37 ` [PATCH 8/8] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
@ 2018-12-03 12:00 ` Patchwork
  2018-12-03 12:03 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2018-12-03 12:00 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
URL   : https://patchwork.freedesktop.org/series/53396/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
766ea4f65bdf drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
-:28: CHECK:MACRO_ARG_REUSE: Macro argument reuse 'tsk' - possible side-effects?
#28: FILE: drivers/gpu/drm/i915/intel_breadcrumbs.c:30:
+#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_rq)

total: 0 errors, 0 warnings, 1 checks, 12 lines checked
fa8bf690be7f drm/i915: Complete the fences as they are cancelled due to wedging
-:13: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#13: 
<1>[  354.473346] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250

total: 0 errors, 1 warnings, 0 checks, 135 lines checked
1053059305d7 drm/i915/ringbuffer: Clear semaphore sync registers on ring init
977ecf89aa55 drm/i915: Allocate a common scratch page
3e6b3d9e45c9 drm/i915: Flush GPU relocs harder for gen3
-:14: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7fa28e146994 ("drm/i915: Write GPU relocs harder with gen3")'
#14: 
References: 7fa28e146994 ("drm/i915: Write GPU relocs harder with gen3")

total: 1 errors, 0 warnings, 0 checks, 50 lines checked
c43dcef30f43 drm/i915/selftests: Terminate hangcheck sanitycheck forcibly
6ab551e4569e drm/i915/selftests: Reorder request allocation vs vma pinning
258c6cca41c7 drm/i915: Pipeline PDP updates for Braswell

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (7 preceding siblings ...)
  2018-12-03 12:00 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Patchwork
@ 2018-12-03 12:03 ` Patchwork
  2018-12-03 12:17 ` ✓ Fi.CI.BAT: success " Patchwork
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2018-12-03 12:03 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
URL   : https://patchwork.freedesktop.org/series/53396/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
Okay!

Commit: drm/i915: Complete the fences as they are cancelled due to wedging
Okay!

Commit: drm/i915/ringbuffer: Clear semaphore sync registers on ring init
Okay!

Commit: drm/i915: Allocate a common scratch page
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3571:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3573:16: warning: expression using sizeof(void)

Commit: drm/i915: Flush GPU relocs harder for gen3
Okay!

Commit: drm/i915/selftests: Terminate hangcheck sanitycheck forcibly
Okay!

Commit: drm/i915/selftests: Reorder request allocation vs vma pinning
Okay!

Commit: drm/i915: Pipeline PDP updates for Braswell
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/8] drm/i915/ringbuffer: Clear semaphore sync registers on ring init
  2018-12-03 11:36 ` [PATCH 3/8] drm/i915/ringbuffer: Clear semaphore sync registers on ring init Chris Wilson
@ 2018-12-03 12:05   ` Mika Kuoppala
  2018-12-03 12:15     ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Mika Kuoppala @ 2018-12-03 12:05 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Ensure that the sync registers are cleared every time we restart the
> ring to avoid stale values from creeping in from random neutrinos.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 992889f9e0ff..81b10d85b738 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -529,6 +529,13 @@ static int init_ring_common(struct intel_engine_cs *engine)
>  
>  	intel_engine_reset_breadcrumbs(engine);
>  
> +	if (HAS_LEGACY_SEMAPHORES(engine->i915)) {
> +		I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
> +		I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
> +		if (HAS_VEBOX(dev_priv))

Minor nitpick: mixed i915 and dev_priv usage. 

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> +			I915_WRITE(RING_SYNC_2(engine->mmio_base), 0);
> +	}
> +
>  	/* Enforce ordering by reading HEAD register back */
>  	I915_READ_HEAD(engine);
>  
> -- 
> 2.20.0.rc1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/8] drm/i915/selftests: Terminate hangcheck sanitycheck forcibly
  2018-12-03 11:36 ` [PATCH 6/8] drm/i915/selftests: Terminate hangcheck sanitycheck forcibly Chris Wilson
@ 2018-12-03 12:09   ` Mika Kuoppala
  2018-12-03 12:17     ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Mika Kuoppala @ 2018-12-03 12:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> If all else fails and we are stuck eternally waiting for the undying
> request, abandon all hope.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index defe671130ab..a48fbe2557ea 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -308,6 +308,7 @@ static int igt_hang_sanitycheck(void *arg)
>  		goto unlock;
>  
>  	for_each_engine(engine, i915, id) {
> +		struct igt_wedge_me w;
>  		long timeout;
>  
>  		if (!intel_engine_can_store_dword(engine))
> @@ -328,9 +329,14 @@ static int igt_hang_sanitycheck(void *arg)
>  
>  		i915_request_add(rq);
>  
> -		timeout = i915_request_wait(rq,
> -					    I915_WAIT_LOCKED,
> -					    MAX_SCHEDULE_TIMEOUT);
> +		timeout = 0;
> +		igt_wedge_on_timeout(&w, i915, HZ / 10 /* 100ms timeout*/)

100ms? We are emitting a hanging batch here, so there is something
I am missing here.

-Mika


> +			timeout = i915_request_wait(rq,
> +						    I915_WAIT_LOCKED,
> +						    MAX_SCHEDULE_TIMEOUT);
> +		if (i915_terminally_wedged(&i915->gpu_error))
> +			timeout = -EIO;
> +
>  		i915_request_put(rq);
>  
>  		if (timeout < 0) {
> -- 
> 2.20.0.rc1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/8] drm/i915/ringbuffer: Clear semaphore sync registers on ring init
  2018-12-03 12:05   ` Mika Kuoppala
@ 2018-12-03 12:15     ` Chris Wilson
  0 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 12:15 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2018-12-03 12:05:25)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Ensure that the sync registers are cleared every time we restart the
> > ring to avoid stale values from creeping in from random neutrinos.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/intel_ringbuffer.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > index 992889f9e0ff..81b10d85b738 100644
> > --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> > +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> > @@ -529,6 +529,13 @@ static int init_ring_common(struct intel_engine_cs *engine)
> >  
> >       intel_engine_reset_breadcrumbs(engine);
> >  
> > +     if (HAS_LEGACY_SEMAPHORES(engine->i915)) {
> > +             I915_WRITE(RING_SYNC_0(engine->mmio_base), 0);
> > +             I915_WRITE(RING_SYNC_1(engine->mmio_base), 0);
> > +             if (HAS_VEBOX(dev_priv))
> 
> Minor nitpick: mixed i915 and dev_priv usage. 

I don't think you are ready yet for i915_write32(i915, reg, value). :-p
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (8 preceding siblings ...)
  2018-12-03 12:03 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2018-12-03 12:17 ` Patchwork
  2018-12-03 13:49 ` [PATCH 1/8] " Tvrtko Ursulin
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2018-12-03 12:17 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
URL   : https://patchwork.freedesktop.org/series/53396/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5239 -> Patchwork_10996
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/53396/revisions/1/mbox/

Known issues
------------

  Here are the changes found in Patchwork_10996 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s4-devices:
    - fi-ivb-3520m:       PASS -> FAIL [fdo#108880]

  * igt@kms_chamelium@common-hpd-after-suspend:
    - fi-skl-6700k2:      PASS -> INCOMPLETE [fdo#104108] / [fdo#105524] / [k.org#199541]

  
#### Possible fixes ####

  * igt@i915_selftest@live_execlists:
    - fi-apl-guc:         INCOMPLETE [fdo#103927] / [fdo#108622] -> PASS

  
#### Warnings ####

  * igt@i915_selftest@live_contexts:
    - {fi-icl-u3}:        INCOMPLETE [fdo#108315] -> DMESG-FAIL [fdo#108569]

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#103927]: https://bugs.freedesktop.org/show_bug.cgi?id=103927
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#105524]: https://bugs.freedesktop.org/show_bug.cgi?id=105524
  [fdo#108315]: https://bugs.freedesktop.org/show_bug.cgi?id=108315
  [fdo#108569]: https://bugs.freedesktop.org/show_bug.cgi?id=108569
  [fdo#108622]: https://bugs.freedesktop.org/show_bug.cgi?id=108622
  [fdo#108880]: https://bugs.freedesktop.org/show_bug.cgi?id=108880
  [k.org#199541]: https://bugzilla.kernel.org/show_bug.cgi?id=199541


Participating hosts (47 -> 42)
------------------------------

  Missing    (5): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 


Build changes
-------------

    * Linux: CI_DRM_5239 -> Patchwork_10996

  CI_DRM_5239: 6b82ae50cbf9b70fb3884937a221f69261c4c41c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4736: 285ebfb3b7adc56586031afa5150c4e5ad40c229 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10996: 258c6cca41c715c7b29ff8f320a5d578774724a1 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

258c6cca41c7 drm/i915: Pipeline PDP updates for Braswell
6ab551e4569e drm/i915/selftests: Reorder request allocation vs vma pinning
c43dcef30f43 drm/i915/selftests: Terminate hangcheck sanitycheck forcibly
3e6b3d9e45c9 drm/i915: Flush GPU relocs harder for gen3
977ecf89aa55 drm/i915: Allocate a common scratch page
1053059305d7 drm/i915/ringbuffer: Clear semaphore sync registers on ring init
fa8bf690be7f drm/i915: Complete the fences as they are cancelled due to wedging
766ea4f65bdf drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10996/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/8] drm/i915/selftests: Terminate hangcheck sanitycheck forcibly
  2018-12-03 12:09   ` Mika Kuoppala
@ 2018-12-03 12:17     ` Chris Wilson
  2018-12-03 12:21       ` Mika Kuoppala
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 12:17 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2018-12-03 12:09:39)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > If all else fails and we are stuck eternally waiting for the undying
> > request, abandon all hope.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > ---
> >  drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> > index defe671130ab..a48fbe2557ea 100644
> > --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> > +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> > @@ -308,6 +308,7 @@ static int igt_hang_sanitycheck(void *arg)
> >               goto unlock;
> >  
> >       for_each_engine(engine, i915, id) {
> > +             struct igt_wedge_me w;
> >               long timeout;
> >  
> >               if (!intel_engine_can_store_dword(engine))
> > @@ -328,9 +329,14 @@ static int igt_hang_sanitycheck(void *arg)
> >  
> >               i915_request_add(rq);
> >  
> > -             timeout = i915_request_wait(rq,
> > -                                         I915_WAIT_LOCKED,
> > -                                         MAX_SCHEDULE_TIMEOUT);
> > +             timeout = 0;
> > +             igt_wedge_on_timeout(&w, i915, HZ / 10 /* 100ms timeout*/)
> 
> 100ms? We are emitting a hanging batch here, so there is something
> I am missing here.

It's not a hanging batch, anymore due to the terminator applied a couple
of lines above.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/8] drm/i915/selftests: Terminate hangcheck sanitycheck forcibly
  2018-12-03 12:17     ` Chris Wilson
@ 2018-12-03 12:21       ` Mika Kuoppala
  0 siblings, 0 replies; 31+ messages in thread
From: Mika Kuoppala @ 2018-12-03 12:21 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Quoting Mika Kuoppala (2018-12-03 12:09:39)
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > If all else fails and we are stuck eternally waiting for the undying
>> > request, abandon all hope.
>> >
>> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> > ---
>> >  drivers/gpu/drm/i915/selftests/intel_hangcheck.c | 12 +++++++++---
>> >  1 file changed, 9 insertions(+), 3 deletions(-)
>> >
>> > diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
>> > index defe671130ab..a48fbe2557ea 100644
>> > --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
>> > +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
>> > @@ -308,6 +308,7 @@ static int igt_hang_sanitycheck(void *arg)
>> >               goto unlock;
>> >  
>> >       for_each_engine(engine, i915, id) {
>> > +             struct igt_wedge_me w;
>> >               long timeout;
>> >  
>> >               if (!intel_engine_can_store_dword(engine))
>> > @@ -328,9 +329,14 @@ static int igt_hang_sanitycheck(void *arg)
>> >  
>> >               i915_request_add(rq);
>> >  
>> > -             timeout = i915_request_wait(rq,
>> > -                                         I915_WAIT_LOCKED,
>> > -                                         MAX_SCHEDULE_TIMEOUT);
>> > +             timeout = 0;
>> > +             igt_wedge_on_timeout(&w, i915, HZ / 10 /* 100ms timeout*/)
>> 
>> 100ms? We are emitting a hanging batch here, so there is something
>> I am missing here.
>
> It's not a hanging batch, anymore due to the terminator applied a couple
> of lines above.

There it is. I did read the code, I did have coffee. It is Monday.

Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>

> -Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (9 preceding siblings ...)
  2018-12-03 12:17 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-12-03 13:49 ` Tvrtko Ursulin
  2018-12-03 15:01 ` ✓ Fi.CI.IGT: success for series starting with [1/8] " Patchwork
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 31+ messages in thread
From: Tvrtko Ursulin @ 2018-12-03 13:49 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/12/2018 11:36, Chris Wilson wrote:
> Change the on-cpu check to on-runqueue to catch if the waiter has been
> woken (and reset its current_state back to TASK_UNINTERRUPTIBLE to
> perform the seqno check) but is sleeping due to being preempted off the
> cpu.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/intel_breadcrumbs.c | 6 +-----
>   1 file changed, 1 insertion(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index 84bf8d827136..447c5256f63a 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -27,11 +27,7 @@
>   
>   #include "i915_drv.h"
>   
> -#ifdef CONFIG_SMP
> -#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_cpu)
> -#else
> -#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL)
> -#endif
> +#define task_asleep(tsk) ((tsk)->state & TASK_NORMAL && !(tsk)->on_rq)
>   
>   static unsigned int __intel_breadcrumbs_wakeup(struct intel_breadcrumbs *b)
>   {
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (10 preceding siblings ...)
  2018-12-03 13:49 ` [PATCH 1/8] " Tvrtko Ursulin
@ 2018-12-03 15:01 ` Patchwork
  2018-12-03 17:42 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2) Patchwork
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2018-12-03 15:01 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate
URL   : https://patchwork.freedesktop.org/series/53396/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5239_full -> Patchwork_10996_full
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with Patchwork_10996_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_10996_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_10996_full:

### IGT changes ###

#### Warnings ####

  * igt@kms_plane_multiple@atomic-pipe-b-tiling-x:
    - shard-snb:          PASS -> SKIP

  * igt@kms_vblank@pipe-a-wait-busy:
    - shard-snb:          SKIP -> PASS +2

  
Known issues
------------

  Here are the changes found in Patchwork_10996_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_ppgtt@blt-vs-render-ctxn:
    - shard-skl:          PASS -> TIMEOUT [fdo#108039]

  * igt@gem_userptr_blits@readonly-unsync:
    - shard-skl:          PASS -> TIMEOUT [fdo#108887]

  * igt@kms_atomic_transition@plane-toggle-modeset-transition:
    - {shard-iclb}:       NOTRUN -> DMESG-WARN [fdo#107724] +8

  * igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-b:
    - shard-snb:          PASS -> DMESG-WARN [fdo#107956]

  * igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-c:
    - shard-skl:          NOTRUN -> DMESG-WARN [fdo#107956]

  * igt@kms_ccs@pipe-a-crc-sprite-planes-basic:
    - shard-glk:          PASS -> FAIL [fdo#108145]

  * igt@kms_chv_cursor_fail@pipe-a-128x128-top-edge:
    - {shard-iclb}:       PASS -> DMESG-WARN [fdo#107724] / [fdo#108336] +1

  * igt@kms_color@pipe-a-ctm-max:
    - shard-apl:          PASS -> FAIL [fdo#108147]

  * igt@kms_cursor_crc@cursor-128x42-onscreen:
    - shard-skl:          NOTRUN -> FAIL [fdo#103232]

  * igt@kms_cursor_crc@cursor-64x21-random:
    - shard-glk:          PASS -> FAIL [fdo#103232]

  * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-wc-xtiled:
    - {shard-iclb}:       NOTRUN -> WARN [fdo#108336]

  * igt@kms_fbcon_fbt@psr:
    - shard-skl:          NOTRUN -> FAIL [fdo#107882]

  * igt@kms_flip@flip-vs-expired-vblank-interruptible:
    - shard-skl:          PASS -> FAIL [fdo#105363]

  * igt@kms_flip@modeset-vs-vblank-race-interruptible:
    - shard-glk:          PASS -> FAIL [fdo#103060]

  * igt@kms_flip@wf_vblank-ts-check-interruptible:
    - shard-skl:          NOTRUN -> FAIL [fdo#100368]

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-mmap-cpu:
    - shard-apl:          PASS -> FAIL [fdo#103167]

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-fullscreen:
    - shard-glk:          PASS -> FAIL [fdo#103167] +2

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-offscren-pri-shrfb-draw-mmap-gtt:
    - {shard-iclb}:       NOTRUN -> DMESG-FAIL [fdo#107724] +1

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-spr-indfb-draw-pwrite:
    - {shard-iclb}:       PASS -> FAIL [fdo#103167] +3

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          PASS -> FAIL [fdo#107815] / [fdo#108145]

  * igt@kms_plane_alpha_blend@pipe-b-alpha-transparant-fb:
    - shard-kbl:          NOTRUN -> FAIL [fdo#108145]

  * igt@kms_plane_alpha_blend@pipe-b-constant-alpha-max:
    - {shard-iclb}:       NOTRUN -> DMESG-WARN [fdo#107724] / [fdo#108336] +3

  * igt@kms_plane_alpha_blend@pipe-c-alpha-basic:
    - {shard-iclb}:       PASS -> DMESG-FAIL [fdo#107724] +2
    - shard-skl:          NOTRUN -> FAIL [fdo#107815] / [fdo#108145]

  * igt@kms_plane_multiple@atomic-pipe-a-tiling-yf:
    - shard-glk:          PASS -> FAIL [fdo#103166]
    - shard-apl:          PASS -> FAIL [fdo#103166] +1

  * igt@kms_plane_multiple@atomic-pipe-c-tiling-x:
    - {shard-iclb}:       PASS -> FAIL [fdo#103166]

  * igt@kms_properties@connector-properties-atomic:
    - shard-skl:          NOTRUN -> FAIL [fdo#108642]

  * igt@kms_psr@no_drrs:
    - {shard-iclb}:       PASS -> FAIL [fdo#108341]

  * igt@pm_rpm@reg-read-ioctl:
    - {shard-iclb}:       PASS -> DMESG-WARN [fdo#107724] +11

  
#### Possible fixes ####

  * igt@gem_ppgtt@blt-vs-render-ctxn:
    - shard-kbl:          INCOMPLETE [fdo#103665] / [fdo#106023] / [fdo#106887] -> PASS

  * igt@kms_busy@extended-modeset-hang-newfb-render-a:
    - shard-snb:          DMESG-WARN [fdo#107956] -> SKIP

  * igt@kms_busy@extended-modeset-hang-newfb-render-c:
    - {shard-iclb}:       DMESG-WARN [fdo#107956] -> PASS

  * igt@kms_busy@extended-pageflip-modeset-hang-oldfb-render-c:
    - shard-glk:          DMESG-WARN [fdo#107956] -> PASS

  * igt@kms_cursor_crc@cursor-128x128-offscreen:
    - shard-skl:          FAIL [fdo#103232] -> PASS

  * igt@kms_cursor_crc@cursor-64x64-onscreen:
    - shard-apl:          FAIL [fdo#103232] -> PASS +2

  * igt@kms_cursor_legacy@2x-long-cursor-vs-flip-atomic:
    - shard-hsw:          FAIL [fdo#105767] -> PASS

  * igt@kms_frontbuffer_tracking@fbcpsr-rgb565-draw-mmap-wc:
    - {shard-iclb}:       DMESG-FAIL [fdo#107724] -> PASS

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-spr-indfb-draw-pwrite:
    - {shard-iclb}:       FAIL [fdo#103167] -> PASS +9

  * igt@kms_plane@plane-position-covered-pipe-b-planes:
    - shard-apl:          FAIL [fdo#103166] -> PASS

  * igt@kms_plane@plane-position-covered-pipe-c-planes:
    - shard-glk:          FAIL [fdo#103166] -> PASS

  * igt@kms_plane_alpha_blend@pipe-c-coverage-7efc:
    - shard-skl:          FAIL [fdo#107815] -> PASS

  * igt@kms_plane_multiple@atomic-pipe-c-tiling-yf:
    - {shard-iclb}:       FAIL [fdo#103166] -> PASS +1

  * igt@kms_rmfb@rmfb-ioctl:
    - {shard-iclb}:       DMESG-WARN [fdo#107724] -> PASS +3

  * igt@kms_setmode@basic:
    - shard-kbl:          FAIL [fdo#99912] -> PASS

  * igt@kms_vblank@pipe-a-ts-continuation-suspend:
    - shard-hsw:          FAIL [fdo#104894] -> PASS

  * igt@kms_vblank@pipe-c-ts-continuation-dpms-suspend:
    - shard-apl:          DMESG-WARN [fdo#103558] / [fdo#105602] -> PASS +7

  * igt@pm_rpm@gem-pread:
    - shard-skl:          INCOMPLETE [fdo#107807] -> PASS

  * igt@pm_rpm@system-suspend-execbuf:
    - {shard-iclb}:       DMESG-WARN [fdo#108654] -> PASS

  
#### Warnings ####

  * igt@kms_hdmi_inject@inject-audio:
    - {shard-iclb}:       FAIL [fdo#102370] -> DMESG-FAIL [fdo#107724]

  * igt@kms_plane_alpha_blend@pipe-c-alpha-7efc:
    - shard-apl:          DMESG-FAIL [fdo#103558] / [fdo#105602] / [fdo#108145] -> FAIL [fdo#108145] +1

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#100368]: https://bugs.freedesktop.org/show_bug.cgi?id=100368
  [fdo#102370]: https://bugs.freedesktop.org/show_bug.cgi?id=102370
  [fdo#103060]: https://bugs.freedesktop.org/show_bug.cgi?id=103060
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103558]: https://bugs.freedesktop.org/show_bug.cgi?id=103558
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#104894]: https://bugs.freedesktop.org/show_bug.cgi?id=104894
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#105602]: https://bugs.freedesktop.org/show_bug.cgi?id=105602
  [fdo#105767]: https://bugs.freedesktop.org/show_bug.cgi?id=105767
  [fdo#106023]: https://bugs.freedesktop.org/show_bug.cgi?id=106023
  [fdo#106887]: https://bugs.freedesktop.org/show_bug.cgi?id=106887
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
  [fdo#107815]: https://bugs.freedesktop.org/show_bug.cgi?id=107815
  [fdo#107882]: https://bugs.freedesktop.org/show_bug.cgi?id=107882
  [fdo#107956]: https://bugs.freedesktop.org/show_bug.cgi?id=107956
  [fdo#108039]: https://bugs.freedesktop.org/show_bug.cgi?id=108039
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108147]: https://bugs.freedesktop.org/show_bug.cgi?id=108147
  [fdo#108336]: https://bugs.freedesktop.org/show_bug.cgi?id=108336
  [fdo#108341]: https://bugs.freedesktop.org/show_bug.cgi?id=108341
  [fdo#108642]: https://bugs.freedesktop.org/show_bug.cgi?id=108642
  [fdo#108654]: https://bugs.freedesktop.org/show_bug.cgi?id=108654
  [fdo#108887]: https://bugs.freedesktop.org/show_bug.cgi?id=108887
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912


Participating hosts (7 -> 7)
------------------------------

  No changes in participating hosts


Build changes
-------------

    * Linux: CI_DRM_5239 -> Patchwork_10996

  CI_DRM_5239: 6b82ae50cbf9b70fb3884937a221f69261c4c41c @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4736: 285ebfb3b7adc56586031afa5150c4e5ad40c229 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_10996: 258c6cca41c715c7b29ff8f320a5d578774724a1 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_10996/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] drm/i915: Allocate a common scratch page
  2018-12-03 11:36 ` [PATCH 4/8] drm/i915: Allocate a common scratch page Chris Wilson
@ 2018-12-03 15:28   ` Mika Kuoppala
  2018-12-03 17:10     ` Chris Wilson
  2018-12-03 17:29   ` [PATCH v2] " Chris Wilson
  1 sibling, 1 reply; 31+ messages in thread
From: Mika Kuoppala @ 2018-12-03 15:28 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> Currently we allocate a scratch page for each engine, but since we only
> ever write into it for post-sync operations, it is not exposed to
> userspace nor do we care for coherency. As we then do not care about its
> contents, we can use one page for all, reducing our allocations and
> avoid complications by not assuming per-engine isolation.
>
> For later use, it simplifies engine initialisation (by removing the
> allocation that required struct_mutex!) and means that we can always rely
> on there being a scratch page.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  7 ++++
>  drivers/gpu/drm/i915/i915_gem.c         | 50 ++++++++++++++++++++++++-
>  drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
>  drivers/gpu/drm/i915/intel_engine_cs.c  | 42 ---------------------
>  drivers/gpu/drm/i915/intel_lrc.c        | 17 +++------
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 33 +++++-----------
>  drivers/gpu/drm/i915/intel_ringbuffer.h |  5 ---
>  7 files changed, 71 insertions(+), 85 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d45475287130..0ec65cc48b5a 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1996,6 +1996,8 @@ struct drm_i915_private {
>  		struct delayed_work idle_work;
>  
>  		ktime_t last_init_time;
> +
> +		struct i915_vma *scratch;
>  	} gt;
>  
>  	/* perform PHY state sanity checks? */
> @@ -3724,4 +3726,9 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
>  		return I915_HWS_CSB_WRITE_INDEX;
>  }
>  
> +static inline u32 i915_scratch_offset(const struct drm_i915_private *i915)
> +{
> +	return i915_ggtt_offset(i915->gt.scratch);
> +}
> +
>  #endif
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 834240a9b262..cca4285e3329 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -5495,6 +5495,44 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
>  	goto out_ctx;
>  }
>  
> +static int
> +i915_gem_init_scratch(struct drm_i915_private *i915, unsigned int size)
> +{
> +	struct drm_i915_gem_object *obj;
> +	struct i915_vma *vma;
> +	int ret;
> +
> +	obj = i915_gem_object_create_stolen(i915, size);
> +	if (!obj)
> +		obj = i915_gem_object_create_internal(i915, size);
> +	if (IS_ERR(obj)) {
> +		DRM_ERROR("Failed to allocate scratch page\n");
> +		return PTR_ERR(obj);
> +	}
> +
> +	vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
> +	if (IS_ERR(vma)) {
> +		ret = PTR_ERR(vma);
> +		goto err_unref;
> +	}
> +
> +	ret = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
> +	if (ret)
> +		goto err_unref;
> +
> +	i915->gt.scratch = vma;
> +	return 0;
> +
> +err_unref:
> +	i915_gem_object_put(obj);
> +	return ret;
> +}
> +
> +static void i915_gem_fini_scratch(struct drm_i915_private *i915)
> +{
> +	i915_vma_unpin_and_release(&i915->gt.scratch, 0);
> +}
> +
>  int i915_gem_init(struct drm_i915_private *dev_priv)
>  {
>  	int ret;
> @@ -5541,12 +5579,19 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
>  		goto err_unlock;
>  	}
>  
> -	ret = i915_gem_contexts_init(dev_priv);
> +	ret = i915_gem_init_scratch(dev_priv,
> +				    IS_GEN2(dev_priv) ? SZ_256K : PAGE_SIZE);
>  	if (ret) {
>  		GEM_BUG_ON(ret == -EIO);
>  		goto err_ggtt;
>  	}
>  
> +	ret = i915_gem_contexts_init(dev_priv);
> +	if (ret) {
> +		GEM_BUG_ON(ret == -EIO);
> +		goto err_scratch;
> +	}
> +
>  	ret = intel_engines_init(dev_priv);
>  	if (ret) {
>  		GEM_BUG_ON(ret == -EIO);
> @@ -5619,6 +5664,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
>  err_context:
>  	if (ret != -EIO)
>  		i915_gem_contexts_fini(dev_priv);
> +err_scratch:
> +	i915_gem_fini_scratch(dev_priv);
>  err_ggtt:
>  err_unlock:
>  	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
> @@ -5670,6 +5717,7 @@ void i915_gem_fini(struct drm_i915_private *dev_priv)
>  	intel_uc_fini(dev_priv);
>  	i915_gem_cleanup_engines(dev_priv);
>  	i915_gem_contexts_fini(dev_priv);
> +	i915_gem_fini_scratch(dev_priv);
>  	mutex_unlock(&dev_priv->drm.struct_mutex);
>  
>  	intel_cleanup_gt_powersave(dev_priv);
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index a6885a59568b..07465123c166 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -1571,7 +1571,7 @@ static void gem_record_rings(struct i915_gpu_state *error)
>  			if (HAS_BROKEN_CS_TLB(i915))
>  				ee->wa_batchbuffer =
>  					i915_error_object_create(i915,
> -								 engine->scratch);
> +								 i915->gt.scratch);
>  			request_record_user_bo(request, ee);
>  
>  			ee->ctx =
> diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
> index 759c0fd58f8c..2390985384d6 100644
> --- a/drivers/gpu/drm/i915/intel_engine_cs.c
> +++ b/drivers/gpu/drm/i915/intel_engine_cs.c
> @@ -493,46 +493,6 @@ void intel_engine_setup_common(struct intel_engine_cs *engine)
>  	intel_engine_init_cmd_parser(engine);
>  }
>  
> -int intel_engine_create_scratch(struct intel_engine_cs *engine,
> -				unsigned int size)
> -{
> -	struct drm_i915_gem_object *obj;
> -	struct i915_vma *vma;
> -	int ret;
> -
> -	WARN_ON(engine->scratch);
> -
> -	obj = i915_gem_object_create_stolen(engine->i915, size);
> -	if (!obj)
> -		obj = i915_gem_object_create_internal(engine->i915, size);
> -	if (IS_ERR(obj)) {
> -		DRM_ERROR("Failed to allocate scratch page\n");
> -		return PTR_ERR(obj);
> -	}
> -
> -	vma = i915_vma_instance(obj, &engine->i915->ggtt.vm, NULL);
> -	if (IS_ERR(vma)) {
> -		ret = PTR_ERR(vma);
> -		goto err_unref;
> -	}
> -
> -	ret = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
> -	if (ret)
> -		goto err_unref;
> -
> -	engine->scratch = vma;
> -	return 0;
> -
> -err_unref:
> -	i915_gem_object_put(obj);
> -	return ret;
> -}
> -
> -void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
> -{
> -	i915_vma_unpin_and_release(&engine->scratch, 0);
> -}
> -
>  static void cleanup_status_page(struct intel_engine_cs *engine)
>  {
>  	if (HWS_NEEDS_PHYSICAL(engine->i915)) {
> @@ -707,8 +667,6 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
>  {
>  	struct drm_i915_private *i915 = engine->i915;
>  
> -	intel_engine_cleanup_scratch(engine);
> -
>  	cleanup_status_page(engine);
>  
>  	intel_engine_fini_breadcrumbs(engine);
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b5511a054f30..de070dca4033 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1286,9 +1286,10 @@ static int execlists_request_alloc(struct i915_request *request)
>  static u32 *
>  gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
>  {
> +	/* NB no one else is allowed to scribble over scratch + 256! */
>  	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
>  	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
> -	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
> +	*batch++ = i915_scratch_offset(engine->i915) + 256;
>  	*batch++ = 0;
>  
>  	*batch++ = MI_LOAD_REGISTER_IMM(1);
> @@ -1302,7 +1303,7 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
>  
>  	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
>  	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
> -	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
> +	*batch++ = i915_scratch_offset(engine->i915) + 256;
>  	*batch++ = 0;
>  
>  	return batch;
> @@ -1339,7 +1340,7 @@ static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
>  				       PIPE_CONTROL_GLOBAL_GTT_IVB |
>  				       PIPE_CONTROL_CS_STALL |
>  				       PIPE_CONTROL_QW_WRITE,
> -				       i915_ggtt_offset(engine->scratch) +
> +				       i915_scratch_offset(engine->i915) +
>  				       2 * CACHELINE_BYTES);
>  
>  	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> @@ -1969,7 +1970,7 @@ static int gen8_emit_flush_render(struct i915_request *request,
>  {
>  	struct intel_engine_cs *engine = request->engine;
>  	u32 scratch_addr =
> -		i915_ggtt_offset(engine->scratch) + 2 * CACHELINE_BYTES;
> +		i915_scratch_offset(engine->i915) + 2 * CACHELINE_BYTES;
>  	bool vf_flush_wa = false, dc_flush_wa = false;
>  	u32 *cs, flags = 0;
>  	int len;
> @@ -2306,10 +2307,6 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
>  	if (ret)
>  		return ret;
>  
> -	ret = intel_engine_create_scratch(engine, PAGE_SIZE);
> -	if (ret)
> -		goto err_cleanup_common;
> -
>  	ret = intel_init_workaround_bb(engine);
>  	if (ret) {
>  		/*
> @@ -2322,10 +2319,6 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
>  	}
>  
>  	return 0;
> -
> -err_cleanup_common:
> -	intel_engine_cleanup_common(engine);
> -	return ret;
>  }
>  
>  int logical_xcs_ring_init(struct intel_engine_cs *engine)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 81b10d85b738..a3d3126a3938 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -150,8 +150,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
>  	 */
>  	if (mode & EMIT_INVALIDATE) {
>  		*cs++ = GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE;
> -		*cs++ = i915_ggtt_offset(rq->engine->scratch) |
> -			PIPE_CONTROL_GLOBAL_GTT;
> +		*cs++ = i915_scratch_offset(rq->i915) | PIPE_CONTROL_GLOBAL_GTT;
>  		*cs++ = 0;
>  		*cs++ = 0;
>  
> @@ -159,8 +158,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
>  			*cs++ = MI_FLUSH;
>  
>  		*cs++ = GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE;
> -		*cs++ = i915_ggtt_offset(rq->engine->scratch) |
> -			PIPE_CONTROL_GLOBAL_GTT;
> +		*cs++ = i915_scratch_offset(rq->i915) | PIPE_CONTROL_GLOBAL_GTT;
>  		*cs++ = 0;
>  		*cs++ = 0;
>  	}
> @@ -212,8 +210,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
>  static int
>  intel_emit_post_sync_nonzero_flush(struct i915_request *rq)
>  {
> -	u32 scratch_addr =
> -		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
> +	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
>  	u32 *cs;
>  
>  	cs = intel_ring_begin(rq, 6);
> @@ -246,8 +243,7 @@ intel_emit_post_sync_nonzero_flush(struct i915_request *rq)
>  static int
>  gen6_render_ring_flush(struct i915_request *rq, u32 mode)
>  {
> -	u32 scratch_addr =
> -		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
> +	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
>  	u32 *cs, flags = 0;
>  	int ret;
>  
> @@ -316,8 +312,7 @@ gen7_render_ring_cs_stall_wa(struct i915_request *rq)
>  static int
>  gen7_render_ring_flush(struct i915_request *rq, u32 mode)
>  {
> -	u32 scratch_addr =
> -		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
> +	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
>  	u32 *cs, flags = 0;
>  
>  	/*
> @@ -1002,7 +997,7 @@ i830_emit_bb_start(struct i915_request *rq,
>  		   u64 offset, u32 len,
>  		   unsigned int dispatch_flags)
>  {
> -	u32 *cs, cs_offset = i915_ggtt_offset(rq->engine->scratch);
> +	u32 *cs, cs_offset = i915_scratch_offset(rq->i915);
>  
>  	cs = intel_ring_begin(rq, 6);
>  	if (IS_ERR(cs))
> @@ -1459,7 +1454,6 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
>  {
>  	struct i915_timeline *timeline;
>  	struct intel_ring *ring;
> -	unsigned int size;
>  	int err;
>  
>  	intel_engine_setup_common(engine);
> @@ -1484,21 +1478,12 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
>  	GEM_BUG_ON(engine->buffer);
>  	engine->buffer = ring;
>  
> -	size = PAGE_SIZE;
> -	if (HAS_BROKEN_CS_TLB(engine->i915))
> -		size = I830_WA_SIZE;

Hmm, this disappeared. I don't know this wa details
but apparently we do need 2 pages for this kind of
machines.

-Mika

> -	err = intel_engine_create_scratch(engine, size);
> -	if (err)
> -		goto err_unpin;
> -
>  	err = intel_engine_init_common(engine);
>  	if (err)
> -		goto err_scratch;
> +		goto err_unpin;
>  
>  	return 0;
>  
> -err_scratch:
> -	intel_engine_cleanup_scratch(engine);
>  err_unpin:
>  	intel_ring_unpin(ring);
>  err_ring:
> @@ -1572,7 +1557,7 @@ static int flush_pd_dir(struct i915_request *rq)
>  	/* Stall until the page table load is complete */
>  	*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
>  	*cs++ = i915_mmio_reg_offset(RING_PP_DIR_BASE(engine));
> -	*cs++ = i915_ggtt_offset(engine->scratch);
> +	*cs++ = i915_scratch_offset(rq->i915);
>  	*cs++ = MI_NOOP;
>  
>  	intel_ring_advance(rq, cs);
> @@ -1681,7 +1666,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
>  			/* Insert a delay before the next switch! */
>  			*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
>  			*cs++ = i915_mmio_reg_offset(last_reg);
> -			*cs++ = i915_ggtt_offset(engine->scratch);
> +			*cs++ = i915_scratch_offset(rq->i915);
>  			*cs++ = MI_NOOP;
>  		}
>  		*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 8a2270b209b0..970fb5c05c36 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -451,7 +451,6 @@ struct intel_engine_cs {
>  
>  	struct intel_hw_status_page status_page;
>  	struct i915_ctx_workarounds wa_ctx;
> -	struct i915_vma *scratch;
>  
>  	u32             irq_keep_mask; /* always keep these interrupts */
>  	u32		irq_enable_mask; /* bitmask to enable ring interrupt */
> @@ -908,10 +907,6 @@ void intel_engine_setup_common(struct intel_engine_cs *engine);
>  int intel_engine_init_common(struct intel_engine_cs *engine);
>  void intel_engine_cleanup_common(struct intel_engine_cs *engine);
>  
> -int intel_engine_create_scratch(struct intel_engine_cs *engine,
> -				unsigned int size);
> -void intel_engine_cleanup_scratch(struct intel_engine_cs *engine);
> -
>  int intel_init_render_ring_buffer(struct intel_engine_cs *engine);
>  int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine);
>  int intel_init_blt_ring_buffer(struct intel_engine_cs *engine);
> -- 
> 2.20.0.rc1
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/8] drm/i915: Allocate a common scratch page
  2018-12-03 15:28   ` Mika Kuoppala
@ 2018-12-03 17:10     ` Chris Wilson
  0 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 17:10 UTC (permalink / raw)
  To: Mika Kuoppala, intel-gfx

Quoting Mika Kuoppala (2018-12-03 15:28:22)
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > Currently we allocate a scratch page for each engine, but since we only
> > ever write into it for post-sync operations, it is not exposed to
> > userspace nor do we care for coherency. As we then do not care about its
> > contents, we can use one page for all, reducing our allocations and
> > avoid complications by not assuming per-engine isolation.
> >
> > For later use, it simplifies engine initialisation (by removing the
> > allocation that required struct_mutex!) and means that we can always rely
> > on there being a scratch page.
> >
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> > -     ret = i915_gem_contexts_init(dev_priv);
> > +     ret = i915_gem_init_scratch(dev_priv,
> > +                                 IS_GEN2(dev_priv) ? SZ_256K : PAGE_SIZE);

> > @@ -1484,21 +1478,12 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
> >       GEM_BUG_ON(engine->buffer);
> >       engine->buffer = ring;
> >  
> > -     size = PAGE_SIZE;
> > -     if (HAS_BROKEN_CS_TLB(engine->i915))
> > -             size = I830_WA_SIZE;
> 
> Hmm, this disappeared. I don't know this wa details
> but apparently we do need 2 pages for this kind of
> machines.

2 pages? More like 64 pages. See above.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging
  2018-12-03 11:36 ` [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging Chris Wilson
@ 2018-12-03 17:11   ` Tvrtko Ursulin
  2018-12-03 17:36     ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Tvrtko Ursulin @ 2018-12-03 17:11 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/12/2018 11:36, Chris Wilson wrote:
> We inspect the requests under the assumption that they will be marked as
> completed when they are removed from the queue. Currently however, in the
> process of wedging the requests will be removed from the queue before they
> are completed, so rearrange the code to complete the fences before the
> locks are dropped.
> 
> <1>[  354.473346] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250
> <6>[  354.473363] PGD 0 P4D 0
> <4>[  354.473370] Oops: 0000 [#1] PREEMPT SMP PTI
> <4>[  354.473380] CPU: 0 PID: 4470 Comm: gem_eio Tainted: G     U            4.20.0-rc4-CI-CI_DRM_5216+ #1
> <4>[  354.473393] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018
> <4>[  354.473480] RIP: 0010:__i915_schedule+0x311/0x5e0 [i915]
> <4>[  354.473490] Code: 49 89 44 24 20 4d 89 4c 24 28 4d 89 29 44 39 b3 a0 04 00 00 7d 3a 41 8b 44 24 78 85 c0 74 13 48 8b 93 78 04 00 00 48 83 e2 fc <39> 82 50 02 00 00 79 1e 44 89 b3 a0 04 00 00 48 8d bb d0 03 00 00

This confuses me, isn't the code segment usually at the end? And then 
you have another after the call trace which doesn't match 
__i915_scheduel.. anyways, _this_ code seems to be this part:

                 if (node_to_request(node)->global_seqno &&
  90d:   8b 43 78                mov    eax,DWORD PTR [rbx+0x78]
  910:   85 c0                   test   eax,eax
  912:   74 13                   je     927 <__i915_schedule+0x317>
 
i915_seqno_passed(port_request(engine->execlists.port)->global_seqno,
  914:   49 8b 97 c0 04 00 00    mov    rdx,QWORD PTR [r15+0x4c0]
  91b:   48 83 e2 fc             and    rdx,0xfffffffffffffffc
                 if (node_to_request(node)->global_seqno &&
  91f:   39 82 50 02 00 00       cmp    DWORD PTR [rdx+0x250],eax
  925:   79 1e                   jns    945 <__i915_schedule+0x335>

> <4>[  354.473515] RSP: 0018:ffffc900001bba90 EFLAGS: 00010046
> <4>[  354.473524] RAX: 0000000000000003 RBX: ffff8882624c8008 RCX: f34a737800000000
> <4>[  354.473535] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8882624c8048
> <4>[  354.473545] RBP: ffffc900001bbab0 R08: 000000005963f1f1 R09: 0000000000000000
> <4>[  354.473556] R10: ffffc900001bba10 R11: ffff8882624c8060 R12: ffff88824fdd7b98
> <4>[  354.473567] R13: ffff88824fdd7bb8 R14: 0000000000000001 R15: ffff88824fdd7750
> <4>[  354.473578] FS:  00007f44b4b5b980(0000) GS:ffff888277e00000(0000) knlGS:0000000000000000
> <4>[  354.473590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[  354.473599] CR2: 0000000000000250 CR3: 000000026976e000 CR4: 0000000000340ef0

Given the registers above, I think it means this - eax is global_seqno 
of the node rq. rdx is is port_request so NULL and bang. No request in 
port, but why would there always be one at the point we are scheduling 
in a new request to the runnable queue?

Regards,

Tvrtko

> <4>[  354.473611] Call Trace:
> <4>[  354.473622]  ? lock_acquire+0xa6/0x1c0
> <4>[  354.473677]  ? i915_schedule_bump_priority+0x57/0xd0 [i915]
> <4>[  354.473736]  i915_schedule_bump_priority+0x72/0xd0 [i915]
> <4>[  354.473792]  i915_request_wait+0x4db/0x840 [i915]
> <4>[  354.473804]  ? get_pwq.isra.4+0x2c/0x50
> <4>[  354.473813]  ? ___preempt_schedule+0x16/0x18
> <4>[  354.473824]  ? wake_up_q+0x70/0x70
> <4>[  354.473831]  ? wake_up_q+0x70/0x70
> <4>[  354.473882]  ? gen6_rps_boost+0x118/0x120 [i915]
> <4>[  354.473936]  i915_gem_object_wait_fence+0x8a/0x110 [i915]
> <4>[  354.473991]  i915_gem_object_wait+0x113/0x500 [i915]
> <4>[  354.474047]  i915_gem_wait_ioctl+0x11c/0x2f0 [i915]
> <4>[  354.474101]  ? i915_gem_unset_wedged+0x210/0x210 [i915]
> <4>[  354.474113]  drm_ioctl_kernel+0x81/0xf0
> <4>[  354.474123]  drm_ioctl+0x2de/0x390
> <4>[  354.474175]  ? i915_gem_unset_wedged+0x210/0x210 [i915]
> <4>[  354.474187]  ? finish_task_switch+0x95/0x260
> <4>[  354.474197]  ? lock_acquire+0xa6/0x1c0
> <4>[  354.474207]  do_vfs_ioctl+0xa0/0x6e0
> <4>[  354.474217]  ? __fget+0xfc/0x1e0
> <4>[  354.474225]  ksys_ioctl+0x35/0x60
> <4>[  354.474233]  __x64_sys_ioctl+0x11/0x20
> <4>[  354.474241]  do_syscall_64+0x55/0x190
> <4>[  354.474251]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> <4>[  354.474260] RIP: 0033:0x7f44b3de65d7
> <4>[  354.474267] Code: b3 66 90 48 8b 05 b1 48 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 81 48 2d 00 f7 d8 64 89 01 48
> <4>[  354.474293] RSP: 002b:00007fff974948e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> <4>[  354.474305] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f44b3de65d7
> <4>[  354.474316] RDX: 00007fff97494940 RSI: 00000000c010646c RDI: 0000000000000007
> <4>[  354.474327] RBP: 00007fff97494940 R08: 0000000000000000 R09: 00007f44b40bbc40
> <4>[  354.474337] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c010646c
> <4>[  354.474348] R13: 0000000000000007 R14: 0000000000000000 R15: 0000000000000000
> 
> v2: Avoid floating requests.
> v3: Can't call dma_fence_signal() under the timeline lock!
> v4: Can't call dma_fence_signal() from inside another fence either.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem.c         | 54 +++++--------------------
>   drivers/gpu/drm/i915/intel_lrc.c        | 13 ++++--
>   drivers/gpu/drm/i915/intel_ringbuffer.c | 13 +++++-
>   3 files changed, 31 insertions(+), 49 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index c55b1f75c980..834240a9b262 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -3308,16 +3308,6 @@ void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
>   }
>   
>   static void nop_submit_request(struct i915_request *request)
> -{
> -	GEM_TRACE("%s fence %llx:%d -> -EIO\n",
> -		  request->engine->name,
> -		  request->fence.context, request->fence.seqno);
> -	dma_fence_set_error(&request->fence, -EIO);
> -
> -	i915_request_submit(request);
> -}
> -
> -static void nop_complete_submit_request(struct i915_request *request)
>   {
>   	unsigned long flags;
>   
> @@ -3354,57 +3344,33 @@ void i915_gem_set_wedged(struct drm_i915_private *i915)
>   	 * rolling the global seqno forward (since this would complete requests
>   	 * for which we haven't set the fence error to EIO yet).
>   	 */
> -	for_each_engine(engine, i915, id) {
> +	for_each_engine(engine, i915, id)
>   		i915_gem_reset_prepare_engine(engine);
>   
> -		engine->submit_request = nop_submit_request;
> -		engine->schedule = NULL;
> -	}
> -	i915->caps.scheduler = 0;
> -
>   	/* Even if the GPU reset fails, it should still stop the engines */
>   	if (INTEL_GEN(i915) >= 5)
>   		intel_gpu_reset(i915, ALL_ENGINES);
>   
> -	/*
> -	 * Make sure no one is running the old callback before we proceed with
> -	 * cancelling requests and resetting the completion tracking. Otherwise
> -	 * we might submit a request to the hardware which never completes.
> -	 */
> -	synchronize_rcu();
> -
>   	for_each_engine(engine, i915, id) {
> -		/* Mark all executing requests as skipped */
> -		engine->cancel_requests(engine);
> -
> -		/*
> -		 * Only once we've force-cancelled all in-flight requests can we
> -		 * start to complete all requests.
> -		 */
> -		engine->submit_request = nop_complete_submit_request;
> +		engine->submit_request = nop_submit_request;
> +		engine->schedule = NULL;
>   	}
> +	i915->caps.scheduler = 0;
>   
>   	/*
>   	 * Make sure no request can slip through without getting completed by
>   	 * either this call here to intel_engine_init_global_seqno, or the one
> -	 * in nop_complete_submit_request.
> +	 * in nop_submit_request.
>   	 */
>   	synchronize_rcu();
>   
> -	for_each_engine(engine, i915, id) {
> -		unsigned long flags;
> -
> -		/*
> -		 * Mark all pending requests as complete so that any concurrent
> -		 * (lockless) lookup doesn't try and wait upon the request as we
> -		 * reset it.
> -		 */
> -		spin_lock_irqsave(&engine->timeline.lock, flags);
> -		intel_engine_init_global_seqno(engine,
> -					       intel_engine_last_submit(engine));
> -		spin_unlock_irqrestore(&engine->timeline.lock, flags);
> +	/* Mark all executing requests as skipped */
> +	for_each_engine(engine, i915, id)
> +		engine->cancel_requests(engine);
>   
> +	for_each_engine(engine, i915, id) {
>   		i915_gem_reset_finish_engine(engine);
> +		intel_engine_wakeup(engine);
>   	}
>   
>   out:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 11f4e6148557..b5511a054f30 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -818,8 +818,11 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>   	/* Mark all executing requests as skipped. */
>   	list_for_each_entry(rq, &engine->timeline.requests, link) {
>   		GEM_BUG_ON(!rq->global_seqno);
> -		if (!i915_request_completed(rq))
> -			dma_fence_set_error(&rq->fence, -EIO);
> +
> +		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &rq->fence.flags))
> +			continue;
> +
> +		dma_fence_set_error(&rq->fence, -EIO);
>   	}
>   
>   	/* Flush the queued requests to the timeline list (for retiring). */
> @@ -830,8 +833,8 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>   		priolist_for_each_request_consume(rq, rn, p, i) {
>   			list_del_init(&rq->sched.link);
>   
> -			dma_fence_set_error(&rq->fence, -EIO);
>   			__i915_request_submit(rq);
> +			dma_fence_set_error(&rq->fence, -EIO);
>   		}
>   
>   		rb_erase_cached(&p->node, &execlists->queue);
> @@ -839,6 +842,10 @@ static void execlists_cancel_requests(struct intel_engine_cs *engine)
>   			kmem_cache_free(engine->i915->priorities, p);
>   	}
>   
> +	intel_write_status_page(engine,
> +				I915_GEM_HWS_INDEX,
> +				intel_engine_last_submit(engine));
> +
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
>   	execlists->queue_priority = INT_MIN;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 28ae1e436ea6..992889f9e0ff 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -748,9 +748,18 @@ static void cancel_requests(struct intel_engine_cs *engine)
>   	/* Mark all submitted requests as skipped. */
>   	list_for_each_entry(request, &engine->timeline.requests, link) {
>   		GEM_BUG_ON(!request->global_seqno);
> -		if (!i915_request_completed(request))
> -			dma_fence_set_error(&request->fence, -EIO);
> +
> +		if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
> +			     &request->fence.flags))
> +			continue;
> +
> +		dma_fence_set_error(&request->fence, -EIO);
>   	}
> +
> +	intel_write_status_page(engine,
> +				I915_GEM_HWS_INDEX,
> +				intel_engine_last_submit(engine));
> +
>   	/* Remaining _unready_ requests will be nop'ed when submitted */
>   
>   	spin_unlock_irqrestore(&engine->timeline.lock, flags);
> 
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v2] drm/i915: Allocate a common scratch page
  2018-12-03 11:36 ` [PATCH 4/8] drm/i915: Allocate a common scratch page Chris Wilson
  2018-12-03 15:28   ` Mika Kuoppala
@ 2018-12-03 17:29   ` Chris Wilson
  1 sibling, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 17:29 UTC (permalink / raw)
  To: intel-gfx

Currently we allocate a scratch page for each engine, but since we only
ever write into it for post-sync operations, it is not exposed to
userspace nor do we care for coherency. As we then do not care about its
contents, we can use one page for all, reducing our allocations and
avoid complications by not assuming per-engine isolation.

For later use, it simplifies engine initialisation (by removing the
allocation that required struct_mutex!) and means that we can always rely
on there being a scratch page.

v2: Check that we allocated a large enough scratch for I830 w/a

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  7 ++++
 drivers/gpu/drm/i915/i915_gem.c         | 50 ++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/intel_engine_cs.c  | 42 ---------------------
 drivers/gpu/drm/i915/intel_lrc.c        | 17 +++------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 37 ++++++------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  5 ---
 7 files changed, 74 insertions(+), 86 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d45475287130..0ec65cc48b5a 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1996,6 +1996,8 @@ struct drm_i915_private {
 		struct delayed_work idle_work;
 
 		ktime_t last_init_time;
+
+		struct i915_vma *scratch;
 	} gt;
 
 	/* perform PHY state sanity checks? */
@@ -3724,4 +3726,9 @@ static inline int intel_hws_csb_write_index(struct drm_i915_private *i915)
 		return I915_HWS_CSB_WRITE_INDEX;
 }
 
+static inline u32 i915_scratch_offset(const struct drm_i915_private *i915)
+{
+	return i915_ggtt_offset(i915->gt.scratch);
+}
+
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 834240a9b262..cca4285e3329 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -5495,6 +5495,44 @@ static int __intel_engines_record_defaults(struct drm_i915_private *i915)
 	goto out_ctx;
 }
 
+static int
+i915_gem_init_scratch(struct drm_i915_private *i915, unsigned int size)
+{
+	struct drm_i915_gem_object *obj;
+	struct i915_vma *vma;
+	int ret;
+
+	obj = i915_gem_object_create_stolen(i915, size);
+	if (!obj)
+		obj = i915_gem_object_create_internal(i915, size);
+	if (IS_ERR(obj)) {
+		DRM_ERROR("Failed to allocate scratch page\n");
+		return PTR_ERR(obj);
+	}
+
+	vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto err_unref;
+	}
+
+	ret = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
+	if (ret)
+		goto err_unref;
+
+	i915->gt.scratch = vma;
+	return 0;
+
+err_unref:
+	i915_gem_object_put(obj);
+	return ret;
+}
+
+static void i915_gem_fini_scratch(struct drm_i915_private *i915)
+{
+	i915_vma_unpin_and_release(&i915->gt.scratch, 0);
+}
+
 int i915_gem_init(struct drm_i915_private *dev_priv)
 {
 	int ret;
@@ -5541,12 +5579,19 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 		goto err_unlock;
 	}
 
-	ret = i915_gem_contexts_init(dev_priv);
+	ret = i915_gem_init_scratch(dev_priv,
+				    IS_GEN2(dev_priv) ? SZ_256K : PAGE_SIZE);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
 		goto err_ggtt;
 	}
 
+	ret = i915_gem_contexts_init(dev_priv);
+	if (ret) {
+		GEM_BUG_ON(ret == -EIO);
+		goto err_scratch;
+	}
+
 	ret = intel_engines_init(dev_priv);
 	if (ret) {
 		GEM_BUG_ON(ret == -EIO);
@@ -5619,6 +5664,8 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 err_context:
 	if (ret != -EIO)
 		i915_gem_contexts_fini(dev_priv);
+err_scratch:
+	i915_gem_fini_scratch(dev_priv);
 err_ggtt:
 err_unlock:
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
@@ -5670,6 +5717,7 @@ void i915_gem_fini(struct drm_i915_private *dev_priv)
 	intel_uc_fini(dev_priv);
 	i915_gem_cleanup_engines(dev_priv);
 	i915_gem_contexts_fini(dev_priv);
+	i915_gem_fini_scratch(dev_priv);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	intel_cleanup_gt_powersave(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index a6885a59568b..07465123c166 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -1571,7 +1571,7 @@ static void gem_record_rings(struct i915_gpu_state *error)
 			if (HAS_BROKEN_CS_TLB(i915))
 				ee->wa_batchbuffer =
 					i915_error_object_create(i915,
-								 engine->scratch);
+								 i915->gt.scratch);
 			request_record_user_bo(request, ee);
 
 			ee->ctx =
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 759c0fd58f8c..2390985384d6 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -493,46 +493,6 @@ void intel_engine_setup_common(struct intel_engine_cs *engine)
 	intel_engine_init_cmd_parser(engine);
 }
 
-int intel_engine_create_scratch(struct intel_engine_cs *engine,
-				unsigned int size)
-{
-	struct drm_i915_gem_object *obj;
-	struct i915_vma *vma;
-	int ret;
-
-	WARN_ON(engine->scratch);
-
-	obj = i915_gem_object_create_stolen(engine->i915, size);
-	if (!obj)
-		obj = i915_gem_object_create_internal(engine->i915, size);
-	if (IS_ERR(obj)) {
-		DRM_ERROR("Failed to allocate scratch page\n");
-		return PTR_ERR(obj);
-	}
-
-	vma = i915_vma_instance(obj, &engine->i915->ggtt.vm, NULL);
-	if (IS_ERR(vma)) {
-		ret = PTR_ERR(vma);
-		goto err_unref;
-	}
-
-	ret = i915_vma_pin(vma, 0, 0, PIN_GLOBAL | PIN_HIGH);
-	if (ret)
-		goto err_unref;
-
-	engine->scratch = vma;
-	return 0;
-
-err_unref:
-	i915_gem_object_put(obj);
-	return ret;
-}
-
-void intel_engine_cleanup_scratch(struct intel_engine_cs *engine)
-{
-	i915_vma_unpin_and_release(&engine->scratch, 0);
-}
-
 static void cleanup_status_page(struct intel_engine_cs *engine)
 {
 	if (HWS_NEEDS_PHYSICAL(engine->i915)) {
@@ -707,8 +667,6 @@ void intel_engine_cleanup_common(struct intel_engine_cs *engine)
 {
 	struct drm_i915_private *i915 = engine->i915;
 
-	intel_engine_cleanup_scratch(engine);
-
 	cleanup_status_page(engine);
 
 	intel_engine_fini_breadcrumbs(engine);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 77c23088f599..f9d24bcbc3e5 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1288,9 +1288,10 @@ static int execlists_request_alloc(struct i915_request *request)
 static u32 *
 gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
 {
+	/* NB no one else is allowed to scribble over scratch + 256! */
 	*batch++ = MI_STORE_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
 	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
+	*batch++ = i915_scratch_offset(engine->i915) + 256;
 	*batch++ = 0;
 
 	*batch++ = MI_LOAD_REGISTER_IMM(1);
@@ -1304,7 +1305,7 @@ gen8_emit_flush_coherentl3_wa(struct intel_engine_cs *engine, u32 *batch)
 
 	*batch++ = MI_LOAD_REGISTER_MEM_GEN8 | MI_SRM_LRM_GLOBAL_GTT;
 	*batch++ = i915_mmio_reg_offset(GEN8_L3SQCREG4);
-	*batch++ = i915_ggtt_offset(engine->scratch) + 256;
+	*batch++ = i915_scratch_offset(engine->i915) + 256;
 	*batch++ = 0;
 
 	return batch;
@@ -1341,7 +1342,7 @@ static u32 *gen8_init_indirectctx_bb(struct intel_engine_cs *engine, u32 *batch)
 				       PIPE_CONTROL_GLOBAL_GTT_IVB |
 				       PIPE_CONTROL_CS_STALL |
 				       PIPE_CONTROL_QW_WRITE,
-				       i915_ggtt_offset(engine->scratch) +
+				       i915_scratch_offset(engine->i915) +
 				       2 * CACHELINE_BYTES);
 
 	*batch++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
@@ -1971,7 +1972,7 @@ static int gen8_emit_flush_render(struct i915_request *request,
 {
 	struct intel_engine_cs *engine = request->engine;
 	u32 scratch_addr =
-		i915_ggtt_offset(engine->scratch) + 2 * CACHELINE_BYTES;
+		i915_scratch_offset(engine->i915) + 2 * CACHELINE_BYTES;
 	bool vf_flush_wa = false, dc_flush_wa = false;
 	u32 *cs, flags = 0;
 	int len;
@@ -2290,10 +2291,6 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
-	ret = intel_engine_create_scratch(engine, PAGE_SIZE);
-	if (ret)
-		goto err_cleanup_common;
-
 	ret = intel_init_workaround_bb(engine);
 	if (ret) {
 		/*
@@ -2306,10 +2303,6 @@ int logical_render_ring_init(struct intel_engine_cs *engine)
 	}
 
 	return 0;
-
-err_cleanup_common:
-	intel_engine_cleanup_common(engine);
-	return ret;
 }
 
 int logical_xcs_ring_init(struct intel_engine_cs *engine)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 81b10d85b738..480b359ce61e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -150,8 +150,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
 	 */
 	if (mode & EMIT_INVALIDATE) {
 		*cs++ = GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE;
-		*cs++ = i915_ggtt_offset(rq->engine->scratch) |
-			PIPE_CONTROL_GLOBAL_GTT;
+		*cs++ = i915_scratch_offset(rq->i915) | PIPE_CONTROL_GLOBAL_GTT;
 		*cs++ = 0;
 		*cs++ = 0;
 
@@ -159,8 +158,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
 			*cs++ = MI_FLUSH;
 
 		*cs++ = GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE;
-		*cs++ = i915_ggtt_offset(rq->engine->scratch) |
-			PIPE_CONTROL_GLOBAL_GTT;
+		*cs++ = i915_scratch_offset(rq->i915) | PIPE_CONTROL_GLOBAL_GTT;
 		*cs++ = 0;
 		*cs++ = 0;
 	}
@@ -212,8 +210,7 @@ gen4_render_ring_flush(struct i915_request *rq, u32 mode)
 static int
 intel_emit_post_sync_nonzero_flush(struct i915_request *rq)
 {
-	u32 scratch_addr =
-		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
+	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
 	u32 *cs;
 
 	cs = intel_ring_begin(rq, 6);
@@ -246,8 +243,7 @@ intel_emit_post_sync_nonzero_flush(struct i915_request *rq)
 static int
 gen6_render_ring_flush(struct i915_request *rq, u32 mode)
 {
-	u32 scratch_addr =
-		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
+	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
 	u32 *cs, flags = 0;
 	int ret;
 
@@ -316,8 +312,7 @@ gen7_render_ring_cs_stall_wa(struct i915_request *rq)
 static int
 gen7_render_ring_flush(struct i915_request *rq, u32 mode)
 {
-	u32 scratch_addr =
-		i915_ggtt_offset(rq->engine->scratch) + 2 * CACHELINE_BYTES;
+	u32 scratch_addr = i915_scratch_offset(rq->i915) + 2 * CACHELINE_BYTES;
 	u32 *cs, flags = 0;
 
 	/*
@@ -994,7 +989,7 @@ i965_emit_bb_start(struct i915_request *rq,
 }
 
 /* Just userspace ABI convention to limit the wa batch bo to a resonable size */
-#define I830_BATCH_LIMIT (256*1024)
+#define I830_BATCH_LIMIT SZ_256K
 #define I830_TLB_ENTRIES (2)
 #define I830_WA_SIZE max(I830_TLB_ENTRIES*4096, I830_BATCH_LIMIT)
 static int
@@ -1002,7 +997,9 @@ i830_emit_bb_start(struct i915_request *rq,
 		   u64 offset, u32 len,
 		   unsigned int dispatch_flags)
 {
-	u32 *cs, cs_offset = i915_ggtt_offset(rq->engine->scratch);
+	u32 *cs, cs_offset = i915_scratch_offset(rq->i915);
+
+	GEM_BUG_ON(rq->i915->gt.scratch->size < I830_WA_SIZE);
 
 	cs = intel_ring_begin(rq, 6);
 	if (IS_ERR(cs))
@@ -1459,7 +1456,6 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 {
 	struct i915_timeline *timeline;
 	struct intel_ring *ring;
-	unsigned int size;
 	int err;
 
 	intel_engine_setup_common(engine);
@@ -1484,21 +1480,12 @@ static int intel_init_ring_buffer(struct intel_engine_cs *engine)
 	GEM_BUG_ON(engine->buffer);
 	engine->buffer = ring;
 
-	size = PAGE_SIZE;
-	if (HAS_BROKEN_CS_TLB(engine->i915))
-		size = I830_WA_SIZE;
-	err = intel_engine_create_scratch(engine, size);
-	if (err)
-		goto err_unpin;
-
 	err = intel_engine_init_common(engine);
 	if (err)
-		goto err_scratch;
+		goto err_unpin;
 
 	return 0;
 
-err_scratch:
-	intel_engine_cleanup_scratch(engine);
 err_unpin:
 	intel_ring_unpin(ring);
 err_ring:
@@ -1572,7 +1559,7 @@ static int flush_pd_dir(struct i915_request *rq)
 	/* Stall until the page table load is complete */
 	*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
 	*cs++ = i915_mmio_reg_offset(RING_PP_DIR_BASE(engine));
-	*cs++ = i915_ggtt_offset(engine->scratch);
+	*cs++ = i915_scratch_offset(rq->i915);
 	*cs++ = MI_NOOP;
 
 	intel_ring_advance(rq, cs);
@@ -1681,7 +1668,7 @@ static inline int mi_set_context(struct i915_request *rq, u32 flags)
 			/* Insert a delay before the next switch! */
 			*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
 			*cs++ = i915_mmio_reg_offset(last_reg);
-			*cs++ = i915_ggtt_offset(engine->scratch);
+			*cs++ = i915_scratch_offset(rq->i915);
 			*cs++ = MI_NOOP;
 		}
 		*cs++ = MI_ARB_ON_OFF | MI_ARB_ENABLE;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index f4988f7bc756..096043b784f0 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -435,7 +435,6 @@ struct intel_engine_cs {
 
 	struct intel_hw_status_page status_page;
 	struct i915_ctx_workarounds wa_ctx;
-	struct i915_vma *scratch;
 
 	u32             irq_keep_mask; /* always keep these interrupts */
 	u32		irq_enable_mask; /* bitmask to enable ring interrupt */
@@ -892,10 +891,6 @@ void intel_engine_setup_common(struct intel_engine_cs *engine);
 int intel_engine_init_common(struct intel_engine_cs *engine);
 void intel_engine_cleanup_common(struct intel_engine_cs *engine);
 
-int intel_engine_create_scratch(struct intel_engine_cs *engine,
-				unsigned int size);
-void intel_engine_cleanup_scratch(struct intel_engine_cs *engine);
-
 int intel_init_render_ring_buffer(struct intel_engine_cs *engine);
 int intel_init_bsd_ring_buffer(struct intel_engine_cs *engine);
 int intel_init_blt_ring_buffer(struct intel_engine_cs *engine);
-- 
2.20.0.rc1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging
  2018-12-03 17:11   ` Tvrtko Ursulin
@ 2018-12-03 17:36     ` Chris Wilson
  2018-12-04 10:30       ` Tvrtko Ursulin
  0 siblings, 1 reply; 31+ messages in thread
From: Chris Wilson @ 2018-12-03 17:36 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-12-03 17:11:59)
> 
> On 03/12/2018 11:36, Chris Wilson wrote:
> > We inspect the requests under the assumption that they will be marked as
> > completed when they are removed from the queue. Currently however, in the
> > process of wedging the requests will be removed from the queue before they
> > are completed, so rearrange the code to complete the fences before the
> > locks are dropped.
> > 
> > <1>[  354.473346] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250
> > <6>[  354.473363] PGD 0 P4D 0
> > <4>[  354.473370] Oops: 0000 [#1] PREEMPT SMP PTI
> > <4>[  354.473380] CPU: 0 PID: 4470 Comm: gem_eio Tainted: G     U            4.20.0-rc4-CI-CI_DRM_5216+ #1
> > <4>[  354.473393] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018
> > <4>[  354.473480] RIP: 0010:__i915_schedule+0x311/0x5e0 [i915]
> > <4>[  354.473490] Code: 49 89 44 24 20 4d 89 4c 24 28 4d 89 29 44 39 b3 a0 04 00 00 7d 3a 41 8b 44 24 78 85 c0 74 13 48 8b 93 78 04 00 00 48 83 e2 fc <39> 82 50 02 00 00 79 1e 44 89 b3 a0 04 00 00 48 8d bb d0 03 00 00
> 
> This confuses me, isn't the code segment usually at the end?

*shrug* It was cut and paste.

> And then 
> you have another after the call trace which doesn't match 
> __i915_scheduel.. anyways, _this_ code seems to be this part:
> 
>                  if (node_to_request(node)->global_seqno &&
>   90d:   8b 43 78                mov    eax,DWORD PTR [rbx+0x78]
>   910:   85 c0                   test   eax,eax
>   912:   74 13                   je     927 <__i915_schedule+0x317>
>  
> i915_seqno_passed(port_request(engine->execlists.port)->global_seqno,
>   914:   49 8b 97 c0 04 00 00    mov    rdx,QWORD PTR [r15+0x4c0]
>   91b:   48 83 e2 fc             and    rdx,0xfffffffffffffffc
>                  if (node_to_request(node)->global_seqno &&
>   91f:   39 82 50 02 00 00       cmp    DWORD PTR [rdx+0x250],eax
>   925:   79 1e                   jns    945 <__i915_schedule+0x335>
> 
> > <4>[  354.473515] RSP: 0018:ffffc900001bba90 EFLAGS: 00010046
> > <4>[  354.473524] RAX: 0000000000000003 RBX: ffff8882624c8008 RCX: f34a737800000000
> > <4>[  354.473535] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8882624c8048
> > <4>[  354.473545] RBP: ffffc900001bbab0 R08: 000000005963f1f1 R09: 0000000000000000
> > <4>[  354.473556] R10: ffffc900001bba10 R11: ffff8882624c8060 R12: ffff88824fdd7b98
> > <4>[  354.473567] R13: ffff88824fdd7bb8 R14: 0000000000000001 R15: ffff88824fdd7750
> > <4>[  354.473578] FS:  00007f44b4b5b980(0000) GS:ffff888277e00000(0000) knlGS:0000000000000000
> > <4>[  354.473590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4>[  354.473599] CR2: 0000000000000250 CR3: 000000026976e000 CR4: 0000000000340ef0
> 
> Given the registers above, I think it means this - eax is global_seqno 
> of the node rq. rdx is is port_request so NULL and bang. No request in 
> port, but why would there always be one at the point we are scheduling 
> in a new request to the runnable queue?

Correct. The answer, as I chose to interpret it, is because of the
incomplete submitted+dequeued requests during cancellation which this
patch attempts to address.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2)
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (11 preceding siblings ...)
  2018-12-03 15:01 ` ✓ Fi.CI.IGT: success for series starting with [1/8] " Patchwork
@ 2018-12-03 17:42 ` Patchwork
  2018-12-03 17:44 ` ✗ Fi.CI.SPARSE: " Patchwork
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2018-12-03 17:42 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2)
URL   : https://patchwork.freedesktop.org/series/53396/
State : warning

== Summary ==

$ dim checkpatch origin/drm-tip
0c7799b14282 drm/i915: Complete the fences as they are cancelled due to wedging
-:13: WARNING:COMMIT_LOG_LONG_LINE: Possible unwrapped commit description (prefer a maximum 75 chars per line)
#13: 
<1>[  354.473346] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250

total: 0 errors, 1 warnings, 0 checks, 135 lines checked
69375fb8c710 drm/i915: Allocate a common scratch page
005a997a8239 drm/i915: Flush GPU relocs harder for gen3
-:14: ERROR:GIT_COMMIT_ID: Please use git commit description style 'commit <12+ chars of sha1> ("<title line>")' - ie: 'commit 7fa28e146994 ("drm/i915: Write GPU relocs harder with gen3")'
#14: 
References: 7fa28e146994 ("drm/i915: Write GPU relocs harder with gen3")

total: 1 errors, 0 warnings, 0 checks, 50 lines checked
b469bbe67ddb drm/i915/selftests: Reorder request allocation vs vma pinning
768429adcb4b drm/i915: Pipeline PDP updates for Braswell

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✗ Fi.CI.SPARSE: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2)
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (12 preceding siblings ...)
  2018-12-03 17:42 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2) Patchwork
@ 2018-12-03 17:44 ` Patchwork
  2018-12-04  8:40 ` ✓ Fi.CI.BAT: success " Patchwork
  2018-12-04 11:22 ` ✓ Fi.CI.IGT: " Patchwork
  15 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2018-12-03 17:44 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2)
URL   : https://patchwork.freedesktop.org/series/53396/
State : warning

== Summary ==

$ dim sparse origin/drm-tip
Sparse version: v0.5.2
Commit: drm/i915: Complete the fences as they are cancelled due to wedging
Okay!

Commit: drm/i915: Allocate a common scratch page
-drivers/gpu/drm/i915/selftests/../i915_drv.h:3571:16: warning: expression using sizeof(void)
+drivers/gpu/drm/i915/selftests/../i915_drv.h:3573:16: warning: expression using sizeof(void)

Commit: drm/i915: Flush GPU relocs harder for gen3
Okay!

Commit: drm/i915/selftests: Reorder request allocation vs vma pinning
Okay!

Commit: drm/i915: Pipeline PDP updates for Braswell
Okay!

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✓ Fi.CI.BAT: success for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2)
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (13 preceding siblings ...)
  2018-12-03 17:44 ` ✗ Fi.CI.SPARSE: " Patchwork
@ 2018-12-04  8:40 ` Patchwork
  2018-12-04 11:22 ` ✓ Fi.CI.IGT: " Patchwork
  15 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2018-12-04  8:40 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2)
URL   : https://patchwork.freedesktop.org/series/53396/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5247 -> Patchwork_11001
====================================================

Summary
-------

  **SUCCESS**

  No regressions found.

  External URL: https://patchwork.freedesktop.org/api/1.0/series/53396/revisions/2/mbox/

Known issues
------------

  Here are the changes found in Patchwork_11001 that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_exec_suspend@basic-s4-devices:
    - fi-ivb-3520m:       PASS -> FAIL [fdo#108880]

  * igt@kms_frontbuffer_tracking@basic:
    - fi-hsw-peppy:       PASS -> DMESG-WARN [fdo#102614]

  
#### Possible fixes ####

  * igt@gem_ctx_create@basic-files:
    - fi-bsw-n3050:       FAIL [fdo#108656] -> PASS

  * igt@gem_exec_suspend@basic-s3:
    - fi-cfl-8109u:       DMESG-WARN [fdo#107345] -> PASS +3

  * igt@i915_selftest@live_evict:
    - fi-bsw-kefka:       DMESG-WARN [fdo#107709] -> PASS

  
  [fdo#102614]: https://bugs.freedesktop.org/show_bug.cgi?id=102614
  [fdo#107345]: https://bugs.freedesktop.org/show_bug.cgi?id=107345
  [fdo#107709]: https://bugs.freedesktop.org/show_bug.cgi?id=107709
  [fdo#108656]: https://bugs.freedesktop.org/show_bug.cgi?id=108656
  [fdo#108880]: https://bugs.freedesktop.org/show_bug.cgi?id=108880


Participating hosts (48 -> 42)
------------------------------

  Missing    (6): fi-ilk-m540 fi-hsw-4200u fi-byt-squawks fi-bsw-cyan fi-ctg-p8600 fi-icl-y 


Build changes
-------------

    * Linux: CI_DRM_5247 -> Patchwork_11001

  CI_DRM_5247: ff3d336ae5b3ef75e12ed400fd4ccb55701c212a @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4736: 285ebfb3b7adc56586031afa5150c4e5ad40c229 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_11001: 768429adcb4be8e83e7728fcf825135f1dbd11c7 @ git://anongit.freedesktop.org/gfx-ci/linux


== Linux commits ==

768429adcb4b drm/i915: Pipeline PDP updates for Braswell
b469bbe67ddb drm/i915/selftests: Reorder request allocation vs vma pinning
005a997a8239 drm/i915: Flush GPU relocs harder for gen3
69375fb8c710 drm/i915: Allocate a common scratch page
0c7799b14282 drm/i915: Complete the fences as they are cancelled due to wedging

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_11001/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging
  2018-12-03 17:36     ` Chris Wilson
@ 2018-12-04 10:30       ` Tvrtko Ursulin
  0 siblings, 0 replies; 31+ messages in thread
From: Tvrtko Ursulin @ 2018-12-04 10:30 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/12/2018 17:36, Chris Wilson wrote:
> Quoting Tvrtko Ursulin (2018-12-03 17:11:59)
>>
>> On 03/12/2018 11:36, Chris Wilson wrote:
>>> We inspect the requests under the assumption that they will be marked as
>>> completed when they are removed from the queue. Currently however, in the
>>> process of wedging the requests will be removed from the queue before they
>>> are completed, so rearrange the code to complete the fences before the
>>> locks are dropped.
>>>
>>> <1>[  354.473346] BUG: unable to handle kernel NULL pointer dereference at 0000000000000250
>>> <6>[  354.473363] PGD 0 P4D 0
>>> <4>[  354.473370] Oops: 0000 [#1] PREEMPT SMP PTI
>>> <4>[  354.473380] CPU: 0 PID: 4470 Comm: gem_eio Tainted: G     U            4.20.0-rc4-CI-CI_DRM_5216+ #1
>>> <4>[  354.473393] Hardware name: Intel Corporation NUC7CJYH/NUC7JYB, BIOS JYGLKCPX.86A.0027.2018.0125.1347 01/25/2018
>>> <4>[  354.473480] RIP: 0010:__i915_schedule+0x311/0x5e0 [i915]
>>> <4>[  354.473490] Code: 49 89 44 24 20 4d 89 4c 24 28 4d 89 29 44 39 b3 a0 04 00 00 7d 3a 41 8b 44 24 78 85 c0 74 13 48 8b 93 78 04 00 00 48 83 e2 fc <39> 82 50 02 00 00 79 1e 44 89 b3 a0 04 00 00 48 8d bb d0 03 00 00
>>
>> This confuses me, isn't the code segment usually at the end?
> 
> *shrug* It was cut and paste.
> 
>> And then
>> you have another after the call trace which doesn't match
>> __i915_scheduel.. anyways, _this_ code seems to be this part:
>>
>>                   if (node_to_request(node)->global_seqno &&
>>    90d:   8b 43 78                mov    eax,DWORD PTR [rbx+0x78]
>>    910:   85 c0                   test   eax,eax
>>    912:   74 13                   je     927 <__i915_schedule+0x317>
>>   
>> i915_seqno_passed(port_request(engine->execlists.port)->global_seqno,
>>    914:   49 8b 97 c0 04 00 00    mov    rdx,QWORD PTR [r15+0x4c0]
>>    91b:   48 83 e2 fc             and    rdx,0xfffffffffffffffc
>>                   if (node_to_request(node)->global_seqno &&
>>    91f:   39 82 50 02 00 00       cmp    DWORD PTR [rdx+0x250],eax
>>    925:   79 1e                   jns    945 <__i915_schedule+0x335>
>>
>>> <4>[  354.473515] RSP: 0018:ffffc900001bba90 EFLAGS: 00010046
>>> <4>[  354.473524] RAX: 0000000000000003 RBX: ffff8882624c8008 RCX: f34a737800000000
>>> <4>[  354.473535] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff8882624c8048
>>> <4>[  354.473545] RBP: ffffc900001bbab0 R08: 000000005963f1f1 R09: 0000000000000000
>>> <4>[  354.473556] R10: ffffc900001bba10 R11: ffff8882624c8060 R12: ffff88824fdd7b98
>>> <4>[  354.473567] R13: ffff88824fdd7bb8 R14: 0000000000000001 R15: ffff88824fdd7750
>>> <4>[  354.473578] FS:  00007f44b4b5b980(0000) GS:ffff888277e00000(0000) knlGS:0000000000000000
>>> <4>[  354.473590] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> <4>[  354.473599] CR2: 0000000000000250 CR3: 000000026976e000 CR4: 0000000000340ef0
>>
>> Given the registers above, I think it means this - eax is global_seqno
>> of the node rq. rdx is is port_request so NULL and bang. No request in
>> port, but why would there always be one at the point we are scheduling
>> in a new request to the runnable queue?
> 
> Correct. The answer, as I chose to interpret it, is because of the
> incomplete submitted+dequeued requests during cancellation which this
> patch attempts to address.

I couldn't find any other route to this state myself, so on the basis of 
that, but with a little bit of fear from "Could it have really been so 
much simpler all along?!":

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko


_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 7/8] drm/i915/selftests: Reorder request allocation vs vma pinning
  2018-12-03 11:37 ` [PATCH 7/8] drm/i915/selftests: Reorder request allocation vs vma pinning Chris Wilson
@ 2018-12-04 11:06   ` Tvrtko Ursulin
  0 siblings, 0 replies; 31+ messages in thread
From: Tvrtko Ursulin @ 2018-12-04 11:06 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/12/2018 11:37, Chris Wilson wrote:
> Impose a restraint that we have all vma pinned for a request prior to
> its allocation. This is to simplify request construction, and should
> facilitate unravelling the lock interdependencies later.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/selftests/huge_pages.c   |  31 +++--
>   drivers/gpu/drm/i915/selftests/igt_spinner.c  |  86 ++++++------
>   .../gpu/drm/i915/selftests/intel_hangcheck.c  | 123 +++++++++---------
>   3 files changed, 119 insertions(+), 121 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/selftests/huge_pages.c b/drivers/gpu/drm/i915/selftests/huge_pages.c
> index 26c065c8d2c0..a0c7cbc212ba 100644
> --- a/drivers/gpu/drm/i915/selftests/huge_pages.c
> +++ b/drivers/gpu/drm/i915/selftests/huge_pages.c
> @@ -972,7 +972,6 @@ static int gpu_write(struct i915_vma *vma,
>   {
>   	struct i915_request *rq;
>   	struct i915_vma *batch;
> -	int flags = 0;
>   	int err;
>   
>   	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
> @@ -981,14 +980,14 @@ static int gpu_write(struct i915_vma *vma,
>   	if (err)
>   		return err;
>   
> -	rq = i915_request_alloc(engine, ctx);
> -	if (IS_ERR(rq))
> -		return PTR_ERR(rq);
> -
>   	batch = gpu_write_dw(vma, dword * sizeof(u32), value);
> -	if (IS_ERR(batch)) {
> -		err = PTR_ERR(batch);
> -		goto err_request;
> +	if (IS_ERR(batch))
> +		return PTR_ERR(batch);
> +
> +	rq = i915_request_alloc(engine, ctx);
> +	if (IS_ERR(rq)) {
> +		err = PTR_ERR(rq);
> +		goto err_batch;
>   	}
>   
>   	err = i915_vma_move_to_active(batch, rq, 0);
> @@ -996,21 +995,21 @@ static int gpu_write(struct i915_vma *vma,
>   		goto err_request;
>   
>   	i915_gem_object_set_active_reference(batch->obj);
> -	i915_vma_unpin(batch);
> -	i915_vma_close(batch);
>   
> -	err = engine->emit_bb_start(rq,
> -				    batch->node.start, batch->node.size,
> -				    flags);
> +	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
>   	if (err)
>   		goto err_request;
>   
> -	err = i915_vma_move_to_active(vma, rq, EXEC_OBJECT_WRITE);
> +	err = engine->emit_bb_start(rq,
> +				    batch->node.start, batch->node.size,
> +				    0);
> +err_request:
>   	if (err)
>   		i915_request_skip(rq, err);
> -
> -err_request:
>   	i915_request_add(rq);
> +err_batch:
> +	i915_vma_unpin(batch);
> +	i915_vma_close(batch);
>   
>   	return err;
>   }
> diff --git a/drivers/gpu/drm/i915/selftests/igt_spinner.c b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> index 8cd34f6e6859..0e70df0230b8 100644
> --- a/drivers/gpu/drm/i915/selftests/igt_spinner.c
> +++ b/drivers/gpu/drm/i915/selftests/igt_spinner.c
> @@ -68,48 +68,65 @@ static u64 hws_address(const struct i915_vma *hws,
>   	return hws->node.start + seqno_offset(rq->fence.context);
>   }
>   
> -static int emit_recurse_batch(struct igt_spinner *spin,
> -			      struct i915_request *rq,
> -			      u32 arbitration_command)
> +static int move_to_active(struct i915_vma *vma,
> +			  struct i915_request *rq,
> +			  unsigned int flags)
>   {
> -	struct i915_address_space *vm = &rq->gem_context->ppgtt->vm;
> +	int err;
> +
> +	err = i915_vma_move_to_active(vma, rq, flags);
> +	if (err)
> +		return err;
> +
> +	if (!i915_gem_object_has_active_reference(vma->obj)) {
> +		i915_gem_object_get(vma->obj);
> +		i915_gem_object_set_active_reference(vma->obj);
> +	}
> +
> +	return 0;
> +}
> +
> +struct i915_request *
> +igt_spinner_create_request(struct igt_spinner *spin,
> +			   struct i915_gem_context *ctx,
> +			   struct intel_engine_cs *engine,
> +			   u32 arbitration_command)
> +{
> +	struct i915_address_space *vm = &ctx->ppgtt->vm;
> +	struct i915_request *rq = NULL;
>   	struct i915_vma *hws, *vma;
>   	u32 *batch;
>   	int err;
>   
>   	vma = i915_vma_instance(spin->obj, vm, NULL);
>   	if (IS_ERR(vma))
> -		return PTR_ERR(vma);
> +		return ERR_CAST(vma);
>   
>   	hws = i915_vma_instance(spin->hws, vm, NULL);
>   	if (IS_ERR(hws))
> -		return PTR_ERR(hws);
> +		return ERR_CAST(hws);
>   
>   	err = i915_vma_pin(vma, 0, 0, PIN_USER);
>   	if (err)
> -		return err;
> +		return ERR_PTR(err);
>   
>   	err = i915_vma_pin(hws, 0, 0, PIN_USER);
>   	if (err)
>   		goto unpin_vma;
>   
> -	err = i915_vma_move_to_active(vma, rq, 0);
> -	if (err)
> +	rq = i915_request_alloc(engine, ctx);
> +	if (IS_ERR(rq)) {
> +		err = PTR_ERR(rq);
>   		goto unpin_hws;
> -
> -	if (!i915_gem_object_has_active_reference(vma->obj)) {
> -		i915_gem_object_get(vma->obj);
> -		i915_gem_object_set_active_reference(vma->obj);
>   	}
>   
> -	err = i915_vma_move_to_active(hws, rq, 0);
> +	err = move_to_active(vma, rq, 0);
>   	if (err)
> -		goto unpin_hws;
> +		goto cancel_rq;
>   
> -	if (!i915_gem_object_has_active_reference(hws->obj)) {
> -		i915_gem_object_get(hws->obj);
> -		i915_gem_object_set_active_reference(hws->obj);
> -	}
> +	err = move_to_active(hws, rq, 0);
> +	if (err)
> +		goto cancel_rq;
>   
>   	batch = spin->batch;
>   
> @@ -127,35 +144,18 @@ static int emit_recurse_batch(struct igt_spinner *spin,
>   
>   	i915_gem_chipset_flush(spin->i915);
>   
> -	err = rq->engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
> +	err = engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, 0);
>   
> +cancel_rq:
> +	if (err) {
> +		i915_request_skip(rq, err);
> +		i915_request_add(rq);
> +	}
>   unpin_hws:
>   	i915_vma_unpin(hws);
>   unpin_vma:
>   	i915_vma_unpin(vma);
> -	return err;
> -}
> -
> -struct i915_request *
> -igt_spinner_create_request(struct igt_spinner *spin,
> -			   struct i915_gem_context *ctx,
> -			   struct intel_engine_cs *engine,
> -			   u32 arbitration_command)
> -{
> -	struct i915_request *rq;
> -	int err;
> -
> -	rq = i915_request_alloc(engine, ctx);
> -	if (IS_ERR(rq))
> -		return rq;
> -
> -	err = emit_recurse_batch(spin, rq, arbitration_command);
> -	if (err) {
> -		i915_request_add(rq);
> -		return ERR_PTR(err);
> -	}
> -
> -	return rq;
> +	return err ? ERR_PTR(err) : rq;
>   }
>   
>   static u32
> diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> index a48fbe2557ea..0ff813ad3462 100644
> --- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> +++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
> @@ -102,52 +102,87 @@ static u64 hws_address(const struct i915_vma *hws,
>   	return hws->node.start + offset_in_page(sizeof(u32)*rq->fence.context);
>   }
>   
> -static int emit_recurse_batch(struct hang *h,
> -			      struct i915_request *rq)
> +static int move_to_active(struct i915_vma *vma,
> +			  struct i915_request *rq,
> +			  unsigned int flags)
> +{
> +	int err;
> +
> +	err = i915_vma_move_to_active(vma, rq, flags);
> +	if (err)
> +		return err;
> +
> +	if (!i915_gem_object_has_active_reference(vma->obj)) {
> +		i915_gem_object_get(vma->obj);
> +		i915_gem_object_set_active_reference(vma->obj);
> +	}
> +
> +	return 0;
> +}
> +
> +static struct i915_request *
> +hang_create_request(struct hang *h, struct intel_engine_cs *engine)
>   {
>   	struct drm_i915_private *i915 = h->i915;
>   	struct i915_address_space *vm =
> -		rq->gem_context->ppgtt ?
> -		&rq->gem_context->ppgtt->vm :
> -		&i915->ggtt.vm;
> +		h->ctx->ppgtt ? &h->ctx->ppgtt->vm : &i915->ggtt.vm;
> +	struct i915_request *rq = NULL;
>   	struct i915_vma *hws, *vma;
>   	unsigned int flags;
>   	u32 *batch;
>   	int err;
>   
> +	if (i915_gem_object_is_active(h->obj)) {
> +		struct drm_i915_gem_object *obj;
> +		void *vaddr;
> +
> +		obj = i915_gem_object_create_internal(h->i915, PAGE_SIZE);
> +		if (IS_ERR(obj))
> +			return ERR_CAST(obj);
> +
> +		vaddr = i915_gem_object_pin_map(obj,
> +						i915_coherent_map_type(h->i915));
> +		if (IS_ERR(vaddr)) {
> +			i915_gem_object_put(obj);
> +			return ERR_CAST(vaddr);
> +		}
> +
> +		i915_gem_object_unpin_map(h->obj);
> +		i915_gem_object_put(h->obj);
> +
> +		h->obj = obj;
> +		h->batch = vaddr;
> +	}
> +
>   	vma = i915_vma_instance(h->obj, vm, NULL);
>   	if (IS_ERR(vma))
> -		return PTR_ERR(vma);
> +		return ERR_CAST(vma);
>   
>   	hws = i915_vma_instance(h->hws, vm, NULL);
>   	if (IS_ERR(hws))
> -		return PTR_ERR(hws);
> +		return ERR_CAST(hws);
>   
>   	err = i915_vma_pin(vma, 0, 0, PIN_USER);
>   	if (err)
> -		return err;
> +		return ERR_PTR(err);
>   
>   	err = i915_vma_pin(hws, 0, 0, PIN_USER);
>   	if (err)
>   		goto unpin_vma;
>   
> -	err = i915_vma_move_to_active(vma, rq, 0);
> -	if (err)
> +	rq = i915_request_alloc(engine, h->ctx);
> +	if (IS_ERR(rq)) {
> +		err = PTR_ERR(rq);
>   		goto unpin_hws;
> -
> -	if (!i915_gem_object_has_active_reference(vma->obj)) {
> -		i915_gem_object_get(vma->obj);
> -		i915_gem_object_set_active_reference(vma->obj);
>   	}
>   
> -	err = i915_vma_move_to_active(hws, rq, 0);
> +	err = move_to_active(vma, rq, 0);
>   	if (err)
> -		goto unpin_hws;
> +		goto cancel_rq;
>   
> -	if (!i915_gem_object_has_active_reference(hws->obj)) {
> -		i915_gem_object_get(hws->obj);
> -		i915_gem_object_set_active_reference(hws->obj);
> -	}
> +	err = move_to_active(hws, rq, 0);
> +	if (err)
> +		goto cancel_rq;
>   
>   	batch = h->batch;
>   	if (INTEL_GEN(i915) >= 8) {
> @@ -212,52 +247,16 @@ static int emit_recurse_batch(struct hang *h,
>   
>   	err = rq->engine->emit_bb_start(rq, vma->node.start, PAGE_SIZE, flags);
>   
> +cancel_rq:
> +	if (err) {
> +		i915_request_skip(rq, err);
> +		i915_request_add(rq);
> +	}
>   unpin_hws:
>   	i915_vma_unpin(hws);
>   unpin_vma:
>   	i915_vma_unpin(vma);
> -	return err;
> -}
> -
> -static struct i915_request *
> -hang_create_request(struct hang *h, struct intel_engine_cs *engine)
> -{
> -	struct i915_request *rq;
> -	int err;
> -
> -	if (i915_gem_object_is_active(h->obj)) {
> -		struct drm_i915_gem_object *obj;
> -		void *vaddr;
> -
> -		obj = i915_gem_object_create_internal(h->i915, PAGE_SIZE);
> -		if (IS_ERR(obj))
> -			return ERR_CAST(obj);
> -
> -		vaddr = i915_gem_object_pin_map(obj,
> -						i915_coherent_map_type(h->i915));
> -		if (IS_ERR(vaddr)) {
> -			i915_gem_object_put(obj);
> -			return ERR_CAST(vaddr);
> -		}
> -
> -		i915_gem_object_unpin_map(h->obj);
> -		i915_gem_object_put(h->obj);
> -
> -		h->obj = obj;
> -		h->batch = vaddr;
> -	}
> -
> -	rq = i915_request_alloc(engine, h->ctx);
> -	if (IS_ERR(rq))
> -		return rq;
> -
> -	err = emit_recurse_batch(h, rq);
> -	if (err) {
> -		i915_request_add(rq);
> -		return ERR_PTR(err);
> -	}
> -
> -	return rq;
> +	return err ? ERR_PTR(err) : rq;
>   }
>   
>   static u32 hws_seqno(const struct hang *h, const struct i915_request *rq)
> 

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* ✓ Fi.CI.IGT: success for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2)
  2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
                   ` (14 preceding siblings ...)
  2018-12-04  8:40 ` ✓ Fi.CI.BAT: success " Patchwork
@ 2018-12-04 11:22 ` Patchwork
  15 siblings, 0 replies; 31+ messages in thread
From: Patchwork @ 2018-12-04 11:22 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2)
URL   : https://patchwork.freedesktop.org/series/53396/
State : success

== Summary ==

CI Bug Log - changes from CI_DRM_5247_full -> Patchwork_11001_full
====================================================

Summary
-------

  **WARNING**

  Minor unknown changes coming with Patchwork_11001_full need to be verified
  manually.
  
  If you think the reported changes have nothing to do with the changes
  introduced in Patchwork_11001_full, please notify your bug team to allow them
  to document this new failure mode, which will reduce false positives in CI.

  

Possible new issues
-------------------

  Here are the unknown changes that may have been introduced in Patchwork_11001_full:

### IGT changes ###

#### Warnings ####

  * igt@kms_vblank@pipe-a-wait-forked-hang:
    - shard-snb:          SKIP -> PASS

  
Known issues
------------

  Here are the changes found in Patchwork_11001_full that come from known issues:

### IGT changes ###

#### Issues hit ####

  * igt@gem_eio@reset-stress:
    - shard-glk:          PASS -> FAIL [fdo#107799]

  * igt@gem_ppgtt@blt-vs-render-ctx0:
    - shard-skl:          PASS -> TIMEOUT [fdo#108039]

  * igt@gem_userptr_blits@readonly-unsync:
    - shard-skl:          NOTRUN -> TIMEOUT [fdo#108887]

  * igt@kms_busy@extended-modeset-hang-newfb-render-c:
    - shard-skl:          NOTRUN -> DMESG-WARN [fdo#107956]

  * igt@kms_ccs@pipe-a-crc-sprite-planes-basic:
    - shard-glk:          PASS -> FAIL [fdo#108145]

  * igt@kms_chv_cursor_fail@pipe-a-128x128-bottom-edge:
    - shard-skl:          PASS -> FAIL [fdo#104671]

  * igt@kms_cursor_crc@cursor-128x128-suspend:
    - shard-skl:          PASS -> INCOMPLETE [fdo#104108]

  * igt@kms_cursor_crc@cursor-256x256-offscreen:
    - shard-skl:          NOTRUN -> FAIL [fdo#103232]

  * igt@kms_cursor_crc@cursor-256x256-random:
    - shard-glk:          PASS -> FAIL [fdo#103232] +2

  * igt@kms_cursor_crc@cursor-256x85-offscreen:
    - shard-skl:          PASS -> FAIL [fdo#103232]

  * igt@kms_cursor_crc@cursor-64x64-onscreen:
    - shard-apl:          PASS -> FAIL [fdo#103232] +1

  * igt@kms_cursor_legacy@2x-long-flip-vs-cursor-legacy:
    - shard-glk:          PASS -> FAIL [fdo#104873]

  * igt@kms_draw_crc@draw-method-xrgb2101010-mmap-cpu-untiled:
    - shard-skl:          NOTRUN -> FAIL [fdo#103184]

  * igt@kms_draw_crc@draw-method-xrgb8888-mmap-wc-untiled:
    - shard-skl:          PASS -> FAIL [fdo#108145] / [fdo#108472]

  * igt@kms_flip@2x-flip-vs-expired-vblank:
    - shard-hsw:          PASS -> FAIL [fdo#102887]

  * igt@kms_flip@flip-vs-expired-vblank:
    - shard-skl:          NOTRUN -> FAIL [fdo#105363]

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-draw-mmap-cpu:
    - shard-apl:          PASS -> FAIL [fdo#103167] +1

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-spr-indfb-fullscreen:
    - shard-glk:          PASS -> FAIL [fdo#103167] +3

  * igt@kms_frontbuffer_tracking@fbcpsr-1p-primscrn-pri-indfb-draw-mmap-gtt:
    - shard-skl:          NOTRUN -> FAIL [fdo#105682] +1

  * igt@kms_frontbuffer_tracking@fbcpsr-tilingchange:
    - shard-skl:          PASS -> FAIL [fdo#105682]

  * igt@kms_frontbuffer_tracking@psr-1p-primscrn-pri-indfb-draw-mmap-cpu:
    - shard-skl:          NOTRUN -> FAIL [fdo#103167] +2

  * igt@kms_frontbuffer_tracking@psr-rgb101010-draw-blt:
    - shard-skl:          PASS -> FAIL [fdo#103167] +2

  * igt@kms_mmap_write_crc:
    - shard-hsw:          PASS -> DMESG-WARN [fdo#102614]

  * igt@kms_plane_alpha_blend@pipe-a-coverage-7efc:
    - shard-skl:          NOTRUN -> FAIL [fdo#107815] / [fdo#108145]

  * igt@kms_plane_alpha_blend@pipe-b-alpha-transparant-fb:
    - shard-kbl:          NOTRUN -> FAIL [fdo#108145]

  * igt@kms_plane_alpha_blend@pipe-c-alpha-opaque-fb:
    - shard-skl:          NOTRUN -> FAIL [fdo#108145] +2

  * igt@kms_plane_multiple@atomic-pipe-c-tiling-x:
    - shard-glk:          PASS -> FAIL [fdo#103166] +1

  * igt@kms_rotation_crc@primary-rotation-180:
    - shard-skl:          PASS -> FAIL [fdo#103925] / [fdo#107815]

  * igt@pm_backlight@basic-brightness:
    - {shard-iclb}:       PASS -> DMESG-WARN [fdo#107724]

  * igt@pm_rpm@dpms-mode-unset-lpsp:
    - shard-skl:          PASS -> INCOMPLETE [fdo#107807]

  * igt@pm_rpm@dpms-mode-unset-non-lpsp:
    - shard-skl:          SKIP -> INCOMPLETE [fdo#107807]

  
#### Possible fixes ####

  * igt@debugfs_test@read_all_entries_display_off:
    - shard-skl:          INCOMPLETE [fdo#104108] -> PASS

  * igt@gem_ppgtt@blt-vs-render-ctxn:
    - shard-kbl:          INCOMPLETE [fdo#103665] / [fdo#106023] / [fdo#106887] -> PASS

  * igt@kms_busy@extended-pageflip-hang-newfb-render-c:
    - shard-glk:          DMESG-WARN [fdo#107956] -> PASS

  * igt@kms_cursor_crc@cursor-256x256-suspend:
    - shard-apl:          FAIL [fdo#103191] / [fdo#103232] -> PASS

  * igt@kms_cursor_crc@cursor-64x21-random:
    - shard-glk:          FAIL [fdo#103232] -> PASS

  * igt@kms_cursor_crc@cursor-64x64-random:
    - shard-skl:          FAIL [fdo#103232] -> PASS

  * igt@kms_cursor_crc@cursor-size-change:
    - shard-apl:          FAIL [fdo#103232] -> PASS

  * igt@kms_cursor_legacy@2x-nonblocking-modeset-vs-cursor-atomic:
    - shard-glk:          FAIL [fdo#105454] -> PASS

  * igt@kms_frontbuffer_tracking@fbc-1p-primscrn-cur-indfb-draw-mmap-wc:
    - shard-apl:          FAIL [fdo#103167] -> PASS

  * igt@kms_frontbuffer_tracking@fbc-2p-primscrn-spr-indfb-onoff:
    - shard-glk:          FAIL [fdo#103167] -> PASS +1

  * igt@kms_plane@plane-position-covered-pipe-b-planes:
    - shard-glk:          FAIL [fdo#103166] -> PASS +2

  * igt@kms_plane_alpha_blend@pipe-b-coverage-7efc:
    - shard-skl:          FAIL [fdo#107815] -> PASS

  * igt@kms_setmode@basic:
    - shard-kbl:          FAIL [fdo#99912] -> PASS

  * igt@perf@short-reads:
    - shard-skl:          FAIL [fdo#103183] -> PASS

  * igt@pm_rpm@gem-mmap-gtt:
    - shard-skl:          INCOMPLETE [fdo#107807] -> PASS +1

  
#### Warnings ####

  * igt@i915_suspend@shrink:
    - shard-skl:          INCOMPLETE [fdo#106886] -> DMESG-WARN [fdo#108784]
    - shard-kbl:          INCOMPLETE [fdo#103665] / [fdo#106886] -> DMESG-WARN [fdo#108784]

  
  {name}: This element is suppressed. This means it is ignored when computing
          the status of the difference (SUCCESS, WARNING, or FAILURE).

  [fdo#102614]: https://bugs.freedesktop.org/show_bug.cgi?id=102614
  [fdo#102887]: https://bugs.freedesktop.org/show_bug.cgi?id=102887
  [fdo#103166]: https://bugs.freedesktop.org/show_bug.cgi?id=103166
  [fdo#103167]: https://bugs.freedesktop.org/show_bug.cgi?id=103167
  [fdo#103183]: https://bugs.freedesktop.org/show_bug.cgi?id=103183
  [fdo#103184]: https://bugs.freedesktop.org/show_bug.cgi?id=103184
  [fdo#103191]: https://bugs.freedesktop.org/show_bug.cgi?id=103191
  [fdo#103232]: https://bugs.freedesktop.org/show_bug.cgi?id=103232
  [fdo#103665]: https://bugs.freedesktop.org/show_bug.cgi?id=103665
  [fdo#103925]: https://bugs.freedesktop.org/show_bug.cgi?id=103925
  [fdo#104108]: https://bugs.freedesktop.org/show_bug.cgi?id=104108
  [fdo#104671]: https://bugs.freedesktop.org/show_bug.cgi?id=104671
  [fdo#104873]: https://bugs.freedesktop.org/show_bug.cgi?id=104873
  [fdo#105363]: https://bugs.freedesktop.org/show_bug.cgi?id=105363
  [fdo#105454]: https://bugs.freedesktop.org/show_bug.cgi?id=105454
  [fdo#105682]: https://bugs.freedesktop.org/show_bug.cgi?id=105682
  [fdo#106023]: https://bugs.freedesktop.org/show_bug.cgi?id=106023
  [fdo#106886]: https://bugs.freedesktop.org/show_bug.cgi?id=106886
  [fdo#106887]: https://bugs.freedesktop.org/show_bug.cgi?id=106887
  [fdo#107724]: https://bugs.freedesktop.org/show_bug.cgi?id=107724
  [fdo#107799]: https://bugs.freedesktop.org/show_bug.cgi?id=107799
  [fdo#107807]: https://bugs.freedesktop.org/show_bug.cgi?id=107807
  [fdo#107815]: https://bugs.freedesktop.org/show_bug.cgi?id=107815
  [fdo#107956]: https://bugs.freedesktop.org/show_bug.cgi?id=107956
  [fdo#108039]: https://bugs.freedesktop.org/show_bug.cgi?id=108039
  [fdo#108145]: https://bugs.freedesktop.org/show_bug.cgi?id=108145
  [fdo#108472]: https://bugs.freedesktop.org/show_bug.cgi?id=108472
  [fdo#108784]: https://bugs.freedesktop.org/show_bug.cgi?id=108784
  [fdo#108887]: https://bugs.freedesktop.org/show_bug.cgi?id=108887
  [fdo#99912]: https://bugs.freedesktop.org/show_bug.cgi?id=99912


Participating hosts (7 -> 7)
------------------------------

  No changes in participating hosts


Build changes
-------------

    * Linux: CI_DRM_5247 -> Patchwork_11001

  CI_DRM_5247: ff3d336ae5b3ef75e12ed400fd4ccb55701c212a @ git://anongit.freedesktop.org/gfx-ci/linux
  IGT_4736: 285ebfb3b7adc56586031afa5150c4e5ad40c229 @ git://anongit.freedesktop.org/xorg/app/intel-gpu-tools
  Patchwork_11001: 768429adcb4be8e83e7728fcf825135f1dbd11c7 @ git://anongit.freedesktop.org/gfx-ci/linux
  piglit_4509: fdc5a4ca11124ab8413c7988896eec4c97336694 @ git://anongit.freedesktop.org/piglit

== Logs ==

For more details see: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_11001/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] drm/i915: Pipeline PDP updates for Braswell
  2018-12-03 11:37 ` [PATCH 8/8] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
@ 2018-12-04 11:53   ` Tvrtko Ursulin
  2018-12-04 12:09     ` Chris Wilson
  0 siblings, 1 reply; 31+ messages in thread
From: Tvrtko Ursulin @ 2018-12-04 11:53 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 03/12/2018 11:37, Chris Wilson wrote:
> Currently we face a severe problem on Braswell that manifests as invalid
> ppGTT accesses. The code tries to maintain the PDP (page directory
> pointers) inside the context in two ways, direct write into the context
> and a pipelined LRI update. The direct write into the context is
> fundamentally racy as it is unserialised with any access (read or write)
> the GPU is doing. By asserting that Braswell is not used with vGPU
> (currently an unsupported platform) we can eliminate the dangerous
> direct write into the context image and solely use the pipelined update.
> 
> However, the LRI of the PDP fouls up the GPU, causing it to freeze and
> take out the machine with "forcewake ack timeouts". This seems possible
> to workaround by preventing the GPU from sleeping (via means of
> disabling the power-state management interface, i.e. forcing each ring
> to remain awake) around the update.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
> References: https://bugs.freedesktop.org/show_bug.cgi?id=108714
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_gem_gtt.c     |   2 -
>   drivers/gpu/drm/i915/i915_request.c     |   5 -
>   drivers/gpu/drm/i915/intel_lrc.c        | 137 +++++++++++-------------
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   5 +-
>   4 files changed, 68 insertions(+), 81 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
> index add1fe7aeb93..62bde517d383 100644
> --- a/drivers/gpu/drm/i915/i915_gem_gtt.c
> +++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
> @@ -1423,8 +1423,6 @@ static int gen8_ppgtt_alloc_pdp(struct i915_address_space *vm,
>   			gen8_initialize_pd(vm, pd);
>   			gen8_ppgtt_set_pdpe(vm, pdp, pd, pdpe);
>   			GEM_BUG_ON(pdp->used_pdpes > i915_pdpes_per_pdp(vm));
> -
> -			mark_tlbs_dirty(i915_vm_to_ppgtt(vm));
>   		}
>   
>   		ret = gen8_ppgtt_alloc_pd(vm, pd, start, length);
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index ca95ab2f4cfa..8ab8e8e6a086 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -719,11 +719,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
>   	 */
>   	rq->head = rq->ring->emit;
>   
> -	/* Unconditionally invalidate GPU caches and TLBs. */
> -	ret = engine->emit_flush(rq, EMIT_INVALIDATE);

It seems this is still always called at least once on the common path so 
why are you moving it to the backend? Just because it is more "backendy" 
type operation? It makes sense I guess.

> -	if (ret)
> -		goto err_unwind;
> -
>   	ret = engine->request_alloc(rq);
>   	if (ret)
>   		goto err_unwind;
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index de070dca4033..1ec3f80a4472 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -363,31 +363,12 @@ execlists_context_schedule_out(struct i915_request *rq, unsigned long status)
>   	trace_i915_request_out(rq);
>   }
>   
> -static void
> -execlists_update_context_pdps(struct i915_hw_ppgtt *ppgtt, u32 *reg_state)
> -{
> -	ASSIGN_CTX_PDP(ppgtt, reg_state, 3);
> -	ASSIGN_CTX_PDP(ppgtt, reg_state, 2);
> -	ASSIGN_CTX_PDP(ppgtt, reg_state, 1);
> -	ASSIGN_CTX_PDP(ppgtt, reg_state, 0);
> -}
> -
>   static u64 execlists_update_context(struct i915_request *rq)
>   {
> -	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
>   	struct intel_context *ce = rq->hw_context;
> -	u32 *reg_state = ce->lrc_reg_state;
> -
> -	reg_state[CTX_RING_TAIL+1] = intel_ring_set_tail(rq->ring, rq->tail);
>   
> -	/*
> -	 * True 32b PPGTT with dynamic page allocation: update PDP
> -	 * registers and point the unallocated PDPs to scratch page.
> -	 * PML4 is allocated during ppgtt init, so this is not needed
> -	 * in 48-bit mode.
> -	 */
> -	if (!i915_vm_is_48bit(&ppgtt->vm))
> -		execlists_update_context_pdps(ppgtt, reg_state);
> +	ce->lrc_reg_state[CTX_RING_TAIL + 1] =
> +		intel_ring_set_tail(rq->ring, rq->tail);
>   
>   	/*
>   	 * Make sure the context image is complete before we submit it to HW.
> @@ -1240,29 +1221,80 @@ execlists_context_pin(struct intel_engine_cs *engine,
>   	return __execlists_context_pin(engine, ctx, ce);
>   }
>   
> +static int emit_pdps(struct i915_request *rq)
> +{
> +	const struct intel_engine_cs * const engine = rq->engine;
> +	struct i915_hw_ppgtt * const ppgtt = rq->gem_context->ppgtt;
> +	int err, i;
> +	u32 *cs;
> +
> +	err = engine->emit_flush(rq, EMIT_INVALIDATE);
> +	if (err)
> +		return err;
> +
> +	cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
> +	if (IS_ERR(cs))
> +		return PTR_ERR(cs);
> +
> +	/*
> +	 * Force the GPU (not just the local engine/powerwell!) to remain awake,
> +	 * or else we may kill the machine with "timed out waiting for
> +	 * forcewake ack request".
> +	 */
> +
> +	*cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;
> +	for (i = GEN8_3LVL_PDPES; i--; ) {
> +		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
> +
> +		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
> +		*cs++ = upper_32_bits(pd_daddr);
> +		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
> +		*cs++ = lower_32_bits(pd_daddr);
> +	}
> +	*cs++ = MI_NOOP;
> +
> +	intel_ring_advance(rq, cs);
> +
> +	err = engine->emit_flush(rq, EMIT_INVALIDATE);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
>   static int execlists_request_alloc(struct i915_request *request)
>   {
>   	int ret;
>   
>   	GEM_BUG_ON(!request->hw_context->pin_count);
>   
> -	/* Flush enough space to reduce the likelihood of waiting after
> +	/*
> +	 * Flush enough space to reduce the likelihood of waiting after
>   	 * we start building the request - in which case we will just
>   	 * have to repeat work.
>   	 */
>   	request->reserved_space += EXECLISTS_REQUEST_SIZE;
>   
> -	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
> -	if (ret)
> -		return ret;

Removing this in favour of what intel_ring_begin will do? But is it the 
same? Could be.. just not sure due all the arithmetic that's happening 
in these areas..

> -
> -	/* Note that after this point, we have committed to using
> +	/*
> +	 * Note that after this point, we have committed to using
>   	 * this request as it is being used to both track the
>   	 * state of engine initialisation and liveness of the
>   	 * golden renderstate above. Think twice before you try
>   	 * to cancel/unwind this request now.
>   	 */
>   
> +	/* Unconditionally invalidate GPU caches and TLBs. */
> +	if (i915_vm_is_48bit(&request->gem_context->ppgtt->vm)) {
> +		ret = request->engine->emit_flush(request, EMIT_INVALIDATE);

Excuse my ignorance, but what is the PDP update mechanism in 48-bit mode?

> +		if (ret)
> +			return ret;
> +	} else {
> +		GEM_BUG_ON(intel_vgpu_active(request->i915));
> +		ret = emit_pdps(request);
> +		if (ret)
> +			return ret;
> +	}
> +

There is always a starting EMIT_INVALIDATE on both of these two branches 
so you could pull it out before the if block.

>   	request->reserved_space -= EXECLISTS_REQUEST_SIZE;
>   	return 0;
>   }
> @@ -1832,56 +1864,11 @@ static void execlists_reset_finish(struct intel_engine_cs *engine)
>   		  atomic_read(&execlists->tasklet.count));
>   }
>   
> -static int intel_logical_ring_emit_pdps(struct i915_request *rq)
> -{
> -	struct i915_hw_ppgtt *ppgtt = rq->gem_context->ppgtt;
> -	struct intel_engine_cs *engine = rq->engine;
> -	const int num_lri_cmds = GEN8_3LVL_PDPES * 2;
> -	u32 *cs;
> -	int i;
> -
> -	cs = intel_ring_begin(rq, num_lri_cmds * 2 + 2);
> -	if (IS_ERR(cs))
> -		return PTR_ERR(cs);
> -
> -	*cs++ = MI_LOAD_REGISTER_IMM(num_lri_cmds);
> -	for (i = GEN8_3LVL_PDPES - 1; i >= 0; i--) {
> -		const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
> -
> -		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
> -		*cs++ = upper_32_bits(pd_daddr);
> -		*cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
> -		*cs++ = lower_32_bits(pd_daddr);
> -	}
> -
> -	*cs++ = MI_NOOP;
> -	intel_ring_advance(rq, cs);
> -
> -	return 0;
> -}
> -
>   static int gen8_emit_bb_start(struct i915_request *rq,
>   			      u64 offset, u32 len,
>   			      const unsigned int flags)
>   {
>   	u32 *cs;
> -	int ret;
> -
> -	/* Don't rely in hw updating PDPs, specially in lite-restore.
> -	 * Ideally, we should set Force PD Restore in ctx descriptor,
> -	 * but we can't. Force Restore would be a second option, but
> -	 * it is unsafe in case of lite-restore (because the ctx is
> -	 * not idle). PML4 is allocated during ppgtt init so this is
> -	 * not needed in 48-bit.*/
> -	if ((intel_engine_flag(rq->engine) & rq->gem_context->ppgtt->pd_dirty_rings) &&
> -	    !i915_vm_is_48bit(&rq->gem_context->ppgtt->vm) &&
> -	    !intel_vgpu_active(rq->i915)) {
> -		ret = intel_logical_ring_emit_pdps(rq);

What is the reason to move from emit_bb_start to request_alloc?

> -		if (ret)
> -			return ret;
> -
> -		rq->gem_context->ppgtt->pd_dirty_rings &= ~intel_engine_flag(rq->engine);
> -	}
>   
>   	cs = intel_ring_begin(rq, 6);
>   	if (IS_ERR(cs))
> @@ -1914,6 +1901,7 @@ static int gen8_emit_bb_start(struct i915_request *rq,
>   
>   	*cs++ = MI_ARB_ON_OFF | MI_ARB_DISABLE;
>   	*cs++ = MI_NOOP;
> +
>   	intel_ring_advance(rq, cs);
>   
>   	return 0;
> @@ -2544,6 +2532,11 @@ static void execlists_init_reg_state(u32 *regs,
>   		 * other PDP Descriptors are ignored.
>   		 */
>   		ASSIGN_CTX_PML4(ctx->ppgtt, regs);
> +	} else {
> +		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 3);
> +		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 2);
> +		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 1);
> +		ASSIGN_CTX_PDP(ctx->ppgtt, regs, 0);
>   	}
>   
>   	if (rcs) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 37bd05cef0e9..4591f568547c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -1833,11 +1833,12 @@ static int ring_request_alloc(struct i915_request *request)
>   	 */
>   	request->reserved_space += LEGACY_REQUEST_SIZE;
>   
> -	ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
> +	ret = switch_context(request);
>   	if (ret)
>   		return ret;
>   
> -	ret = switch_context(request);
> +	/* Unconditionally invalidate GPU caches and TLBs. */
> +	ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
>   	if (ret)
>   		return ret;
>   
> 

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 8/8] drm/i915: Pipeline PDP updates for Braswell
  2018-12-04 11:53   ` Tvrtko Ursulin
@ 2018-12-04 12:09     ` Chris Wilson
  0 siblings, 0 replies; 31+ messages in thread
From: Chris Wilson @ 2018-12-04 12:09 UTC (permalink / raw)
  To: Tvrtko Ursulin, intel-gfx

Quoting Tvrtko Ursulin (2018-12-04 11:53:22)
> 
> On 03/12/2018 11:37, Chris Wilson wrote:
> > Currently we face a severe problem on Braswell that manifests as invalid
> > ppGTT accesses. The code tries to maintain the PDP (page directory
> > pointers) inside the context in two ways, direct write into the context
> > and a pipelined LRI update. The direct write into the context is
> > fundamentally racy as it is unserialised with any access (read or write)
> > the GPU is doing. By asserting that Braswell is not used with vGPU
> > (currently an unsupported platform) we can eliminate the dangerous
> > direct write into the context image and solely use the pipelined update.
> > 
> > However, the LRI of the PDP fouls up the GPU, causing it to freeze and
> > take out the machine with "forcewake ack timeouts". This seems possible
> > to workaround by preventing the GPU from sleeping (via means of
> > disabling the power-state management interface, i.e. forcing each ring
> > to remain awake) around the update.
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108656
> > References: https://bugs.freedesktop.org/show_bug.cgi?id=108714
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> > ---
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index ca95ab2f4cfa..8ab8e8e6a086 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -719,11 +719,6 @@ i915_request_alloc(struct intel_engine_cs *engine, struct i915_gem_context *ctx)
> >        */
> >       rq->head = rq->ring->emit;
> >   
> > -     /* Unconditionally invalidate GPU caches and TLBs. */
> > -     ret = engine->emit_flush(rq, EMIT_INVALIDATE);
> 
> It seems this is still always called at least once on the common path so 
> why are you moving it to the backend? Just because it is more "backendy" 
> type operation? It makes sense I guess.

Yup. Having gone through many iterations, not all quite as symmetrical
as the current incarnation, specialisation of flushes to the backend
made sense. We pulled it into the core because everyone had to do their
invalidate first, but that might not actually be so true (at least, we
probably should do the invalidate after the context load in ringbuffer,
but one bug at a time).

> > +static int emit_pdps(struct i915_request *rq)
> > +{
> > +     const struct intel_engine_cs * const engine = rq->engine;
> > +     struct i915_hw_ppgtt * const ppgtt = rq->gem_context->ppgtt;
> > +     int err, i;
> > +     u32 *cs;
> > +
> > +     err = engine->emit_flush(rq, EMIT_INVALIDATE);
> > +     if (err)
> > +             return err;
> > +
> > +     cs = intel_ring_begin(rq, 4 * GEN8_3LVL_PDPES + 2);
> > +     if (IS_ERR(cs))
> > +             return PTR_ERR(cs);
> > +
> > +     /*
> > +      * Force the GPU (not just the local engine/powerwell!) to remain awake,
> > +      * or else we may kill the machine with "timed out waiting for
> > +      * forcewake ack request".
> > +      */
> > +
> > +     *cs++ = MI_LOAD_REGISTER_IMM(2 * GEN8_3LVL_PDPES) | MI_LRI_FORCE_POSTED;
> > +     for (i = GEN8_3LVL_PDPES; i--; ) {
> > +             const dma_addr_t pd_daddr = i915_page_dir_dma_addr(ppgtt, i);
> > +
> > +             *cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_UDW(engine, i));
> > +             *cs++ = upper_32_bits(pd_daddr);
> > +             *cs++ = i915_mmio_reg_offset(GEN8_RING_PDP_LDW(engine, i));
> > +             *cs++ = lower_32_bits(pd_daddr);
> > +     }
> > +     *cs++ = MI_NOOP;
> > +
> > +     intel_ring_advance(rq, cs);
> > +
> > +     err = engine->emit_flush(rq, EMIT_INVALIDATE);
> > +     if (err)
> > +             return err;
> > +
> > +     return 0;
> > +}
> > +
> >   static int execlists_request_alloc(struct i915_request *request)
> >   {
> >       int ret;
> >   
> >       GEM_BUG_ON(!request->hw_context->pin_count);
> >   
> > -     /* Flush enough space to reduce the likelihood of waiting after
> > +     /*
> > +      * Flush enough space to reduce the likelihood of waiting after
> >        * we start building the request - in which case we will just
> >        * have to repeat work.
> >        */
> >       request->reserved_space += EXECLISTS_REQUEST_SIZE;
> >   
> > -     ret = intel_ring_wait_for_space(request->ring, request->reserved_space);
> > -     if (ret)
> > -             return ret;
> 
> Removing this in favour of what intel_ring_begin will do? But is it the 
> same? Could be.. just not sure due all the arithmetic that's happening 
> in these areas..

It's the same effect. intel_ring_wait_for_space() was just a shorthand
to avoid the secondary effect of preparing the ring for the request.

> > -
> > -     /* Note that after this point, we have committed to using
> > +     /*
> > +      * Note that after this point, we have committed to using
> >        * this request as it is being used to both track the
> >        * state of engine initialisation and liveness of the
> >        * golden renderstate above. Think twice before you try
> >        * to cancel/unwind this request now.
> >        */
> >   
> > +     /* Unconditionally invalidate GPU caches and TLBs. */
> > +     if (i915_vm_is_48bit(&request->gem_context->ppgtt->vm)) {
> > +             ret = request->engine->emit_flush(request, EMIT_INVALIDATE);
> 
> Excuse my ignorance, but what is the PDP update mechanism in 48-bit mode?

The top-most level is fixed (i.e. the address register inside the context
image for the pm4l page), and the TLBs flushed by EMIT_INVALIDATE. Each
batch then traverses the new page directory tree afresh.

> > +             if (ret)
> > +                     return ret;
> > +     } else {
> > +             GEM_BUG_ON(intel_vgpu_active(request->i915));
> > +             ret = emit_pdps(request);
> > +             if (ret)
> > +                     return ret;
> > +     }
> > +
> 
> There is always a starting EMIT_INVALIDATE on both of these two branches 
> so you could pull it out before the if block.

Still playing. Though at the moment, it's 8 EMIT_FLUSH after the LRI. My
point is that the emit_pdps() is decidedly a magic sequence, dropping
that EMIT_INVALIDATE (in favour of just one after) causes "media forcewake
errors". I should probably have a much bigger "here be unknown dragons".

> >   static int gen8_emit_bb_start(struct i915_request *rq,
> >                             u64 offset, u32 len,
> >                             const unsigned int flags)
> >   {
> >       u32 *cs;
> > -     int ret;
> > -
> > -     /* Don't rely in hw updating PDPs, specially in lite-restore.
> > -      * Ideally, we should set Force PD Restore in ctx descriptor,
> > -      * but we can't. Force Restore would be a second option, but
> > -      * it is unsafe in case of lite-restore (because the ctx is
> > -      * not idle). PML4 is allocated during ppgtt init so this is
> > -      * not needed in 48-bit.*/
> > -     if ((intel_engine_flag(rq->engine) & rq->gem_context->ppgtt->pd_dirty_rings) &&
> > -         !i915_vm_is_48bit(&rq->gem_context->ppgtt->vm) &&
> > -         !intel_vgpu_active(rq->i915)) {
> > -             ret = intel_logical_ring_emit_pdps(rq);
> 
> What is the reason to move from emit_bb_start to request_alloc?

Long standing pet peeve. Imho, this is not about the BB_START but is an
integral part of TLB invalidation. So inappropriate misuse of
emit_bb_start.
-Chris
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2018-12-04 12:09 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-03 11:36 [PATCH 1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Chris Wilson
2018-12-03 11:36 ` [PATCH 2/8] drm/i915: Complete the fences as they are cancelled due to wedging Chris Wilson
2018-12-03 17:11   ` Tvrtko Ursulin
2018-12-03 17:36     ` Chris Wilson
2018-12-04 10:30       ` Tvrtko Ursulin
2018-12-03 11:36 ` [PATCH 3/8] drm/i915/ringbuffer: Clear semaphore sync registers on ring init Chris Wilson
2018-12-03 12:05   ` Mika Kuoppala
2018-12-03 12:15     ` Chris Wilson
2018-12-03 11:36 ` [PATCH 4/8] drm/i915: Allocate a common scratch page Chris Wilson
2018-12-03 15:28   ` Mika Kuoppala
2018-12-03 17:10     ` Chris Wilson
2018-12-03 17:29   ` [PATCH v2] " Chris Wilson
2018-12-03 11:36 ` [PATCH 5/8] drm/i915: Flush GPU relocs harder for gen3 Chris Wilson
2018-12-03 11:36 ` [PATCH 6/8] drm/i915/selftests: Terminate hangcheck sanitycheck forcibly Chris Wilson
2018-12-03 12:09   ` Mika Kuoppala
2018-12-03 12:17     ` Chris Wilson
2018-12-03 12:21       ` Mika Kuoppala
2018-12-03 11:37 ` [PATCH 7/8] drm/i915/selftests: Reorder request allocation vs vma pinning Chris Wilson
2018-12-04 11:06   ` Tvrtko Ursulin
2018-12-03 11:37 ` [PATCH 8/8] drm/i915: Pipeline PDP updates for Braswell Chris Wilson
2018-12-04 11:53   ` Tvrtko Ursulin
2018-12-04 12:09     ` Chris Wilson
2018-12-03 12:00 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate Patchwork
2018-12-03 12:03 ` ✗ Fi.CI.SPARSE: " Patchwork
2018-12-03 12:17 ` ✓ Fi.CI.BAT: success " Patchwork
2018-12-03 13:49 ` [PATCH 1/8] " Tvrtko Ursulin
2018-12-03 15:01 ` ✓ Fi.CI.IGT: success for series starting with [1/8] " Patchwork
2018-12-03 17:42 ` ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/8] drm/i915/breadcrumbs: Reduce missed-breadcrumb false positive rate (rev2) Patchwork
2018-12-03 17:44 ` ✗ Fi.CI.SPARSE: " Patchwork
2018-12-04  8:40 ` ✓ Fi.CI.BAT: success " Patchwork
2018-12-04 11:22 ` ✓ Fi.CI.IGT: " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.