All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()
@ 2017-02-06 13:25 Chris Wilson
  2017-02-06 13:25 ` [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer Chris Wilson
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Chris Wilson @ 2017-02-06 13:25 UTC (permalink / raw)
  To: intel-gfx

It is required that the caller declare the exact number of dwords they
wish to write into the ring. This is required for two reasons, we need
to allocate sufficient space for the entire command packet and we need
to be sure that the contents are completely written to avoid executing
stale data. The current interface requires for any bug to be caught in
review, the reader has to carefully count the number of
intel_ring_emit() between intel_ring_begin() and intel_ring_advance().
If we record the end of the packet of each intel_ring_begin() we can
also have CI check for us.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem.h         | 9 +++++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +++
 3 files changed, 13 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index a585d47c420a..412fa2794cbb 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -28,9 +28,18 @@
 #ifdef CONFIG_DRM_I915_DEBUG_GEM
 #define GEM_BUG_ON(expr) BUG_ON(expr)
 #define GEM_WARN_ON(expr) WARN_ON(expr)
+
+#define GEM_BUG_ONLY(expr) expr
+#define GEM_BUG_ONLY_DECLARE(var) var
+#define GEM_BUG_ONLY_ON(expr) GEM_BUG_ON(expr)
+
 #else
 #define GEM_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
 #define GEM_WARN_ON(expr) (BUILD_BUG_ON_INVALID(expr), 0)
+
+#define GEM_BUG_ONLY(expr) do { } while (0)
+#define GEM_BUG_ONLY_DECLARE(var)
+#define GEM_BUG_ONLY_ON(expr)
 #endif
 
 #define I915_NUM_ENGINES 5
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index d32cbba25d98..383083ef2210 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2238,6 +2238,7 @@ int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
 
 	ring->space -= bytes;
 	GEM_BUG_ON(ring->space < 0);
+	GEM_BUG_ONLY(ring->advance = ring->tail + bytes);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index b9c15cd40fbf..2c6d3655985e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -144,6 +144,8 @@ struct intel_ring {
 
 	u32 head;
 	u32 tail;
+	GEM_BUG_ONLY_DECLARE(u32 advance);
+
 	int space;
 	int size;
 	int effective_size;
@@ -516,6 +518,7 @@ static inline void intel_ring_advance(struct intel_ring *ring)
 	 * reserved for the command packet (i.e. the value passed to
 	 * intel_ring_begin()).
 	 */
+	GEM_BUG_ONLY_ON(ring->tail != ring->advance);
 }
 
 static inline u32 intel_ring_offset(struct intel_ring *ring, void *addr)
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer
  2017-02-06 13:25 [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance() Chris Wilson
@ 2017-02-06 13:25 ` Chris Wilson
  2017-02-06 13:57   ` Mika Kuoppala
  2017-02-06 14:01   ` Mika Kuoppala
  2017-02-06 13:53 ` [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance() Mika Kuoppala
  2017-02-06 18:54 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
  2 siblings, 2 replies; 9+ messages in thread
From: Chris Wilson @ 2017-02-06 13:25 UTC (permalink / raw)
  To: intel-gfx; +Cc: Mika Kuoppala

In commit 86aa7e760a67 ("drm/i915: Assert that the context-switch
completion matches our context") I added a read to the irq tasklet
handler that compared the on-chip status with that of our sw tracking,
using an unguarded read of the request pointer to get the context and
beyond. Whilst we hold a reference to the request, we do not hold
anything on the context and if we are unlucky it may be reaped from a
second thread retiring the request (since it may retire the request as
soon as the breadcrumb is complete, even before we finish processing the
context switch) as we try to read from the context pointer.

Avoid the racy read from underneath the request by storing the expected
result in the execlist_port[].

Fixes: 86aa7e760a67 ("drm/i915: Assert that the context-switch completion matches our context")
Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
---
 drivers/gpu/drm/i915/intel_lrc.c        | 7 ++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
 2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 754f77c394fb..ba39c2952438 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -350,6 +350,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		execlists_context_status_change(port[0].request,
 						INTEL_CONTEXT_SCHEDULE_IN);
 	desc[0] = execlists_update_context(port[0].request);
+	GEM_BUG_ONLY(port[0].context_id = upper_32_bits(desc[0]));
 	port[0].count++;
 
 	if (port[1].request) {
@@ -357,6 +358,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
 		execlists_context_status_change(port[1].request,
 						INTEL_CONTEXT_SCHEDULE_IN);
 		desc[1] = execlists_update_context(port[1].request);
+		GEM_BUG_ONLY(port[1].context_id = upper_32_bits(desc[1]));
 		port[1].count = 1;
 	} else {
 		desc[1] = 0;
@@ -563,9 +565,8 @@ static void intel_lrc_irq_handler(unsigned long data)
 				continue;
 
 			/* Check the context/desc id for this event matches */
-			GEM_BUG_ON(readl(buf + 2 * idx + 1) !=
-				   upper_32_bits(intel_lr_context_descriptor(port[0].request->ctx,
-									     engine)));
+			GEM_BUG_ONLY_ON(readl(buf + 2 * idx + 1) !=
+					port[0].context_id);
 
 			GEM_BUG_ON(port[0].count == 0);
 			if (--port[0].count == 0) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 2c6d3655985e..896838ca502c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -381,6 +381,7 @@ struct intel_engine_cs {
 	struct execlist_port {
 		struct drm_i915_gem_request *request;
 		unsigned int count;
+		GEM_BUG_ONLY_DECLARE(u32 context_id);
 	} execlist_port[2];
 	struct rb_root execlist_queue;
 	struct rb_node *execlist_first;
-- 
2.11.0

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()
  2017-02-06 13:25 [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance() Chris Wilson
  2017-02-06 13:25 ` [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer Chris Wilson
@ 2017-02-06 13:53 ` Mika Kuoppala
  2017-02-06 18:54 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork
  2 siblings, 0 replies; 9+ messages in thread
From: Mika Kuoppala @ 2017-02-06 13:53 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> It is required that the caller declare the exact number of dwords they
> wish to write into the ring. This is required for two reasons, we need
> to allocate sufficient space for the entire command packet and we need
> to be sure that the contents are completely written to avoid executing
> stale data. The current interface requires for any bug to be caught in
> review, the reader has to carefully count the number of
> intel_ring_emit() between intel_ring_begin() and intel_ring_advance().
> If we record the end of the packet of each intel_ring_begin() we can
> also have CI check for us.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> ---
>  drivers/gpu/drm/i915/i915_gem.h         | 9 +++++++++
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 1 +
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 3 +++
>  3 files changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
> index a585d47c420a..412fa2794cbb 100644
> --- a/drivers/gpu/drm/i915/i915_gem.h
> +++ b/drivers/gpu/drm/i915/i915_gem.h
> @@ -28,9 +28,18 @@
>  #ifdef CONFIG_DRM_I915_DEBUG_GEM
>  #define GEM_BUG_ON(expr) BUG_ON(expr)
>  #define GEM_WARN_ON(expr) WARN_ON(expr)
> +
> +#define GEM_BUG_ONLY(expr) expr
> +#define GEM_BUG_ONLY_DECLARE(var) var
> +#define GEM_BUG_ONLY_ON(expr) GEM_BUG_ON(expr)
> +
>  #else
>  #define GEM_BUG_ON(expr) BUILD_BUG_ON_INVALID(expr)
>  #define GEM_WARN_ON(expr) (BUILD_BUG_ON_INVALID(expr), 0)
> +
> +#define GEM_BUG_ONLY(expr) do { } while (0)
> +#define GEM_BUG_ONLY_DECLARE(var)
> +#define GEM_BUG_ONLY_ON(expr)
>  #endif
>  
>  #define I915_NUM_ENGINES 5
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index d32cbba25d98..383083ef2210 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2238,6 +2238,7 @@ int intel_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
>  
>  	ring->space -= bytes;
>  	GEM_BUG_ON(ring->space < 0);
> +	GEM_BUG_ONLY(ring->advance = ring->tail + bytes);
>  	return 0;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index b9c15cd40fbf..2c6d3655985e 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -144,6 +144,8 @@ struct intel_ring {
>  
>  	u32 head;
>  	u32 tail;
> +	GEM_BUG_ONLY_DECLARE(u32 advance);
> +
>  	int space;
>  	int size;
>  	int effective_size;
> @@ -516,6 +518,7 @@ static inline void intel_ring_advance(struct intel_ring *ring)
>  	 * reserved for the command packet (i.e. the value passed to
>  	 * intel_ring_begin()).
>  	 */
> +	GEM_BUG_ONLY_ON(ring->tail != ring->advance);
>  }
>  
>  static inline u32 intel_ring_offset(struct intel_ring *ring, void *addr)
> -- 
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer
  2017-02-06 13:25 ` [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer Chris Wilson
@ 2017-02-06 13:57   ` Mika Kuoppala
  2017-02-06 14:06     ` Chris Wilson
  2017-02-06 14:16     ` Chris Wilson
  2017-02-06 14:01   ` Mika Kuoppala
  1 sibling, 2 replies; 9+ messages in thread
From: Mika Kuoppala @ 2017-02-06 13:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> In commit 86aa7e760a67 ("drm/i915: Assert that the context-switch
> completion matches our context") I added a read to the irq tasklet
> handler that compared the on-chip status with that of our sw tracking,
> using an unguarded read of the request pointer to get the context and
> beyond. Whilst we hold a reference to the request, we do not hold
> anything on the context and if we are unlucky it may be reaped from a
> second thread retiring the request (since it may retire the request as
> soon as the breadcrumb is complete, even before we finish processing the
> context switch) as we try to read from the context pointer.
>

Please add warning of the possibility of context vanishing beneath
our feet. Perhaps a good spot is when we store a bug on variable
context_id.

Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>

> Avoid the racy read from underneath the request by storing the expected
> result in the execlist_port[].
>
> Fixes: 86aa7e760a67 ("drm/i915: Assert that the context-switch completion matches our context")
> Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c        | 7 ++++---
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
>  2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 754f77c394fb..ba39c2952438 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -350,6 +350,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>  		execlists_context_status_change(port[0].request,
>  						INTEL_CONTEXT_SCHEDULE_IN);
>  	desc[0] = execlists_update_context(port[0].request);
> +	GEM_BUG_ONLY(port[0].context_id = upper_32_bits(desc[0]));
>  	port[0].count++;
>  
>  	if (port[1].request) {
> @@ -357,6 +358,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>  		execlists_context_status_change(port[1].request,
>  						INTEL_CONTEXT_SCHEDULE_IN);
>  		desc[1] = execlists_update_context(port[1].request);
> +		GEM_BUG_ONLY(port[1].context_id = upper_32_bits(desc[1]));
>  		port[1].count = 1;
>  	} else {
>  		desc[1] = 0;
> @@ -563,9 +565,8 @@ static void intel_lrc_irq_handler(unsigned long data)
>  				continue;
>  
>  			/* Check the context/desc id for this event matches */
> -			GEM_BUG_ON(readl(buf + 2 * idx + 1) !=
> -				   upper_32_bits(intel_lr_context_descriptor(port[0].request->ctx,
> -									     engine)));
> +			GEM_BUG_ONLY_ON(readl(buf + 2 * idx + 1) !=
> +					port[0].context_id);
>  
>  			GEM_BUG_ON(port[0].count == 0);
>  			if (--port[0].count == 0) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 2c6d3655985e..896838ca502c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -381,6 +381,7 @@ struct intel_engine_cs {
>  	struct execlist_port {
>  		struct drm_i915_gem_request *request;
>  		unsigned int count;
> +		GEM_BUG_ONLY_DECLARE(u32 context_id);
>  	} execlist_port[2];
>  	struct rb_root execlist_queue;
>  	struct rb_node *execlist_first;
> -- 
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer
  2017-02-06 13:25 ` [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer Chris Wilson
  2017-02-06 13:57   ` Mika Kuoppala
@ 2017-02-06 14:01   ` Mika Kuoppala
  1 sibling, 0 replies; 9+ messages in thread
From: Mika Kuoppala @ 2017-02-06 14:01 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> In commit 86aa7e760a67 ("drm/i915: Assert that the context-switch
> completion matches our context") I added a read to the irq tasklet
> handler that compared the on-chip status with that of our sw tracking,
> using an unguarded read of the request pointer to get the context and
> beyond. Whilst we hold a reference to the request, we do not hold
> anything on the context and if we are unlucky it may be reaped from a
> second thread retiring the request (since it may retire the request as
> soon as the breadcrumb is complete, even before we finish processing the
> context switch) as we try to read from the context pointer.
>
> Avoid the racy read from underneath the request by storing the expected
> result in the execlist_port[].
>
> Fixes: 86aa7e760a67 ("drm/i915: Assert that the context-switch
> completion matches our context")

+ Testcase: igt/gem_ctx_create

> Reported-by: Mika Kuoppala <mika.kuoppala@intel.com>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Mika Kuoppala <mika.kuoppala@intel.com>
> ---
>  drivers/gpu/drm/i915/intel_lrc.c        | 7 ++++---
>  drivers/gpu/drm/i915/intel_ringbuffer.h | 1 +
>  2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 754f77c394fb..ba39c2952438 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -350,6 +350,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>  		execlists_context_status_change(port[0].request,
>  						INTEL_CONTEXT_SCHEDULE_IN);
>  	desc[0] = execlists_update_context(port[0].request);
> +	GEM_BUG_ONLY(port[0].context_id = upper_32_bits(desc[0]));
>  	port[0].count++;
>  
>  	if (port[1].request) {
> @@ -357,6 +358,7 @@ static void execlists_submit_ports(struct intel_engine_cs *engine)
>  		execlists_context_status_change(port[1].request,
>  						INTEL_CONTEXT_SCHEDULE_IN);
>  		desc[1] = execlists_update_context(port[1].request);
> +		GEM_BUG_ONLY(port[1].context_id = upper_32_bits(desc[1]));
>  		port[1].count = 1;
>  	} else {
>  		desc[1] = 0;
> @@ -563,9 +565,8 @@ static void intel_lrc_irq_handler(unsigned long data)
>  				continue;
>  
>  			/* Check the context/desc id for this event matches */
> -			GEM_BUG_ON(readl(buf + 2 * idx + 1) !=
> -				   upper_32_bits(intel_lr_context_descriptor(port[0].request->ctx,
> -									     engine)));
> +			GEM_BUG_ONLY_ON(readl(buf + 2 * idx + 1) !=
> +					port[0].context_id);
>  
>  			GEM_BUG_ON(port[0].count == 0);
>  			if (--port[0].count == 0) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 2c6d3655985e..896838ca502c 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -381,6 +381,7 @@ struct intel_engine_cs {
>  	struct execlist_port {
>  		struct drm_i915_gem_request *request;
>  		unsigned int count;
> +		GEM_BUG_ONLY_DECLARE(u32 context_id);
>  	} execlist_port[2];
>  	struct rb_root execlist_queue;
>  	struct rb_node *execlist_first;
> -- 
> 2.11.0
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer
  2017-02-06 13:57   ` Mika Kuoppala
@ 2017-02-06 14:06     ` Chris Wilson
  2017-02-06 14:16     ` Chris Wilson
  1 sibling, 0 replies; 9+ messages in thread
From: Chris Wilson @ 2017-02-06 14:06 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Mon, Feb 06, 2017 at 03:57:47PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > In commit 86aa7e760a67 ("drm/i915: Assert that the context-switch
> > completion matches our context") I added a read to the irq tasklet
> > handler that compared the on-chip status with that of our sw tracking,
> > using an unguarded read of the request pointer to get the context and
> > beyond. Whilst we hold a reference to the request, we do not hold
> > anything on the context and if we are unlucky it may be reaped from a
> > second thread retiring the request (since it may retire the request as
> > soon as the breadcrumb is complete, even before we finish processing the
> > context switch) as we try to read from the context pointer.
> >
> 
> Please add warning of the possibility of context vanishing beneath
> our feet. Perhaps a good spot is when we store a bug on variable
> context_id.

I prefer it in ctx-switch, since that is where we are in a softirq
section trying to operate under the minimal of locking.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer
  2017-02-06 13:57   ` Mika Kuoppala
  2017-02-06 14:06     ` Chris Wilson
@ 2017-02-06 14:16     ` Chris Wilson
  2017-02-06 15:04       ` Mika Kuoppala
  1 sibling, 1 reply; 9+ messages in thread
From: Chris Wilson @ 2017-02-06 14:16 UTC (permalink / raw)
  To: Mika Kuoppala; +Cc: intel-gfx

On Mon, Feb 06, 2017 at 03:57:47PM +0200, Mika Kuoppala wrote:
> Chris Wilson <chris@chris-wilson.co.uk> writes:
> 
> > In commit 86aa7e760a67 ("drm/i915: Assert that the context-switch
> > completion matches our context") I added a read to the irq tasklet
> > handler that compared the on-chip status with that of our sw tracking,
> > using an unguarded read of the request pointer to get the context and
> > beyond. Whilst we hold a reference to the request, we do not hold
> > anything on the context and if we are unlucky it may be reaped from a
> > second thread retiring the request (since it may retire the request as
> > soon as the breadcrumb is complete, even before we finish processing the
> > context switch) as we try to read from the context pointer.
> >
> 
> Please add warning of the possibility of context vanishing beneath
> our feet. Perhaps a good spot is when we store a bug on variable
> context_id.

diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c6c5050c79c0..937e2af2da64 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -564,6 +564,22 @@ static void intel_lrc_irq_handler(unsigned long data)
                        unsigned int idx = ++head % GEN8_CSB_ENTRIES;
                        unsigned int status = readl(buf + 2 * idx);
 
+                       /* We are flying near dragons again.
+                        *
+                        * We hold a reference to the request in execlist_port[]
+                        * but no more than that. We are operating in softirq
+                        * context and so cannot hold any mutex or sleep. That
+                        * prevents us stopping the requests we are processing
+                        * in port[] from being retired simultaneously (the
+                        * breadcrumb will be complete before we see the
+                        * context-switch). As we only hold the reference to
+                        * request, any pointer chasing underneath the request
+                        * is subject to a potential use-after-free. Thus we
+                        * store all of the bookkeeping within port[], if
+                        * required, and avoid using request itself. The
+                        * same applies to the atomic status notifier.
+                        */
+
                        if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
                                continue;

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer
  2017-02-06 14:16     ` Chris Wilson
@ 2017-02-06 15:04       ` Mika Kuoppala
  0 siblings, 0 replies; 9+ messages in thread
From: Mika Kuoppala @ 2017-02-06 15:04 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

Chris Wilson <chris@chris-wilson.co.uk> writes:

> On Mon, Feb 06, 2017 at 03:57:47PM +0200, Mika Kuoppala wrote:
>> Chris Wilson <chris@chris-wilson.co.uk> writes:
>> 
>> > In commit 86aa7e760a67 ("drm/i915: Assert that the context-switch
>> > completion matches our context") I added a read to the irq tasklet
>> > handler that compared the on-chip status with that of our sw tracking,
>> > using an unguarded read of the request pointer to get the context and
>> > beyond. Whilst we hold a reference to the request, we do not hold
>> > anything on the context and if we are unlucky it may be reaped from a
>> > second thread retiring the request (since it may retire the request as
>> > soon as the breadcrumb is complete, even before we finish processing the
>> > context switch) as we try to read from the context pointer.
>> >
>> 
>> Please add warning of the possibility of context vanishing beneath
>> our feet. Perhaps a good spot is when we store a bug on variable
>> context_id.
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index c6c5050c79c0..937e2af2da64 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -564,6 +564,22 @@ static void intel_lrc_irq_handler(unsigned long data)
>                         unsigned int idx = ++head % GEN8_CSB_ENTRIES;
>                         unsigned int status = readl(buf + 2 * idx);
>  
> +                       /* We are flying near dragons again.
> +                        *
> +                        * We hold a reference to the request in execlist_port[]
> +                        * but no more than that. We are operating in softirq
> +                        * context and so cannot hold any mutex or sleep. That
> +                        * prevents us stopping the requests we are processing
> +                        * in port[] from being retired simultaneously (the
> +                        * breadcrumb will be complete before we see the
> +                        * context-switch). As we only hold the reference to
> +                        * request, any pointer chasing underneath the request
> +                        * is subject to a potential use-after-free. Thus we
> +                        * store all of the bookkeeping within port[], if
> +                        * required, and avoid using request itself. The
> +                        * same applies to the atomic status notifier.
> +                        */
> +

Agreed that it is better in this spot and that any pointer chasing
will lead to trouble.

-Mika

>                         if (!(status & GEN8_CTX_STATUS_COMPLETED_MASK))
>                                 continue;
>
> -- 
> Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()
  2017-02-06 13:25 [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance() Chris Wilson
  2017-02-06 13:25 ` [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer Chris Wilson
  2017-02-06 13:53 ` [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance() Mika Kuoppala
@ 2017-02-06 18:54 ` Patchwork
  2 siblings, 0 replies; 9+ messages in thread
From: Patchwork @ 2017-02-06 18:54 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()
URL   : https://patchwork.freedesktop.org/series/19160/
State : failure

== Summary ==

Series 19160v1 Series without cover letter
https://patchwork.freedesktop.org/api/1.0/series/19160/revisions/1/mbox/

Test drv_hangman:
        Subgroup error-state-basic:
                pass       -> INCOMPLETE (fi-hsw-4770)
Test kms_force_connector_basic:
        Subgroup force-connector-state:
                skip       -> PASS       (fi-snb-2520m)

fi-bdw-5557u     total:252  pass:238  dwarn:0   dfail:0   fail:0   skip:14 
fi-bsw-n3050     total:252  pass:213  dwarn:0   dfail:0   fail:0   skip:39 
fi-bxt-j4205     total:252  pass:230  dwarn:0   dfail:0   fail:0   skip:22 
fi-bxt-t5700     total:83   pass:70   dwarn:0   dfail:0   fail:0   skip:12 
fi-byt-j1900     total:252  pass:225  dwarn:0   dfail:0   fail:0   skip:27 
fi-byt-n2820     total:252  pass:221  dwarn:0   dfail:0   fail:0   skip:31 
fi-hsw-4770      total:5    pass:4    dwarn:0   dfail:0   fail:0   skip:0  
fi-hsw-4770r     total:252  pass:233  dwarn:0   dfail:0   fail:0   skip:19 
fi-ilk-650       total:252  pass:199  dwarn:0   dfail:0   fail:0   skip:53 
fi-ivb-3520m     total:252  pass:231  dwarn:0   dfail:0   fail:0   skip:21 
fi-ivb-3770      total:252  pass:231  dwarn:0   dfail:0   fail:0   skip:21 
fi-kbl-7500u     total:252  pass:229  dwarn:0   dfail:0   fail:2   skip:21 
fi-skl-6260u     total:252  pass:239  dwarn:0   dfail:0   fail:0   skip:13 
fi-skl-6700hq    total:252  pass:232  dwarn:0   dfail:0   fail:0   skip:20 
fi-skl-6700k     total:252  pass:227  dwarn:4   dfail:0   fail:0   skip:21 
fi-skl-6770hq    total:252  pass:239  dwarn:0   dfail:0   fail:0   skip:13 
fi-snb-2520m     total:252  pass:221  dwarn:0   dfail:0   fail:0   skip:31 
fi-snb-2600      total:252  pass:220  dwarn:0   dfail:0   fail:0   skip:32 

6e475fdb541d8148ea949f62e7c2082e57bc6736 drm-tip: 2017y-02m-06d-16h-01m-53s UTC integration manifest
77ec998 drm/i915: Avoid unguarded reads from the request pointer
ac59067 drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance()

== Logs ==

For more details see: https://intel-gfx-ci.01.org/CI/Patchwork_3713/
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-02-06 18:54 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-06 13:25 [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance() Chris Wilson
2017-02-06 13:25 ` [PATCH 2/2] drm/i915: Avoid unguarded reads from the request pointer Chris Wilson
2017-02-06 13:57   ` Mika Kuoppala
2017-02-06 14:06     ` Chris Wilson
2017-02-06 14:16     ` Chris Wilson
2017-02-06 15:04       ` Mika Kuoppala
2017-02-06 14:01   ` Mika Kuoppala
2017-02-06 13:53 ` [PATCH 1/2] drm/i915: Mark the end of intel_ring_begin() and check in intel_ring_advance() Mika Kuoppala
2017-02-06 18:54 ` ✗ Fi.CI.BAT: failure for series starting with [1/2] " Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.