All of lore.kernel.org
 help / color / mirror / Atom feed
* [CI-ping 01/15] drm/i915: Force clean compilation with -Werror
@ 2016-04-12 20:02 Chris Wilson
  2016-04-12 20:02 ` [CI-ping 02/15] drm/i915: Disentangle i915_drv.h includes Chris Wilson
                   ` (14 more replies)
  0 siblings, 15 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: Jani Nikula

Our driver compiles clean (nowadays thanks to 0day) but for me, at least,
it would be beneficial if the compiler threw an error rather than a
warning when it found a piece of suspect code. (I use this to
compile-check patch series and want to break on the first compiler error
in order to fix the patch.)

v2: Kick off a new "Debugging" submenu for i915.ko

At this point, we applied it to the kernel and promptly kicked it out
again as it broke buildbots (due to a compiler warning on 32bits):

commit 908d759b210effb33d927a8cb6603a16448474e4
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date:   Tue May 26 07:46:21 2015 +0200

    Revert "drm/i915: Force clean compilation with -Werror"

v3: Avoid enabling -Werror for allyesconfig/allmodconfig builds, using
COMPILE_TEST as a suitable proxy suggested by Andrew Morton. (Damien)
Only make the option available for EXPERT to reinforce that the option
should not be casually enabled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Damien Lespiau <damien.lespiau@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/Kconfig.debug | 17 +++++++++++++++++
 drivers/gpu/drm/i915/Makefile      |  2 ++
 2 files changed, 19 insertions(+)

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 649a562ddf17..61485c8ba3a8 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -1,3 +1,20 @@
+config DRM_I915_WERROR
+        bool "Force GCC to throw an error instead of a warning when compiling"
+        # As this may inadvertently break the build, only allow the user
+        # to shoot oneself in the foot iff they aim really hard
+        depends on EXPERT
+        # We use the dependency on !COMPILE_TEST to not be enabled in
+        # allmodconfig or allyesconfig configurations
+        depends on !COMPILE_TEST
+        default n
+        help
+          Add -Werror to the build flags for (and only for) i915.ko.
+          Do not enable this unless you are writing code for the i915.ko module.
+
+          Recommended for driver developers only.
+
+          If in doubt, say "N".
+
 config DRM_I915_DEBUG
         bool "Enable additional driver debugging"
         depends on DRM_I915
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 7ffb51b0cbc2..0b88ba0f3c1f 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -2,6 +2,8 @@
 # Makefile for the drm device driver.  This driver provides support for the
 # Direct Rendering Infrastructure (DRI) in XFree86 4.1.0 and higher.
 
+subdir-ccflags-$(CONFIG_DRM_I915_WERROR) := -Werror
+
 # Please keep these build lists sorted!
 
 # core driver code
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 02/15] drm/i915: Disentangle i915_drv.h includes
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
@ 2016-04-12 20:02 ` Chris Wilson
  2016-04-12 20:02 ` [CI-ping 03/15] drm/i915: Add GEM debugging Kconfig option Chris Wilson
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Separate out the layers of includes (linux, drm, intel, i915) so that it
is a little easier to order our definitions between our multiple
reentrant headers. A couple of headers needed fixes to make them more
standalone (forgotten includes, forward declarations etc).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h  | 29 +++++++++++++++++------------
 drivers/gpu/drm/i915/intel_guc.h |  2 ++
 drivers/gpu/drm/i915/intel_lrc.h |  2 ++
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f5c91b01194f..f058f7d68929 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -33,27 +33,32 @@
 #include <uapi/drm/i915_drm.h>
 #include <uapi/drm/drm_fourcc.h>
 
-#include <drm/drmP.h>
-#include "i915_params.h"
-#include "i915_reg.h"
-#include "intel_bios.h"
-#include "intel_ringbuffer.h"
-#include "intel_lrc.h"
-#include "i915_gem_gtt.h"
-#include "i915_gem_render_state.h"
 #include <linux/io-mapping.h>
 #include <linux/i2c.h>
 #include <linux/i2c-algo-bit.h>
-#include <drm/intel-gtt.h>
-#include <drm/drm_legacy.h> /* for struct drm_dma_handle */
-#include <drm/drm_gem.h>
 #include <linux/backlight.h>
 #include <linux/hashtable.h>
 #include <linux/intel-iommu.h>
 #include <linux/kref.h>
 #include <linux/pm_qos.h>
-#include "intel_guc.h"
+#include <linux/shmem_fs.h>
+
+#include <drm/drmP.h>
+#include <drm/intel-gtt.h>
+#include <drm/drm_legacy.h> /* for struct drm_dma_handle */
+#include <drm/drm_gem.h>
+
+#include "i915_params.h"
+#include "i915_reg.h"
+
+#include "intel_bios.h"
 #include "intel_dpll_mgr.h"
+#include "intel_guc.h"
+#include "intel_lrc.h"
+#include "intel_ringbuffer.h"
+
+#include "i915_gem_gtt.h"
+#include "i915_gem_render_state.h"
 
 /* General customization:
  */
diff --git a/drivers/gpu/drm/i915/intel_guc.h b/drivers/gpu/drm/i915/intel_guc.h
index 73002e901ff2..3bb85b127cb0 100644
--- a/drivers/gpu/drm/i915/intel_guc.h
+++ b/drivers/gpu/drm/i915/intel_guc.h
@@ -27,6 +27,8 @@
 #include "intel_guc_fwif.h"
 #include "i915_guc_reg.h"
 
+struct drm_i915_gem_request;
+
 struct i915_guc_client {
 	struct drm_i915_gem_object *client_obj;
 	struct intel_context *owner;
diff --git a/drivers/gpu/drm/i915/intel_lrc.h b/drivers/gpu/drm/i915/intel_lrc.h
index 8de1ea536ad4..f82ea3f59211 100644
--- a/drivers/gpu/drm/i915/intel_lrc.h
+++ b/drivers/gpu/drm/i915/intel_lrc.h
@@ -24,6 +24,8 @@
 #ifndef _INTEL_LRC_H_
 #define _INTEL_LRC_H_
 
+#include "intel_ringbuffer.h"
+
 #define GEN8_LR_CONTEXT_ALIGN 4096
 
 /* Execlists regs */
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 03/15] drm/i915: Add GEM debugging Kconfig option
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
  2016-04-12 20:02 ` [CI-ping 02/15] drm/i915: Disentangle i915_drv.h includes Chris Wilson
@ 2016-04-12 20:02 ` Chris Wilson
  2016-04-12 20:02 ` [CI-ping 04/15] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:02 UTC (permalink / raw)
  To: intel-gfx

Currently there is a #define to enable extra BUG_ON for debugging
requests and associated activities. I want to expand its use to cover
all of GEM internals (so that we can saturate the code with asserts).
We can add a Kconfig option to make it easier to enable - with the usual
caveats of not enabling unless explicitly requested.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
---
 drivers/gpu/drm/i915/Kconfig.debug | 12 ++++++++++++
 drivers/gpu/drm/i915/i915_drv.h    |  1 +
 drivers/gpu/drm/i915/i915_gem.c    | 12 +++++-------
 drivers/gpu/drm/i915/i915_gem.h    | 34 ++++++++++++++++++++++++++++++++++
 4 files changed, 52 insertions(+), 7 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/i915_gem.h

diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 61485c8ba3a8..8f404103341d 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -27,3 +27,15 @@ config DRM_I915_DEBUG
 
           If in doubt, say "N".
 
+config DRM_I915_DEBUG_GEM
+        bool "Insert extra checks into the GEM internals"
+        default n
+        depends on DRM_I915_WERROR
+        help
+          Enable extra sanity checks (including BUGs) along the GEM driver
+          paths that may slow the system down and if hit hang the machine.
+
+          Recommended for driver developers only.
+
+          If in doubt, say "N".
+
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f058f7d68929..452f64acd7de 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -57,6 +57,7 @@
 #include "intel_lrc.h"
 #include "intel_ringbuffer.h"
 
+#include "i915_gem.h"
 #include "i915_gem_gtt.h"
 #include "i915_gem_render_state.h"
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b37ffea8b458..00bed3b79793 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -38,8 +38,6 @@
 #include <linux/pci.h>
 #include <linux/dma-buf.h>
 
-#define RQ_BUG_ON(expr)
-
 static void i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj);
 static void i915_gem_object_flush_cpu_write_domain(struct drm_i915_gem_object *obj);
 static void
@@ -1521,7 +1519,7 @@ i915_gem_object_wait_rendering(struct drm_i915_gem_object *obj,
 
 			i915_gem_object_retire__read(obj, i);
 		}
-		RQ_BUG_ON(obj->active);
+		GEM_BUG_ON(obj->active);
 	}
 
 	return 0;
@@ -2473,8 +2471,8 @@ void i915_vma_move_to_active(struct i915_vma *vma,
 static void
 i915_gem_object_retire__write(struct drm_i915_gem_object *obj)
 {
-	RQ_BUG_ON(obj->last_write_req == NULL);
-	RQ_BUG_ON(!(obj->active & intel_engine_flag(obj->last_write_req->engine)));
+	GEM_BUG_ON(obj->last_write_req == NULL);
+	GEM_BUG_ON(!(obj->active & intel_engine_flag(obj->last_write_req->engine)));
 
 	i915_gem_request_assign(&obj->last_write_req, NULL);
 	intel_fb_obj_flush(obj, true, ORIGIN_CS);
@@ -2485,8 +2483,8 @@ i915_gem_object_retire__read(struct drm_i915_gem_object *obj, int ring)
 {
 	struct i915_vma *vma;
 
-	RQ_BUG_ON(obj->last_read_req[ring] == NULL);
-	RQ_BUG_ON(!(obj->active & (1 << ring)));
+	GEM_BUG_ON(obj->last_read_req[ring] == NULL);
+	GEM_BUG_ON(!(obj->active & (1 << ring)));
 
 	list_del_init(&obj->engine_list[ring]);
 	i915_gem_request_assign(&obj->last_read_req[ring], NULL);
diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
new file mode 100644
index 000000000000..8292e797d9b5
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -0,0 +1,34 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __I915_GEM_H__
+#define __I915_GEM_H__
+
+#ifdef CONFIG_DRM_I915_DEBUG_GEM
+#define GEM_BUG_ON(expr) BUG_ON(expr)
+#else
+#define GEM_BUG_ON(expr)
+#endif
+
+#endif /* __I915_GEM_H__ */
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 04/15] drm/i915: Hide the atomic_read(reset_counter) behind a helper
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
  2016-04-12 20:02 ` [CI-ping 02/15] drm/i915: Disentangle i915_drv.h includes Chris Wilson
  2016-04-12 20:02 ` [CI-ping 03/15] drm/i915: Add GEM debugging Kconfig option Chris Wilson
@ 2016-04-12 20:02 ` Chris Wilson
  2016-04-12 20:02 ` [CI-ping 05/15] drm/i915: Simplify checking of GPU reset_counter in display pageflips Chris Wilson
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

This is principally a little bit of syntatic sugar to hide the
atomic_read()s throughout the code to retrieve the current reset_counter.
It also provides the other utility functions to check the reset state on the
already read reset_counter, so that (in later patches) we can read it once
and do multiple tests rather than risk the value changing between tests.

v2: Be more strict on converting existing i915_reset_in_progress() over to
the more verbose i915_reset_in_progress_or_wedged().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  4 ++--
 drivers/gpu/drm/i915/i915_drv.h         | 32 ++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem.c         | 16 ++++++++--------
 drivers/gpu/drm/i915/i915_irq.c         |  2 +-
 drivers/gpu/drm/i915/intel_display.c    | 18 +++++++++++-------
 drivers/gpu/drm/i915/intel_lrc.c        |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |  7 ++++---
 7 files changed, 55 insertions(+), 26 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 2d11b4948a74..f1470834e874 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4721,7 +4721,7 @@ i915_wedged_get(void *data, u64 *val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	*val = atomic_read(&dev_priv->gpu_error.reset_counter);
+	*val = i915_reset_counter(&dev_priv->gpu_error);
 
 	return 0;
 }
@@ -4740,7 +4740,7 @@ i915_wedged_set(void *data, u64 val)
 	 * while it is writing to 'i915_wedged'
 	 */
 
-	if (i915_reset_in_progress(&dev_priv->gpu_error))
+	if (i915_reset_in_progress_or_wedged(&dev_priv->gpu_error))
 		return -EAGAIN;
 
 	intel_runtime_pm_get(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 452f64acd7de..1053cb3f9e7b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3088,20 +3088,44 @@ void i915_gem_retire_requests_ring(struct intel_engine_cs *engine);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
 
+static inline u32 i915_reset_counter(struct i915_gpu_error *error)
+{
+	return atomic_read(&error->reset_counter);
+}
+
+static inline bool __i915_reset_in_progress(u32 reset)
+{
+	return unlikely(reset & I915_RESET_IN_PROGRESS_FLAG);
+}
+
+static inline bool __i915_reset_in_progress_or_wedged(u32 reset)
+{
+	return unlikely(reset & (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
+}
+
+static inline bool __i915_terminally_wedged(u32 reset)
+{
+	return unlikely(reset & I915_WEDGED);
+}
+
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
-	return unlikely(atomic_read(&error->reset_counter)
-			& (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
+	return __i915_reset_in_progress(i915_reset_counter(error));
+}
+
+static inline bool i915_reset_in_progress_or_wedged(struct i915_gpu_error *error)
+{
+	return __i915_reset_in_progress_or_wedged(i915_reset_counter(error));
 }
 
 static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
 {
-	return atomic_read(&error->reset_counter) & I915_WEDGED;
+	return __i915_terminally_wedged(i915_reset_counter(error));
 }
 
 static inline u32 i915_reset_count(struct i915_gpu_error *error)
 {
-	return ((atomic_read(&error->reset_counter) & ~I915_WEDGED) + 1) / 2;
+	return ((i915_reset_counter(error) & ~I915_WEDGED) + 1) / 2;
 }
 
 static inline bool i915_stop_ring_allow_ban(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 00bed3b79793..ecbe56810796 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -83,7 +83,7 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
 {
 	int ret;
 
-#define EXIT_COND (!i915_reset_in_progress(error) || \
+#define EXIT_COND (!i915_reset_in_progress_or_wedged(error) || \
 		   i915_terminally_wedged(error))
 	if (EXIT_COND)
 		return 0;
@@ -1112,7 +1112,7 @@ int
 i915_gem_check_wedge(struct i915_gpu_error *error,
 		     bool interruptible)
 {
-	if (i915_reset_in_progress(error)) {
+	if (i915_reset_in_progress_or_wedged(error)) {
 		/* Non-interruptible callers can't handle -EAGAIN, hence return
 		 * -EIO unconditionally for these. */
 		if (!interruptible)
@@ -1299,7 +1299,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 
 		/* We need to check whether any gpu reset happened in between
 		 * the caller grabbing the seqno and now ... */
-		if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
+		if (reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
 			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
 			 * is truely gone. */
 			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
@@ -1474,7 +1474,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
 		return ret;
 
 	ret = __i915_wait_request(req,
-				  atomic_read(&dev_priv->gpu_error.reset_counter),
+				  i915_reset_counter(&dev_priv->gpu_error),
 				  interruptible, NULL, NULL);
 	if (ret)
 		return ret;
@@ -1563,7 +1563,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 
 	if (readonly) {
 		struct drm_i915_gem_request *req;
@@ -3179,7 +3179,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	}
 
 	drm_gem_object_unreference(&obj->base);
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 
 	for (i = 0; i < I915_NUM_ENGINES; i++) {
 		if (obj->last_read_req[i] == NULL)
@@ -3224,7 +3224,7 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
 		struct drm_i915_private *i915 = to_i915(obj->base.dev);
 		ret = __i915_wait_request(from_req,
-					  atomic_read(&i915->gpu_error.reset_counter),
+					  i915_reset_counter(&i915->gpu_error),
 					  i915->mm.interruptible,
 					  NULL,
 					  &i915->rps.semaphores);
@@ -4205,7 +4205,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 		target = request;
 	}
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 	if (target)
 		i915_gem_request_reference(target);
 	spin_unlock(&file_priv->mm.lock);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 247d962afabb..c2269c103e30 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2501,7 +2501,7 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 	 * the reset in-progress bit is only ever set by code outside of this
 	 * work we don't need to worry about any other races.
 	 */
-	if (i915_reset_in_progress(error) && !i915_terminally_wedged(error)) {
+	if (i915_reset_in_progress_or_wedged(error) && !i915_terminally_wedged(error)) {
 		DRM_DEBUG_DRIVER("resetting chip\n");
 		kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE,
 				   reset_event);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 607dc41bcc68..0bb78009379b 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3200,10 +3200,12 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
 	struct drm_device *dev = crtc->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned reset_counter;
 	bool pending;
 
-	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
-	    intel_crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
+	if (intel_crtc->reset_counter != reset_counter ||
+	    __i915_reset_in_progress_or_wedged(reset_counter))
 		return false;
 
 	spin_lock_irq(&dev->event_lock);
@@ -10908,9 +10910,11 @@ static bool page_flip_finished(struct intel_crtc *crtc)
 {
 	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	unsigned reset_counter;
 
-	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
-	    crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
+	if (crtc->reset_counter != reset_counter ||
+	    __i915_reset_in_progress_or_wedged(reset_counter))
 		return true;
 
 	/*
@@ -11573,7 +11577,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 		goto cleanup;
 
 	atomic_inc(&intel_crtc->unpin_work_count);
-	intel_crtc->reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+	intel_crtc->reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 
 	if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev))
 		work->flip_count = I915_READ(PIPE_FLIPCOUNT_G4X(pipe)) + 1;
@@ -13419,10 +13423,10 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 		return ret;
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);
-	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
+	if (!ret && !async && !i915_reset_in_progress_or_wedged(&dev_priv->gpu_error)) {
 		u32 reset_counter;
 
-		reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+		reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 		mutex_unlock(&dev->struct_mutex);
 
 		for_each_plane_in_state(state, plane, plane_state, i) {
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index e6e69c2f2386..4d2b75e880fa 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1058,7 +1058,7 @@ void intel_logical_ring_stop(struct intel_engine_cs *engine)
 		return;
 
 	ret = intel_engine_idle(engine);
-	if (ret && !i915_reset_in_progress(&to_i915(engine->dev)->gpu_error))
+	if (ret && !i915_reset_in_progress_or_wedged(&dev_priv->gpu_error))
 		DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
 			  engine->name, ret);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 41b604e69db7..9b9cec330ff7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2363,8 +2363,8 @@ int intel_engine_idle(struct intel_engine_cs *engine)
 
 	/* Make sure we do not trigger any retires */
 	return __i915_wait_request(req,
-				   atomic_read(&to_i915(engine->dev)->gpu_error.reset_counter),
-				   to_i915(engine->dev)->mm.interruptible,
+				   i915_reset_counter(&req->i915->gpu_error),
+				   req->i915->mm.interruptible,
 				   NULL, NULL);
 }
 
@@ -3189,7 +3189,8 @@ intel_stop_engine(struct intel_engine_cs *engine)
 		return;
 
 	ret = intel_engine_idle(engine);
-	if (ret && !i915_reset_in_progress(&to_i915(engine->dev)->gpu_error))
+	if (ret &&
+	    !i915_reset_in_progress_or_wedged(&to_i915(engine->dev)->gpu_error))
 		DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
 			  engine->name, ret);
 
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 05/15] drm/i915: Simplify checking of GPU reset_counter in display pageflips
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (2 preceding siblings ...)
  2016-04-12 20:02 ` [CI-ping 04/15] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
@ 2016-04-12 20:02 ` Chris Wilson
  2016-04-12 20:03 ` [CI-ping 06/15] drm/i915: Tighten reset_counter for reset status Chris Wilson
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:02 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

If we, when we store the reset_counter for the operation, we ensure that
it is not in a wedged or in the middle of a reset, we can then assert that
if any reset occurs the reset_counter must change. Later we can just
compare the operation's reset epoch against the current counter to see
if we need to abort the operation (to handle the hang).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_display.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0bb78009379b..ec11c7ef08a2 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3198,14 +3198,12 @@ void intel_finish_reset(struct drm_device *dev)
 static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
 {
 	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	unsigned reset_counter;
 	bool pending;
 
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
-	if (intel_crtc->reset_counter != reset_counter ||
-	    __i915_reset_in_progress_or_wedged(reset_counter))
+	reset_counter = i915_reset_counter(&to_i915(dev)->gpu_error);
+	if (intel_crtc->reset_counter != reset_counter)
 		return false;
 
 	spin_lock_irq(&dev->event_lock);
@@ -10913,8 +10911,7 @@ static bool page_flip_finished(struct intel_crtc *crtc)
 	unsigned reset_counter;
 
 	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
-	if (crtc->reset_counter != reset_counter ||
-	    __i915_reset_in_progress_or_wedged(reset_counter))
+	if (crtc->reset_counter != reset_counter)
 		return true;
 
 	/*
@@ -11576,8 +11573,13 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 	if (ret)
 		goto cleanup;
 
-	atomic_inc(&intel_crtc->unpin_work_count);
 	intel_crtc->reset_counter = i915_reset_counter(&dev_priv->gpu_error);
+	if (__i915_reset_in_progress_or_wedged(intel_crtc->reset_counter)) {
+		ret = -EIO;
+		goto cleanup;
+	}
+
+	atomic_inc(&intel_crtc->unpin_work_count);
 
 	if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev))
 		work->flip_count = I915_READ(PIPE_FLIPCOUNT_G4X(pipe)) + 1;
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 06/15] drm/i915: Tighten reset_counter for reset status
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (3 preceding siblings ...)
  2016-04-12 20:02 ` [CI-ping 05/15] drm/i915: Simplify checking of GPU reset_counter in display pageflips Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-12 20:03 ` [CI-ping 07/15] drm/i915: Store the reset counter when constructing a request Chris Wilson
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

In the reset_counter, we use two bits to track a GPU hang and reset. The
low bit is a "reset-in-progress" flag that we set to signal when we need
to break waiters in order for the recovery task to grab the mutex. As
soon as the recovery task has the mutex, we can clear that flag (which
we do by incrementing the reset_counter thereby incrementing the gobal
reset epoch). By clearing that flag when the recovery task holds the
struct_mutex, we can forgo a second flag that simply tells GEM to ignore
the "reset-in-progress" flag.

The second flag we store in the reset_counter is whether the
reset failed and we consider the GPU terminally wedged. Whilst this flag
is set, all access to the GPU (at least through GEM rather than direct mmio
access) is verboten.

PS: Fun is in store, as in the future we want to move from a global
reset epoch to a per-engine reset engine with request recovery.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_debugfs.c |  4 ++--
 drivers/gpu/drm/i915/i915_drv.c     | 39 ++++++++++++++++++++++---------------
 drivers/gpu/drm/i915/i915_drv.h     |  3 ---
 drivers/gpu/drm/i915/i915_gem.c     | 27 +++++++++----------------
 drivers/gpu/drm/i915/i915_irq.c     | 21 ++------------------
 5 files changed, 36 insertions(+), 58 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index f1470834e874..1d5c2e23c47e 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4721,7 +4721,7 @@ i915_wedged_get(void *data, u64 *val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	*val = i915_reset_counter(&dev_priv->gpu_error);
+	*val = i915_terminally_wedged(&dev_priv->gpu_error);
 
 	return 0;
 }
@@ -4740,7 +4740,7 @@ i915_wedged_set(void *data, u64 val)
 	 * while it is writing to 'i915_wedged'
 	 */
 
-	if (i915_reset_in_progress_or_wedged(&dev_priv->gpu_error))
+	if (i915_reset_in_progress(&dev_priv->gpu_error))
 		return -EAGAIN;
 
 	intel_runtime_pm_get(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 29b4e79c85a6..310dc61817fa 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -880,23 +880,32 @@ int i915_resume_switcheroo(struct drm_device *dev)
 int i915_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	bool simulated;
+	struct i915_gpu_error *error = &dev_priv->gpu_error;
+	unsigned reset_counter;
 	int ret;
 
 	intel_reset_gt_powersave(dev);
 
 	mutex_lock(&dev->struct_mutex);
 
-	i915_gem_reset(dev);
+	/* Clear any previous failed attempts at recovery. Time to try again. */
+	atomic_andnot(I915_WEDGED, &error->reset_counter);
 
-	simulated = dev_priv->gpu_error.stop_rings != 0;
+	/* Clear the reset-in-progress flag and increment the reset epoch. */
+	reset_counter = atomic_inc_return(&error->reset_counter);
+	if (WARN_ON(__i915_reset_in_progress(reset_counter))) {
+		ret = -EIO;
+		goto error;
+	}
+
+	i915_gem_reset(dev);
 
 	ret = intel_gpu_reset(dev, ALL_ENGINES);
 
 	/* Also reset the gpu hangman. */
-	if (simulated) {
+	if (error->stop_rings != 0) {
 		DRM_INFO("Simulated gpu hang, resetting stop_rings\n");
-		dev_priv->gpu_error.stop_rings = 0;
+		error->stop_rings = 0;
 		if (ret == -ENODEV) {
 			DRM_INFO("Reset not implemented, but ignoring "
 				 "error for simulated gpu hangs\n");
@@ -909,8 +918,7 @@ int i915_reset(struct drm_device *dev)
 
 	if (ret) {
 		DRM_ERROR("Failed to reset chip: %i\n", ret);
-		mutex_unlock(&dev->struct_mutex);
-		return ret;
+		goto error;
 	}
 
 	intel_overlay_reset(dev_priv);
@@ -929,20 +937,14 @@ int i915_reset(struct drm_device *dev)
 	 * was running at the time of the reset (i.e. we weren't VT
 	 * switched away).
 	 */
-
-	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */
-	dev_priv->gpu_error.reload_in_reset = true;
-
 	ret = i915_gem_init_hw(dev);
-
-	dev_priv->gpu_error.reload_in_reset = false;
-
-	mutex_unlock(&dev->struct_mutex);
 	if (ret) {
 		DRM_ERROR("Failed hw init on reset %d\n", ret);
-		return ret;
+		goto error;
 	}
 
+	mutex_unlock(&dev->struct_mutex);
+
 	/*
 	 * rps/rc6 re-init is necessary to restore state lost after the
 	 * reset and the re-install of gt irqs. Skip for ironlake per
@@ -953,6 +955,11 @@ int i915_reset(struct drm_device *dev)
 		intel_enable_gt_powersave(dev);
 
 	return 0;
+
+error:
+	atomic_or(I915_WEDGED, &error->reset_counter);
+	mutex_unlock(&dev->struct_mutex);
+	return ret;
 }
 
 static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1053cb3f9e7b..1da57d309618 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1400,9 +1400,6 @@ struct i915_gpu_error {
 
 	/* For missed irq/seqno simulation. */
 	unsigned int test_irq_rings;
-
-	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset   */
-	bool reload_in_reset;
 };
 
 enum modeset_restore {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ecbe56810796..f1e91cf8cbf9 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -83,9 +83,7 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
 {
 	int ret;
 
-#define EXIT_COND (!i915_reset_in_progress_or_wedged(error) || \
-		   i915_terminally_wedged(error))
-	if (EXIT_COND)
+	if (!i915_reset_in_progress(error))
 		return 0;
 
 	/*
@@ -94,17 +92,16 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
 	 * we should simply try to bail out and fail as gracefully as possible.
 	 */
 	ret = wait_event_interruptible_timeout(error->reset_queue,
-					       EXIT_COND,
+					       !i915_reset_in_progress(error),
 					       10*HZ);
 	if (ret == 0) {
 		DRM_ERROR("Timed out waiting for the gpu reset to complete\n");
 		return -EIO;
 	} else if (ret < 0) {
 		return ret;
+	} else {
+		return 0;
 	}
-#undef EXIT_COND
-
-	return 0;
 }
 
 int i915_mutex_lock_interruptible(struct drm_device *dev)
@@ -1113,22 +1110,16 @@ i915_gem_check_wedge(struct i915_gpu_error *error,
 		     bool interruptible)
 {
 	if (i915_reset_in_progress_or_wedged(error)) {
+		/* Recovery complete, but the reset failed ... */
+		if (i915_terminally_wedged(error))
+			return -EIO;
+
 		/* Non-interruptible callers can't handle -EAGAIN, hence return
 		 * -EIO unconditionally for these. */
 		if (!interruptible)
 			return -EIO;
 
-		/* Recovery complete, but the reset failed ... */
-		if (i915_terminally_wedged(error))
-			return -EIO;
-
-		/*
-		 * Check if GPU Reset is in progress - we need intel_ring_begin
-		 * to work properly to reinit the hw state while the gpu is
-		 * still marked as reset-in-progress. Handle this with a flag.
-		 */
-		if (!error->reload_in_reset)
-			return -EAGAIN;
+		return -EAGAIN;
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index c2269c103e30..be78f5229114 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2483,7 +2483,6 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 static void i915_reset_and_wakeup(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct i915_gpu_error *error = &dev_priv->gpu_error;
 	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
 	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
 	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
@@ -2501,7 +2500,7 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 	 * the reset in-progress bit is only ever set by code outside of this
 	 * work we don't need to worry about any other races.
 	 */
-	if (i915_reset_in_progress_or_wedged(error) && !i915_terminally_wedged(error)) {
+	if (i915_reset_in_progress(&dev_priv->gpu_error)) {
 		DRM_DEBUG_DRIVER("resetting chip\n");
 		kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE,
 				   reset_event);
@@ -2529,25 +2528,9 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 
 		intel_runtime_pm_put(dev_priv);
 
-		if (ret == 0) {
-			/*
-			 * After all the gem state is reset, increment the reset
-			 * counter and wake up everyone waiting for the reset to
-			 * complete.
-			 *
-			 * Since unlock operations are a one-sided barrier only,
-			 * we need to insert a barrier here to order any seqno
-			 * updates before
-			 * the counter increment.
-			 */
-			smp_mb__before_atomic();
-			atomic_inc(&dev_priv->gpu_error.reset_counter);
-
+		if (ret == 0)
 			kobject_uevent_env(&dev->primary->kdev->kobj,
 					   KOBJ_CHANGE, reset_done_event);
-		} else {
-			atomic_or(I915_WEDGED, &error->reset_counter);
-		}
 
 		/*
 		 * Note: The wake_up also serves as a memory barrier so that
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 07/15] drm/i915: Store the reset counter when constructing a request
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (4 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 06/15] drm/i915: Tighten reset_counter for reset status Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-12 20:03 ` [CI-ping 08/15] drm/i915: Simplify reset_counter handling during atomic modesetting Chris Wilson
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

As the request is only valid during the same global reset epoch, we can
record the current reset_counter when constructing the request and reuse
it when waiting upon that request in future. This removes a very hairy
atomic check serialised by the struct_mutex at the time of waiting and
allows us to transfer those waits to a central dispatcher for all
waiters and all requests.

PS: With per-engine resets, we obviously cannot assume a global reset
epoch for the requests - a per-engine epoch makes the most sense. The
challenge then is how to handle checking in the waiter for when to break
the wait, as the fine-grained reset may also want to requeue the
request (i.e. the assumption that just because the epoch changes the
request is completed may be broken - or we just avoid breaking that
assumption with the fine-grained resets).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h         |  2 +-
 drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++----------------------
 drivers/gpu/drm/i915/i915_gem_userptr.c |  5 +----
 drivers/gpu/drm/i915/intel_display.c    |  7 +-----
 drivers/gpu/drm/i915/intel_lrc.c        |  7 ------
 drivers/gpu/drm/i915/intel_ringbuffer.c |  6 -----
 6 files changed, 16 insertions(+), 51 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 1da57d309618..8821d38c07ed 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2250,6 +2250,7 @@ struct drm_i915_gem_request {
 	/** On Which ring this request was generated */
 	struct drm_i915_private *i915;
 	struct intel_engine_cs *engine;
+	unsigned reset_counter;
 
 	 /** GEM sequence number associated with the previous request,
 	  * when the HWS breadcrumb is equal to this the GPU is processing
@@ -3155,7 +3156,6 @@ void __i915_add_request(struct drm_i915_gem_request *req,
 #define i915_add_request_no_flush(req) \
 	__i915_add_request(req, NULL, false)
 int __i915_wait_request(struct drm_i915_gem_request *req,
-			unsigned reset_counter,
 			bool interruptible,
 			s64 *timeout,
 			struct intel_rps_client *rps);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f1e91cf8cbf9..01cdebfea27a 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1213,7 +1213,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 /**
  * __i915_wait_request - wait until execution of request has finished
  * @req: duh!
- * @reset_counter: reset sequence associated with the given request
  * @interruptible: do an interruptible wait (normally yes)
  * @timeout: in - how long to wait (NULL forever); out - how much time remaining
  *
@@ -1228,7 +1227,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
  * errno with remaining time filled in timeout argument.
  */
 int __i915_wait_request(struct drm_i915_gem_request *req,
-			unsigned reset_counter,
 			bool interruptible,
 			s64 *timeout,
 			struct intel_rps_client *rps)
@@ -1290,7 +1288,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 
 		/* We need to check whether any gpu reset happened in between
 		 * the caller grabbing the seqno and now ... */
-		if (reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
+		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
 			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
 			 * is truely gone. */
 			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
@@ -1460,13 +1458,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
-	if (ret)
-		return ret;
-
-	ret = __i915_wait_request(req,
-				  i915_reset_counter(&dev_priv->gpu_error),
-				  interruptible, NULL, NULL);
+	ret = __i915_wait_request(req, interruptible, NULL, NULL);
 	if (ret)
 		return ret;
 
@@ -1541,7 +1533,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_request *requests[I915_NUM_ENGINES];
-	unsigned reset_counter;
 	int ret, i, n = 0;
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -1550,12 +1541,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	if (!obj->active)
 		return 0;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, true);
-	if (ret)
-		return ret;
-
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
-
 	if (readonly) {
 		struct drm_i915_gem_request *req;
 
@@ -1577,9 +1562,9 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	}
 
 	mutex_unlock(&dev->struct_mutex);
+	ret = 0;
 	for (i = 0; ret == 0 && i < n; i++)
-		ret = __i915_wait_request(requests[i], reset_counter, true,
-					  NULL, rps);
+		ret = __i915_wait_request(requests[i], true, NULL, rps);
 	mutex_lock(&dev->struct_mutex);
 
 	for (i = 0; i < n; i++) {
@@ -2735,6 +2720,7 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
 			 struct drm_i915_gem_request **req_out)
 {
 	struct drm_i915_private *dev_priv = to_i915(engine->dev);
+	unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 	struct drm_i915_gem_request *req;
 	int ret;
 
@@ -2743,6 +2729,11 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
 
 	*req_out = NULL;
 
+	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
+				   dev_priv->mm.interruptible);
+	if (ret)
+		return ret;
+
 	req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
 	if (req == NULL)
 		return -ENOMEM;
@@ -2754,6 +2745,7 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
 	kref_init(&req->ref);
 	req->i915 = dev_priv;
 	req->engine = engine;
+	req->reset_counter = reset_counter;
 	req->ctx  = ctx;
 	i915_gem_context_reference(req->ctx);
 
@@ -3132,11 +3124,9 @@ retire:
 int
 i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_wait *args = data;
 	struct drm_i915_gem_object *obj;
 	struct drm_i915_gem_request *req[I915_NUM_ENGINES];
-	unsigned reset_counter;
 	int i, n = 0;
 	int ret;
 
@@ -3170,7 +3160,6 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	}
 
 	drm_gem_object_unreference(&obj->base);
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 
 	for (i = 0; i < I915_NUM_ENGINES; i++) {
 		if (obj->last_read_req[i] == NULL)
@@ -3183,7 +3172,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 
 	for (i = 0; i < n; i++) {
 		if (ret == 0)
-			ret = __i915_wait_request(req[i], reset_counter, true,
+			ret = __i915_wait_request(req[i], true,
 						  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
 						  to_rps_client(file));
 		i915_gem_request_unreference__unlocked(req[i]);
@@ -3215,7 +3204,6 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
 		struct drm_i915_private *i915 = to_i915(obj->base.dev);
 		ret = __i915_wait_request(from_req,
-					  i915_reset_counter(&i915->gpu_error),
 					  i915->mm.interruptible,
 					  NULL,
 					  &i915->rps.semaphores);
@@ -4171,7 +4159,6 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	unsigned long recent_enough = jiffies - DRM_I915_THROTTLE_JIFFIES;
 	struct drm_i915_gem_request *request, *target = NULL;
-	unsigned reset_counter;
 	int ret;
 
 	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
@@ -4196,7 +4183,6 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 		target = request;
 	}
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 	if (target)
 		i915_gem_request_reference(target);
 	spin_unlock(&file_priv->mm.lock);
@@ -4204,7 +4190,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (target == NULL)
 		return 0;
 
-	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
+	ret = __i915_wait_request(target, true, NULL, NULL);
 	if (ret == 0)
 		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
 
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index bebaf75d5348..e6b5938ce6e2 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -65,7 +65,6 @@ static void wait_rendering(struct drm_i915_gem_object *obj)
 {
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_gem_request *requests[I915_NUM_ENGINES];
-	unsigned reset_counter;
 	int i, n;
 
 	if (!obj->active)
@@ -82,12 +81,10 @@ static void wait_rendering(struct drm_i915_gem_object *obj)
 		requests[n++] = i915_gem_request_reference(req);
 	}
 
-	reset_counter = atomic_read(&to_i915(dev)->gpu_error.reset_counter);
 	mutex_unlock(&dev->struct_mutex);
 
 	for (i = 0; i < n; i++)
-		__i915_wait_request(requests[i], reset_counter, false,
-				    NULL, NULL);
+		__i915_wait_request(requests[i], false, NULL, NULL);
 
 	mutex_lock(&dev->struct_mutex);
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index ec11c7ef08a2..8763d953f1df 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11365,7 +11365,6 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
 
 	if (mmio_flip->req) {
 		WARN_ON(__i915_wait_request(mmio_flip->req,
-					    mmio_flip->crtc->reset_counter,
 					    false, NULL,
 					    &mmio_flip->i915->rps.mmioflips));
 		i915_gem_request_unreference__unlocked(mmio_flip->req);
@@ -13426,9 +13425,6 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);
 	if (!ret && !async && !i915_reset_in_progress_or_wedged(&dev_priv->gpu_error)) {
-		u32 reset_counter;
-
-		reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 		mutex_unlock(&dev->struct_mutex);
 
 		for_each_plane_in_state(state, plane, plane_state, i) {
@@ -13439,8 +13435,7 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 				continue;
 
 			ret = __i915_wait_request(intel_plane_state->wait_req,
-						  reset_counter, true,
-						  NULL, NULL);
+						  true, NULL, NULL);
 
 			/* Swallow -EIO errors to allow updates during hw lockup. */
 			if (ret == -EIO)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4d2b75e880fa..027c47dff55b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -893,16 +893,9 @@ static int logical_ring_prepare(struct drm_i915_gem_request *req, int bytes)
  */
 int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
 {
-	struct drm_i915_private *dev_priv;
 	int ret;
 
 	WARN_ON(req == NULL);
-	dev_priv = req->i915;
-
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
-				   dev_priv->mm.interruptible);
-	if (ret)
-		return ret;
 
 	ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
 	if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 9b9cec330ff7..55829b5ff837 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2363,7 +2363,6 @@ int intel_engine_idle(struct intel_engine_cs *engine)
 
 	/* Make sure we do not trigger any retires */
 	return __i915_wait_request(req,
-				   i915_reset_counter(&req->i915->gpu_error),
 				   req->i915->mm.interruptible,
 				   NULL, NULL);
 }
@@ -2494,11 +2493,6 @@ int intel_ring_begin(struct drm_i915_gem_request *req,
 	engine = req->engine;
 	dev_priv = req->i915;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
-				   dev_priv->mm.interruptible);
-	if (ret)
-		return ret;
-
 	ret = __intel_ring_prepare(engine, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 08/15] drm/i915: Simplify reset_counter handling during atomic modesetting
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (5 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 07/15] drm/i915: Store the reset counter when constructing a request Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-12 20:03 ` [CI-ping 09/15] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Now that the reset_counter is stored on the request, we can rearrange
the code to handle reading the counter versus waiting during the atomic
modesetting for readibility (by deleting the hairiest of codes).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/intel_display.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 8763d953f1df..6500f77fc78e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13424,9 +13424,9 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 		return ret;
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);
-	if (!ret && !async && !i915_reset_in_progress_or_wedged(&dev_priv->gpu_error)) {
-		mutex_unlock(&dev->struct_mutex);
+	mutex_unlock(&dev->struct_mutex);
 
+	if (!ret && !async) {
 		for_each_plane_in_state(state, plane, plane_state, i) {
 			struct intel_plane_state *intel_plane_state =
 				to_intel_plane_state(plane_state);
@@ -13440,19 +13440,15 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 			/* Swallow -EIO errors to allow updates during hw lockup. */
 			if (ret == -EIO)
 				ret = 0;
-
-			if (ret)
+			if (ret) {
+				mutex_lock(&dev->struct_mutex);
+				drm_atomic_helper_cleanup_planes(dev, state);
+				mutex_unlock(&dev->struct_mutex);
 				break;
+			}
 		}
-
-		if (!ret)
-			return 0;
-
-		mutex_lock(&dev->struct_mutex);
-		drm_atomic_helper_cleanup_planes(dev, state);
 	}
 
-	mutex_unlock(&dev->struct_mutex);
 	return ret;
 }
 
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 09/15] drm/i915: Prevent leaking of -EIO from i915_wait_request()
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (6 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 08/15] drm/i915: Simplify reset_counter handling during atomic modesetting Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-12 20:03 ` [CI-ping 10/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.

If the we reset the GPU or have detected a hang and wish to reset the
GPU, the request is forcibly complete and the wait broken. Currently, we
report either -EAGAIN or -EIO in order for the caller to retreat and
restart the wait (if appropriate) after dropping and then reacquiring
the struct_mutex (essential to allow the GPU reset to proceed). However,
if we take the view that the request is complete (no further work will
be done on it by the GPU because it is dead and soon to be reset), then
we can proceed with the task at hand and then drop the struct_mutex
allowing the reset to occur. This transfers the burden of checking
whether it is safe to proceed to the caller, which in all but one
instance it is safe - completely eliminating the source of all spurious
-EIO.

Of note, we only have two API entry points where we expect that
userspace can observe an EIO. First is when submitting an execbuf, if
the GPU is terminally wedged, then the operation cannot succeed and an
-EIO is reported. Secondly, existing userspace uses the throttle ioctl
to detect an already wedged GPU before starting using HW acceleration
(or to confirm that the GPU is wedged after an error condition). So if
the GPU is wedged when the user calls throttle, also report -EIO.

v2: Split more carefully the change to i915_wait_request() and assorted
ABI from the reset handling.
v3: Add a couple of WARN_ON(EIO) to the interruptible modesetting code
so that we don't start to leak EIO there in future (and break our hang
resistant modesetting).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.h         |  2 --
 drivers/gpu/drm/i915/i915_gem.c         | 44 ++++++++++++++++-----------------
 drivers/gpu/drm/i915/i915_gem_userptr.c |  6 ++---
 drivers/gpu/drm/i915/intel_display.c    | 13 +++++-----
 drivers/gpu/drm/i915/intel_lrc.c        |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |  3 +--
 6 files changed, 32 insertions(+), 38 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8821d38c07ed..061ecc43d935 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3083,8 +3083,6 @@ i915_gem_find_active_request(struct intel_engine_cs *engine);
 
 bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *engine);
-int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
-				      bool interruptible);
 
 static inline u32 i915_reset_counter(struct i915_gpu_error *error)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 01cdebfea27a..765efa88db32 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -206,11 +206,10 @@ i915_gem_object_put_pages_phys(struct drm_i915_gem_object *obj)
 	BUG_ON(obj->madv == __I915_MADV_PURGED);
 
 	ret = i915_gem_object_set_to_cpu_domain(obj, true);
-	if (ret) {
+	if (WARN_ON(ret)) {
 		/* In the event of a disaster, abandon all caches and
 		 * hope for the best.
 		 */
-		WARN_ON(ret != -EIO);
 		obj->base.read_domains = obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	}
 
@@ -1105,15 +1104,13 @@ put_rpm:
 	return ret;
 }
 
-int
-i915_gem_check_wedge(struct i915_gpu_error *error,
-		     bool interruptible)
+static int
+i915_gem_check_wedge(unsigned reset_counter, bool interruptible)
 {
-	if (i915_reset_in_progress_or_wedged(error)) {
-		/* Recovery complete, but the reset failed ... */
-		if (i915_terminally_wedged(error))
-			return -EIO;
+	if (__i915_terminally_wedged(reset_counter))
+		return -EIO;
 
+	if (__i915_reset_in_progress(reset_counter)) {
 		/* Non-interruptible callers can't handle -EAGAIN, hence return
 		 * -EIO unconditionally for these. */
 		if (!interruptible)
@@ -1287,13 +1284,14 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 		prepare_to_wait(&engine->irq_queue, &wait, state);
 
 		/* We need to check whether any gpu reset happened in between
-		 * the caller grabbing the seqno and now ... */
+		 * the request being submitted and now. If a reset has occurred,
+		 * the request is effectively complete (we either are in the
+		 * process of or have discarded the rendering and completely
+		 * reset the GPU. The results of the request are lost and we
+		 * are free to continue on with the original operation.
+		 */
 		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
-			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
-			 * is truely gone. */
-			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
-			if (ret == 0)
-				ret = -EAGAIN;
+			ret = 0;
 			break;
 		}
 
@@ -2154,11 +2152,10 @@ i915_gem_object_put_pages_gtt(struct drm_i915_gem_object *obj)
 	BUG_ON(obj->madv == __I915_MADV_PURGED);
 
 	ret = i915_gem_object_set_to_cpu_domain(obj, true);
-	if (ret) {
+	if (WARN_ON(ret)) {
 		/* In the event of a disaster, abandon all caches and
 		 * hope for the best.
 		 */
-		WARN_ON(ret != -EIO);
 		i915_gem_clflush_object(obj, true);
 		obj->base.read_domains = obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	}
@@ -2729,8 +2726,11 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
 
 	*req_out = NULL;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
-				   dev_priv->mm.interruptible);
+	/* ABI: Before userspace accesses the GPU (e.g. execbuffer), report
+	 * EIO if the GPU is already wedged, or EAGAIN to drop the struct_mutex
+	 * and restart.
+	 */
+	ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
 	if (ret)
 		return ret;
 
@@ -4165,9 +4165,9 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (ret)
 		return ret;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
-	if (ret)
-		return ret;
+	/* ABI: return -EIO if already wedged */
+	if (i915_terminally_wedged(&dev_priv->gpu_error))
+		return -EIO;
 
 	spin_lock(&file_priv->mm.lock);
 	list_for_each_entry(request, &file_priv->mm.request_list, client_list) {
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index e6b5938ce6e2..32d9726e38b1 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -112,10 +112,8 @@ static void cancel_userptr(struct work_struct *work)
 		was_interruptible = dev_priv->mm.interruptible;
 		dev_priv->mm.interruptible = false;
 
-		list_for_each_entry_safe(vma, tmp, &obj->vma_list, obj_link) {
-			int ret = i915_vma_unbind(vma);
-			WARN_ON(ret && ret != -EIO);
-		}
+		list_for_each_entry_safe(vma, tmp, &obj->vma_list, obj_link)
+			WARN_ON(i915_vma_unbind(vma));
 		WARN_ON(i915_gem_object_put_pages(obj));
 
 		dev_priv->mm.interruptible = was_interruptible;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 6500f77fc78e..b1b457864e17 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -13436,11 +13436,9 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 
 			ret = __i915_wait_request(intel_plane_state->wait_req,
 						  true, NULL, NULL);
-
-			/* Swallow -EIO errors to allow updates during hw lockup. */
-			if (ret == -EIO)
-				ret = 0;
 			if (ret) {
+				/* Any hang should be swallowed by the wait */
+				WARN_ON(ret == -EIO);
 				mutex_lock(&dev->struct_mutex);
 				drm_atomic_helper_cleanup_planes(dev, state);
 				mutex_unlock(&dev->struct_mutex);
@@ -13792,10 +13790,11 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 		 */
 		if (needs_modeset(crtc_state))
 			ret = i915_gem_object_wait_rendering(old_obj, true);
-
-		/* Swallow -EIO errors to allow updates during hw lockup. */
-		if (ret && ret != -EIO)
+		if (ret) {
+			/* GPU hangs should have been swallowed by the wait */
+			WARN_ON(ret == -EIO);
 			return ret;
+		}
 	}
 
 	/* For framebuffer backed by dmabuf, wait for fence */
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 027c47dff55b..b8f6b96472a6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1051,7 +1051,7 @@ void intel_logical_ring_stop(struct intel_engine_cs *engine)
 		return;
 
 	ret = intel_engine_idle(engine);
-	if (ret && !i915_reset_in_progress_or_wedged(&dev_priv->gpu_error))
+	if (ret)
 		DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
 			  engine->name, ret);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 55829b5ff837..8391382431b2 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -3183,8 +3183,7 @@ intel_stop_engine(struct intel_engine_cs *engine)
 		return;
 
 	ret = intel_engine_idle(engine);
-	if (ret &&
-	    !i915_reset_in_progress_or_wedged(&to_i915(engine->dev)->gpu_error))
+	if (ret)
 		DRM_ERROR("failed to quiesce %s whilst cleaning up: %d\n",
 			  engine->name, ret);
 
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 10/15] drm/i915: Suppress error message when GPU resets are disabled
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (7 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 09/15] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

If we do not have lowlevel support for reseting the GPU, or if the user
has explicitly disabled reseting the device, the failure is expected.
Since it is an expected failure, we should be using a lower priority
message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
emit the expected failure as a DEBUG message.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_drv.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 310dc61817fa..18eb3e698763 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -917,7 +917,10 @@ int i915_reset(struct drm_device *dev)
 		pr_notice("drm/i915: Resetting chip after gpu hang\n");
 
 	if (ret) {
-		DRM_ERROR("Failed to reset chip: %i\n", ret);
+		if (ret != -ENODEV)
+			DRM_ERROR("Failed to reset chip: %i\n", ret);
+		else
+			DRM_DEBUG_DRIVER("GPU reset disabled\n");
 		goto error;
 	}
 
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (8 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 10/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-13  9:33     ` Daniel Vetter
  2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, Daniel Vetter, Ville Syrjälä, stable

Two concurrent writes into the same register cacheline has the chance of
killing the machine on Ivybridge and other gen7. This includes LRI
emitted from the command parser.  The MI_SET_CONTEXT itself serves as
serialising barrier and prevents the pair of register writes in the first
packet from triggering the fault.  However, if a second switch-context
immediately occurs then we may have two adjacent blocks of LRI to the
same registers which may then trigger the hang. To counteract this we
need to insert a delay after the second register write using SRM.

This is easiest to reproduce with something like
igt/gem_ctx_switch/interruptible that triggers back-to-back context
switches (with no operations in between them in the command stream,
which requires the execbuf operation to be interrupted after the
MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
interruptible igt. No reports from the wild though, so it must be of low
enough frequency that no one has correlated the random machine freezes
with i915.ko

The issue was introduced with
commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Dec 16 10:02:27 2014 +0000

    drm/i915: Disable PSMI sleep messages on all rings around context switches

Testcase: igt/gem_ctx_switch/render-interruptible #ivb
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index fe580cb9501a..e5ad7b21e356 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 
 	len = 4;
 	if (INTEL_INFO(engine->dev)->gen >= 7)
-		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
+		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
 
 	ret = intel_ring_begin(req, len);
 	if (ret)
@@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	if (INTEL_INFO(engine->dev)->gen >= 7) {
 		if (num_rings) {
 			struct intel_engine_cs *signaller;
+			i915_reg_t last_reg = {}; /* keep gcc quiet */
 
 			intel_ring_emit(engine,
 					MI_LOAD_REGISTER_IMM(num_rings));
@@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 				if (signaller == engine)
 					continue;
 
-				intel_ring_emit_reg(engine,
-						    RING_PSMI_CTL(signaller->mmio_base));
+				last_reg = RING_PSMI_CTL(signaller->mmio_base);
+				intel_ring_emit_reg(engine, last_reg);
 				intel_ring_emit(engine,
 						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
 			}
+
+			/* Insert a delay before the next switch! */
+			intel_ring_emit(engine,
+					MI_STORE_REGISTER_MEM |
+					MI_SRM_LRM_GLOBAL_GTT);
+			intel_ring_emit_reg(engine, last_reg);
+			intel_ring_emit(engine, engine->scratch.gtt_offset);
+			intel_ring_emit(engine, MI_NOOP);
 		}
 		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
 	}
-- 
2.8.0.rc3


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (9 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-13  9:34   ` [Intel-gfx] " Daniel Vetter
  2016-04-12 20:03 ` [CI-ping 13/15] drm/i915: Move the mb() following release-mmap into release-mmap Chris Wilson
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, stable

For reasons unknown Sandybridge GT1 (at least) will eventually hang when
it encounters a ring wraparound at offset 0. The test case that
reproduces the bug reliably forces a large number of interrupted context
switches, thereby causing very frequent ring wraparounds, but there are
similar bug reports in the wild with the same symptoms, seqno writes
stop just before the wrap and the ringbuffer at address 0. It is also
timing crucial, but adding various delays hasn't helped pinpoint where
the window lies.

Whether the fault is restricted to the ringbuffer itself or the GTT
addressing is unclear, but moving the ringbuffer fixes all the hangs I
have been able to reproduce.

References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262
Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8391382431b2..904a8a276f6a 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2096,10 +2096,12 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct i915_ggtt *ggtt = &dev_priv->ggtt;
 	struct drm_i915_gem_object *obj = ringbuf->obj;
+	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
+	unsigned flags = PIN_OFFSET_BIAS | 4096;
 	int ret;
 
 	if (HAS_LLC(dev_priv) && !obj->stolen) {
-		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, 0);
+		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, flags);
 		if (ret)
 			return ret;
 
@@ -2113,7 +2115,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
 			goto err_unpin;
 		}
 	} else {
-		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
+		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
+					    flags | PIN_MAPPABLE);
 		if (ret)
 			return ret;
 
-- 
2.8.0.rc3


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 13/15] drm/i915: Move the mb() following release-mmap into release-mmap
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (10 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-12 20:03 ` [CI-ping 14/15] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Goel, Akash

As paranoia, we want to ensure that the CPU's PTEs have been revoked for
the object before we return from i915_gem_release_mmap(). This allows us
to rely on there being no outstanding memory accesses and guarantees
serialisation of the code against concurrent access just by calling
i915_gem_release_mmap().

v2: Reduce the mb() into a wmb() following the revoke.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: "Goel, Akash" <akash.goel@intel.com
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 765efa88db32..b6879d43dd74 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1936,11 +1936,21 @@ out:
 void
 i915_gem_release_mmap(struct drm_i915_gem_object *obj)
 {
+	/* Serialisation between user GTT access and our code depends upon
+	 * revoking the CPU's PTE whilst the mutex is held. The next user
+	 * pagefault then has to wait until we release the mutex.
+	 */
+	lockdep_assert_held(&obj->base.dev->struct_mutex);
+
 	if (!obj->fault_mappable)
 		return;
 
 	drm_vma_node_unmap(&obj->base.vma_node,
 			   obj->base.dev->anon_inode->i_mapping);
+
+	/* Ensure that the CPU's PTE are revoked before we return */
+	wmb();
+
 	obj->fault_mappable = false;
 }
 
@@ -3324,9 +3334,6 @@ static void i915_gem_object_finish_gtt(struct drm_i915_gem_object *obj)
 	if ((obj->base.read_domains & I915_GEM_DOMAIN_GTT) == 0)
 		return;
 
-	/* Wait for any direct GTT access to complete */
-	mb();
-
 	old_read_domains = obj->base.read_domains;
 	old_write_domain = obj->base.write_domain;
 
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 14/15] drm/i915: Reorganise legacy context switch to cope with late failure
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (11 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 13/15] drm/i915: Move the mb() following release-mmap into release-mmap Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-13  9:59   ` Daniel Vetter
  2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
  2016-04-14  8:45 ` ✗ Fi.CI.BAT: failure for series starting with [CI-ping,01/15] drm/i915: Force clean compilation with -Werror (rev3) Patchwork
  14 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

After mi_set_context() succeeds, we need to update the state of the
engine's last_context. This ensures that we hold a pin on the context
whilst the hardware may write to it. However, since we didn't complete
the post-switch setup of the context, we need to force the subsequent
use of the same context to complete the setup (which means updating
should_skip_switch()).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 215 ++++++++++++++++----------------
 1 file changed, 109 insertions(+), 106 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e5ad7b21e356..361959b1d53a 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -609,50 +609,40 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	return ret;
 }
 
-static inline bool should_skip_switch(struct intel_engine_cs *engine,
-				      struct intel_context *from,
+static inline bool should_skip_switch(struct intel_context *from,
 				      struct intel_context *to)
 {
 	if (to->remap_slice)
 		return false;
 
-	if (to->ppgtt && from == to &&
-	    !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
-		return true;
+	if (!to->legacy_hw_ctx.initialized)
+		return false;
 
-	return false;
+	if (to->ppgtt && to->ppgtt->pd_dirty_rings & (1 << RCS))
+		return false;
+
+	return to == from;
 }
 
 static bool
-needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
+needs_pd_load_pre(struct intel_context *to)
 {
-	struct drm_i915_private *dev_priv = engine->dev->dev_private;
-
 	if (!to->ppgtt)
 		return false;
 
-	if (INTEL_INFO(engine->dev)->gen < 8)
-		return true;
-
-	if (engine != &dev_priv->engine[RCS])
+	if (INTEL_INFO(to->i915)->gen < 8)
 		return true;
 
 	return false;
 }
 
 static bool
-needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
-		   u32 hw_flags)
+needs_pd_load_post(struct intel_context *to, u32 hw_flags)
 {
-	struct drm_i915_private *dev_priv = engine->dev->dev_private;
-
 	if (!to->ppgtt)
 		return false;
 
-	if (!IS_GEN8(engine->dev))
-		return false;
-
-	if (engine != &dev_priv->engine[RCS])
+	if (!IS_GEN8(to->i915))
 		return false;
 
 	if (hw_flags & MI_RESTORE_INHIBIT)
@@ -661,60 +651,33 @@ needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
 	return false;
 }
 
-static int do_switch(struct drm_i915_gem_request *req)
+static int do_rcs_switch(struct drm_i915_gem_request *req)
 {
 	struct intel_context *to = req->ctx;
 	struct intel_engine_cs *engine = req->engine;
-	struct drm_i915_private *dev_priv = req->i915;
-	struct intel_context *from = engine->last_context;
-	u32 hw_flags = 0;
-	bool uninitialized = false;
+	struct intel_context *from;
+	u32 hw_flags;
 	int ret, i;
 
-	if (from != NULL && engine == &dev_priv->engine[RCS]) {
-		BUG_ON(from->legacy_hw_ctx.rcs_state == NULL);
-		BUG_ON(!i915_gem_obj_is_pinned(from->legacy_hw_ctx.rcs_state));
-	}
-
-	if (should_skip_switch(engine, from, to))
+	if (should_skip_switch(engine->last_context, to))
 		return 0;
 
 	/* Trying to pin first makes error handling easier. */
-	if (engine == &dev_priv->engine[RCS]) {
-		ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
-					    get_context_alignment(engine->dev),
-					    0);
-		if (ret)
-			return ret;
-	}
+	ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
+				    get_context_alignment(engine->dev),
+				    0);
+	if (ret)
+		return ret;
 
 	/*
 	 * Pin can switch back to the default context if we end up calling into
 	 * evict_everything - as a last ditch gtt defrag effort that also
 	 * switches to the default context. Hence we need to reload from here.
+	 *
+	 * XXX: Doing so is painfully broken!
 	 */
 	from = engine->last_context;
 
-	if (needs_pd_load_pre(engine, to)) {
-		/* Older GENs and non render rings still want the load first,
-		 * "PP_DCLV followed by PP_DIR_BASE register through Load
-		 * Register Immediate commands in Ring Buffer before submitting
-		 * a context."*/
-		trace_switch_mm(engine, to);
-		ret = to->ppgtt->switch_mm(to->ppgtt, req);
-		if (ret)
-			goto unpin_out;
-
-		/* Doing a PD load always reloads the page dirs */
-		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
-	}
-
-	if (engine != &dev_priv->engine[RCS]) {
-		if (from)
-			i915_gem_context_unreference(from);
-		goto done;
-	}
-
 	/*
 	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
 	 * that thanks to write = false in this call and us not setting any gpu
@@ -727,55 +690,46 @@ static int do_switch(struct drm_i915_gem_request *req)
 	if (ret)
 		goto unpin_out;
 
-	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) {
-		hw_flags |= MI_RESTORE_INHIBIT;
+	if (needs_pd_load_pre(to)) {
+		/* Older GENs and non render rings still want the load first,
+		 * "PP_DCLV followed by PP_DIR_BASE register through Load
+		 * Register Immediate commands in Ring Buffer before submitting
+		 * a context."*/
+		trace_switch_mm(engine, to);
+		ret = to->ppgtt->switch_mm(to->ppgtt, req);
+		if (ret)
+			goto unpin_out;
+
+		/* Doing a PD load always reloads the page dirs */
+		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
+	}
+
+	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
 		/* NB: If we inhibit the restore, the context is not allowed to
 		 * die because future work may end up depending on valid address
 		 * space. This means we must enforce that a page table load
 		 * occur when this occurs. */
-	} else if (to->ppgtt &&
-		   (intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)) {
-		hw_flags |= MI_FORCE_RESTORE;
-		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
-	}
+		hw_flags = MI_RESTORE_INHIBIT;
+	else if (to->ppgtt && to->ppgtt->pd_dirty_rings & (1 << RCS))
+		hw_flags = MI_FORCE_RESTORE;
+	else
+		hw_flags = 0;
 
 	/* We should never emit switch_mm more than once */
-	WARN_ON(needs_pd_load_pre(engine, to) &&
-		needs_pd_load_post(engine, to, hw_flags));
+	WARN_ON(needs_pd_load_pre(to) && needs_pd_load_post(to, hw_flags));
 
-	ret = mi_set_context(req, hw_flags);
-	if (ret)
-		goto unpin_out;
-
-	/* GEN8 does *not* require an explicit reload if the PDPs have been
-	 * setup, and we do not wish to move them.
-	 */
-	if (needs_pd_load_post(engine, to, hw_flags)) {
-		trace_switch_mm(engine, to);
-		ret = to->ppgtt->switch_mm(to->ppgtt, req);
-		/* The hardware context switch is emitted, but we haven't
-		 * actually changed the state - so it's probably safe to bail
-		 * here. Still, let the user know something dangerous has
-		 * happened.
-		 */
+	if (to != from || (hw_flags & MI_FORCE_RESTORE)) {
+		ret = mi_set_context(req, hw_flags);
 		if (ret) {
-			DRM_ERROR("Failed to change address space on context switch\n");
+			/* Force the reload of this and the previous mm */
+			if (needs_pd_load_pre(to))
+				to->ppgtt->pd_dirty_rings |= 1 << RCS;
+			if (from && needs_pd_load_pre(from))
+				from->ppgtt->pd_dirty_rings |= 1 << RCS;
 			goto unpin_out;
 		}
 	}
 
-	for (i = 0; i < MAX_L3_SLICES; i++) {
-		if (!(to->remap_slice & (1<<i)))
-			continue;
-
-		ret = i915_gem_l3_remap(req, i);
-		/* If it failed, try again next round */
-		if (ret)
-			DRM_DEBUG_DRIVER("L3 remapping failed\n");
-		else
-			to->remap_slice &= ~(1<<i);
-	}
-
 	/* The backing object for the context is done after switching to the
 	 * *next* context. Therefore we cannot retire the previous context until
 	 * the next context has already started running. In fact, the below code
@@ -798,27 +752,52 @@ static int do_switch(struct drm_i915_gem_request *req)
 		i915_gem_object_ggtt_unpin(from->legacy_hw_ctx.rcs_state);
 		i915_gem_context_unreference(from);
 	}
-
-	uninitialized = !to->legacy_hw_ctx.initialized;
-	to->legacy_hw_ctx.initialized = true;
-
-done:
 	i915_gem_context_reference(to);
 	engine->last_context = to;
 
-	if (uninitialized) {
+	/* GEN8 does *not* require an explicit reload if the PDPs have been
+	 * setup, and we do not wish to move them.
+	 */
+	if (needs_pd_load_post(to, hw_flags)) {
+		trace_switch_mm(engine, to);
+		ret = to->ppgtt->switch_mm(to->ppgtt, req);
+		/* The hardware context switch is emitted, but we haven't
+		 * actually changed the state - so it's probably safe to bail
+		 * here. Still, let the user know something dangerous has
+		 * happened.
+		 */
+		if (ret)
+			return ret;
+
+		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
+	}
+	if (hw_flags & MI_FORCE_RESTORE)
+		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
+
+	for (i = 0; i < MAX_L3_SLICES; i++) {
+		if (!(to->remap_slice & (1<<i)))
+			continue;
+
+		ret = i915_gem_l3_remap(req, i);
+		if (ret)
+			return ret;
+
+		to->remap_slice &= ~(1<<i);
+	}
+
+	if (!to->legacy_hw_ctx.initialized) {
 		if (engine->init_context) {
 			ret = engine->init_context(req);
 			if (ret)
-				DRM_ERROR("ring init context: %d\n", ret);
+				return ret;
 		}
+		to->legacy_hw_ctx.initialized = true;
 	}
 
 	return 0;
 
 unpin_out:
-	if (engine->id == RCS)
-		i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
+	i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
 	return ret;
 }
 
@@ -853,7 +832,31 @@ int i915_switch_context(struct drm_i915_gem_request *req)
 		return 0;
 	}
 
-	return do_switch(req);
+	if (engine->id != RCS) {
+		struct intel_context *from = engine->last_context;
+		struct intel_context *to = req->ctx;
+
+		if (to->ppgtt &&
+		    (from != to ||
+		     intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)) {
+			int ret;
+
+			trace_switch_mm(engine, to);
+			ret = to->ppgtt->switch_mm(to->ppgtt, req);
+			if (ret)
+				return ret;
+
+			to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
+		}
+
+		if (from)
+			i915_gem_context_unreference(from);
+		i915_gem_context_reference(to);
+		engine->last_context = to;
+		return 0;
+	}
+
+	return do_rcs_switch(req);
 }
 
 static bool contexts_enabled(struct drm_device *dev)
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (12 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 14/15] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
@ 2016-04-12 20:03 ` Chris Wilson
  2016-04-13  9:57     ` Daniel Vetter
  2016-04-14  8:45 ` ✗ Fi.CI.BAT: failure for series starting with [CI-ping,01/15] drm/i915: Force clean compilation with -Werror (rev3) Patchwork
  14 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-04-12 20:03 UTC (permalink / raw)
  To: intel-gfx; +Cc: Chris Wilson, Daniel Vetter, Tvrtko Ursulin, stable

Conceptually, each request is a record of a hardware transaction - we
build up a list of pending commands and then either commit them to
hardware, or cancel them. However, whilst building up the list of
pending commands, we may modify state outside of the request and make
references to the pending request. If we do so and then cancel that
request, external objects then point to the deleted request leading to
both graphical and memory corruption.

The easiest example is to consider object/VMA tracking. When we mark an
object as active in a request, we store a pointer to this, the most
recent request, in the object. Then we want to free that object, we wait
for the most recent request to be idle before proceeding (otherwise the
hardware will write to pages now owned by the system, or we will attempt
to read from those pages before the hardware is finished writing). If
the request was cancelled instead, that wait completes immediately. As a
result, all requests must be committed and not cancelled if the external
state is unknown.

All that remains of i915_gem_request_cancel() users are just a couple of
extremely unlikely allocation failures, so remove the API entirely.

A consequence of committing all incomplete requests is that we generate
excess breadcrumbs and fill the ring much more often with dummy work. We
have completely undone the outstanding_last_seqno optimisation.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/i915/i915_drv.h            |  2 --
 drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
 drivers/gpu/drm/i915/intel_display.c       |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
 drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
 6 files changed, 30 insertions(+), 51 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 061ecc43d935..de84dd7be971 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
 struct drm_i915_gem_request * __must_check
 i915_gem_request_alloc(struct intel_engine_cs *engine,
 		       struct intel_context *ctx);
-void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 void i915_gem_request_free(struct kref *req_ref);
 int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
 				   struct drm_file *file);
@@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file_priv);
 void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 					struct drm_i915_gem_request *req);
-void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
 int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 				   struct drm_i915_gem_execbuffer2 *args,
 				   struct list_head *vmas);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index b6879d43dd74..c6f09e7839ea 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
 		 * fully prepared. Thus it can be cleaned up using the proper
 		 * free code.
 		 */
-		i915_gem_request_cancel(req);
+		intel_ring_reserved_space_cancel(req->ringbuf);
+		i915_gem_request_unreference(req);
 		return ret;
 	}
 
@@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 	return err ? ERR_PTR(err) : req;
 }
 
-void i915_gem_request_cancel(struct drm_i915_gem_request *req)
-{
-	intel_ring_reserved_space_cancel(req->ringbuf);
-
-	i915_gem_request_unreference(req);
-}
-
 struct drm_i915_gem_request *
 i915_gem_find_active_request(struct intel_engine_cs *engine)
 {
@@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
 				return PTR_ERR(req);
 
 			ret = i915_switch_context(req);
-			if (ret) {
-				i915_gem_request_cancel(req);
-				return ret;
-			}
-
 			i915_add_request_no_flush(req);
+			if (ret)
+				return ret;
 		}
 
 		ret = intel_engine_idle(engine);
@@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
 		req = i915_gem_request_alloc(engine, NULL);
 		if (IS_ERR(req)) {
 			ret = PTR_ERR(req);
-			i915_gem_cleanup_engines(dev);
-			goto out;
+			break;
 		}
 
 		if (engine->id == RCS) {
-			for (j = 0; j < NUM_L3_SLICES(dev); j++)
-				i915_gem_l3_remap(req, j);
+			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
+				ret = i915_gem_l3_remap(req, j);
+				if (ret)
+					goto err_request;
+			}
 		}
 
 		ret = i915_ppgtt_init_ring(req);
-		if (ret && ret != -EIO) {
-			DRM_ERROR("PPGTT enable %s failed %d\n",
-				  engine->name, ret);
-			i915_gem_request_cancel(req);
-			i915_gem_cleanup_engines(dev);
-			goto out;
-		}
+		if (ret)
+			goto err_request;
 
 		ret = i915_gem_context_enable(req);
-		if (ret && ret != -EIO) {
-			DRM_ERROR("Context enable %s failed %d\n",
+		if (ret)
+			goto err_request;
+
+err_request:
+		i915_add_request_no_flush(req);
+		if (ret) {
+			DRM_ERROR("Failed to enable %s, error=%d\n",
 				  engine->name, ret);
-			i915_gem_request_cancel(req);
 			i915_gem_cleanup_engines(dev);
-			goto out;
+			break;
 		}
-
-		i915_add_request_no_flush(req);
 	}
 
 out:
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index 6ee4f00f620c..6f4f2a6cdf93 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
 	}
 }
 
-void
+static void
 i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
 {
 	/* Unconditionally force add_request to emit a full flush. */
@@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
 	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, params->request);
-	i915_gem_execbuffer_retire_commands(params);
 
 	return 0;
 }
@@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 
 	ret = i915_gem_request_add_to_client(req, file);
 	if (ret)
-		goto err_batch_unpin;
+		goto err_request;
 
 	/*
 	 * Save assorted stuff away to pass through to *_submission().
@@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	params->request                 = req;
 
 	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
+err_request:
+	i915_gem_execbuffer_retire_commands(params);
 
 err_batch_unpin:
 	/*
@@ -1657,14 +1658,6 @@ err:
 	i915_gem_context_unreference(ctx);
 	eb_destroy(eb);
 
-	/*
-	 * If the request was created but not successfully submitted then it
-	 * must be freed again. If it was submitted then it is being tracked
-	 * on the active request list and no clean up is required here.
-	 */
-	if (ret && !IS_ERR_OR_NULL(req))
-		i915_gem_request_cancel(req);
-
 	mutex_unlock(&dev->struct_mutex);
 
 pre_mutex_err:
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index b1b457864e17..3cae596d10a3 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11664,7 +11664,7 @@ cleanup_unpin:
 	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
 cleanup_pending:
 	if (!IS_ERR_OR_NULL(request))
-		i915_gem_request_cancel(request);
+		i915_add_request_no_flush(request);
 	atomic_dec(&intel_crtc->unpin_work_count);
 	mutex_unlock(&dev->struct_mutex);
 cleanup:
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b8f6b96472a6..6fc24deaa16a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
 	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
 
 	i915_gem_execbuffer_move_to_active(vmas, params->request);
-	i915_gem_execbuffer_retire_commands(params);
 
 	return 0;
 }
@@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
 		}
 
 		ret = engine->init_context(req);
+		i915_add_request_no_flush(req);
 		if (ret) {
 			DRM_ERROR("ring init context: %d\n",
 				ret);
-			i915_gem_request_cancel(req);
 			goto error_ringbuf;
 		}
-		i915_add_request_no_flush(req);
 	}
 	return 0;
 
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index 6694e9230cd5..bcc3b6a016d8 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
 
 	ret = intel_ring_begin(req, 4);
 	if (ret) {
-		i915_gem_request_cancel(req);
+		i915_add_request_no_flush(req);
 		return ret;
 	}
 
@@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
 
 	ret = intel_ring_begin(req, 2);
 	if (ret) {
-		i915_gem_request_cancel(req);
+		i915_add_request_no_flush(req);
 		return ret;
 	}
 
@@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
 
 	ret = intel_ring_begin(req, 6);
 	if (ret) {
-		i915_gem_request_cancel(req);
+		i915_add_request_no_flush(req);
 		return ret;
 	}
 
@@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
 
 		ret = intel_ring_begin(req, 2);
 		if (ret) {
-			i915_gem_request_cancel(req);
+			i915_add_request_no_flush(req);
 			return ret;
 		}
 
-- 
2.8.0.rc3


^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
  2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
@ 2016-04-13  9:33     ` Daniel Vetter
  0 siblings, 0 replies; 39+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, Daniel Vetter, Ville Syrjälä, stable

On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
> Two concurrent writes into the same register cacheline has the chance of
> killing the machine on Ivybridge and other gen7. This includes LRI
> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
> serialising barrier and prevents the pair of register writes in the first
> packet from triggering the fault.  However, if a second switch-context
> immediately occurs then we may have two adjacent blocks of LRI to the
> same registers which may then trigger the hang. To counteract this we
> need to insert a delay after the second register write using SRM.
> 
> This is easiest to reproduce with something like
> igt/gem_ctx_switch/interruptible that triggers back-to-back context
> switches (with no operations in between them in the command stream,
> which requires the execbuf operation to be interrupted after the
> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
> interruptible igt. No reports from the wild though, so it must be of low
> enough frequency that no one has correlated the random machine freezes
> with i915.ko
> 
> The issue was introduced with
> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Dec 16 10:02:27 2014 +0000
> 
>     drm/i915: Disable PSMI sleep messages on all rings around context switches
> 
> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ville Syrj�l� <ville.syrjala@linux.intel.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index fe580cb9501a..e5ad7b21e356 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  
>  	len = 4;
>  	if (INTEL_INFO(engine->dev)->gen >= 7)
> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>  
>  	ret = intel_ring_begin(req, len);
>  	if (ret)
> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>  		if (num_rings) {
>  			struct intel_engine_cs *signaller;
> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>  
>  			intel_ring_emit(engine,
>  					MI_LOAD_REGISTER_IMM(num_rings));
> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  				if (signaller == engine)
>  					continue;
>  
> -				intel_ring_emit_reg(engine,
> -						    RING_PSMI_CTL(signaller->mmio_base));
> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
> +				intel_ring_emit_reg(engine, last_reg);
>  				intel_ring_emit(engine,
>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>  			}
> +
> +			/* Insert a delay before the next switch! */
> +			intel_ring_emit(engine,
> +					MI_STORE_REGISTER_MEM |
> +					MI_SRM_LRM_GLOBAL_GTT);
> +			intel_ring_emit_reg(engine, last_reg);
> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
> +			intel_ring_emit(engine, MI_NOOP);
>  		}
>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>  	}
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
@ 2016-04-13  9:33     ` Daniel Vetter
  0 siblings, 0 replies; 39+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:33 UTC (permalink / raw)
  To: Chris Wilson; +Cc: stable, intel-gfx

On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
> Two concurrent writes into the same register cacheline has the chance of
> killing the machine on Ivybridge and other gen7. This includes LRI
> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
> serialising barrier and prevents the pair of register writes in the first
> packet from triggering the fault.  However, if a second switch-context
> immediately occurs then we may have two adjacent blocks of LRI to the
> same registers which may then trigger the hang. To counteract this we
> need to insert a delay after the second register write using SRM.
> 
> This is easiest to reproduce with something like
> igt/gem_ctx_switch/interruptible that triggers back-to-back context
> switches (with no operations in between them in the command stream,
> which requires the execbuf operation to be interrupted after the
> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
> interruptible igt. No reports from the wild though, so it must be of low
> enough frequency that no one has correlated the random machine freezes
> with i915.ko
> 
> The issue was introduced with
> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Dec 16 10:02:27 2014 +0000
> 
>     drm/i915: Disable PSMI sleep messages on all rings around context switches
> 
> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
> Cc: stable@vger.kernel.org

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index fe580cb9501a..e5ad7b21e356 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  
>  	len = 4;
>  	if (INTEL_INFO(engine->dev)->gen >= 7)
> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>  
>  	ret = intel_ring_begin(req, len);
>  	if (ret)
> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>  		if (num_rings) {
>  			struct intel_engine_cs *signaller;
> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>  
>  			intel_ring_emit(engine,
>  					MI_LOAD_REGISTER_IMM(num_rings));
> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  				if (signaller == engine)
>  					continue;
>  
> -				intel_ring_emit_reg(engine,
> -						    RING_PSMI_CTL(signaller->mmio_base));
> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
> +				intel_ring_emit_reg(engine, last_reg);
>  				intel_ring_emit(engine,
>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>  			}
> +
> +			/* Insert a delay before the next switch! */
> +			intel_ring_emit(engine,
> +					MI_STORE_REGISTER_MEM |
> +					MI_SRM_LRM_GLOBAL_GTT);
> +			intel_ring_emit_reg(engine, last_reg);
> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
> +			intel_ring_emit(engine, MI_NOOP);
>  		}
>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>  	}
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Intel-gfx] [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0
  2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
@ 2016-04-13  9:34   ` Daniel Vetter
  0 siblings, 0 replies; 39+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx, stable

On Tue, Apr 12, 2016 at 09:03:06PM +0100, Chris Wilson wrote:
> For reasons unknown Sandybridge GT1 (at least) will eventually hang when
> it encounters a ring wraparound at offset 0. The test case that
> reproduces the bug reliably forces a large number of interrupted context
> switches, thereby causing very frequent ring wraparounds, but there are
> similar bug reports in the wild with the same symptoms, seqno writes
> stop just before the wrap and the ringbuffer at address 0. It is also
> timing crucial, but adding various delays hasn't helped pinpoint where
> the window lies.
> 
> Whether the fault is restricted to the ringbuffer itself or the GTT
> addressing is unclear, but moving the ringbuffer fixes all the hangs I
> have been able to reproduce.
> 
> References: (e.g.) https://bugs.freedesktop.org/show_bug.cgi?id=93262
> Testcase: igt/gem_exec_whisper/render-contexts-interruptible #snb-gt1
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: stable@vger.kernel.org

Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8391382431b2..904a8a276f6a 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2096,10 +2096,12 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>  	struct drm_i915_private *dev_priv = to_i915(dev);
>  	struct i915_ggtt *ggtt = &dev_priv->ggtt;
>  	struct drm_i915_gem_object *obj = ringbuf->obj;
> +	/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
> +	unsigned flags = PIN_OFFSET_BIAS | 4096;
>  	int ret;
>  
>  	if (HAS_LLC(dev_priv) && !obj->stolen) {
> -		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, 0);
> +		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, flags);
>  		if (ret)
>  			return ret;
>  
> @@ -2113,7 +2115,8 @@ int intel_pin_and_map_ringbuffer_obj(struct drm_device *dev,
>  			goto err_unpin;
>  		}
>  	} else {
> -		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE, PIN_MAPPABLE);
> +		ret = i915_gem_obj_ggtt_pin(obj, PAGE_SIZE,
> +					    flags | PIN_MAPPABLE);
>  		if (ret)
>  			return ret;
>  
> -- 
> 2.8.0.rc3
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
@ 2016-04-13  9:57     ` Daniel Vetter
  0 siblings, 0 replies; 39+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:57 UTC (permalink / raw)
  To: Chris Wilson
  Cc: intel-gfx, Daniel Vetter, Tvrtko Ursulin, stable, John Harrison

On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
> Conceptually, each request is a record of a hardware transaction - we
> build up a list of pending commands and then either commit them to
> hardware, or cancel them. However, whilst building up the list of
> pending commands, we may modify state outside of the request and make
> references to the pending request. If we do so and then cancel that
> request, external objects then point to the deleted request leading to
> both graphical and memory corruption.
> 
> The easiest example is to consider object/VMA tracking. When we mark an
> object as active in a request, we store a pointer to this, the most
> recent request, in the object. Then we want to free that object, we wait
> for the most recent request to be idle before proceeding (otherwise the
> hardware will write to pages now owned by the system, or we will attempt
> to read from those pages before the hardware is finished writing). If
> the request was cancelled instead, that wait completes immediately. As a
> result, all requests must be committed and not cancelled if the external
> state is unknown.
> 
> All that remains of i915_gem_request_cancel() users are just a couple of
> extremely unlikely allocation failures, so remove the API entirely.
> 
> A consequence of committing all incomplete requests is that we generate
> excess breadcrumbs and fill the ring much more often with dummy work. We
> have completely undone the outstanding_last_seqno optimisation.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: stable@vger.kernel.org

Cc: John Harrison <John.C.Harrison@Intel.com>

I'd like John's ack on this on too, but patch itself looks sound. Fast r-b
since we've discussed this a while ago already ...

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  2 --
>  drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
>  drivers/gpu/drm/i915/intel_display.c       |  2 +-
>  drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
>  drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
>  6 files changed, 30 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 061ecc43d935..de84dd7be971 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
>  struct drm_i915_gem_request * __must_check
>  i915_gem_request_alloc(struct intel_engine_cs *engine,
>  		       struct intel_context *ctx);
> -void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  void i915_gem_request_free(struct kref *req_ref);
>  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>  				   struct drm_file *file);
> @@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
>  			     struct drm_file *file_priv);
>  void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  					struct drm_i915_gem_request *req);
> -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
>  int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>  				   struct drm_i915_gem_execbuffer2 *args,
>  				   struct list_head *vmas);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b6879d43dd74..c6f09e7839ea 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>  		 * fully prepared. Thus it can be cleaned up using the proper
>  		 * free code.
>  		 */
> -		i915_gem_request_cancel(req);
> +		intel_ring_reserved_space_cancel(req->ringbuf);
> +		i915_gem_request_unreference(req);
>  		return ret;
>  	}
>  
> @@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>  	return err ? ERR_PTR(err) : req;
>  }
>  
> -void i915_gem_request_cancel(struct drm_i915_gem_request *req)
> -{
> -	intel_ring_reserved_space_cancel(req->ringbuf);
> -
> -	i915_gem_request_unreference(req);
> -}
> -
>  struct drm_i915_gem_request *
>  i915_gem_find_active_request(struct intel_engine_cs *engine)
>  {
> @@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
>  				return PTR_ERR(req);
>  
>  			ret = i915_switch_context(req);
> -			if (ret) {
> -				i915_gem_request_cancel(req);
> -				return ret;
> -			}
> -
>  			i915_add_request_no_flush(req);
> +			if (ret)
> +				return ret;
>  		}
>  
>  		ret = intel_engine_idle(engine);
> @@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
>  		req = i915_gem_request_alloc(engine, NULL);
>  		if (IS_ERR(req)) {
>  			ret = PTR_ERR(req);
> -			i915_gem_cleanup_engines(dev);
> -			goto out;
> +			break;
>  		}
>  
>  		if (engine->id == RCS) {
> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
> -				i915_gem_l3_remap(req, j);
> +			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
> +				ret = i915_gem_l3_remap(req, j);
> +				if (ret)
> +					goto err_request;
> +			}
>  		}
>  
>  		ret = i915_ppgtt_init_ring(req);
> -		if (ret && ret != -EIO) {
> -			DRM_ERROR("PPGTT enable %s failed %d\n",
> -				  engine->name, ret);
> -			i915_gem_request_cancel(req);
> -			i915_gem_cleanup_engines(dev);
> -			goto out;
> -		}
> +		if (ret)
> +			goto err_request;
>  
>  		ret = i915_gem_context_enable(req);
> -		if (ret && ret != -EIO) {
> -			DRM_ERROR("Context enable %s failed %d\n",
> +		if (ret)
> +			goto err_request;
> +
> +err_request:
> +		i915_add_request_no_flush(req);
> +		if (ret) {
> +			DRM_ERROR("Failed to enable %s, error=%d\n",
>  				  engine->name, ret);
> -			i915_gem_request_cancel(req);
>  			i915_gem_cleanup_engines(dev);
> -			goto out;
> +			break;
>  		}
> -
> -		i915_add_request_no_flush(req);
>  	}
>  
>  out:
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 6ee4f00f620c..6f4f2a6cdf93 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  	}
>  }
>  
> -void
> +static void
>  i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>  {
>  	/* Unconditionally force add_request to emit a full flush. */
> @@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>  
>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
> -	i915_gem_execbuffer_retire_commands(params);
>  
>  	return 0;
>  }
> @@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	ret = i915_gem_request_add_to_client(req, file);
>  	if (ret)
> -		goto err_batch_unpin;
> +		goto err_request;
>  
>  	/*
>  	 * Save assorted stuff away to pass through to *_submission().
> @@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	params->request                 = req;
>  
>  	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
> +err_request:
> +	i915_gem_execbuffer_retire_commands(params);
>  
>  err_batch_unpin:
>  	/*
> @@ -1657,14 +1658,6 @@ err:
>  	i915_gem_context_unreference(ctx);
>  	eb_destroy(eb);
>  
> -	/*
> -	 * If the request was created but not successfully submitted then it
> -	 * must be freed again. If it was submitted then it is being tracked
> -	 * on the active request list and no clean up is required here.
> -	 */
> -	if (ret && !IS_ERR_OR_NULL(req))
> -		i915_gem_request_cancel(req);
> -
>  	mutex_unlock(&dev->struct_mutex);
>  
>  pre_mutex_err:
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index b1b457864e17..3cae596d10a3 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11664,7 +11664,7 @@ cleanup_unpin:
>  	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
>  cleanup_pending:
>  	if (!IS_ERR_OR_NULL(request))
> -		i915_gem_request_cancel(request);
> +		i915_add_request_no_flush(request);
>  	atomic_dec(&intel_crtc->unpin_work_count);
>  	mutex_unlock(&dev->struct_mutex);
>  cleanup:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b8f6b96472a6..6fc24deaa16a 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>  
>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
> -	i915_gem_execbuffer_retire_commands(params);
>  
>  	return 0;
>  }
> @@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>  		}
>  
>  		ret = engine->init_context(req);
> +		i915_add_request_no_flush(req);
>  		if (ret) {
>  			DRM_ERROR("ring init context: %d\n",
>  				ret);
> -			i915_gem_request_cancel(req);
>  			goto error_ringbuf;
>  		}
> -		i915_add_request_no_flush(req);
>  	}
>  	return 0;
>  
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 6694e9230cd5..bcc3b6a016d8 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>  
>  	ret = intel_ring_begin(req, 4);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>  
>  	ret = intel_ring_begin(req, 2);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>  
>  	ret = intel_ring_begin(req, 6);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>  
>  		ret = intel_ring_begin(req, 2);
>  		if (ret) {
> -			i915_gem_request_cancel(req);
> +			i915_add_request_no_flush(req);
>  			return ret;
>  		}
>  
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
@ 2016-04-13  9:57     ` Daniel Vetter
  0 siblings, 0 replies; 39+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx, stable

On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
> Conceptually, each request is a record of a hardware transaction - we
> build up a list of pending commands and then either commit them to
> hardware, or cancel them. However, whilst building up the list of
> pending commands, we may modify state outside of the request and make
> references to the pending request. If we do so and then cancel that
> request, external objects then point to the deleted request leading to
> both graphical and memory corruption.
> 
> The easiest example is to consider object/VMA tracking. When we mark an
> object as active in a request, we store a pointer to this, the most
> recent request, in the object. Then we want to free that object, we wait
> for the most recent request to be idle before proceeding (otherwise the
> hardware will write to pages now owned by the system, or we will attempt
> to read from those pages before the hardware is finished writing). If
> the request was cancelled instead, that wait completes immediately. As a
> result, all requests must be committed and not cancelled if the external
> state is unknown.
> 
> All that remains of i915_gem_request_cancel() users are just a couple of
> extremely unlikely allocation failures, so remove the API entirely.
> 
> A consequence of committing all incomplete requests is that we generate
> excess breadcrumbs and fill the ring much more often with dummy work. We
> have completely undone the outstanding_last_seqno optimisation.
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: stable@vger.kernel.org

Cc: John Harrison <John.C.Harrison@Intel.com>

I'd like John's ack on this on too, but patch itself looks sound. Fast r-b
since we've discussed this a while ago already ...

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>  drivers/gpu/drm/i915/i915_drv.h            |  2 --
>  drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
>  drivers/gpu/drm/i915/intel_display.c       |  2 +-
>  drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
>  drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
>  6 files changed, 30 insertions(+), 51 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 061ecc43d935..de84dd7be971 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
>  struct drm_i915_gem_request * __must_check
>  i915_gem_request_alloc(struct intel_engine_cs *engine,
>  		       struct intel_context *ctx);
> -void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>  void i915_gem_request_free(struct kref *req_ref);
>  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>  				   struct drm_file *file);
> @@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
>  			     struct drm_file *file_priv);
>  void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  					struct drm_i915_gem_request *req);
> -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
>  int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>  				   struct drm_i915_gem_execbuffer2 *args,
>  				   struct list_head *vmas);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index b6879d43dd74..c6f09e7839ea 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>  		 * fully prepared. Thus it can be cleaned up using the proper
>  		 * free code.
>  		 */
> -		i915_gem_request_cancel(req);
> +		intel_ring_reserved_space_cancel(req->ringbuf);
> +		i915_gem_request_unreference(req);
>  		return ret;
>  	}
>  
> @@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>  	return err ? ERR_PTR(err) : req;
>  }
>  
> -void i915_gem_request_cancel(struct drm_i915_gem_request *req)
> -{
> -	intel_ring_reserved_space_cancel(req->ringbuf);
> -
> -	i915_gem_request_unreference(req);
> -}
> -
>  struct drm_i915_gem_request *
>  i915_gem_find_active_request(struct intel_engine_cs *engine)
>  {
> @@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
>  				return PTR_ERR(req);
>  
>  			ret = i915_switch_context(req);
> -			if (ret) {
> -				i915_gem_request_cancel(req);
> -				return ret;
> -			}
> -
>  			i915_add_request_no_flush(req);
> +			if (ret)
> +				return ret;
>  		}
>  
>  		ret = intel_engine_idle(engine);
> @@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
>  		req = i915_gem_request_alloc(engine, NULL);
>  		if (IS_ERR(req)) {
>  			ret = PTR_ERR(req);
> -			i915_gem_cleanup_engines(dev);
> -			goto out;
> +			break;
>  		}
>  
>  		if (engine->id == RCS) {
> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
> -				i915_gem_l3_remap(req, j);
> +			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
> +				ret = i915_gem_l3_remap(req, j);
> +				if (ret)
> +					goto err_request;
> +			}
>  		}
>  
>  		ret = i915_ppgtt_init_ring(req);
> -		if (ret && ret != -EIO) {
> -			DRM_ERROR("PPGTT enable %s failed %d\n",
> -				  engine->name, ret);
> -			i915_gem_request_cancel(req);
> -			i915_gem_cleanup_engines(dev);
> -			goto out;
> -		}
> +		if (ret)
> +			goto err_request;
>  
>  		ret = i915_gem_context_enable(req);
> -		if (ret && ret != -EIO) {
> -			DRM_ERROR("Context enable %s failed %d\n",
> +		if (ret)
> +			goto err_request;
> +
> +err_request:
> +		i915_add_request_no_flush(req);
> +		if (ret) {
> +			DRM_ERROR("Failed to enable %s, error=%d\n",
>  				  engine->name, ret);
> -			i915_gem_request_cancel(req);
>  			i915_gem_cleanup_engines(dev);
> -			goto out;
> +			break;
>  		}
> -
> -		i915_add_request_no_flush(req);
>  	}
>  
>  out:
> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> index 6ee4f00f620c..6f4f2a6cdf93 100644
> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
> @@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>  	}
>  }
>  
> -void
> +static void
>  i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>  {
>  	/* Unconditionally force add_request to emit a full flush. */
> @@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>  
>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
> -	i915_gem_execbuffer_retire_commands(params);
>  
>  	return 0;
>  }
> @@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  
>  	ret = i915_gem_request_add_to_client(req, file);
>  	if (ret)
> -		goto err_batch_unpin;
> +		goto err_request;
>  
>  	/*
>  	 * Save assorted stuff away to pass through to *_submission().
> @@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>  	params->request                 = req;
>  
>  	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
> +err_request:
> +	i915_gem_execbuffer_retire_commands(params);
>  
>  err_batch_unpin:
>  	/*
> @@ -1657,14 +1658,6 @@ err:
>  	i915_gem_context_unreference(ctx);
>  	eb_destroy(eb);
>  
> -	/*
> -	 * If the request was created but not successfully submitted then it
> -	 * must be freed again. If it was submitted then it is being tracked
> -	 * on the active request list and no clean up is required here.
> -	 */
> -	if (ret && !IS_ERR_OR_NULL(req))
> -		i915_gem_request_cancel(req);
> -
>  	mutex_unlock(&dev->struct_mutex);
>  
>  pre_mutex_err:
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index b1b457864e17..3cae596d10a3 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11664,7 +11664,7 @@ cleanup_unpin:
>  	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
>  cleanup_pending:
>  	if (!IS_ERR_OR_NULL(request))
> -		i915_gem_request_cancel(request);
> +		i915_add_request_no_flush(request);
>  	atomic_dec(&intel_crtc->unpin_work_count);
>  	mutex_unlock(&dev->struct_mutex);
>  cleanup:
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b8f6b96472a6..6fc24deaa16a 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>  
>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
> -	i915_gem_execbuffer_retire_commands(params);
>  
>  	return 0;
>  }
> @@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>  		}
>  
>  		ret = engine->init_context(req);
> +		i915_add_request_no_flush(req);
>  		if (ret) {
>  			DRM_ERROR("ring init context: %d\n",
>  				ret);
> -			i915_gem_request_cancel(req);
>  			goto error_ringbuf;
>  		}
> -		i915_add_request_no_flush(req);
>  	}
>  	return 0;
>  
> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
> index 6694e9230cd5..bcc3b6a016d8 100644
> --- a/drivers/gpu/drm/i915/intel_overlay.c
> +++ b/drivers/gpu/drm/i915/intel_overlay.c
> @@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>  
>  	ret = intel_ring_begin(req, 4);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>  
>  	ret = intel_ring_begin(req, 2);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>  
>  	ret = intel_ring_begin(req, 6);
>  	if (ret) {
> -		i915_gem_request_cancel(req);
> +		i915_add_request_no_flush(req);
>  		return ret;
>  	}
>  
> @@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>  
>  		ret = intel_ring_begin(req, 2);
>  		if (ret) {
> -			i915_gem_request_cancel(req);
> +			i915_add_request_no_flush(req);
>  			return ret;
>  		}
>  
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 14/15] drm/i915: Reorganise legacy context switch to cope with late failure
  2016-04-12 20:03 ` [CI-ping 14/15] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
@ 2016-04-13  9:59   ` Daniel Vetter
  2016-04-13 10:05     ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Daniel Vetter @ 2016-04-13  9:59 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

On Tue, Apr 12, 2016 at 09:03:08PM +0100, Chris Wilson wrote:
> After mi_set_context() succeeds, we need to update the state of the
> engine's last_context. This ensures that we hold a pin on the context
> whilst the hardware may write to it. However, since we didn't complete
> the post-switch setup of the context, we need to force the subsequent
> use of the same context to complete the setup (which means updating
> should_skip_switch()).
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>

Specializing do_rcs_switch makes sense, but shouldn't be intermingled with
the bugfix. Assuming I'm reading things correctly the real bugfix is to
add the check for hw_ctx.initialized to should_skip_switch()? If so,
please split into 2 patches - first to add just that check, 2nd to do the
RCS specialization.
-Daniel


> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 215 ++++++++++++++++----------------
>  1 file changed, 109 insertions(+), 106 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index e5ad7b21e356..361959b1d53a 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -609,50 +609,40 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  	return ret;
>  }
>  
> -static inline bool should_skip_switch(struct intel_engine_cs *engine,
> -				      struct intel_context *from,
> +static inline bool should_skip_switch(struct intel_context *from,
>  				      struct intel_context *to)
>  {
>  	if (to->remap_slice)
>  		return false;
>  
> -	if (to->ppgtt && from == to &&
> -	    !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
> -		return true;
> +	if (!to->legacy_hw_ctx.initialized)
> +		return false;
>  
> -	return false;
> +	if (to->ppgtt && to->ppgtt->pd_dirty_rings & (1 << RCS))
> +		return false;
> +
> +	return to == from;
>  }
>  
>  static bool
> -needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
> +needs_pd_load_pre(struct intel_context *to)
>  {
> -	struct drm_i915_private *dev_priv = engine->dev->dev_private;
> -
>  	if (!to->ppgtt)
>  		return false;
>  
> -	if (INTEL_INFO(engine->dev)->gen < 8)
> -		return true;
> -
> -	if (engine != &dev_priv->engine[RCS])
> +	if (INTEL_INFO(to->i915)->gen < 8)
>  		return true;
>  
>  	return false;
>  }
>  
>  static bool
> -needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
> -		   u32 hw_flags)
> +needs_pd_load_post(struct intel_context *to, u32 hw_flags)
>  {
> -	struct drm_i915_private *dev_priv = engine->dev->dev_private;
> -
>  	if (!to->ppgtt)
>  		return false;
>  
> -	if (!IS_GEN8(engine->dev))
> -		return false;
> -
> -	if (engine != &dev_priv->engine[RCS])
> +	if (!IS_GEN8(to->i915))
>  		return false;
>  
>  	if (hw_flags & MI_RESTORE_INHIBIT)
> @@ -661,60 +651,33 @@ needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
>  	return false;
>  }
>  
> -static int do_switch(struct drm_i915_gem_request *req)
> +static int do_rcs_switch(struct drm_i915_gem_request *req)
>  {
>  	struct intel_context *to = req->ctx;
>  	struct intel_engine_cs *engine = req->engine;
> -	struct drm_i915_private *dev_priv = req->i915;
> -	struct intel_context *from = engine->last_context;
> -	u32 hw_flags = 0;
> -	bool uninitialized = false;
> +	struct intel_context *from;
> +	u32 hw_flags;
>  	int ret, i;
>  
> -	if (from != NULL && engine == &dev_priv->engine[RCS]) {
> -		BUG_ON(from->legacy_hw_ctx.rcs_state == NULL);
> -		BUG_ON(!i915_gem_obj_is_pinned(from->legacy_hw_ctx.rcs_state));
> -	}
> -
> -	if (should_skip_switch(engine, from, to))
> +	if (should_skip_switch(engine->last_context, to))
>  		return 0;
>  
>  	/* Trying to pin first makes error handling easier. */
> -	if (engine == &dev_priv->engine[RCS]) {
> -		ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
> -					    get_context_alignment(engine->dev),
> -					    0);
> -		if (ret)
> -			return ret;
> -	}
> +	ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
> +				    get_context_alignment(engine->dev),
> +				    0);
> +	if (ret)
> +		return ret;
>  
>  	/*
>  	 * Pin can switch back to the default context if we end up calling into
>  	 * evict_everything - as a last ditch gtt defrag effort that also
>  	 * switches to the default context. Hence we need to reload from here.
> +	 *
> +	 * XXX: Doing so is painfully broken!
>  	 */
>  	from = engine->last_context;
>  
> -	if (needs_pd_load_pre(engine, to)) {
> -		/* Older GENs and non render rings still want the load first,
> -		 * "PP_DCLV followed by PP_DIR_BASE register through Load
> -		 * Register Immediate commands in Ring Buffer before submitting
> -		 * a context."*/
> -		trace_switch_mm(engine, to);
> -		ret = to->ppgtt->switch_mm(to->ppgtt, req);
> -		if (ret)
> -			goto unpin_out;
> -
> -		/* Doing a PD load always reloads the page dirs */
> -		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
> -	}
> -
> -	if (engine != &dev_priv->engine[RCS]) {
> -		if (from)
> -			i915_gem_context_unreference(from);
> -		goto done;
> -	}
> -
>  	/*
>  	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
>  	 * that thanks to write = false in this call and us not setting any gpu
> @@ -727,55 +690,46 @@ static int do_switch(struct drm_i915_gem_request *req)
>  	if (ret)
>  		goto unpin_out;
>  
> -	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) {
> -		hw_flags |= MI_RESTORE_INHIBIT;
> +	if (needs_pd_load_pre(to)) {
> +		/* Older GENs and non render rings still want the load first,
> +		 * "PP_DCLV followed by PP_DIR_BASE register through Load
> +		 * Register Immediate commands in Ring Buffer before submitting
> +		 * a context."*/
> +		trace_switch_mm(engine, to);
> +		ret = to->ppgtt->switch_mm(to->ppgtt, req);
> +		if (ret)
> +			goto unpin_out;
> +
> +		/* Doing a PD load always reloads the page dirs */
> +		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
> +	}
> +
> +	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
>  		/* NB: If we inhibit the restore, the context is not allowed to
>  		 * die because future work may end up depending on valid address
>  		 * space. This means we must enforce that a page table load
>  		 * occur when this occurs. */
> -	} else if (to->ppgtt &&
> -		   (intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)) {
> -		hw_flags |= MI_FORCE_RESTORE;
> -		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
> -	}
> +		hw_flags = MI_RESTORE_INHIBIT;
> +	else if (to->ppgtt && to->ppgtt->pd_dirty_rings & (1 << RCS))
> +		hw_flags = MI_FORCE_RESTORE;
> +	else
> +		hw_flags = 0;
>  
>  	/* We should never emit switch_mm more than once */
> -	WARN_ON(needs_pd_load_pre(engine, to) &&
> -		needs_pd_load_post(engine, to, hw_flags));
> +	WARN_ON(needs_pd_load_pre(to) && needs_pd_load_post(to, hw_flags));
>  
> -	ret = mi_set_context(req, hw_flags);
> -	if (ret)
> -		goto unpin_out;
> -
> -	/* GEN8 does *not* require an explicit reload if the PDPs have been
> -	 * setup, and we do not wish to move them.
> -	 */
> -	if (needs_pd_load_post(engine, to, hw_flags)) {
> -		trace_switch_mm(engine, to);
> -		ret = to->ppgtt->switch_mm(to->ppgtt, req);
> -		/* The hardware context switch is emitted, but we haven't
> -		 * actually changed the state - so it's probably safe to bail
> -		 * here. Still, let the user know something dangerous has
> -		 * happened.
> -		 */
> +	if (to != from || (hw_flags & MI_FORCE_RESTORE)) {
> +		ret = mi_set_context(req, hw_flags);
>  		if (ret) {
> -			DRM_ERROR("Failed to change address space on context switch\n");
> +			/* Force the reload of this and the previous mm */
> +			if (needs_pd_load_pre(to))
> +				to->ppgtt->pd_dirty_rings |= 1 << RCS;
> +			if (from && needs_pd_load_pre(from))
> +				from->ppgtt->pd_dirty_rings |= 1 << RCS;
>  			goto unpin_out;
>  		}
>  	}
>  
> -	for (i = 0; i < MAX_L3_SLICES; i++) {
> -		if (!(to->remap_slice & (1<<i)))
> -			continue;
> -
> -		ret = i915_gem_l3_remap(req, i);
> -		/* If it failed, try again next round */
> -		if (ret)
> -			DRM_DEBUG_DRIVER("L3 remapping failed\n");
> -		else
> -			to->remap_slice &= ~(1<<i);
> -	}
> -
>  	/* The backing object for the context is done after switching to the
>  	 * *next* context. Therefore we cannot retire the previous context until
>  	 * the next context has already started running. In fact, the below code
> @@ -798,27 +752,52 @@ static int do_switch(struct drm_i915_gem_request *req)
>  		i915_gem_object_ggtt_unpin(from->legacy_hw_ctx.rcs_state);
>  		i915_gem_context_unreference(from);
>  	}
> -
> -	uninitialized = !to->legacy_hw_ctx.initialized;
> -	to->legacy_hw_ctx.initialized = true;
> -
> -done:
>  	i915_gem_context_reference(to);
>  	engine->last_context = to;
>  
> -	if (uninitialized) {
> +	/* GEN8 does *not* require an explicit reload if the PDPs have been
> +	 * setup, and we do not wish to move them.
> +	 */
> +	if (needs_pd_load_post(to, hw_flags)) {
> +		trace_switch_mm(engine, to);
> +		ret = to->ppgtt->switch_mm(to->ppgtt, req);
> +		/* The hardware context switch is emitted, but we haven't
> +		 * actually changed the state - so it's probably safe to bail
> +		 * here. Still, let the user know something dangerous has
> +		 * happened.
> +		 */
> +		if (ret)
> +			return ret;
> +
> +		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
> +	}
> +	if (hw_flags & MI_FORCE_RESTORE)
> +		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
> +
> +	for (i = 0; i < MAX_L3_SLICES; i++) {
> +		if (!(to->remap_slice & (1<<i)))
> +			continue;
> +
> +		ret = i915_gem_l3_remap(req, i);
> +		if (ret)
> +			return ret;
> +
> +		to->remap_slice &= ~(1<<i);
> +	}
> +
> +	if (!to->legacy_hw_ctx.initialized) {
>  		if (engine->init_context) {
>  			ret = engine->init_context(req);
>  			if (ret)
> -				DRM_ERROR("ring init context: %d\n", ret);
> +				return ret;
>  		}
> +		to->legacy_hw_ctx.initialized = true;
>  	}
>  
>  	return 0;
>  
>  unpin_out:
> -	if (engine->id == RCS)
> -		i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
> +	i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
>  	return ret;
>  }
>  
> @@ -853,7 +832,31 @@ int i915_switch_context(struct drm_i915_gem_request *req)
>  		return 0;
>  	}
>  
> -	return do_switch(req);
> +	if (engine->id != RCS) {
> +		struct intel_context *from = engine->last_context;
> +		struct intel_context *to = req->ctx;
> +
> +		if (to->ppgtt &&
> +		    (from != to ||
> +		     intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)) {
> +			int ret;
> +
> +			trace_switch_mm(engine, to);
> +			ret = to->ppgtt->switch_mm(to->ppgtt, req);
> +			if (ret)
> +				return ret;
> +
> +			to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
> +		}
> +
> +		if (from)
> +			i915_gem_context_unreference(from);
> +		i915_gem_context_reference(to);
> +		engine->last_context = to;
> +		return 0;
> +	}
> +
> +	return do_rcs_switch(req);
>  }
>  
>  static bool contexts_enabled(struct drm_device *dev)
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 14/15] drm/i915: Reorganise legacy context switch to cope with late failure
  2016-04-13  9:59   ` Daniel Vetter
@ 2016-04-13 10:05     ` Chris Wilson
  2016-04-13 14:16       ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-04-13 10:05 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On Wed, Apr 13, 2016 at 11:59:06AM +0200, Daniel Vetter wrote:
> On Tue, Apr 12, 2016 at 09:03:08PM +0100, Chris Wilson wrote:
> > After mi_set_context() succeeds, we need to update the state of the
> > engine's last_context. This ensures that we hold a pin on the context
> > whilst the hardware may write to it. However, since we didn't complete
> > the post-switch setup of the context, we need to force the subsequent
> > use of the same context to complete the setup (which means updating
> > should_skip_switch()).
> > 
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> 
> Specializing do_rcs_switch makes sense, but shouldn't be intermingled with
> the bugfix. Assuming I'm reading things correctly the real bugfix is to
> add the check for hw_ctx.initialized to should_skip_switch()? If so,
> please split into 2 patches - first to add just that check, 2nd to do the
> RCS specialization.

Not quite. That is a critical step, but we also have to set the
engine->last_context earlier. That gets quite muddled with threading the
!rcs path through it. I felt that separating them and making it clear
what was happening to each was much easier to understand. The proviso
being as always we duplicate some code (later patches will ease some of
that, but nothing can eliminate that RCS is special).
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* [PATCH 1/2] drm/i915: Split out !RCS legacy context switching
  2016-04-13 10:05     ` Chris Wilson
@ 2016-04-13 14:16       ` Chris Wilson
  2016-04-13 14:16         ` [PATCH 2/2] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
  2016-04-13 14:56         ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Daniel Vetter
  0 siblings, 2 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-13 14:16 UTC (permalink / raw)
  To: intel-gfx

Having the !RCS legacy context switch threaded through the RCS switching
code makes it much harder to follow and understand. In the next patch, I
want to fix a bug handling the incomplete switch, this is made much
simpler if we segregate the two paths now.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 74 +++++++++++++++++----------------
 1 file changed, 38 insertions(+), 36 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 453b655e86fc..c027240aacf3 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -609,9 +609,9 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	return ret;
 }
 
-static inline bool should_skip_switch(struct intel_engine_cs *engine,
-				      struct intel_context *from,
-				      struct intel_context *to)
+static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
+				   struct intel_context *from,
+				   struct intel_context *to)
 {
 	if (to->remap_slice)
 		return false;
@@ -626,15 +626,17 @@ static inline bool should_skip_switch(struct intel_engine_cs *engine,
 static bool
 needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
 {
-	struct drm_i915_private *dev_priv = engine->dev->dev_private;
-
 	if (!to->ppgtt)
 		return false;
 
-	if (INTEL_INFO(engine->dev)->gen < 8)
+	if (engine->last_context == to &&
+	    !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
+		return false;
+
+	if (engine->id != RCS)
 		return true;
 
-	if (engine != &dev_priv->engine[RCS])
+	if (INTEL_INFO(engine->dev)->gen < 8)
 		return true;
 
 	return false;
@@ -661,32 +663,24 @@ needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
 	return false;
 }
 
-static int do_switch(struct drm_i915_gem_request *req)
+static int do_rcs_switch(struct drm_i915_gem_request *req)
 {
 	struct intel_context *to = req->ctx;
 	struct intel_engine_cs *engine = req->engine;
-	struct drm_i915_private *dev_priv = req->i915;
 	struct intel_context *from = engine->last_context;
 	u32 hw_flags = 0;
 	bool uninitialized = false;
 	int ret, i;
 
-	if (from != NULL && engine == &dev_priv->engine[RCS]) {
-		BUG_ON(from->legacy_hw_ctx.rcs_state == NULL);
-		BUG_ON(!i915_gem_obj_is_pinned(from->legacy_hw_ctx.rcs_state));
-	}
-
-	if (should_skip_switch(engine, from, to))
+	if (skip_rcs_switch(engine, from, to))
 		return 0;
 
 	/* Trying to pin first makes error handling easier. */
-	if (engine == &dev_priv->engine[RCS]) {
-		ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
-					    get_context_alignment(engine->dev),
-					    0);
-		if (ret)
-			return ret;
-	}
+	ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
+				    get_context_alignment(engine->dev),
+				    0);
+	if (ret)
+		return ret;
 
 	/*
 	 * Pin can switch back to the default context if we end up calling into
@@ -709,12 +703,6 @@ static int do_switch(struct drm_i915_gem_request *req)
 		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
 	}
 
-	if (engine != &dev_priv->engine[RCS]) {
-		if (from)
-			i915_gem_context_unreference(from);
-		goto done;
-	}
-
 	/*
 	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
 	 * that thanks to write = false in this call and us not setting any gpu
@@ -802,7 +790,6 @@ static int do_switch(struct drm_i915_gem_request *req)
 	uninitialized = !to->legacy_hw_ctx.initialized;
 	to->legacy_hw_ctx.initialized = true;
 
-done:
 	i915_gem_context_reference(to);
 	engine->last_context = to;
 
@@ -817,8 +804,7 @@ done:
 	return 0;
 
 unpin_out:
-	if (engine->id == RCS)
-		i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
+	i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
 	return ret;
 }
 
@@ -843,17 +829,33 @@ int i915_switch_context(struct drm_i915_gem_request *req)
 	WARN_ON(i915.enable_execlists);
 	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
 
-	if (req->ctx->legacy_hw_ctx.rcs_state == NULL) { /* We have the fake context */
-		if (req->ctx != engine->last_context) {
-			i915_gem_context_reference(req->ctx);
+	if (engine->id != RCS ||
+	    req->ctx->legacy_hw_ctx.rcs_state == NULL) {
+		struct intel_context *to = req->ctx;
+
+		if (needs_pd_load_pre(engine, to)) {
+			int ret;
+
+			trace_switch_mm(engine, to);
+			ret = to->ppgtt->switch_mm(to->ppgtt, req);
+			if (ret)
+				return ret;
+
+			/* Doing a PD load always reloads the page dirs */
+			to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
+		}
+
+		if (to != engine->last_context) {
+			i915_gem_context_reference(to);
 			if (engine->last_context)
 				i915_gem_context_unreference(engine->last_context);
-			engine->last_context = req->ctx;
+			engine->last_context = to;
 		}
+
 		return 0;
 	}
 
-	return do_switch(req);
+	return do_rcs_switch(req);
 }
 
 static bool contexts_enabled(struct drm_device *dev)
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* [PATCH 2/2] drm/i915: Reorganise legacy context switch to cope with late failure
  2016-04-13 14:16       ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Chris Wilson
@ 2016-04-13 14:16         ` Chris Wilson
  2016-04-13 15:05           ` Daniel Vetter
  2016-04-13 14:56         ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Daniel Vetter
  1 sibling, 1 reply; 39+ messages in thread
From: Chris Wilson @ 2016-04-13 14:16 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter

After mi_set_context() succeeds, we need to update the state of the
engine's last_context. This ensures that we hold a pin on the context
whilst the hardware may write to it. However, since we didn't complete
the post-switch setup of the context, we need to force the subsequent
use of the same context to complete the setup (which means updating
should_skip_switch()).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_gem_context.c | 147 ++++++++++++++++----------------
 1 file changed, 74 insertions(+), 73 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index c027240aacf3..702780d9af57 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -609,18 +609,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
 	return ret;
 }
 
-static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
-				   struct intel_context *from,
+static inline bool skip_rcs_switch(struct intel_context *from,
 				   struct intel_context *to)
 {
 	if (to->remap_slice)
 		return false;
 
-	if (to->ppgtt && from == to &&
-	    !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
-		return true;
+	if (!to->legacy_hw_ctx.initialized)
+		return false;
 
-	return false;
+	if (to->ppgtt && to->ppgtt->pd_dirty_rings & (1 << RCS))
+		return false;
+
+	return to == from;
 }
 
 static bool
@@ -643,18 +644,12 @@ needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
 }
 
 static bool
-needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
-		   u32 hw_flags)
+needs_pd_load_post(struct intel_context *to, u32 hw_flags)
 {
-	struct drm_i915_private *dev_priv = engine->dev->dev_private;
-
 	if (!to->ppgtt)
 		return false;
 
-	if (!IS_GEN8(engine->dev))
-		return false;
-
-	if (engine != &dev_priv->engine[RCS])
+	if (!IS_GEN8(to->i915))
 		return false;
 
 	if (hw_flags & MI_RESTORE_INHIBIT)
@@ -667,12 +662,11 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 {
 	struct intel_context *to = req->ctx;
 	struct intel_engine_cs *engine = req->engine;
-	struct intel_context *from = engine->last_context;
-	u32 hw_flags = 0;
-	bool uninitialized = false;
+	struct intel_context *from;
+	u32 hw_flags;
 	int ret, i;
 
-	if (skip_rcs_switch(engine, from, to))
+	if (skip_rcs_switch(engine->last_context, to))
 		return 0;
 
 	/* Trying to pin first makes error handling easier. */
@@ -686,9 +680,23 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 	 * Pin can switch back to the default context if we end up calling into
 	 * evict_everything - as a last ditch gtt defrag effort that also
 	 * switches to the default context. Hence we need to reload from here.
+	 *
+	 * XXX: Doing so is painfully broken!
 	 */
 	from = engine->last_context;
 
+	/*
+	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
+	 * that thanks to write = false in this call and us not setting any gpu
+	 * write domains when putting a context object onto the active list
+	 * (when switching away from it), this won't block.
+	 *
+	 * XXX: We need a real interface to do this instead of trickery.
+	 */
+	ret = i915_gem_object_set_to_gtt_domain(to->legacy_hw_ctx.rcs_state, false);
+	if (ret)
+		goto unpin_out;
+
 	if (needs_pd_load_pre(engine, to)) {
 		/* Older GENs and non render rings still want the load first,
 		 * "PP_DCLV followed by PP_DIR_BASE register through Load
@@ -700,70 +708,36 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 			goto unpin_out;
 
 		/* Doing a PD load always reloads the page dirs */
-		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
+		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
 	}
 
-	/*
-	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
-	 * that thanks to write = false in this call and us not setting any gpu
-	 * write domains when putting a context object onto the active list
-	 * (when switching away from it), this won't block.
-	 *
-	 * XXX: We need a real interface to do this instead of trickery.
-	 */
-	ret = i915_gem_object_set_to_gtt_domain(to->legacy_hw_ctx.rcs_state, false);
-	if (ret)
-		goto unpin_out;
-
-	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) {
-		hw_flags |= MI_RESTORE_INHIBIT;
+	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
 		/* NB: If we inhibit the restore, the context is not allowed to
 		 * die because future work may end up depending on valid address
 		 * space. This means we must enforce that a page table load
 		 * occur when this occurs. */
-	} else if (to->ppgtt &&
-		   (intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)) {
-		hw_flags |= MI_FORCE_RESTORE;
-		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
-	}
+		hw_flags = MI_RESTORE_INHIBIT;
+	else if (to->ppgtt && to->ppgtt->pd_dirty_rings & (1 << RCS))
+		hw_flags = MI_FORCE_RESTORE;
+	else
+		hw_flags = 0;
 
 	/* We should never emit switch_mm more than once */
 	WARN_ON(needs_pd_load_pre(engine, to) &&
-		needs_pd_load_post(engine, to, hw_flags));
-
-	ret = mi_set_context(req, hw_flags);
-	if (ret)
-		goto unpin_out;
+		needs_pd_load_post(to, hw_flags));
 
-	/* GEN8 does *not* require an explicit reload if the PDPs have been
-	 * setup, and we do not wish to move them.
-	 */
-	if (needs_pd_load_post(engine, to, hw_flags)) {
-		trace_switch_mm(engine, to);
-		ret = to->ppgtt->switch_mm(to->ppgtt, req);
-		/* The hardware context switch is emitted, but we haven't
-		 * actually changed the state - so it's probably safe to bail
-		 * here. Still, let the user know something dangerous has
-		 * happened.
-		 */
+	if (to != from || (hw_flags & MI_FORCE_RESTORE)) {
+		ret = mi_set_context(req, hw_flags);
 		if (ret) {
-			DRM_ERROR("Failed to change address space on context switch\n");
+			/* Force the reload of this and the previous mm */
+			if (needs_pd_load_pre(engine, to))
+				to->ppgtt->pd_dirty_rings |= 1 << RCS;
+			if (from && needs_pd_load_pre(engine, from))
+				from->ppgtt->pd_dirty_rings |= 1 << RCS;
 			goto unpin_out;
 		}
 	}
 
-	for (i = 0; i < MAX_L3_SLICES; i++) {
-		if (!(to->remap_slice & (1<<i)))
-			continue;
-
-		ret = i915_gem_l3_remap(req, i);
-		/* If it failed, try again next round */
-		if (ret)
-			DRM_DEBUG_DRIVER("L3 remapping failed\n");
-		else
-			to->remap_slice &= ~(1<<i);
-	}
-
 	/* The backing object for the context is done after switching to the
 	 * *next* context. Therefore we cannot retire the previous context until
 	 * the next context has already started running. In fact, the below code
@@ -786,19 +760,46 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
 		i915_gem_object_ggtt_unpin(from->legacy_hw_ctx.rcs_state);
 		i915_gem_context_unreference(from);
 	}
-
-	uninitialized = !to->legacy_hw_ctx.initialized;
-	to->legacy_hw_ctx.initialized = true;
-
 	i915_gem_context_reference(to);
 	engine->last_context = to;
 
-	if (uninitialized) {
+	/* GEN8 does *not* require an explicit reload if the PDPs have been
+	 * setup, and we do not wish to move them.
+	 */
+	if (needs_pd_load_post(to, hw_flags)) {
+		trace_switch_mm(engine, to);
+		ret = to->ppgtt->switch_mm(to->ppgtt, req);
+		/* The hardware context switch is emitted, but we haven't
+		 * actually changed the state - so it's probably safe to bail
+		 * here. Still, let the user know something dangerous has
+		 * happened.
+		 */
+		if (ret)
+			return ret;
+
+		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
+	}
+	if (hw_flags & MI_FORCE_RESTORE)
+		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
+
+	for (i = 0; i < MAX_L3_SLICES; i++) {
+		if (!(to->remap_slice & (1<<i)))
+			continue;
+
+		ret = i915_gem_l3_remap(req, i);
+		if (ret)
+			return ret;
+
+		to->remap_slice &= ~(1<<i);
+	}
+
+	if (!to->legacy_hw_ctx.initialized) {
 		if (engine->init_context) {
 			ret = engine->init_context(req);
 			if (ret)
-				DRM_ERROR("ring init context: %d\n", ret);
+				return ret;
 		}
+		to->legacy_hw_ctx.initialized = true;
 	}
 
 	return 0;
-- 
2.8.0.rc3

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 39+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-13  9:57     ` Daniel Vetter
  (?)
@ 2016-04-13 14:21     ` John Harrison
  2016-04-19 12:35       ` Dave Gordon
  -1 siblings, 1 reply; 39+ messages in thread
From: John Harrison @ 2016-04-13 14:21 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson
  Cc: intel-gfx, Daniel Vetter, Tvrtko Ursulin, stable

On 13/04/2016 10:57, Daniel Vetter wrote:
> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>> Conceptually, each request is a record of a hardware transaction - we
>> build up a list of pending commands and then either commit them to
>> hardware, or cancel them. However, whilst building up the list of
>> pending commands, we may modify state outside of the request and make
>> references to the pending request. If we do so and then cancel that
>> request, external objects then point to the deleted request leading to
>> both graphical and memory corruption.
>>
>> The easiest example is to consider object/VMA tracking. When we mark an
>> object as active in a request, we store a pointer to this, the most
>> recent request, in the object. Then we want to free that object, we wait
>> for the most recent request to be idle before proceeding (otherwise the
>> hardware will write to pages now owned by the system, or we will attempt
>> to read from those pages before the hardware is finished writing). If
>> the request was cancelled instead, that wait completes immediately. As a
>> result, all requests must be committed and not cancelled if the external
>> state is unknown.
>>
>> All that remains of i915_gem_request_cancel() users are just a couple of
>> extremely unlikely allocation failures, so remove the API entirely.
>>
>> A consequence of committing all incomplete requests is that we generate
>> excess breadcrumbs and fill the ring much more often with dummy work. We
>> have completely undone the outstanding_last_seqno optimisation.
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Cc: stable@vger.kernel.org
> Cc: John Harrison <John.C.Harrison@Intel.com>
>
> I'd like John's ack on this on too, but patch itself looks sound. Fast r-b
> since we've discussed this a while ago already ...

I think this is going to cause a problem with the scheduler. You are 
effectively saying that the execbuf call cannot fail beyond the point of 
allocating a request. If it gets that far then it must go all the way 
and submit the request to the hardware. With a scheduler, that means 
adding it to the scheduler's queues and tracking it all the way through 
the system to completion. If nothing else, that sounds like a lot of 
extra overhead for no actual work. Or worse if the failure is at a point 
where the request cannot be sent further through the system because it 
was something critical that failed then you are really stuffed.

I'm not sure what the other option would be though, short of being able 
to undo the last read/write object tracking updates.


>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h            |  2 --
>>   drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
>>   drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
>>   drivers/gpu/drm/i915/intel_display.c       |  2 +-
>>   drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
>>   drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
>>   6 files changed, 30 insertions(+), 51 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 061ecc43d935..de84dd7be971 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
>>   struct drm_i915_gem_request * __must_check
>>   i915_gem_request_alloc(struct intel_engine_cs *engine,
>>   		       struct intel_context *ctx);
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>   void i915_gem_request_free(struct kref *req_ref);
>>   int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>>   				   struct drm_file *file);
>> @@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
>>   			     struct drm_file *file_priv);
>>   void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>   					struct drm_i915_gem_request *req);
>> -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
>>   int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>   				   struct drm_i915_gem_execbuffer2 *args,
>>   				   struct list_head *vmas);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index b6879d43dd74..c6f09e7839ea 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>   		 * fully prepared. Thus it can be cleaned up using the proper
>>   		 * free code.
>>   		 */
>> -		i915_gem_request_cancel(req);
>> +		intel_ring_reserved_space_cancel(req->ringbuf);
>> +		i915_gem_request_unreference(req);
>>   		return ret;
>>   	}
>>   
>> @@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>>   	return err ? ERR_PTR(err) : req;
>>   }
>>   
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>> -{
>> -	intel_ring_reserved_space_cancel(req->ringbuf);
>> -
>> -	i915_gem_request_unreference(req);
>> -}
>> -
>>   struct drm_i915_gem_request *
>>   i915_gem_find_active_request(struct intel_engine_cs *engine)
>>   {
>> @@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
>>   				return PTR_ERR(req);
>>   
>>   			ret = i915_switch_context(req);
>> -			if (ret) {
>> -				i915_gem_request_cancel(req);
>> -				return ret;
>> -			}
>> -
>>   			i915_add_request_no_flush(req);
>> +			if (ret)
>> +				return ret;
>>   		}
>>   
>>   		ret = intel_engine_idle(engine);
>> @@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
>>   		req = i915_gem_request_alloc(engine, NULL);
>>   		if (IS_ERR(req)) {
>>   			ret = PTR_ERR(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>   		}
>>   
>>   		if (engine->id == RCS) {
>> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
>> -				i915_gem_l3_remap(req, j);
>> +			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
>> +				ret = i915_gem_l3_remap(req, j);
>> +				if (ret)
>> +					goto err_request;
>> +			}
>>   		}
>>   
>>   		ret = i915_ppgtt_init_ring(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("PPGTT enable %s failed %d\n",
>> -				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> -		}
>> +		if (ret)
>> +			goto err_request;
>>   
>>   		ret = i915_gem_context_enable(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("Context enable %s failed %d\n",
>> +		if (ret)
>> +			goto err_request;
>> +
>> +err_request:
>> +		i915_add_request_no_flush(req);
>> +		if (ret) {
>> +			DRM_ERROR("Failed to enable %s, error=%d\n",
>>   				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>>   			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>   		}
>> -
>> -		i915_add_request_no_flush(req);
>>   	}
>>   
>>   out:
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 6ee4f00f620c..6f4f2a6cdf93 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>   	}
>>   }
>>   
>> -void
>> +static void
>>   i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>>   {
>>   	/* Unconditionally force add_request to emit a full flush. */
>> @@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>   	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>   
>>   	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>   
>>   	return 0;
>>   }
>> @@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>   
>>   	ret = i915_gem_request_add_to_client(req, file);
>>   	if (ret)
>> -		goto err_batch_unpin;
>> +		goto err_request;
>>   
>>   	/*
>>   	 * Save assorted stuff away to pass through to *_submission().
>> @@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>   	params->request                 = req;
>>   
>>   	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
>> +err_request:
>> +	i915_gem_execbuffer_retire_commands(params);
>>   
>>   err_batch_unpin:
>>   	/*
>> @@ -1657,14 +1658,6 @@ err:
>>   	i915_gem_context_unreference(ctx);
>>   	eb_destroy(eb);
>>   
>> -	/*
>> -	 * If the request was created but not successfully submitted then it
>> -	 * must be freed again. If it was submitted then it is being tracked
>> -	 * on the active request list and no clean up is required here.
>> -	 */
>> -	if (ret && !IS_ERR_OR_NULL(req))
>> -		i915_gem_request_cancel(req);
>> -
>>   	mutex_unlock(&dev->struct_mutex);
>>   
>>   pre_mutex_err:
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index b1b457864e17..3cae596d10a3 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -11664,7 +11664,7 @@ cleanup_unpin:
>>   	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
>>   cleanup_pending:
>>   	if (!IS_ERR_OR_NULL(request))
>> -		i915_gem_request_cancel(request);
>> +		i915_add_request_no_flush(request);
>>   	atomic_dec(&intel_crtc->unpin_work_count);
>>   	mutex_unlock(&dev->struct_mutex);
>>   cleanup:
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index b8f6b96472a6..6fc24deaa16a 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>   	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>   
>>   	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>   
>>   	return 0;
>>   }
>> @@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>>   		}
>>   
>>   		ret = engine->init_context(req);
>> +		i915_add_request_no_flush(req);
>>   		if (ret) {
>>   			DRM_ERROR("ring init context: %d\n",
>>   				ret);
>> -			i915_gem_request_cancel(req);
>>   			goto error_ringbuf;
>>   		}
>> -		i915_add_request_no_flush(req);
>>   	}
>>   	return 0;
>>   
>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
>> index 6694e9230cd5..bcc3b6a016d8 100644
>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>> @@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>>   
>>   	ret = intel_ring_begin(req, 4);
>>   	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>   		return ret;
>>   	}
>>   
>> @@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>>   
>>   	ret = intel_ring_begin(req, 2);
>>   	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>   		return ret;
>>   	}
>>   
>> @@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>>   
>>   	ret = intel_ring_begin(req, 6);
>>   	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>   		return ret;
>>   	}
>>   
>> @@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>>   
>>   		ret = intel_ring_begin(req, 2);
>>   		if (ret) {
>> -			i915_gem_request_cancel(req);
>> +			i915_add_request_no_flush(req);
>>   			return ret;
>>   		}
>>   
>> -- 
>> 2.8.0.rc3
>>

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] drm/i915: Split out !RCS legacy context switching
  2016-04-13 14:16       ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Chris Wilson
  2016-04-13 14:16         ` [PATCH 2/2] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
@ 2016-04-13 14:56         ` Daniel Vetter
  2016-04-13 15:04           ` Chris Wilson
  1 sibling, 1 reply; 39+ messages in thread
From: Daniel Vetter @ 2016-04-13 14:56 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Wed, Apr 13, 2016 at 03:16:30PM +0100, Chris Wilson wrote:
> Having the !RCS legacy context switch threaded through the RCS switching
> code makes it much harder to follow and understand. In the next patch, I
> want to fix a bug handling the incomplete switch, this is made much
> simpler if we segregate the two paths now.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_gem_context.c | 74 +++++++++++++++++----------------
>  1 file changed, 38 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
> index 453b655e86fc..c027240aacf3 100644
> --- a/drivers/gpu/drm/i915/i915_gem_context.c
> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
> @@ -609,9 +609,9 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>  	return ret;
>  }
>  
> -static inline bool should_skip_switch(struct intel_engine_cs *engine,
> -				      struct intel_context *from,
> -				      struct intel_context *to)
> +static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
> +				   struct intel_context *from,
> +				   struct intel_context *to)
>  {
>  	if (to->remap_slice)
>  		return false;
> @@ -626,15 +626,17 @@ static inline bool should_skip_switch(struct intel_engine_cs *engine,
>  static bool
>  needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
>  {
> -	struct drm_i915_private *dev_priv = engine->dev->dev_private;
> -
>  	if (!to->ppgtt)
>  		return false;
>  
> -	if (INTEL_INFO(engine->dev)->gen < 8)
> +	if (engine->last_context == to &&
> +	    !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
> +		return false;
> +
> +	if (engine->id != RCS)
>  		return true;
>  
> -	if (engine != &dev_priv->engine[RCS])
> +	if (INTEL_INFO(engine->dev)->gen < 8)
>  		return true;
>  
>  	return false;
> @@ -661,32 +663,24 @@ needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
>  	return false;
>  }
>  
> -static int do_switch(struct drm_i915_gem_request *req)
> +static int do_rcs_switch(struct drm_i915_gem_request *req)
>  {
>  	struct intel_context *to = req->ctx;
>  	struct intel_engine_cs *engine = req->engine;
> -	struct drm_i915_private *dev_priv = req->i915;
>  	struct intel_context *from = engine->last_context;
>  	u32 hw_flags = 0;
>  	bool uninitialized = false;
>  	int ret, i;
>  
> -	if (from != NULL && engine == &dev_priv->engine[RCS]) {
> -		BUG_ON(from->legacy_hw_ctx.rcs_state == NULL);
> -		BUG_ON(!i915_gem_obj_is_pinned(from->legacy_hw_ctx.rcs_state));
> -	}
> -
> -	if (should_skip_switch(engine, from, to))
> +	if (skip_rcs_switch(engine, from, to))
>  		return 0;
>  
>  	/* Trying to pin first makes error handling easier. */
> -	if (engine == &dev_priv->engine[RCS]) {
> -		ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
> -					    get_context_alignment(engine->dev),
> -					    0);
> -		if (ret)
> -			return ret;
> -	}
> +	ret = i915_gem_obj_ggtt_pin(to->legacy_hw_ctx.rcs_state,
> +				    get_context_alignment(engine->dev),
> +				    0);
> +	if (ret)
> +		return ret;
>  
>  	/*
>  	 * Pin can switch back to the default context if we end up calling into
> @@ -709,12 +703,6 @@ static int do_switch(struct drm_i915_gem_request *req)
>  		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
>  	}
>  
> -	if (engine != &dev_priv->engine[RCS]) {
> -		if (from)
> -			i915_gem_context_unreference(from);
> -		goto done;
> -	}
> -
>  	/*
>  	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
>  	 * that thanks to write = false in this call and us not setting any gpu
> @@ -802,7 +790,6 @@ static int do_switch(struct drm_i915_gem_request *req)
>  	uninitialized = !to->legacy_hw_ctx.initialized;
>  	to->legacy_hw_ctx.initialized = true;
>  
> -done:
>  	i915_gem_context_reference(to);
>  	engine->last_context = to;
>  
> @@ -817,8 +804,7 @@ done:
>  	return 0;
>  
>  unpin_out:
> -	if (engine->id == RCS)
> -		i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
> +	i915_gem_object_ggtt_unpin(to->legacy_hw_ctx.rcs_state);
>  	return ret;
>  }
>  
> @@ -843,17 +829,33 @@ int i915_switch_context(struct drm_i915_gem_request *req)
>  	WARN_ON(i915.enable_execlists);
>  	WARN_ON(!mutex_is_locked(&dev_priv->dev->struct_mutex));
>  
> -	if (req->ctx->legacy_hw_ctx.rcs_state == NULL) { /* We have the fake context */
> -		if (req->ctx != engine->last_context) {
> -			i915_gem_context_reference(req->ctx);
> +	if (engine->id != RCS ||
> +	    req->ctx->legacy_hw_ctx.rcs_state == NULL) {
> +		struct intel_context *to = req->ctx;
> +
> +		if (needs_pd_load_pre(engine, to)) {

Hm, I'd inline this condition now since it's a bit confusing if there's no
POST. Assuming I read code correctly it seems to boil down to to->ppgtt
(which reads a lot cleaner) for !RCS.

Either way (since this is just a bikeshed):

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>


> +			int ret;
> +
> +			trace_switch_mm(engine, to);
> +			ret = to->ppgtt->switch_mm(to->ppgtt, req);
> +			if (ret)
> +				return ret;
> +
> +			/* Doing a PD load always reloads the page dirs */
> +			to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
> +		}
> +
> +		if (to != engine->last_context) {
> +			i915_gem_context_reference(to);
>  			if (engine->last_context)
>  				i915_gem_context_unreference(engine->last_context);
> -			engine->last_context = req->ctx;
> +			engine->last_context = to;
>  		}
> +
>  		return 0;
>  	}
>  
> -	return do_switch(req);
> +	return do_rcs_switch(req);
>  }
>  
>  static bool contexts_enabled(struct drm_device *dev)
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 1/2] drm/i915: Split out !RCS legacy context switching
  2016-04-13 14:56         ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Daniel Vetter
@ 2016-04-13 15:04           ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-13 15:04 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Wed, Apr 13, 2016 at 04:56:05PM +0200, Daniel Vetter wrote:
> On Wed, Apr 13, 2016 at 03:16:30PM +0100, Chris Wilson wrote:
> > +		if (needs_pd_load_pre(engine, to)) {
> 
> Hm, I'd inline this condition now since it's a bit confusing if there's no
> POST. Assuming I read code correctly it seems to boil down to to->ppgtt
> (which reads a lot cleaner) for !RCS.

I did it this way because it makes a later patch trivial ;)

Yes, it boils down to just a couple of checks. I plan a third in
hopefully not the too distant future.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/2] drm/i915: Reorganise legacy context switch to cope with late failure
  2016-04-13 14:16         ` [PATCH 2/2] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
@ 2016-04-13 15:05           ` Daniel Vetter
  2016-04-13 15:18             ` Chris Wilson
  0 siblings, 1 reply; 39+ messages in thread
From: Daniel Vetter @ 2016-04-13 15:05 UTC (permalink / raw)
  To: Chris Wilson; +Cc: Daniel Vetter, intel-gfx

hw_flags)
>  	return ret;
>  }
>  
> -static inline bool skip_rcs_switch(struct intel_engine_cs *engine,
> -				   struct intel_context *from,
> +static inline bool skip_rcs_switch(struct intel_context *from,
>  				   struct intel_context *to)
>  {
>  	if (to->remap_slice)
>  		return false;
>  
> -	if (to->ppgtt && from == to &&
> -	    !(intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings))
> -		return true;
> +	if (!to->legacy_hw_ctx.initialized)
> +		return false;
>  
> -	return false;
> +	if (to->ppgtt && to->ppgtt->pd_dirty_rings & (1 << RCS))
> +		return false;
> +
> +	return to == from;
>  }
>  
>  static bool
> @@ -643,18 +644,12 @@ needs_pd_load_pre(struct intel_engine_cs *engine, struct intel_context *to)
>  }
>  
>  static bool
> -needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
> -		   u32 hw_flags)
> +needs_pd_load_post(struct intel_context *to, u32 hw_flags)
>  {
> -	struct drm_i915_private *dev_priv = engine->dev->dev_private;
> -
>  	if (!to->ppgtt)
>  		return false;
>  
> -	if (!IS_GEN8(engine->dev))
> -		return false;
> -
> -	if (engine != &dev_priv->engine[RCS])
> +	if (!IS_GEN8(to->i915))
>  		return false;
>  
>  	if (hw_flags & MI_RESTORE_INHIBIT)

Above hunk might be better in the previous patch, but meh.

> @@ -667,12 +662,11 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
>  {
>  	struct intel_context *to = req->ctx;
>  	struct intel_engine_cs *engine = req->engine;
> -	struct intel_context *from = engine->last_context;
> -	u32 hw_flags = 0;
> -	bool uninitialized = false;
> +	struct intel_context *from;
> +	u32 hw_flags;
>  	int ret, i;
>  
> -	if (skip_rcs_switch(engine, from, to))
> +	if (skip_rcs_switch(engine->last_context, to))
>  		return 0;
>  
>  	/* Trying to pin first makes error handling easier. */
> @@ -686,9 +680,23 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
>  	 * Pin can switch back to the default context if we end up calling into
>  	 * evict_everything - as a last ditch gtt defrag effort that also
>  	 * switches to the default context. Hence we need to reload from here.
> +	 *
> +	 * XXX: Doing so is painfully broken!

Why the XXX addition here? I thought the point of this patch is to fix
this ...

>  	 */
>  	from = engine->last_context;
>  
> +	/*
> +	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
> +	 * that thanks to write = false in this call and us not setting any gpu
> +	 * write domains when putting a context object onto the active list
> +	 * (when switching away from it), this won't block.
> +	 *
> +	 * XXX: We need a real interface to do this instead of trickery.
> +	 */
> +	ret = i915_gem_object_set_to_gtt_domain(to->legacy_hw_ctx.rcs_state, false);
> +	if (ret)
> +		goto unpin_out;
> +
>  	if (needs_pd_load_pre(engine, to)) {
>  		/* Older GENs and non render rings still want the load first,
>  		 * "PP_DCLV followed by PP_DIR_BASE register through Load
> @@ -700,70 +708,36 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
>  			goto unpin_out;
>  
>  		/* Doing a PD load always reloads the page dirs */
> -		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
> +		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);

Superflous change for the imo worse.

>  	}
>  
> -	/*
> -	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
> -	 * that thanks to write = false in this call and us not setting any gpu
> -	 * write domains when putting a context object onto the active list
> -	 * (when switching away from it), this won't block.
> -	 *
> -	 * XXX: We need a real interface to do this instead of trickery.
> -	 */
> -	ret = i915_gem_object_set_to_gtt_domain(to->legacy_hw_ctx.rcs_state, false);
> -	if (ret)
> -		goto unpin_out;
> -
> -	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to)) {
> -		hw_flags |= MI_RESTORE_INHIBIT;
> +	if (!to->legacy_hw_ctx.initialized || i915_gem_context_is_default(to))
>  		/* NB: If we inhibit the restore, the context is not allowed to
>  		 * die because future work may end up depending on valid address
>  		 * space. This means we must enforce that a page table load
>  		 * occur when this occurs. */
> -	} else if (to->ppgtt &&
> -		   (intel_engine_flag(engine) & to->ppgtt->pd_dirty_rings)) {
> -		hw_flags |= MI_FORCE_RESTORE;
> -		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
> -	}
> +		hw_flags = MI_RESTORE_INHIBIT;
> +	else if (to->ppgtt && to->ppgtt->pd_dirty_rings & (1 << RCS))
> +		hw_flags = MI_FORCE_RESTORE;
> +	else
> +		hw_flags = 0;
>  
>  	/* We should never emit switch_mm more than once */
>  	WARN_ON(needs_pd_load_pre(engine, to) &&
> -		needs_pd_load_post(engine, to, hw_flags));
> -
> -	ret = mi_set_context(req, hw_flags);
> -	if (ret)
> -		goto unpin_out;
> +		needs_pd_load_post(to, hw_flags));
>  
> -	/* GEN8 does *not* require an explicit reload if the PDPs have been
> -	 * setup, and we do not wish to move them.
> -	 */
> -	if (needs_pd_load_post(engine, to, hw_flags)) {
> -		trace_switch_mm(engine, to);
> -		ret = to->ppgtt->switch_mm(to->ppgtt, req);
> -		/* The hardware context switch is emitted, but we haven't
> -		 * actually changed the state - so it's probably safe to bail
> -		 * here. Still, let the user know something dangerous has
> -		 * happened.
> -		 */
> +	if (to != from || (hw_flags & MI_FORCE_RESTORE)) {
> +		ret = mi_set_context(req, hw_flags);
>  		if (ret) {
> -			DRM_ERROR("Failed to change address space on context switch\n");
> +			/* Force the reload of this and the previous mm */
> +			if (needs_pd_load_pre(engine, to))
> +				to->ppgtt->pd_dirty_rings |= 1 << RCS;
> +			if (from && needs_pd_load_pre(engine, from))
> +				from->ppgtt->pd_dirty_rings |= 1 << RCS;
>  			goto unpin_out;
>  		}
>  	}
>  
> -	for (i = 0; i < MAX_L3_SLICES; i++) {
> -		if (!(to->remap_slice & (1<<i)))
> -			continue;
> -
> -		ret = i915_gem_l3_remap(req, i);
> -		/* If it failed, try again next round */
> -		if (ret)
> -			DRM_DEBUG_DRIVER("L3 remapping failed\n");
> -		else
> -			to->remap_slice &= ~(1<<i);
> -	}
> -
>  	/* The backing object for the context is done after switching to the
>  	 * *next* context. Therefore we cannot retire the previous context until
>  	 * the next context has already started running. In fact, the below code
> @@ -786,19 +760,46 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
>  		i915_gem_object_ggtt_unpin(from->legacy_hw_ctx.rcs_state);
>  		i915_gem_context_unreference(from);
>  	}
> -
> -	uninitialized = !to->legacy_hw_ctx.initialized;
> -	to->legacy_hw_ctx.initialized = true;
> -
>  	i915_gem_context_reference(to);
>  	engine->last_context = to;
>  
> -	if (uninitialized) {
> +	/* GEN8 does *not* require an explicit reload if the PDPs have been
> +	 * setup, and we do not wish to move them.
> +	 */
> +	if (needs_pd_load_post(to, hw_flags)) {
> +		trace_switch_mm(engine, to);
> +		ret = to->ppgtt->switch_mm(to->ppgtt, req);
> +		/* The hardware context switch is emitted, but we haven't
> +		 * actually changed the state - so it's probably safe to bail
> +		 * here. Still, let the user know something dangerous has
> +		 * happened.
> +		 */
> +		if (ret)
> +			return ret;
> +
> +		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
> +	}
> +	if (hw_flags & MI_FORCE_RESTORE)
> +		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
> +
> +	for (i = 0; i < MAX_L3_SLICES; i++) {
> +		if (!(to->remap_slice & (1<<i)))
> +			continue;
> +
> +		ret = i915_gem_l3_remap(req, i);
> +		if (ret)
> +			return ret;
> +
> +		to->remap_slice &= ~(1<<i);
> +	}
> +
> +	if (!to->legacy_hw_ctx.initialized) {
>  		if (engine->init_context) {
>  			ret = engine->init_context(req);
>  			if (ret)
> -				DRM_ERROR("ring init context: %d\n", ret);
> +				return ret;
>  		}
> +		to->legacy_hw_ctx.initialized = true;

Otherwise Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

>  	}
>  
>  	return 0;
> -- 
> 2.8.0.rc3
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [PATCH 2/2] drm/i915: Reorganise legacy context switch to cope with late failure
  2016-04-13 15:05           ` Daniel Vetter
@ 2016-04-13 15:18             ` Chris Wilson
  0 siblings, 0 replies; 39+ messages in thread
From: Chris Wilson @ 2016-04-13 15:18 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: Daniel Vetter, intel-gfx

On Wed, Apr 13, 2016 at 05:05:05PM +0200, Daniel Vetter wrote:
> >  static bool
> > -needs_pd_load_post(struct intel_engine_cs *engine, struct intel_context *to,
> > -		   u32 hw_flags)
> > +needs_pd_load_post(struct intel_context *to, u32 hw_flags)
> >  {
> > -	struct drm_i915_private *dev_priv = engine->dev->dev_private;
> > -
> >  	if (!to->ppgtt)
> >  		return false;
> >  
> > -	if (!IS_GEN8(engine->dev))
> > -		return false;
> > -
> > -	if (engine != &dev_priv->engine[RCS])
> > +	if (!IS_GEN8(to->i915))
> >  		return false;
> >  
> >  	if (hw_flags & MI_RESTORE_INHIBIT)
> 
> Above hunk might be better in the previous patch, but meh.

Previous patch changed needs_pd_load_pre() :-p

> > @@ -667,12 +662,11 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
> >  {
> >  	struct intel_context *to = req->ctx;
> >  	struct intel_engine_cs *engine = req->engine;
> > -	struct intel_context *from = engine->last_context;
> > -	u32 hw_flags = 0;
> > -	bool uninitialized = false;
> > +	struct intel_context *from;
> > +	u32 hw_flags;
> >  	int ret, i;
> >  
> > -	if (skip_rcs_switch(engine, from, to))
> > +	if (skip_rcs_switch(engine->last_context, to))
> >  		return 0;
> >  
> >  	/* Trying to pin first makes error handling easier. */
> > @@ -686,9 +680,23 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
> >  	 * Pin can switch back to the default context if we end up calling into
> >  	 * evict_everything - as a last ditch gtt defrag effort that also
> >  	 * switches to the default context. Hence we need to reload from here.
> > +	 *
> > +	 * XXX: Doing so is painfully broken!
> 
> Why the XXX addition here? I thought the point of this patch is to fix
> this ...

No, that requires my seqno->requests patches....

To fix this we have to move the context pinning into request allocation,
so that if we do have to emit a switch to the default context we do so
before we take ownership of the ring for ourselves.

  	/* Trying to pin first makes error handling easier. */

is not kidding.

> >  	 */
> >  	from = engine->last_context;
> >  
> > +	/*
> > +	 * Clear this page out of any CPU caches for coherent swap-in/out. Note
> > +	 * that thanks to write = false in this call and us not setting any gpu
> > +	 * write domains when putting a context object onto the active list
> > +	 * (when switching away from it), this won't block.
> > +	 *
> > +	 * XXX: We need a real interface to do this instead of trickery.
> > +	 */
> > +	ret = i915_gem_object_set_to_gtt_domain(to->legacy_hw_ctx.rcs_state, false);
> > +	if (ret)
> > +		goto unpin_out;
> > +
> >  	if (needs_pd_load_pre(engine, to)) {
> >  		/* Older GENs and non render rings still want the load first,
> >  		 * "PP_DCLV followed by PP_DIR_BASE register through Load
> > @@ -700,70 +708,36 @@ static int do_rcs_switch(struct drm_i915_gem_request *req)
> >  			goto unpin_out;
> >  
> >  		/* Doing a PD load always reloads the page dirs */
> > -		to->ppgtt->pd_dirty_rings &= ~intel_engine_flag(engine);
> > +		to->ppgtt->pd_dirty_rings &= ~(1 << RCS);
> 
> Superflous change for the imo worse.

* shrug. I was just trying to reinforce this was RCS, not any random
engine.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* ✗ Fi.CI.BAT: failure for series starting with [CI-ping,01/15] drm/i915: Force clean compilation with -Werror (rev3)
  2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
                   ` (13 preceding siblings ...)
  2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
@ 2016-04-14  8:45 ` Patchwork
  14 siblings, 0 replies; 39+ messages in thread
From: Patchwork @ 2016-04-14  8:45 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

== Series Details ==

Series: series starting with [CI-ping,01/15] drm/i915: Force clean compilation with -Werror (rev3)
URL   : https://patchwork.freedesktop.org/series/5612/
State : failure

== Summary ==

Series 5612v3 Series without cover letter
http://patchwork.freedesktop.org/api/1.0/series/5612/revisions/3/mbox/

Test drv_hangman:
        Subgroup error-state-basic:
                incomplete -> PASS       (bdw-nuci7)
Test kms_flip:
        Subgroup basic-flip-vs-wf_vblank:
                pass       -> FAIL       (snb-x220t)
                fail       -> PASS       (bsw-nuc-2)

bdw-nuci7        total:203  pass:191  dwarn:0   dfail:0   fail:0   skip:12 
bdw-ultra        total:203  pass:180  dwarn:0   dfail:0   fail:0   skip:23 
bsw-nuc-2        total:202  pass:163  dwarn:0   dfail:0   fail:0   skip:39 
byt-nuc          total:202  pass:164  dwarn:0   dfail:0   fail:0   skip:38 
hsw-brixbox      total:203  pass:179  dwarn:0   dfail:0   fail:0   skip:24 
ivb-t430s        total:203  pass:175  dwarn:0   dfail:0   fail:0   skip:28 
skl-i7k-2        total:203  pass:178  dwarn:0   dfail:0   fail:0   skip:25 
skl-nuci5        total:203  pass:192  dwarn:0   dfail:0   fail:0   skip:11 
snb-dellxps      total:203  pass:165  dwarn:0   dfail:0   fail:0   skip:38 
snb-x220t        total:203  pass:164  dwarn:0   dfail:0   fail:2   skip:37 

Results at /archive/results/CI_IGT_test/Patchwork_1889/

e26bcbcb66f5fe06d824fd6c2930ea933eaee62d drm-intel-nightly: 2016y-04m-14d-06h-19m-12s UTC integration manifest
2f433e1 drm/i915: Move the mb() following release-mmap into release-mmap
3134141 drm/i915: Force ringbuffers to not be at offset 0
29c671f drm/i915: Prevent machine death on Ivybridge context switching
41db75f drm/i915: Suppress error message when GPU resets are disabled
b563971 drm/i915: Prevent leaking of -EIO from i915_wait_request()
7e186aa drm/i915: Simplify reset_counter handling during atomic modesetting
86b7a7d drm/i915: Store the reset counter when constructing a request
d468a4c drm/i915: Tighten reset_counter for reset status
d091d72 drm/i915: Simplify checking of GPU reset_counter in display pageflips
a10eb4e drm/i915: Hide the atomic_read(reset_counter) behind a helper
14083ee drm/i915: Add GEM debugging Kconfig option
fbb7254 drm/i915: Disentangle i915_drv.h includes
884f851 drm/i915: Force clean compilation with -Werror

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Intel-gfx] [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-13  9:57     ` Daniel Vetter
@ 2016-04-18  9:46       ` Jani Nikula
  -1 siblings, 0 replies; 39+ messages in thread
From: Jani Nikula @ 2016-04-18  9:46 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson; +Cc: Daniel Vetter, intel-gfx, stable

On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>> Conceptually, each request is a record of a hardware transaction - we
>> build up a list of pending commands and then either commit them to
>> hardware, or cancel them. However, whilst building up the list of
>> pending commands, we may modify state outside of the request and make
>> references to the pending request. If we do so and then cancel that
>> request, external objects then point to the deleted request leading to
>> both graphical and memory corruption.
>> 
>> The easiest example is to consider object/VMA tracking. When we mark an
>> object as active in a request, we store a pointer to this, the most
>> recent request, in the object. Then we want to free that object, we wait
>> for the most recent request to be idle before proceeding (otherwise the
>> hardware will write to pages now owned by the system, or we will attempt
>> to read from those pages before the hardware is finished writing). If
>> the request was cancelled instead, that wait completes immediately. As a
>> result, all requests must be committed and not cancelled if the external
>> state is unknown.
>> 
>> All that remains of i915_gem_request_cancel() users are just a couple of
>> extremely unlikely allocation failures, so remove the API entirely.
>> 
>> A consequence of committing all incomplete requests is that we generate
>> excess breadcrumbs and fill the ring much more often with dummy work. We
>> have completely undone the outstanding_last_seqno optimisation.
>> 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Cc: stable@vger.kernel.org
>
> Cc: John Harrison <John.C.Harrison@Intel.com>
>
> I'd like John's ack on this on too, but patch itself looks sound. Fast r-b
> since we've discussed this a while ago already ...
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

BR,
Jani.


(*) Well, not exactly *this* but rather
https://patchwork.freedesktop.org/patch/80961/ which was not posted on
the list so I can't reply to it.


>> ---
>>  drivers/gpu/drm/i915/i915_drv.h            |  2 --
>>  drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
>>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
>>  drivers/gpu/drm/i915/intel_display.c       |  2 +-
>>  drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
>>  drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
>>  6 files changed, 30 insertions(+), 51 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 061ecc43d935..de84dd7be971 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
>>  struct drm_i915_gem_request * __must_check
>>  i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  		       struct intel_context *ctx);
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>  void i915_gem_request_free(struct kref *req_ref);
>>  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>>  				   struct drm_file *file);
>> @@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
>>  			     struct drm_file *file_priv);
>>  void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>  					struct drm_i915_gem_request *req);
>> -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
>>  int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>  				   struct drm_i915_gem_execbuffer2 *args,
>>  				   struct list_head *vmas);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index b6879d43dd74..c6f09e7839ea 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  		 * fully prepared. Thus it can be cleaned up using the proper
>>  		 * free code.
>>  		 */
>> -		i915_gem_request_cancel(req);
>> +		intel_ring_reserved_space_cancel(req->ringbuf);
>> +		i915_gem_request_unreference(req);
>>  		return ret;
>>  	}
>>  
>> @@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  	return err ? ERR_PTR(err) : req;
>>  }
>>  
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>> -{
>> -	intel_ring_reserved_space_cancel(req->ringbuf);
>> -
>> -	i915_gem_request_unreference(req);
>> -}
>> -
>>  struct drm_i915_gem_request *
>>  i915_gem_find_active_request(struct intel_engine_cs *engine)
>>  {
>> @@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
>>  				return PTR_ERR(req);
>>  
>>  			ret = i915_switch_context(req);
>> -			if (ret) {
>> -				i915_gem_request_cancel(req);
>> -				return ret;
>> -			}
>> -
>>  			i915_add_request_no_flush(req);
>> +			if (ret)
>> +				return ret;
>>  		}
>>  
>>  		ret = intel_engine_idle(engine);
>> @@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
>>  		req = i915_gem_request_alloc(engine, NULL);
>>  		if (IS_ERR(req)) {
>>  			ret = PTR_ERR(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>  		}
>>  
>>  		if (engine->id == RCS) {
>> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
>> -				i915_gem_l3_remap(req, j);
>> +			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
>> +				ret = i915_gem_l3_remap(req, j);
>> +				if (ret)
>> +					goto err_request;
>> +			}
>>  		}
>>  
>>  		ret = i915_ppgtt_init_ring(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("PPGTT enable %s failed %d\n",
>> -				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> -		}
>> +		if (ret)
>> +			goto err_request;
>>  
>>  		ret = i915_gem_context_enable(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("Context enable %s failed %d\n",
>> +		if (ret)
>> +			goto err_request;
>> +
>> +err_request:
>> +		i915_add_request_no_flush(req);
>> +		if (ret) {
>> +			DRM_ERROR("Failed to enable %s, error=%d\n",
>>  				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>>  			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>  		}
>> -
>> -		i915_add_request_no_flush(req);
>>  	}
>>  
>>  out:
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 6ee4f00f620c..6f4f2a6cdf93 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>  	}
>>  }
>>  
>> -void
>> +static void
>>  i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>>  {
>>  	/* Unconditionally force add_request to emit a full flush. */
>> @@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>  
>>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>  
>>  	return 0;
>>  }
>> @@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>  
>>  	ret = i915_gem_request_add_to_client(req, file);
>>  	if (ret)
>> -		goto err_batch_unpin;
>> +		goto err_request;
>>  
>>  	/*
>>  	 * Save assorted stuff away to pass through to *_submission().
>> @@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>  	params->request                 = req;
>>  
>>  	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
>> +err_request:
>> +	i915_gem_execbuffer_retire_commands(params);
>>  
>>  err_batch_unpin:
>>  	/*
>> @@ -1657,14 +1658,6 @@ err:
>>  	i915_gem_context_unreference(ctx);
>>  	eb_destroy(eb);
>>  
>> -	/*
>> -	 * If the request was created but not successfully submitted then it
>> -	 * must be freed again. If it was submitted then it is being tracked
>> -	 * on the active request list and no clean up is required here.
>> -	 */
>> -	if (ret && !IS_ERR_OR_NULL(req))
>> -		i915_gem_request_cancel(req);
>> -
>>  	mutex_unlock(&dev->struct_mutex);
>>  
>>  pre_mutex_err:
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index b1b457864e17..3cae596d10a3 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -11664,7 +11664,7 @@ cleanup_unpin:
>>  	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
>>  cleanup_pending:
>>  	if (!IS_ERR_OR_NULL(request))
>> -		i915_gem_request_cancel(request);
>> +		i915_add_request_no_flush(request);
>>  	atomic_dec(&intel_crtc->unpin_work_count);
>>  	mutex_unlock(&dev->struct_mutex);
>>  cleanup:
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index b8f6b96472a6..6fc24deaa16a 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>  
>>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>  
>>  	return 0;
>>  }
>> @@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>>  		}
>>  
>>  		ret = engine->init_context(req);
>> +		i915_add_request_no_flush(req);
>>  		if (ret) {
>>  			DRM_ERROR("ring init context: %d\n",
>>  				ret);
>> -			i915_gem_request_cancel(req);
>>  			goto error_ringbuf;
>>  		}
>> -		i915_add_request_no_flush(req);
>>  	}
>>  	return 0;
>>  
>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
>> index 6694e9230cd5..bcc3b6a016d8 100644
>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>> @@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>>  
>>  	ret = intel_ring_begin(req, 4);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>>  
>>  	ret = intel_ring_begin(req, 2);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>>  
>>  	ret = intel_ring_begin(req, 6);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>>  
>>  		ret = intel_ring_begin(req, 2);
>>  		if (ret) {
>> -			i915_gem_request_cancel(req);
>> +			i915_add_request_no_flush(req);
>>  			return ret;
>>  		}
>>  
>> -- 
>> 2.8.0.rc3
>> 

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
@ 2016-04-18  9:46       ` Jani Nikula
  0 siblings, 0 replies; 39+ messages in thread
From: Jani Nikula @ 2016-04-18  9:46 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson; +Cc: Daniel Vetter, intel-gfx, stable

On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>> Conceptually, each request is a record of a hardware transaction - we
>> build up a list of pending commands and then either commit them to
>> hardware, or cancel them. However, whilst building up the list of
>> pending commands, we may modify state outside of the request and make
>> references to the pending request. If we do so and then cancel that
>> request, external objects then point to the deleted request leading to
>> both graphical and memory corruption.
>> 
>> The easiest example is to consider object/VMA tracking. When we mark an
>> object as active in a request, we store a pointer to this, the most
>> recent request, in the object. Then we want to free that object, we wait
>> for the most recent request to be idle before proceeding (otherwise the
>> hardware will write to pages now owned by the system, or we will attempt
>> to read from those pages before the hardware is finished writing). If
>> the request was cancelled instead, that wait completes immediately. As a
>> result, all requests must be committed and not cancelled if the external
>> state is unknown.
>> 
>> All that remains of i915_gem_request_cancel() users are just a couple of
>> extremely unlikely allocation failures, so remove the API entirely.
>> 
>> A consequence of committing all incomplete requests is that we generate
>> excess breadcrumbs and fill the ring much more often with dummy work. We
>> have completely undone the outstanding_last_seqno optimisation.
>> 
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>> Cc: stable@vger.kernel.org
>
> Cc: John Harrison <John.C.Harrison@Intel.com>
>
> I'd like John's ack on this on too, but patch itself looks sound. Fast r-b
> since we've discussed this a while ago already ...
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

BR,
Jani.


(*) Well, not exactly *this* but rather
https://patchwork.freedesktop.org/patch/80961/ which was not posted on
the list so I can't reply to it.


>> ---
>>  drivers/gpu/drm/i915/i915_drv.h            |  2 --
>>  drivers/gpu/drm/i915/i915_gem.c            | 50 ++++++++++++------------------
>>  drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++------
>>  drivers/gpu/drm/i915/intel_display.c       |  2 +-
>>  drivers/gpu/drm/i915/intel_lrc.c           |  4 +--
>>  drivers/gpu/drm/i915/intel_overlay.c       |  8 ++---
>>  6 files changed, 30 insertions(+), 51 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index 061ecc43d935..de84dd7be971 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -2331,7 +2331,6 @@ struct drm_i915_gem_request {
>>  struct drm_i915_gem_request * __must_check
>>  i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  		       struct intel_context *ctx);
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>>  void i915_gem_request_free(struct kref *req_ref);
>>  int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
>>  				   struct drm_file *file);
>> @@ -2883,7 +2882,6 @@ int i915_gem_sw_finish_ioctl(struct drm_device *dev, void *data,
>>  			     struct drm_file *file_priv);
>>  void i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>  					struct drm_i915_gem_request *req);
>> -void i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params);
>>  int i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>  				   struct drm_i915_gem_execbuffer2 *args,
>>  				   struct list_head *vmas);
>> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
>> index b6879d43dd74..c6f09e7839ea 100644
>> --- a/drivers/gpu/drm/i915/i915_gem.c
>> +++ b/drivers/gpu/drm/i915/i915_gem.c
>> @@ -2785,7 +2785,8 @@ __i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  		 * fully prepared. Thus it can be cleaned up using the proper
>>  		 * free code.
>>  		 */
>> -		i915_gem_request_cancel(req);
>> +		intel_ring_reserved_space_cancel(req->ringbuf);
>> +		i915_gem_request_unreference(req);
>>  		return ret;
>>  	}
>>  
>> @@ -2822,13 +2823,6 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
>>  	return err ? ERR_PTR(err) : req;
>>  }
>>  
>> -void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>> -{
>> -	intel_ring_reserved_space_cancel(req->ringbuf);
>> -
>> -	i915_gem_request_unreference(req);
>> -}
>> -
>>  struct drm_i915_gem_request *
>>  i915_gem_find_active_request(struct intel_engine_cs *engine)
>>  {
>> @@ -3438,12 +3432,9 @@ int i915_gpu_idle(struct drm_device *dev)
>>  				return PTR_ERR(req);
>>  
>>  			ret = i915_switch_context(req);
>> -			if (ret) {
>> -				i915_gem_request_cancel(req);
>> -				return ret;
>> -			}
>> -
>>  			i915_add_request_no_flush(req);
>> +			if (ret)
>> +				return ret;
>>  		}
>>  
>>  		ret = intel_engine_idle(engine);
>> @@ -4943,34 +4934,33 @@ i915_gem_init_hw(struct drm_device *dev)
>>  		req = i915_gem_request_alloc(engine, NULL);
>>  		if (IS_ERR(req)) {
>>  			ret = PTR_ERR(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>  		}
>>  
>>  		if (engine->id == RCS) {
>> -			for (j = 0; j < NUM_L3_SLICES(dev); j++)
>> -				i915_gem_l3_remap(req, j);
>> +			for (j = 0; j < NUM_L3_SLICES(dev); j++) {
>> +				ret = i915_gem_l3_remap(req, j);
>> +				if (ret)
>> +					goto err_request;
>> +			}
>>  		}
>>  
>>  		ret = i915_ppgtt_init_ring(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("PPGTT enable %s failed %d\n",
>> -				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>> -			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> -		}
>> +		if (ret)
>> +			goto err_request;
>>  
>>  		ret = i915_gem_context_enable(req);
>> -		if (ret && ret != -EIO) {
>> -			DRM_ERROR("Context enable %s failed %d\n",
>> +		if (ret)
>> +			goto err_request;
>> +
>> +err_request:
>> +		i915_add_request_no_flush(req);
>> +		if (ret) {
>> +			DRM_ERROR("Failed to enable %s, error=%d\n",
>>  				  engine->name, ret);
>> -			i915_gem_request_cancel(req);
>>  			i915_gem_cleanup_engines(dev);
>> -			goto out;
>> +			break;
>>  		}
>> -
>> -		i915_add_request_no_flush(req);
>>  	}
>>  
>>  out:
>> diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> index 6ee4f00f620c..6f4f2a6cdf93 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
>> @@ -1137,7 +1137,7 @@ i915_gem_execbuffer_move_to_active(struct list_head *vmas,
>>  	}
>>  }
>>  
>> -void
>> +static void
>>  i915_gem_execbuffer_retire_commands(struct i915_execbuffer_params *params)
>>  {
>>  	/* Unconditionally force add_request to emit a full flush. */
>> @@ -1322,7 +1322,6 @@ i915_gem_ringbuffer_submission(struct i915_execbuffer_params *params,
>>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>  
>>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>  
>>  	return 0;
>>  }
>> @@ -1624,7 +1623,7 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>  
>>  	ret = i915_gem_request_add_to_client(req, file);
>>  	if (ret)
>> -		goto err_batch_unpin;
>> +		goto err_request;
>>  
>>  	/*
>>  	 * Save assorted stuff away to pass through to *_submission().
>> @@ -1641,6 +1640,8 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
>>  	params->request                 = req;
>>  
>>  	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
>> +err_request:
>> +	i915_gem_execbuffer_retire_commands(params);
>>  
>>  err_batch_unpin:
>>  	/*
>> @@ -1657,14 +1658,6 @@ err:
>>  	i915_gem_context_unreference(ctx);
>>  	eb_destroy(eb);
>>  
>> -	/*
>> -	 * If the request was created but not successfully submitted then it
>> -	 * must be freed again. If it was submitted then it is being tracked
>> -	 * on the active request list and no clean up is required here.
>> -	 */
>> -	if (ret && !IS_ERR_OR_NULL(req))
>> -		i915_gem_request_cancel(req);
>> -
>>  	mutex_unlock(&dev->struct_mutex);
>>  
>>  pre_mutex_err:
>> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
>> index b1b457864e17..3cae596d10a3 100644
>> --- a/drivers/gpu/drm/i915/intel_display.c
>> +++ b/drivers/gpu/drm/i915/intel_display.c
>> @@ -11664,7 +11664,7 @@ cleanup_unpin:
>>  	intel_unpin_fb_obj(fb, crtc->primary->state->rotation);
>>  cleanup_pending:
>>  	if (!IS_ERR_OR_NULL(request))
>> -		i915_gem_request_cancel(request);
>> +		i915_add_request_no_flush(request);
>>  	atomic_dec(&intel_crtc->unpin_work_count);
>>  	mutex_unlock(&dev->struct_mutex);
>>  cleanup:
>> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
>> index b8f6b96472a6..6fc24deaa16a 100644
>> --- a/drivers/gpu/drm/i915/intel_lrc.c
>> +++ b/drivers/gpu/drm/i915/intel_lrc.c
>> @@ -1010,7 +1010,6 @@ int intel_execlists_submission(struct i915_execbuffer_params *params,
>>  	trace_i915_gem_ring_dispatch(params->request, params->dispatch_flags);
>>  
>>  	i915_gem_execbuffer_move_to_active(vmas, params->request);
>> -	i915_gem_execbuffer_retire_commands(params);
>>  
>>  	return 0;
>>  }
>> @@ -2679,13 +2678,12 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
>>  		}
>>  
>>  		ret = engine->init_context(req);
>> +		i915_add_request_no_flush(req);
>>  		if (ret) {
>>  			DRM_ERROR("ring init context: %d\n",
>>  				ret);
>> -			i915_gem_request_cancel(req);
>>  			goto error_ringbuf;
>>  		}
>> -		i915_add_request_no_flush(req);
>>  	}
>>  	return 0;
>>  
>> diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
>> index 6694e9230cd5..bcc3b6a016d8 100644
>> --- a/drivers/gpu/drm/i915/intel_overlay.c
>> +++ b/drivers/gpu/drm/i915/intel_overlay.c
>> @@ -247,7 +247,7 @@ static int intel_overlay_on(struct intel_overlay *overlay)
>>  
>>  	ret = intel_ring_begin(req, 4);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -290,7 +290,7 @@ static int intel_overlay_continue(struct intel_overlay *overlay,
>>  
>>  	ret = intel_ring_begin(req, 2);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -356,7 +356,7 @@ static int intel_overlay_off(struct intel_overlay *overlay)
>>  
>>  	ret = intel_ring_begin(req, 6);
>>  	if (ret) {
>> -		i915_gem_request_cancel(req);
>> +		i915_add_request_no_flush(req);
>>  		return ret;
>>  	}
>>  
>> @@ -431,7 +431,7 @@ static int intel_overlay_release_old_vid(struct intel_overlay *overlay)
>>  
>>  		ret = intel_ring_begin(req, 2);
>>  		if (ret) {
>> -			i915_gem_request_cancel(req);
>> +			i915_add_request_no_flush(req);
>>  			return ret;
>>  		}
>>  
>> -- 
>> 2.8.0.rc3
>> 

-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Intel-gfx] [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
  2016-04-13  9:33     ` Daniel Vetter
@ 2016-04-18  9:50       ` Jani Nikula
  -1 siblings, 0 replies; 39+ messages in thread
From: Jani Nikula @ 2016-04-18  9:50 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson; +Cc: stable, intel-gfx

On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
>> Two concurrent writes into the same register cacheline has the chance of
>> killing the machine on Ivybridge and other gen7. This includes LRI
>> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
>> serialising barrier and prevents the pair of register writes in the first
>> packet from triggering the fault.  However, if a second switch-context
>> immediately occurs then we may have two adjacent blocks of LRI to the
>> same registers which may then trigger the hang. To counteract this we
>> need to insert a delay after the second register write using SRM.
>> 
>> This is easiest to reproduce with something like
>> igt/gem_ctx_switch/interruptible that triggers back-to-back context
>> switches (with no operations in between them in the command stream,
>> which requires the execbuf operation to be interrupted after the
>> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
>> interruptible igt. No reports from the wild though, so it must be of low
>> enough frequency that no one has correlated the random machine freezes
>> with i915.ko
>> 
>> The issue was introduced with
>> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>> Date:   Tue Dec 16 10:02:27 2014 +0000
>> 
>>     drm/i915: Disable PSMI sleep messages on all rings around context switches
>> 
>> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> Cc: stable@vger.kernel.org
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

BR,
Jani.

(*) Well, not exactly *this* but rather
https://patchwork.freedesktop.org/patch/80952/ which was not posted on
the list so I can't reply to it.


>
>> ---
>>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>>  1 file changed, 12 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index fe580cb9501a..e5ad7b21e356 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  
>>  	len = 4;
>>  	if (INTEL_INFO(engine->dev)->gen >= 7)
>> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
>> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>>  
>>  	ret = intel_ring_begin(req, len);
>>  	if (ret)
>> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>>  		if (num_rings) {
>>  			struct intel_engine_cs *signaller;
>> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>>  
>>  			intel_ring_emit(engine,
>>  					MI_LOAD_REGISTER_IMM(num_rings));
>> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  				if (signaller == engine)
>>  					continue;
>>  
>> -				intel_ring_emit_reg(engine,
>> -						    RING_PSMI_CTL(signaller->mmio_base));
>> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
>> +				intel_ring_emit_reg(engine, last_reg);
>>  				intel_ring_emit(engine,
>>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>>  			}
>> +
>> +			/* Insert a delay before the next switch! */
>> +			intel_ring_emit(engine,
>> +					MI_STORE_REGISTER_MEM |
>> +					MI_SRM_LRM_GLOBAL_GTT);
>> +			intel_ring_emit_reg(engine, last_reg);
>> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
>> +			intel_ring_emit(engine, MI_NOOP);
>>  		}
>>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>>  	}
>> -- 
>> 2.8.0.rc3
>> 

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
@ 2016-04-18  9:50       ` Jani Nikula
  0 siblings, 0 replies; 39+ messages in thread
From: Jani Nikula @ 2016-04-18  9:50 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson; +Cc: intel-gfx, stable

On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
>> Two concurrent writes into the same register cacheline has the chance of
>> killing the machine on Ivybridge and other gen7. This includes LRI
>> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
>> serialising barrier and prevents the pair of register writes in the first
>> packet from triggering the fault.  However, if a second switch-context
>> immediately occurs then we may have two adjacent blocks of LRI to the
>> same registers which may then trigger the hang. To counteract this we
>> need to insert a delay after the second register write using SRM.
>> 
>> This is easiest to reproduce with something like
>> igt/gem_ctx_switch/interruptible that triggers back-to-back context
>> switches (with no operations in between them in the command stream,
>> which requires the execbuf operation to be interrupted after the
>> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
>> interruptible igt. No reports from the wild though, so it must be of low
>> enough frequency that no one has correlated the random machine freezes
>> with i915.ko
>> 
>> The issue was introduced with
>> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>> Date:   Tue Dec 16 10:02:27 2014 +0000
>> 
>>     drm/i915: Disable PSMI sleep messages on all rings around context switches
>> 
>> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>> Cc: stable@vger.kernel.org
>
> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

BR,
Jani.

(*) Well, not exactly *this* but rather
https://patchwork.freedesktop.org/patch/80952/ which was not posted on
the list so I can't reply to it.


>
>> ---
>>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>>  1 file changed, 12 insertions(+), 3 deletions(-)
>> 
>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>> index fe580cb9501a..e5ad7b21e356 100644
>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  
>>  	len = 4;
>>  	if (INTEL_INFO(engine->dev)->gen >= 7)
>> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
>> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>>  
>>  	ret = intel_ring_begin(req, len);
>>  	if (ret)
>> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>>  		if (num_rings) {
>>  			struct intel_engine_cs *signaller;
>> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>>  
>>  			intel_ring_emit(engine,
>>  					MI_LOAD_REGISTER_IMM(num_rings));
>> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>  				if (signaller == engine)
>>  					continue;
>>  
>> -				intel_ring_emit_reg(engine,
>> -						    RING_PSMI_CTL(signaller->mmio_base));
>> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
>> +				intel_ring_emit_reg(engine, last_reg);
>>  				intel_ring_emit(engine,
>>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>>  			}
>> +
>> +			/* Insert a delay before the next switch! */
>> +			intel_ring_emit(engine,
>> +					MI_STORE_REGISTER_MEM |
>> +					MI_SRM_LRM_GLOBAL_GTT);
>> +			intel_ring_emit_reg(engine, last_reg);
>> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
>> +			intel_ring_emit(engine, MI_NOOP);
>>  		}
>>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>>  	}
>> -- 
>> 2.8.0.rc3
>> 

-- 
Jani Nikula, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-13 14:21     ` John Harrison
@ 2016-04-19 12:35       ` Dave Gordon
  2016-04-21 13:04         ` John Harrison
  0 siblings, 1 reply; 39+ messages in thread
From: Dave Gordon @ 2016-04-19 12:35 UTC (permalink / raw)
  To: Harrison, John C; +Cc: intel-gfx

On 13/04/16 15:21, John Harrison wrote:
> On 13/04/2016 10:57, Daniel Vetter wrote:
>> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>>> Conceptually, each request is a record of a hardware transaction - we
>>> build up a list of pending commands and then either commit them to
>>> hardware, or cancel them. However, whilst building up the list of
>>> pending commands, we may modify state outside of the request and make
>>> references to the pending request. If we do so and then cancel that
>>> request, external objects then point to the deleted request leading to
>>> both graphical and memory corruption.
>>>
>>> The easiest example is to consider object/VMA tracking. When we mark an
>>> object as active in a request, we store a pointer to this, the most
>>> recent request, in the object. Then we want to free that object, we wait
>>> for the most recent request to be idle before proceeding (otherwise the
>>> hardware will write to pages now owned by the system, or we will attempt
>>> to read from those pages before the hardware is finished writing). If
>>> the request was cancelled instead, that wait completes immediately. As a
>>> result, all requests must be committed and not cancelled if the external
>>> state is unknown.
>>>
>>> All that remains of i915_gem_request_cancel() users are just a couple of
>>> extremely unlikely allocation failures, so remove the API entirely.
>>>
>>> A consequence of committing all incomplete requests is that we generate
>>> excess breadcrumbs and fill the ring much more often with dummy work. We
>>> have completely undone the outstanding_last_seqno optimisation.
>>>
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>> Cc: stable@vger.kernel.org
>> Cc: John Harrison <John.C.Harrison@Intel.com>
>>
>> I'd like John's ack on this on too, but patch itself looks sound. Fast
>> r-b
>> since we've discussed this a while ago already ...
>
> I think this is going to cause a problem with the scheduler. You are
> effectively saying that the execbuf call cannot fail beyond the point of
> allocating a request. If it gets that far then it must go all the way
> and submit the request to the hardware. With a scheduler, that means
> adding it to the scheduler's queues and tracking it all the way through
> the system to completion. If nothing else, that sounds like a lot of
> extra overhead for no actual work. Or worse if the failure is at a point
> where the request cannot be sent further through the system because it
> was something critical that failed then you are really stuffed.
>
> I'm not sure what the other option would be though, short of being able
> to undo the last read/write object tracking updates.

With the chained-ownership code we have in the scheduler, it becomes 
perfectly possible to undo the last-read/write tracking changes.

I'd much rather see any failure during submission rewound and undone, so 
we can just return -EAGAIN at any point and let someone retry if required.

This just looks like a hack to work around not having a properly 
transactional model of request submission :(

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [Intel-gfx] [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching
  2016-04-18  9:50       ` Jani Nikula
  (?)
@ 2016-04-20 13:26       ` Jani Nikula
  -1 siblings, 0 replies; 39+ messages in thread
From: Jani Nikula @ 2016-04-20 13:26 UTC (permalink / raw)
  To: Daniel Vetter, Chris Wilson; +Cc: stable, intel-gfx

On Mon, 18 Apr 2016, Jani Nikula <jani.nikula@linux.intel.com> wrote:
> On Wed, 13 Apr 2016, Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Tue, Apr 12, 2016 at 09:03:05PM +0100, Chris Wilson wrote:
>>> Two concurrent writes into the same register cacheline has the chance of
>>> killing the machine on Ivybridge and other gen7. This includes LRI
>>> emitted from the command parser.  The MI_SET_CONTEXT itself serves as
>>> serialising barrier and prevents the pair of register writes in the first
>>> packet from triggering the fault.  However, if a second switch-context
>>> immediately occurs then we may have two adjacent blocks of LRI to the
>>> same registers which may then trigger the hang. To counteract this we
>>> need to insert a delay after the second register write using SRM.
>>> 
>>> This is easiest to reproduce with something like
>>> igt/gem_ctx_switch/interruptible that triggers back-to-back context
>>> switches (with no operations in between them in the command stream,
>>> which requires the execbuf operation to be interrupted after the
>>> MI_SET_CONTEXT) but can be observed sporadically elsewhere when running
>>> interruptible igt. No reports from the wild though, so it must be of low
>>> enough frequency that no one has correlated the random machine freezes
>>> with i915.ko
>>> 
>>> The issue was introduced with
>>> commit 2c550183476dfa25641309ae9a28d30feed14379 [v3.19]
>>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>>> Date:   Tue Dec 16 10:02:27 2014 +0000
>>> 
>>>     drm/i915: Disable PSMI sleep messages on all rings around context switches
>>> 
>>> Testcase: igt/gem_ctx_switch/render-interruptible #ivb
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
>>> Cc: stable@vger.kernel.org
>>
>> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
>
> FYI, this (*) does not cherry-pick cleanly to drm-intel-fixes.

By which I mean, regardless of cc: stable I'm not doing anything to
backport the commit to fixes. Same with the other patch in this
thread. If someone wants the fixes to 4.6 or stable, they need to do the
backporting.

BR,
Jani.


>
> BR,
> Jani.
>
> (*) Well, not exactly *this* but rather
> https://patchwork.freedesktop.org/patch/80952/ which was not posted on
> the list so I can't reply to it.
>
>
>>
>>> ---
>>>  drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++++++++---
>>>  1 file changed, 12 insertions(+), 3 deletions(-)
>>> 
>>> diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
>>> index fe580cb9501a..e5ad7b21e356 100644
>>> --- a/drivers/gpu/drm/i915/i915_gem_context.c
>>> +++ b/drivers/gpu/drm/i915/i915_gem_context.c
>>> @@ -539,7 +539,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>>  
>>>  	len = 4;
>>>  	if (INTEL_INFO(engine->dev)->gen >= 7)
>>> -		len += 2 + (num_rings ? 4*num_rings + 2 : 0);
>>> +		len += 2 + (num_rings ? 4*num_rings + 6 : 0);
>>>  
>>>  	ret = intel_ring_begin(req, len);
>>>  	if (ret)
>>> @@ -579,6 +579,7 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>>  	if (INTEL_INFO(engine->dev)->gen >= 7) {
>>>  		if (num_rings) {
>>>  			struct intel_engine_cs *signaller;
>>> +			i915_reg_t last_reg = {}; /* keep gcc quiet */
>>>  
>>>  			intel_ring_emit(engine,
>>>  					MI_LOAD_REGISTER_IMM(num_rings));
>>> @@ -586,11 +587,19 @@ mi_set_context(struct drm_i915_gem_request *req, u32 hw_flags)
>>>  				if (signaller == engine)
>>>  					continue;
>>>  
>>> -				intel_ring_emit_reg(engine,
>>> -						    RING_PSMI_CTL(signaller->mmio_base));
>>> +				last_reg = RING_PSMI_CTL(signaller->mmio_base);
>>> +				intel_ring_emit_reg(engine, last_reg);
>>>  				intel_ring_emit(engine,
>>>  						_MASKED_BIT_DISABLE(GEN6_PSMI_SLEEP_MSG_DISABLE));
>>>  			}
>>> +
>>> +			/* Insert a delay before the next switch! */
>>> +			intel_ring_emit(engine,
>>> +					MI_STORE_REGISTER_MEM |
>>> +					MI_SRM_LRM_GLOBAL_GTT);
>>> +			intel_ring_emit_reg(engine, last_reg);
>>> +			intel_ring_emit(engine, engine->scratch.gtt_offset);
>>> +			intel_ring_emit(engine, MI_NOOP);
>>>  		}
>>>  		intel_ring_emit(engine, MI_ARB_ON_OFF | MI_ARB_ENABLE);
>>>  	}
>>> -- 
>>> 2.8.0.rc3
>>> 

-- 
Jani Nikula, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-19 12:35       ` Dave Gordon
@ 2016-04-21 13:04         ` John Harrison
  2016-04-22 22:57           ` John Harrison
  0 siblings, 1 reply; 39+ messages in thread
From: John Harrison @ 2016-04-21 13:04 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On 19/04/2016 13:35, Dave Gordon wrote:
> On 13/04/16 15:21, John Harrison wrote:
>> On 13/04/2016 10:57, Daniel Vetter wrote:
>>> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>>>> Conceptually, each request is a record of a hardware transaction - we
>>>> build up a list of pending commands and then either commit them to
>>>> hardware, or cancel them. However, whilst building up the list of
>>>> pending commands, we may modify state outside of the request and make
>>>> references to the pending request. If we do so and then cancel that
>>>> request, external objects then point to the deleted request leading to
>>>> both graphical and memory corruption.
>>>>
>>>> The easiest example is to consider object/VMA tracking. When we 
>>>> mark an
>>>> object as active in a request, we store a pointer to this, the most
>>>> recent request, in the object. Then we want to free that object, we 
>>>> wait
>>>> for the most recent request to be idle before proceeding (otherwise 
>>>> the
>>>> hardware will write to pages now owned by the system, or we will 
>>>> attempt
>>>> to read from those pages before the hardware is finished writing). If
>>>> the request was cancelled instead, that wait completes immediately. 
>>>> As a
>>>> result, all requests must be committed and not cancelled if the 
>>>> external
>>>> state is unknown.
>>>>
>>>> All that remains of i915_gem_request_cancel() users are just a 
>>>> couple of
>>>> extremely unlikely allocation failures, so remove the API entirely.
>>>>
>>>> A consequence of committing all incomplete requests is that we 
>>>> generate
>>>> excess breadcrumbs and fill the ring much more often with dummy 
>>>> work. We
>>>> have completely undone the outstanding_last_seqno optimisation.
>>>>
>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>>> Cc: stable@vger.kernel.org
>>> Cc: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> I'd like John's ack on this on too, but patch itself looks sound. Fast
>>> r-b
>>> since we've discussed this a while ago already ...
>>
>> I think this is going to cause a problem with the scheduler. You are
>> effectively saying that the execbuf call cannot fail beyond the point of
>> allocating a request. If it gets that far then it must go all the way
>> and submit the request to the hardware. With a scheduler, that means
>> adding it to the scheduler's queues and tracking it all the way through
>> the system to completion. If nothing else, that sounds like a lot of
>> extra overhead for no actual work. Or worse if the failure is at a point
>> where the request cannot be sent further through the system because it
>> was something critical that failed then you are really stuffed.
>>
>> I'm not sure what the other option would be though, short of being able
>> to undo the last read/write object tracking updates.
>
> With the chained-ownership code we have in the scheduler, it becomes 
> perfectly possible to undo the last-read/write tracking changes.
>
> I'd much rather see any failure during submission rewound and undone, 
> so we can just return -EAGAIN at any point and let someone retry if 
> required.
>
> This just looks like a hack to work around not having a properly 
> transactional model of request submission :(
>
> .Dave.

I was thinking if it would be possible to delay the tracking updates 
until later in the execbuf process. I.e. only do it after all potential 
failure points. That would be a much simpler change than putting in 
chained ownership.

However, it seems that the patch has already been merged despite this 
discussion and Daniel Vetter wanting an ack first? Is that correct?

John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-21 13:04         ` John Harrison
@ 2016-04-22 22:57           ` John Harrison
  2016-04-27 18:52             ` Dave Gordon
  0 siblings, 1 reply; 39+ messages in thread
From: John Harrison @ 2016-04-22 22:57 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On 21/04/2016 14:04, John Harrison wrote:
> On 19/04/2016 13:35, Dave Gordon wrote:
>> On 13/04/16 15:21, John Harrison wrote:
>>> On 13/04/2016 10:57, Daniel Vetter wrote:
>>>> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>>>>> Conceptually, each request is a record of a hardware transaction - we
>>>>> build up a list of pending commands and then either commit them to
>>>>> hardware, or cancel them. However, whilst building up the list of
>>>>> pending commands, we may modify state outside of the request and make
>>>>> references to the pending request. If we do so and then cancel that
>>>>> request, external objects then point to the deleted request 
>>>>> leading to
>>>>> both graphical and memory corruption.
>>>>>
>>>>> The easiest example is to consider object/VMA tracking. When we 
>>>>> mark an
>>>>> object as active in a request, we store a pointer to this, the most
>>>>> recent request, in the object. Then we want to free that object, 
>>>>> we wait
>>>>> for the most recent request to be idle before proceeding 
>>>>> (otherwise the
>>>>> hardware will write to pages now owned by the system, or we will 
>>>>> attempt
>>>>> to read from those pages before the hardware is finished writing). If
>>>>> the request was cancelled instead, that wait completes 
>>>>> immediately. As a
>>>>> result, all requests must be committed and not cancelled if the 
>>>>> external
>>>>> state is unknown.
>>>>>
>>>>> All that remains of i915_gem_request_cancel() users are just a 
>>>>> couple of
>>>>> extremely unlikely allocation failures, so remove the API entirely.
>>>>>
>>>>> A consequence of committing all incomplete requests is that we 
>>>>> generate
>>>>> excess breadcrumbs and fill the ring much more often with dummy 
>>>>> work. We
>>>>> have completely undone the outstanding_last_seqno optimisation.
>>>>>
>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>>>> Cc: stable@vger.kernel.org
>>>> Cc: John Harrison <John.C.Harrison@Intel.com>
>>>>
>>>> I'd like John's ack on this on too, but patch itself looks sound. Fast
>>>> r-b
>>>> since we've discussed this a while ago already ...
>>>
>>> I think this is going to cause a problem with the scheduler. You are
>>> effectively saying that the execbuf call cannot fail beyond the 
>>> point of
>>> allocating a request. If it gets that far then it must go all the way
>>> and submit the request to the hardware. With a scheduler, that means
>>> adding it to the scheduler's queues and tracking it all the way through
>>> the system to completion. If nothing else, that sounds like a lot of
>>> extra overhead for no actual work. Or worse if the failure is at a 
>>> point
>>> where the request cannot be sent further through the system because it
>>> was something critical that failed then you are really stuffed.
>>>
>>> I'm not sure what the other option would be though, short of being able
>>> to undo the last read/write object tracking updates.
>>
>> With the chained-ownership code we have in the scheduler, it becomes 
>> perfectly possible to undo the last-read/write tracking changes.
>>
>> I'd much rather see any failure during submission rewound and undone, 
>> so we can just return -EAGAIN at any point and let someone retry if 
>> required.
>>
>> This just looks like a hack to work around not having a properly 
>> transactional model of request submission :(
>>
>> .Dave.
>
> I was thinking if it would be possible to delay the tracking updates 
> until later in the execbuf process. I.e. only do it after all 
> potential failure points. That would be a much simpler change than 
> putting in chained ownership.
>
> However, it seems that the patch has already been merged despite this 
> discussion and Daniel Vetter wanting an ack first? Is that correct?
>
> John.
>

Dave Gordon and myself had a look at the option of delaying the object 
tracking until beyond the point of last possible failure in the execbuf 
call path. As far as we can tell, it already is. The object tracking 
update occurs in i915_gem_execbuffer_move_to_active(). That function 
cannot return a failure code and is immediately followed (in both LRC 
and legacy mode) by a call to i915_gem_execbuffer_retire_commands() 
which is what flushes out the request to the hardware. So it would 
appear that this patch has no effect on object tracking within the 
execbuf code path. If a request cancellation code path was taken then 
the tracking would not have been updated and so the request is 
irrelevant as it has no references to it. If the tracking was updated 
and the request is being referenced then the request was also guaranteed 
to have been submitted and not cancelled.

Either we are missing something major somewhere or this patch cannot fix 
the stated bug in the stated manner?

I have tried running the failing test myself but when I try to run the 
particular gem_concurrent_blit subtest it tells me that it requires more 
'objects' and/or RAM than I have available. What does one need in order 
to run the test? The bug report also does not say whether it is a 
guaranteed failure every time or a sporadic, once in X many runs kind of 
failure?

Thanks,
John.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

* Re: [CI-ping 15/15] drm/i915: Late request cancellations are harmful
  2016-04-22 22:57           ` John Harrison
@ 2016-04-27 18:52             ` Dave Gordon
  0 siblings, 0 replies; 39+ messages in thread
From: Dave Gordon @ 2016-04-27 18:52 UTC (permalink / raw)
  To: John Harrison; +Cc: intel-gfx

On 22/04/16 23:57, John Harrison wrote:
> On 21/04/2016 14:04, John Harrison wrote:
>> On 19/04/2016 13:35, Dave Gordon wrote:
>>> On 13/04/16 15:21, John Harrison wrote:
>>>> On 13/04/2016 10:57, Daniel Vetter wrote:
>>>>> On Tue, Apr 12, 2016 at 09:03:09PM +0100, Chris Wilson wrote:
>>>>>> Conceptually, each request is a record of a hardware transaction - we
>>>>>> build up a list of pending commands and then either commit them to
>>>>>> hardware, or cancel them. However, whilst building up the list of
>>>>>> pending commands, we may modify state outside of the request and make
>>>>>> references to the pending request. If we do so and then cancel that
>>>>>> request, external objects then point to the deleted request
>>>>>> leading to
>>>>>> both graphical and memory corruption.
>>>>>>
>>>>>> The easiest example is to consider object/VMA tracking. When we
>>>>>> mark an
>>>>>> object as active in a request, we store a pointer to this, the most
>>>>>> recent request, in the object. Then we want to free that object,
>>>>>> we wait
>>>>>> for the most recent request to be idle before proceeding
>>>>>> (otherwise the
>>>>>> hardware will write to pages now owned by the system, or we will
>>>>>> attempt
>>>>>> to read from those pages before the hardware is finished writing). If
>>>>>> the request was cancelled instead, that wait completes
>>>>>> immediately. As a
>>>>>> result, all requests must be committed and not cancelled if the
>>>>>> external
>>>>>> state is unknown.
>>>>>>
>>>>>> All that remains of i915_gem_request_cancel() users are just a
>>>>>> couple of
>>>>>> extremely unlikely allocation failures, so remove the API entirely.
>>>>>>
>>>>>> A consequence of committing all incomplete requests is that we
>>>>>> generate
>>>>>> excess breadcrumbs and fill the ring much more often with dummy
>>>>>> work. We
>>>>>> have completely undone the outstanding_last_seqno optimisation.
>>>>>>
>>>>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93907
>>>>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>>>>> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
>>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>>>>> Cc: stable@vger.kernel.org
>>>>> Cc: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> I'd like John's ack on this on too, but patch itself looks sound. Fast
>>>>> r-b
>>>>> since we've discussed this a while ago already ...
>>>>
>>>> I think this is going to cause a problem with the scheduler. You are
>>>> effectively saying that the execbuf call cannot fail beyond the
>>>> point of
>>>> allocating a request. If it gets that far then it must go all the way
>>>> and submit the request to the hardware. With a scheduler, that means
>>>> adding it to the scheduler's queues and tracking it all the way through
>>>> the system to completion. If nothing else, that sounds like a lot of
>>>> extra overhead for no actual work. Or worse if the failure is at a
>>>> point
>>>> where the request cannot be sent further through the system because it
>>>> was something critical that failed then you are really stuffed.
>>>>
>>>> I'm not sure what the other option would be though, short of being able
>>>> to undo the last read/write object tracking updates.
>>>
>>> With the chained-ownership code we have in the scheduler, it becomes
>>> perfectly possible to undo the last-read/write tracking changes.
>>>
>>> I'd much rather see any failure during submission rewound and undone,
>>> so we can just return -EAGAIN at any point and let someone retry if
>>> required.
>>>
>>> This just looks like a hack to work around not having a properly
>>> transactional model of request submission :(
>>>
>>> .Dave.
>>
>> I was thinking if it would be possible to delay the tracking updates
>> until later in the execbuf process. I.e. only do it after all
>> potential failure points. That would be a much simpler change than
>> putting in chained ownership.
>>
>> However, it seems that the patch has already been merged despite this
>> discussion and Daniel Vetter wanting an ack first? Is that correct?
>>
>> John.
>
> Dave Gordon and myself had a look at the option of delaying the object
> tracking until beyond the point of last possible failure in the execbuf
> call path. As far as we can tell, it already is. The object tracking
> update occurs in i915_gem_execbuffer_move_to_active(). That function
> cannot return a failure code and is immediately followed (in both LRC
> and legacy mode) by a call to i915_gem_execbuffer_retire_commands()
> which is what flushes out the request to the hardware. So it would
> appear that this patch has no effect on object tracking within the
> execbuf code path. If a request cancellation code path was taken then
> the tracking would not have been updated and so the request is
> irrelevant as it has no references to it. If the tracking was updated
> and the request is being referenced then the request was also guaranteed
> to have been submitted and not cancelled.
>
> Either we are missing something major somewhere or this patch cannot fix
> the stated bug in the stated manner?
>
> I have tried running the failing test myself but when I try to run the
> particular gem_concurrent_blit subtest it tells me that it requires more
> 'objects' and/or RAM than I have available. What does one need in order
> to run the test? The bug report also does not say whether it is a
> guaranteed failure every time or a sporadic, once in X many runs kind of
> failure?
>
> Thanks,
> John.

Maybe it's not a cancelled request at all, but a race where submission 
HAS been successful, and the object tracking HAS been updated, but the 
object is nonetheless not (yet) on the request list. Which would leave 
the object list-head UNINITIALISED. And hence crash on retiring :(

My one-line patch for that would fix the crash, but I'd still wonder how 
it got tracked and completed before getting onto the request list.

We could try initialising the list-head to POISON rather than empty, to 
clearly flag removing-before-adding-to-list cases.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2016-04-27 18:52 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-12 20:02 [CI-ping 01/15] drm/i915: Force clean compilation with -Werror Chris Wilson
2016-04-12 20:02 ` [CI-ping 02/15] drm/i915: Disentangle i915_drv.h includes Chris Wilson
2016-04-12 20:02 ` [CI-ping 03/15] drm/i915: Add GEM debugging Kconfig option Chris Wilson
2016-04-12 20:02 ` [CI-ping 04/15] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
2016-04-12 20:02 ` [CI-ping 05/15] drm/i915: Simplify checking of GPU reset_counter in display pageflips Chris Wilson
2016-04-12 20:03 ` [CI-ping 06/15] drm/i915: Tighten reset_counter for reset status Chris Wilson
2016-04-12 20:03 ` [CI-ping 07/15] drm/i915: Store the reset counter when constructing a request Chris Wilson
2016-04-12 20:03 ` [CI-ping 08/15] drm/i915: Simplify reset_counter handling during atomic modesetting Chris Wilson
2016-04-12 20:03 ` [CI-ping 09/15] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
2016-04-12 20:03 ` [CI-ping 10/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
2016-04-12 20:03 ` [CI-ping 11/15] drm/i915: Prevent machine death on Ivybridge context switching Chris Wilson
2016-04-13  9:33   ` Daniel Vetter
2016-04-13  9:33     ` Daniel Vetter
2016-04-18  9:50     ` [Intel-gfx] " Jani Nikula
2016-04-18  9:50       ` Jani Nikula
2016-04-20 13:26       ` [Intel-gfx] " Jani Nikula
2016-04-12 20:03 ` [CI-ping 12/15] drm/i915: Force ringbuffers to not be at offset 0 Chris Wilson
2016-04-13  9:34   ` [Intel-gfx] " Daniel Vetter
2016-04-12 20:03 ` [CI-ping 13/15] drm/i915: Move the mb() following release-mmap into release-mmap Chris Wilson
2016-04-12 20:03 ` [CI-ping 14/15] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
2016-04-13  9:59   ` Daniel Vetter
2016-04-13 10:05     ` Chris Wilson
2016-04-13 14:16       ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Chris Wilson
2016-04-13 14:16         ` [PATCH 2/2] drm/i915: Reorganise legacy context switch to cope with late failure Chris Wilson
2016-04-13 15:05           ` Daniel Vetter
2016-04-13 15:18             ` Chris Wilson
2016-04-13 14:56         ` [PATCH 1/2] drm/i915: Split out !RCS legacy context switching Daniel Vetter
2016-04-13 15:04           ` Chris Wilson
2016-04-12 20:03 ` [CI-ping 15/15] drm/i915: Late request cancellations are harmful Chris Wilson
2016-04-13  9:57   ` Daniel Vetter
2016-04-13  9:57     ` Daniel Vetter
2016-04-13 14:21     ` John Harrison
2016-04-19 12:35       ` Dave Gordon
2016-04-21 13:04         ` John Harrison
2016-04-22 22:57           ` John Harrison
2016-04-27 18:52             ` Dave Gordon
2016-04-18  9:46     ` [Intel-gfx] " Jani Nikula
2016-04-18  9:46       ` Jani Nikula
2016-04-14  8:45 ` ✗ Fi.CI.BAT: failure for series starting with [CI-ping,01/15] drm/i915: Force clean compilation with -Werror (rev3) Patchwork

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.