All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/12] Convert requests to use struct fence
@ 2015-11-23 11:34                 ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 01/12] staging/android/sync: Support sync points created from dma-fences John.C.Harrison
                                     ` (13 more replies)
  0 siblings, 14 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

There is a construct in the linux kernel called 'struct fence' that is
intended to keep track of work that is executed on hardware. I.e. it
solves the basic problem that the drivers 'struct
drm_i915_gem_request' is trying to address. The request structure does
quite a lot more than simply track the execution progress so is very
definitely still required. However, the basic completion status side
could be updated to use the ready made fence implementation and gain
all the advantages that provides.

Using the struct fence object also has the advantage that the fence
can be used outside of the i915 driver (by other drivers or by
userland applications). That is the basis of the dma-buff
synchronisation API and allows asynchronous tracking of work
completion. In this case, it allows applications to be signalled
directly when a batch buffer completes without having to make an IOCTL
call into the driver.

This is work that was planned since the conversion of the driver from
being seqno value based to being request structure based. This patch
series does that work.

An IGT test to exercise the fence support from user land is in
progress and will follow. Android already makes extensive use of
fences for display composition. Real world linux usage is planned in
the form of Jesse's page table sharing / bufferless execbuf support.
There is also a plan that Wayland (and others) could make use of it in
a similar manner to Android.

v2: Updated for review comments by various people and to add support
for Android style 'native sync'.

v3: Updated from review comments by Tvrtko Ursulin. Also moved sync
framework out of staging and improved request completion handling.

[Patches against drm-intel-nightly tree fetched 17/11/2015]

John Harrison (9):
  staging/android/sync: Move sync framework out of staging
  drm/i915: Convert requests to use struct fence
  drm/i915: Removed now redudant parameter to i915_gem_request_completed()
  drm/i915: Add per context timelines to fence object
  drm/i915: Delay the freeing of requests until retire time
  drm/i915: Interrupt driven fences
  drm/i915: Updated request structure tracing
  drm/i915: Add sync framework support to execbuff IOCTL
  drm/i915: Cache last IRQ seqno to reduce IRQ overhead

Maarten Lankhorst (1):
  staging/android/sync: add sync_fence_create_dma

Peter Lawthers (1):
  android/sync: Fix reversed sense of signaled fence

Tvrtko Ursulin (1):
  staging/android/sync: Support sync points created from dma-fences

 drivers/android/Kconfig                    |  28 ++
 drivers/android/Makefile                   |   2 +
 drivers/android/sw_sync.c                  | 260 ++++++++++
 drivers/android/sw_sync.h                  |  59 +++
 drivers/android/sync.c                     | 734 +++++++++++++++++++++++++++++
 drivers/android/sync.h                     | 387 +++++++++++++++
 drivers/android/sync_debug.c               | 256 ++++++++++
 drivers/android/trace/sync.h               |  82 ++++
 drivers/gpu/drm/i915/Kconfig               |   3 +
 drivers/gpu/drm/i915/i915_debugfs.c        |   7 +-
 drivers/gpu/drm/i915/i915_drv.h            |  75 +--
 drivers/gpu/drm/i915/i915_gem.c            | 437 ++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_context.c    |  15 +-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c |  95 +++-
 drivers/gpu/drm/i915/i915_irq.c            |   2 +-
 drivers/gpu/drm/i915/i915_trace.h          |  13 +-
 drivers/gpu/drm/i915/intel_display.c       |   4 +-
 drivers/gpu/drm/i915/intel_lrc.c           |  13 +
 drivers/gpu/drm/i915/intel_pm.c            |   6 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c    |   5 +
 drivers/gpu/drm/i915/intel_ringbuffer.h    |   9 +
 drivers/staging/android/Kconfig            |  28 --
 drivers/staging/android/Makefile           |   2 -
 drivers/staging/android/sw_sync.c          | 260 ----------
 drivers/staging/android/sw_sync.h          |  59 ---
 drivers/staging/android/sync.c             | 729 ----------------------------
 drivers/staging/android/sync.h             | 356 --------------
 drivers/staging/android/sync_debug.c       | 254 ----------
 drivers/staging/android/trace/sync.h       |  82 ----
 drivers/staging/android/uapi/sw_sync.h     |  32 --
 drivers/staging/android/uapi/sync.h        |  97 ----
 include/uapi/Kbuild                        |   1 +
 include/uapi/drm/i915_drm.h                |  16 +-
 include/uapi/sync/Kbuild                   |   3 +
 include/uapi/sync/sw_sync.h                |  32 ++
 include/uapi/sync/sync.h                   |  97 ++++
 36 files changed, 2569 insertions(+), 1971 deletions(-)
 create mode 100644 drivers/android/sw_sync.c
 create mode 100644 drivers/android/sw_sync.h
 create mode 100644 drivers/android/sync.c
 create mode 100644 drivers/android/sync.h
 create mode 100644 drivers/android/sync_debug.c
 create mode 100644 drivers/android/trace/sync.h
 delete mode 100644 drivers/staging/android/sw_sync.c
 delete mode 100644 drivers/staging/android/sw_sync.h
 delete mode 100644 drivers/staging/android/sync.c
 delete mode 100644 drivers/staging/android/sync.h
 delete mode 100644 drivers/staging/android/sync_debug.c
 delete mode 100644 drivers/staging/android/trace/sync.h
 delete mode 100644 drivers/staging/android/uapi/sw_sync.h
 delete mode 100644 drivers/staging/android/uapi/sync.h
 create mode 100644 include/uapi/sync/Kbuild
 create mode 100644 include/uapi/sync/sw_sync.h
 create mode 100644 include/uapi/sync/sync.h

-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [RFC 01/12] staging/android/sync: Support sync points created from dma-fences
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 13:29                     ` Maarten Lankhorst
  2015-11-23 13:31                     ` [Intel-gfx] " Tvrtko Ursulin
  2015-11-23 11:34                   ` [RFC 02/12] staging/android/sync: add sync_fence_create_dma John.C.Harrison
                                     ` (12 subsequent siblings)
  13 siblings, 2 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX
  Cc: devel, Greg Kroah-Hartman, Arve Hjønnevåg, Riley Andrews

From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Debug output assumes all sync points are built on top of Android sync points
and when we start creating them from dma-fences will NULL ptr deref unless
taught about this.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: devel@driverdev.osuosl.org
Cc: Riley Andrews <riandrews@android.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arve Hjønnevåg <arve@android.com>
---
 drivers/staging/android/sync_debug.c | 42 +++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 20 deletions(-)

diff --git a/drivers/staging/android/sync_debug.c b/drivers/staging/android/sync_debug.c
index 91ed2c4..f45d13c 100644
--- a/drivers/staging/android/sync_debug.c
+++ b/drivers/staging/android/sync_debug.c
@@ -82,36 +82,42 @@ static const char *sync_status_str(int status)
 	return "error";
 }
 
-static void sync_print_pt(struct seq_file *s, struct sync_pt *pt, bool fence)
+static void sync_print_pt(struct seq_file *s, struct fence *pt, bool fence)
 {
 	int status = 1;
-	struct sync_timeline *parent = sync_pt_parent(pt);
 
-	if (fence_is_signaled_locked(&pt->base))
-		status = pt->base.status;
+	if (fence_is_signaled_locked(pt))
+		status = pt->status;
 
 	seq_printf(s, "  %s%spt %s",
-		   fence ? parent->name : "",
+		   fence && pt->ops->get_timeline_name ?
+		   pt->ops->get_timeline_name(pt) : "",
 		   fence ? "_" : "",
 		   sync_status_str(status));
 
 	if (status <= 0) {
 		struct timespec64 ts64 =
-			ktime_to_timespec64(pt->base.timestamp);
+			ktime_to_timespec64(pt->timestamp);
 
 		seq_printf(s, "@%lld.%09ld", (s64)ts64.tv_sec, ts64.tv_nsec);
 	}
 
-	if (parent->ops->timeline_value_str &&
-	    parent->ops->pt_value_str) {
+	if ((!fence || pt->ops->timeline_value_str) &&
+	    pt->ops->fence_value_str) {
 		char value[64];
+		bool success;
 
-		parent->ops->pt_value_str(pt, value, sizeof(value));
-		seq_printf(s, ": %s", value);
-		if (fence) {
-			parent->ops->timeline_value_str(parent, value,
-						    sizeof(value));
-			seq_printf(s, " / %s", value);
+		pt->ops->fence_value_str(pt, value, sizeof(value));
+		success = strlen(value);
+
+		if (success)
+			seq_printf(s, ": %s", value);
+
+		if (success && fence) {
+			pt->ops->timeline_value_str(pt, value, sizeof(value));
+
+			if (strlen(value))
+				seq_printf(s, " / %s", value);
 		}
 	}
 
@@ -138,7 +144,7 @@ static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj)
 	list_for_each(pos, &obj->child_list_head) {
 		struct sync_pt *pt =
 			container_of(pos, struct sync_pt, child_list);
-		sync_print_pt(s, pt, false);
+		sync_print_pt(s, &pt->base, false);
 	}
 	spin_unlock_irqrestore(&obj->child_list_lock, flags);
 }
@@ -153,11 +159,7 @@ static void sync_print_fence(struct seq_file *s, struct sync_fence *fence)
 		   sync_status_str(atomic_read(&fence->status)));
 
 	for (i = 0; i < fence->num_fences; ++i) {
-		struct sync_pt *pt =
-			container_of(fence->cbs[i].sync_pt,
-				     struct sync_pt, base);
-
-		sync_print_pt(s, pt, true);
+		sync_print_pt(s, fence->cbs[i].sync_pt, true);
 	}
 
 	spin_lock_irqsave(&fence->wq.lock, flags);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 02/12] staging/android/sync: add sync_fence_create_dma
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
  2015-11-23 11:34                   ` [RFC 01/12] staging/android/sync: Support sync points created from dma-fences John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 13:27                     ` Maarten Lankhorst
  2015-11-23 11:34                   ` [RFC 03/12] staging/android/sync: Move sync framework out of staging John.C.Harrison
                                     ` (11 subsequent siblings)
  13 siblings, 1 reply; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX
  Cc: devel, Greg Kroah-Hartman, Arve Hjønnevåg,
	Riley Andrews, Maarten Lankhorst

From: Maarten Lankhorst <maarten.lankhorst@canonical.com>

This allows users of dma fences to create a android fence.

v2: Added kerneldoc. (Tvrtko Ursulin).

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
Cc: devel@driverdev.osuosl.org
Cc: Riley Andrews <riandrews@android.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Arve Hjønnevåg <arve@android.com>
---
 drivers/staging/android/sync.c | 13 +++++++++----
 drivers/staging/android/sync.h | 12 +++++++++++-
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c
index f83e00c..7f0e919 100644
--- a/drivers/staging/android/sync.c
+++ b/drivers/staging/android/sync.c
@@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct fence_cb *cb)
 }
 
 /* TODO: implement a create which takes more that one sync_pt */
-struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
+struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt)
 {
 	struct sync_fence *fence;
 
@@ -199,16 +199,21 @@ struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
 	fence->num_fences = 1;
 	atomic_set(&fence->status, 1);
 
-	fence->cbs[0].sync_pt = &pt->base;
+	fence->cbs[0].sync_pt = pt;
 	fence->cbs[0].fence = fence;
-	if (fence_add_callback(&pt->base, &fence->cbs[0].cb,
-			       fence_check_cb_func))
+	if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func))
 		atomic_dec(&fence->status);
 
 	sync_fence_debug_add(fence);
 
 	return fence;
 }
+EXPORT_SYMBOL(sync_fence_create_dma);
+
+struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
+{
+	return sync_fence_create_dma(name, &pt->base);
+}
 EXPORT_SYMBOL(sync_fence_create);
 
 struct sync_fence *sync_fence_fdget(int fd)
diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h
index 61f8a3a..798cd56 100644
--- a/drivers/staging/android/sync.h
+++ b/drivers/staging/android/sync.h
@@ -250,10 +250,20 @@ void sync_pt_free(struct sync_pt *pt);
  * @pt:		sync_pt to add to the fence
  *
  * Creates a fence containg @pt.  Once this is called, the fence takes
- * ownership of @pt.
+ * a reference on @pt.
  */
 struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt);
 
+/**
+ * sync_fence_create_dma() - creates a sync fence from dma-fence
+ * @name:	name of fence to create
+ * @pt:	dma-fence to add to the fence
+ *
+ * Creates a fence containg @pt.  Once this is called, the fence takes
+ * a reference on @pt.
+ */
+struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt);
+
 /*
  * API for sync_fence consumers
  */
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 03/12] staging/android/sync: Move sync framework out of staging
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
  2015-11-23 11:34                   ` [RFC 01/12] staging/android/sync: Support sync points created from dma-fences John.C.Harrison
  2015-11-23 11:34                   ` [RFC 02/12] staging/android/sync: add sync_fence_create_dma John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 04/12] drm/i915: Convert requests to use struct fence John.C.Harrison
                                     ` (10 subsequent siblings)
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The sync framework is now used by the i915 driver. Therefore it can be
moved out of staging and into the regular tree. Also, the public
interfaces can actually be made public and exported.

v3: New patch for series.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Geoff Miller <geoff.miller@intel.com>
---
 drivers/android/Kconfig                |  28 ++
 drivers/android/Makefile               |   2 +
 drivers/android/sw_sync.c              | 260 ++++++++++++
 drivers/android/sw_sync.h              |  59 +++
 drivers/android/sync.c                 | 734 +++++++++++++++++++++++++++++++++
 drivers/android/sync.h                 | 366 ++++++++++++++++
 drivers/android/sync_debug.c           | 256 ++++++++++++
 drivers/android/trace/sync.h           |  82 ++++
 drivers/staging/android/Kconfig        |  28 --
 drivers/staging/android/Makefile       |   2 -
 drivers/staging/android/sw_sync.c      | 260 ------------
 drivers/staging/android/sw_sync.h      |  59 ---
 drivers/staging/android/sync.c         | 734 ---------------------------------
 drivers/staging/android/sync.h         | 366 ----------------
 drivers/staging/android/sync_debug.c   | 256 ------------
 drivers/staging/android/trace/sync.h   |  82 ----
 drivers/staging/android/uapi/sw_sync.h |  32 --
 drivers/staging/android/uapi/sync.h    |  97 -----
 include/uapi/Kbuild                    |   1 +
 include/uapi/sync/Kbuild               |   3 +
 include/uapi/sync/sw_sync.h            |  32 ++
 include/uapi/sync/sync.h               |  97 +++++
 22 files changed, 1920 insertions(+), 1916 deletions(-)
 create mode 100644 drivers/android/sw_sync.c
 create mode 100644 drivers/android/sw_sync.h
 create mode 100644 drivers/android/sync.c
 create mode 100644 drivers/android/sync.h
 create mode 100644 drivers/android/sync_debug.c
 create mode 100644 drivers/android/trace/sync.h
 delete mode 100644 drivers/staging/android/sw_sync.c
 delete mode 100644 drivers/staging/android/sw_sync.h
 delete mode 100644 drivers/staging/android/sync.c
 delete mode 100644 drivers/staging/android/sync.h
 delete mode 100644 drivers/staging/android/sync_debug.c
 delete mode 100644 drivers/staging/android/trace/sync.h
 delete mode 100644 drivers/staging/android/uapi/sw_sync.h
 delete mode 100644 drivers/staging/android/uapi/sync.h
 create mode 100644 include/uapi/sync/Kbuild
 create mode 100644 include/uapi/sync/sw_sync.h
 create mode 100644 include/uapi/sync/sync.h

diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig
index bdfc6c6..9edcd8f 100644
--- a/drivers/android/Kconfig
+++ b/drivers/android/Kconfig
@@ -32,6 +32,34 @@ config ANDROID_BINDER_IPC_32BIT
 
 	  Note that enabling this will break newer Android user-space.
 
+config SYNC
+	bool "Synchronization framework"
+	default n
+	select ANON_INODES
+	select DMA_SHARED_BUFFER
+	---help---
+	  This option enables the framework for synchronization between multiple
+	  drivers.  Sync implementations can take advantage of hardware
+	  synchronization built into devices like GPUs.
+
+config SW_SYNC
+	bool "Software synchronization objects"
+	default n
+	depends on SYNC
+	---help---
+	  A sync object driver that uses a 32bit counter to coordinate
+	  synchronization.  Useful when there is no hardware primitive backing
+	  the synchronization.
+
+config SW_SYNC_USER
+	bool "Userspace API for SW_SYNC"
+	default n
+	depends on SW_SYNC
+	---help---
+	  Provides a user space API to the sw sync object.
+	  *WARNING* improper use of this can result in deadlocking kernel
+	  drivers from userspace.
+
 endif # if ANDROID
 
 endmenu
diff --git a/drivers/android/Makefile b/drivers/android/Makefile
index 3b7e4b0..a1465dd 100644
--- a/drivers/android/Makefile
+++ b/drivers/android/Makefile
@@ -1,3 +1,5 @@
 ccflags-y += -I$(src)			# needed for trace events
 
 obj-$(CONFIG_ANDROID_BINDER_IPC)	+= binder.o
+obj-$(CONFIG_SYNC)			+= sync.o sync_debug.o
+obj-$(CONFIG_SW_SYNC)			+= sw_sync.o
diff --git a/drivers/android/sw_sync.c b/drivers/android/sw_sync.c
new file mode 100644
index 0000000..c4ff167
--- /dev/null
+++ b/drivers/android/sw_sync.c
@@ -0,0 +1,260 @@
+/*
+ * drivers/base/sw_sync.c
+ *
+ * Copyright (C) 2012 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/export.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/miscdevice.h>
+#include <linux/syscalls.h>
+#include <linux/uaccess.h>
+
+#include "sw_sync.h"
+
+static int sw_sync_cmp(u32 a, u32 b)
+{
+	if (a == b)
+		return 0;
+
+	return ((s32)a - (s32)b) < 0 ? -1 : 1;
+}
+
+struct sync_pt *sw_sync_pt_create(struct sw_sync_timeline *obj, u32 value)
+{
+	struct sw_sync_pt *pt;
+
+	pt = (struct sw_sync_pt *)
+		sync_pt_create(&obj->obj, sizeof(struct sw_sync_pt));
+
+	pt->value = value;
+
+	return (struct sync_pt *)pt;
+}
+EXPORT_SYMBOL(sw_sync_pt_create);
+
+static struct sync_pt *sw_sync_pt_dup(struct sync_pt *sync_pt)
+{
+	struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt;
+	struct sw_sync_timeline *obj =
+		(struct sw_sync_timeline *)sync_pt_parent(sync_pt);
+
+	return (struct sync_pt *)sw_sync_pt_create(obj, pt->value);
+}
+
+static int sw_sync_pt_has_signaled(struct sync_pt *sync_pt)
+{
+	struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt;
+	struct sw_sync_timeline *obj =
+		(struct sw_sync_timeline *)sync_pt_parent(sync_pt);
+
+	return sw_sync_cmp(obj->value, pt->value) >= 0;
+}
+
+static int sw_sync_pt_compare(struct sync_pt *a, struct sync_pt *b)
+{
+	struct sw_sync_pt *pt_a = (struct sw_sync_pt *)a;
+	struct sw_sync_pt *pt_b = (struct sw_sync_pt *)b;
+
+	return sw_sync_cmp(pt_a->value, pt_b->value);
+}
+
+static int sw_sync_fill_driver_data(struct sync_pt *sync_pt,
+				    void *data, int size)
+{
+	struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt;
+
+	if (size < sizeof(pt->value))
+		return -ENOMEM;
+
+	memcpy(data, &pt->value, sizeof(pt->value));
+
+	return sizeof(pt->value);
+}
+
+static void sw_sync_timeline_value_str(struct sync_timeline *sync_timeline,
+				       char *str, int size)
+{
+	struct sw_sync_timeline *timeline =
+		(struct sw_sync_timeline *)sync_timeline;
+	snprintf(str, size, "%d", timeline->value);
+}
+
+static void sw_sync_pt_value_str(struct sync_pt *sync_pt,
+				 char *str, int size)
+{
+	struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt;
+
+	snprintf(str, size, "%d", pt->value);
+}
+
+static struct sync_timeline_ops sw_sync_timeline_ops = {
+	.driver_name = "sw_sync",
+	.dup = sw_sync_pt_dup,
+	.has_signaled = sw_sync_pt_has_signaled,
+	.compare = sw_sync_pt_compare,
+	.fill_driver_data = sw_sync_fill_driver_data,
+	.timeline_value_str = sw_sync_timeline_value_str,
+	.pt_value_str = sw_sync_pt_value_str,
+};
+
+struct sw_sync_timeline *sw_sync_timeline_create(const char *name)
+{
+	struct sw_sync_timeline *obj = (struct sw_sync_timeline *)
+		sync_timeline_create(&sw_sync_timeline_ops,
+				     sizeof(struct sw_sync_timeline),
+				     name);
+
+	return obj;
+}
+EXPORT_SYMBOL(sw_sync_timeline_create);
+
+void sw_sync_timeline_inc(struct sw_sync_timeline *obj, u32 inc)
+{
+	obj->value += inc;
+
+	sync_timeline_signal(&obj->obj);
+}
+EXPORT_SYMBOL(sw_sync_timeline_inc);
+
+#ifdef CONFIG_SW_SYNC_USER
+/* *WARNING*
+ *
+ * improper use of this can result in deadlocking kernel drivers from userspace.
+ */
+
+/* opening sw_sync create a new sync obj */
+static int sw_sync_open(struct inode *inode, struct file *file)
+{
+	struct sw_sync_timeline *obj;
+	char task_comm[TASK_COMM_LEN];
+
+	get_task_comm(task_comm, current);
+
+	obj = sw_sync_timeline_create(task_comm);
+	if (!obj)
+		return -ENOMEM;
+
+	file->private_data = obj;
+
+	return 0;
+}
+
+static int sw_sync_release(struct inode *inode, struct file *file)
+{
+	struct sw_sync_timeline *obj = file->private_data;
+
+	sync_timeline_destroy(&obj->obj);
+	return 0;
+}
+
+static long sw_sync_ioctl_create_fence(struct sw_sync_timeline *obj,
+				       unsigned long arg)
+{
+	int fd = get_unused_fd_flags(O_CLOEXEC);
+	int err;
+	struct sync_pt *pt;
+	struct sync_fence *fence;
+	struct sw_sync_create_fence_data data;
+
+	if (fd < 0)
+		return fd;
+
+	if (copy_from_user(&data, (void __user *)arg, sizeof(data))) {
+		err = -EFAULT;
+		goto err;
+	}
+
+	pt = sw_sync_pt_create(obj, data.value);
+	if (!pt) {
+		err = -ENOMEM;
+		goto err;
+	}
+
+	data.name[sizeof(data.name) - 1] = '\0';
+	fence = sync_fence_create(data.name, pt);
+	if (!fence) {
+		sync_pt_free(pt);
+		err = -ENOMEM;
+		goto err;
+	}
+
+	data.fence = fd;
+	if (copy_to_user((void __user *)arg, &data, sizeof(data))) {
+		sync_fence_put(fence);
+		err = -EFAULT;
+		goto err;
+	}
+
+	sync_fence_install(fence, fd);
+
+	return 0;
+
+err:
+	put_unused_fd(fd);
+	return err;
+}
+
+static long sw_sync_ioctl_inc(struct sw_sync_timeline *obj, unsigned long arg)
+{
+	u32 value;
+
+	if (copy_from_user(&value, (void __user *)arg, sizeof(value)))
+		return -EFAULT;
+
+	sw_sync_timeline_inc(obj, value);
+
+	return 0;
+}
+
+static long sw_sync_ioctl(struct file *file, unsigned int cmd,
+			  unsigned long arg)
+{
+	struct sw_sync_timeline *obj = file->private_data;
+
+	switch (cmd) {
+	case SW_SYNC_IOC_CREATE_FENCE:
+		return sw_sync_ioctl_create_fence(obj, arg);
+
+	case SW_SYNC_IOC_INC:
+		return sw_sync_ioctl_inc(obj, arg);
+
+	default:
+		return -ENOTTY;
+	}
+}
+
+static const struct file_operations sw_sync_fops = {
+	.owner = THIS_MODULE,
+	.open = sw_sync_open,
+	.release = sw_sync_release,
+	.unlocked_ioctl = sw_sync_ioctl,
+	.compat_ioctl = sw_sync_ioctl,
+};
+
+static struct miscdevice sw_sync_dev = {
+	.minor	= MISC_DYNAMIC_MINOR,
+	.name	= "sw_sync",
+	.fops	= &sw_sync_fops,
+};
+
+static int __init sw_sync_device_init(void)
+{
+	return misc_register(&sw_sync_dev);
+}
+device_initcall(sw_sync_device_init);
+
+#endif /* CONFIG_SW_SYNC_USER */
diff --git a/drivers/android/sw_sync.h b/drivers/android/sw_sync.h
new file mode 100644
index 0000000..4bf8b86
--- /dev/null
+++ b/drivers/android/sw_sync.h
@@ -0,0 +1,59 @@
+/*
+ * include/linux/sw_sync.h
+ *
+ * Copyright (C) 2012 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef _LINUX_SW_SYNC_H
+#define _LINUX_SW_SYNC_H
+
+#include <linux/types.h>
+#include <linux/kconfig.h>
+#include <uapi/sync/sw_sync.h>
+#include "sync.h"
+
+struct sw_sync_timeline {
+	struct	sync_timeline	obj;
+
+	u32			value;
+};
+
+struct sw_sync_pt {
+	struct sync_pt		pt;
+
+	u32			value;
+};
+
+#if IS_ENABLED(CONFIG_SW_SYNC)
+struct sw_sync_timeline *sw_sync_timeline_create(const char *name);
+void sw_sync_timeline_inc(struct sw_sync_timeline *obj, u32 inc);
+
+struct sync_pt *sw_sync_pt_create(struct sw_sync_timeline *obj, u32 value);
+#else
+static inline struct sw_sync_timeline *sw_sync_timeline_create(const char *name)
+{
+	return NULL;
+}
+
+static inline void sw_sync_timeline_inc(struct sw_sync_timeline *obj, u32 inc)
+{
+}
+
+static inline struct sync_pt *sw_sync_pt_create(struct sw_sync_timeline *obj,
+						u32 value)
+{
+	return NULL;
+}
+#endif /* IS_ENABLED(CONFIG_SW_SYNC) */
+
+#endif /* _LINUX_SW_SYNC_H */
diff --git a/drivers/android/sync.c b/drivers/android/sync.c
new file mode 100644
index 0000000..7f0e919
--- /dev/null
+++ b/drivers/android/sync.c
@@ -0,0 +1,734 @@
+/*
+ * drivers/base/sync.c
+ *
+ * Copyright (C) 2012 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/debugfs.h>
+#include <linux/export.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/kernel.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/anon_inodes.h>
+
+#include "sync.h"
+
+#define CREATE_TRACE_POINTS
+#include "trace/sync.h"
+
+static const struct fence_ops android_fence_ops;
+static const struct file_operations sync_fence_fops;
+
+struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops,
+					   int size, const char *name)
+{
+	struct sync_timeline *obj;
+
+	if (size < sizeof(struct sync_timeline))
+		return NULL;
+
+	obj = kzalloc(size, GFP_KERNEL);
+	if (obj == NULL)
+		return NULL;
+
+	kref_init(&obj->kref);
+	obj->ops = ops;
+	obj->context = fence_context_alloc(1);
+	strlcpy(obj->name, name, sizeof(obj->name));
+
+	INIT_LIST_HEAD(&obj->child_list_head);
+	INIT_LIST_HEAD(&obj->active_list_head);
+	spin_lock_init(&obj->child_list_lock);
+
+	sync_timeline_debug_add(obj);
+
+	return obj;
+}
+EXPORT_SYMBOL(sync_timeline_create);
+
+static void sync_timeline_free(struct kref *kref)
+{
+	struct sync_timeline *obj =
+		container_of(kref, struct sync_timeline, kref);
+
+	sync_timeline_debug_remove(obj);
+
+	if (obj->ops->release_obj)
+		obj->ops->release_obj(obj);
+
+	kfree(obj);
+}
+
+static void sync_timeline_get(struct sync_timeline *obj)
+{
+	kref_get(&obj->kref);
+}
+
+static void sync_timeline_put(struct sync_timeline *obj)
+{
+	kref_put(&obj->kref, sync_timeline_free);
+}
+
+void sync_timeline_destroy(struct sync_timeline *obj)
+{
+	obj->destroyed = true;
+	/*
+	 * Ensure timeline is marked as destroyed before
+	 * changing timeline's fences status.
+	 */
+	smp_wmb();
+
+	/*
+	 * signal any children that their parent is going away.
+	 */
+	sync_timeline_signal(obj);
+	sync_timeline_put(obj);
+}
+EXPORT_SYMBOL(sync_timeline_destroy);
+
+void sync_timeline_signal(struct sync_timeline *obj)
+{
+	unsigned long flags;
+	LIST_HEAD(signaled_pts);
+	struct sync_pt *pt, *next;
+
+	trace_sync_timeline(obj);
+
+	spin_lock_irqsave(&obj->child_list_lock, flags);
+
+	list_for_each_entry_safe(pt, next, &obj->active_list_head,
+				 active_list) {
+		if (fence_is_signaled_locked(&pt->base))
+			list_del_init(&pt->active_list);
+	}
+
+	spin_unlock_irqrestore(&obj->child_list_lock, flags);
+}
+EXPORT_SYMBOL(sync_timeline_signal);
+
+struct sync_pt *sync_pt_create(struct sync_timeline *obj, int size)
+{
+	unsigned long flags;
+	struct sync_pt *pt;
+
+	if (size < sizeof(struct sync_pt))
+		return NULL;
+
+	pt = kzalloc(size, GFP_KERNEL);
+	if (pt == NULL)
+		return NULL;
+
+	spin_lock_irqsave(&obj->child_list_lock, flags);
+	sync_timeline_get(obj);
+	fence_init(&pt->base, &android_fence_ops, &obj->child_list_lock,
+		   obj->context, ++obj->value);
+	list_add_tail(&pt->child_list, &obj->child_list_head);
+	INIT_LIST_HEAD(&pt->active_list);
+	spin_unlock_irqrestore(&obj->child_list_lock, flags);
+	return pt;
+}
+EXPORT_SYMBOL(sync_pt_create);
+
+void sync_pt_free(struct sync_pt *pt)
+{
+	fence_put(&pt->base);
+}
+EXPORT_SYMBOL(sync_pt_free);
+
+static struct sync_fence *sync_fence_alloc(int size, const char *name)
+{
+	struct sync_fence *fence;
+
+	fence = kzalloc(size, GFP_KERNEL);
+	if (fence == NULL)
+		return NULL;
+
+	fence->file = anon_inode_getfile("sync_fence", &sync_fence_fops,
+					 fence, 0);
+	if (IS_ERR(fence->file))
+		goto err;
+
+	kref_init(&fence->kref);
+	strlcpy(fence->name, name, sizeof(fence->name));
+
+	init_waitqueue_head(&fence->wq);
+
+	return fence;
+
+err:
+	kfree(fence);
+	return NULL;
+}
+
+static void fence_check_cb_func(struct fence *f, struct fence_cb *cb)
+{
+	struct sync_fence_cb *check;
+	struct sync_fence *fence;
+
+	check = container_of(cb, struct sync_fence_cb, cb);
+	fence = check->fence;
+
+	if (atomic_dec_and_test(&fence->status))
+		wake_up_all(&fence->wq);
+}
+
+/* TODO: implement a create which takes more that one sync_pt */
+struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt)
+{
+	struct sync_fence *fence;
+
+	fence = sync_fence_alloc(offsetof(struct sync_fence, cbs[1]), name);
+	if (fence == NULL)
+		return NULL;
+
+	fence->num_fences = 1;
+	atomic_set(&fence->status, 1);
+
+	fence->cbs[0].sync_pt = pt;
+	fence->cbs[0].fence = fence;
+	if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func))
+		atomic_dec(&fence->status);
+
+	sync_fence_debug_add(fence);
+
+	return fence;
+}
+EXPORT_SYMBOL(sync_fence_create_dma);
+
+struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
+{
+	return sync_fence_create_dma(name, &pt->base);
+}
+EXPORT_SYMBOL(sync_fence_create);
+
+struct sync_fence *sync_fence_fdget(int fd)
+{
+	struct file *file = fget(fd);
+
+	if (file == NULL)
+		return NULL;
+
+	if (file->f_op != &sync_fence_fops)
+		goto err;
+
+	return file->private_data;
+
+err:
+	fput(file);
+	return NULL;
+}
+EXPORT_SYMBOL(sync_fence_fdget);
+
+void sync_fence_put(struct sync_fence *fence)
+{
+	fput(fence->file);
+}
+EXPORT_SYMBOL(sync_fence_put);
+
+void sync_fence_install(struct sync_fence *fence, int fd)
+{
+	fd_install(fd, fence->file);
+}
+EXPORT_SYMBOL(sync_fence_install);
+
+static void sync_fence_add_pt(struct sync_fence *fence,
+			      int *i, struct fence *pt)
+{
+	fence->cbs[*i].sync_pt = pt;
+	fence->cbs[*i].fence = fence;
+
+	if (!fence_add_callback(pt, &fence->cbs[*i].cb, fence_check_cb_func)) {
+		fence_get(pt);
+		(*i)++;
+	}
+}
+
+struct sync_fence *sync_fence_merge(const char *name,
+				    struct sync_fence *a, struct sync_fence *b)
+{
+	int num_fences = a->num_fences + b->num_fences;
+	struct sync_fence *fence;
+	int i, i_a, i_b;
+	unsigned long size = offsetof(struct sync_fence, cbs[num_fences]);
+
+	fence = sync_fence_alloc(size, name);
+	if (fence == NULL)
+		return NULL;
+
+	atomic_set(&fence->status, num_fences);
+
+	/*
+	 * Assume sync_fence a and b are both ordered and have no
+	 * duplicates with the same context.
+	 *
+	 * If a sync_fence can only be created with sync_fence_merge
+	 * and sync_fence_create, this is a reasonable assumption.
+	 */
+	for (i = i_a = i_b = 0; i_a < a->num_fences && i_b < b->num_fences; ) {
+		struct fence *pt_a = a->cbs[i_a].sync_pt;
+		struct fence *pt_b = b->cbs[i_b].sync_pt;
+
+		if (pt_a->context < pt_b->context) {
+			sync_fence_add_pt(fence, &i, pt_a);
+
+			i_a++;
+		} else if (pt_a->context > pt_b->context) {
+			sync_fence_add_pt(fence, &i, pt_b);
+
+			i_b++;
+		} else {
+			if (pt_a->seqno - pt_b->seqno <= INT_MAX)
+				sync_fence_add_pt(fence, &i, pt_a);
+			else
+				sync_fence_add_pt(fence, &i, pt_b);
+
+			i_a++;
+			i_b++;
+		}
+	}
+
+	for (; i_a < a->num_fences; i_a++)
+		sync_fence_add_pt(fence, &i, a->cbs[i_a].sync_pt);
+
+	for (; i_b < b->num_fences; i_b++)
+		sync_fence_add_pt(fence, &i, b->cbs[i_b].sync_pt);
+
+	if (num_fences > i)
+		atomic_sub(num_fences - i, &fence->status);
+	fence->num_fences = i;
+
+	sync_fence_debug_add(fence);
+	return fence;
+}
+EXPORT_SYMBOL(sync_fence_merge);
+
+int sync_fence_wake_up_wq(wait_queue_t *curr, unsigned mode,
+				 int wake_flags, void *key)
+{
+	struct sync_fence_waiter *wait;
+
+	wait = container_of(curr, struct sync_fence_waiter, work);
+	list_del_init(&wait->work.task_list);
+
+	wait->callback(wait->work.private, wait);
+	return 1;
+}
+
+int sync_fence_wait_async(struct sync_fence *fence,
+			  struct sync_fence_waiter *waiter)
+{
+	int err = atomic_read(&fence->status);
+	unsigned long flags;
+
+	if (err < 0)
+		return err;
+
+	if (!err)
+		return 1;
+
+	init_waitqueue_func_entry(&waiter->work, sync_fence_wake_up_wq);
+	waiter->work.private = fence;
+
+	spin_lock_irqsave(&fence->wq.lock, flags);
+	err = atomic_read(&fence->status);
+	if (err > 0)
+		__add_wait_queue_tail(&fence->wq, &waiter->work);
+	spin_unlock_irqrestore(&fence->wq.lock, flags);
+
+	if (err < 0)
+		return err;
+
+	return !err;
+}
+EXPORT_SYMBOL(sync_fence_wait_async);
+
+int sync_fence_cancel_async(struct sync_fence *fence,
+			     struct sync_fence_waiter *waiter)
+{
+	unsigned long flags;
+	int ret = 0;
+
+	spin_lock_irqsave(&fence->wq.lock, flags);
+	if (!list_empty(&waiter->work.task_list))
+		list_del_init(&waiter->work.task_list);
+	else
+		ret = -ENOENT;
+	spin_unlock_irqrestore(&fence->wq.lock, flags);
+	return ret;
+}
+EXPORT_SYMBOL(sync_fence_cancel_async);
+
+int sync_fence_wait(struct sync_fence *fence, long timeout)
+{
+	long ret;
+	int i;
+
+	if (timeout < 0)
+		timeout = MAX_SCHEDULE_TIMEOUT;
+	else
+		timeout = msecs_to_jiffies(timeout);
+
+	trace_sync_wait(fence, 1);
+	for (i = 0; i < fence->num_fences; ++i)
+		trace_sync_pt(fence->cbs[i].sync_pt);
+	ret = wait_event_interruptible_timeout(fence->wq,
+					       atomic_read(&fence->status) <= 0,
+					       timeout);
+	trace_sync_wait(fence, 0);
+
+	if (ret < 0) {
+		return ret;
+	} else if (ret == 0) {
+		if (timeout) {
+			pr_info("fence timeout on [%p] after %dms\n", fence,
+				jiffies_to_msecs(timeout));
+			sync_dump();
+		}
+		return -ETIME;
+	}
+
+	ret = atomic_read(&fence->status);
+	if (ret) {
+		pr_info("fence error %ld on [%p]\n", ret, fence);
+		sync_dump();
+	}
+	return ret;
+}
+EXPORT_SYMBOL(sync_fence_wait);
+
+static const char *android_fence_get_driver_name(struct fence *fence)
+{
+	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
+	struct sync_timeline *parent = sync_pt_parent(pt);
+
+	return parent->ops->driver_name;
+}
+
+static const char *android_fence_get_timeline_name(struct fence *fence)
+{
+	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
+	struct sync_timeline *parent = sync_pt_parent(pt);
+
+	return parent->name;
+}
+
+static void android_fence_release(struct fence *fence)
+{
+	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
+	struct sync_timeline *parent = sync_pt_parent(pt);
+	unsigned long flags;
+
+	spin_lock_irqsave(fence->lock, flags);
+	list_del(&pt->child_list);
+	if (WARN_ON_ONCE(!list_empty(&pt->active_list)))
+		list_del(&pt->active_list);
+	spin_unlock_irqrestore(fence->lock, flags);
+
+	if (parent->ops->free_pt)
+		parent->ops->free_pt(pt);
+
+	sync_timeline_put(parent);
+	fence_free(&pt->base);
+}
+
+static bool android_fence_signaled(struct fence *fence)
+{
+	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
+	struct sync_timeline *parent = sync_pt_parent(pt);
+	int ret;
+
+	ret = parent->ops->has_signaled(pt);
+	if (ret < 0)
+		fence->status = ret;
+	return ret;
+}
+
+static bool android_fence_enable_signaling(struct fence *fence)
+{
+	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
+	struct sync_timeline *parent = sync_pt_parent(pt);
+
+	if (android_fence_signaled(fence))
+		return false;
+
+	list_add_tail(&pt->active_list, &parent->active_list_head);
+	return true;
+}
+
+static int android_fence_fill_driver_data(struct fence *fence,
+					  void *data, int size)
+{
+	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
+	struct sync_timeline *parent = sync_pt_parent(pt);
+
+	if (!parent->ops->fill_driver_data)
+		return 0;
+	return parent->ops->fill_driver_data(pt, data, size);
+}
+
+static void android_fence_value_str(struct fence *fence,
+				    char *str, int size)
+{
+	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
+	struct sync_timeline *parent = sync_pt_parent(pt);
+
+	if (!parent->ops->pt_value_str) {
+		if (size)
+			*str = 0;
+		return;
+	}
+	parent->ops->pt_value_str(pt, str, size);
+}
+
+static void android_fence_timeline_value_str(struct fence *fence,
+					     char *str, int size)
+{
+	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
+	struct sync_timeline *parent = sync_pt_parent(pt);
+
+	if (!parent->ops->timeline_value_str) {
+		if (size)
+			*str = 0;
+		return;
+	}
+	parent->ops->timeline_value_str(parent, str, size);
+}
+
+static const struct fence_ops android_fence_ops = {
+	.get_driver_name = android_fence_get_driver_name,
+	.get_timeline_name = android_fence_get_timeline_name,
+	.enable_signaling = android_fence_enable_signaling,
+	.signaled = android_fence_signaled,
+	.wait = fence_default_wait,
+	.release = android_fence_release,
+	.fill_driver_data = android_fence_fill_driver_data,
+	.fence_value_str = android_fence_value_str,
+	.timeline_value_str = android_fence_timeline_value_str,
+};
+
+static void sync_fence_free(struct kref *kref)
+{
+	struct sync_fence *fence = container_of(kref, struct sync_fence, kref);
+	int i, status = atomic_read(&fence->status);
+
+	for (i = 0; i < fence->num_fences; ++i) {
+		if (status)
+			fence_remove_callback(fence->cbs[i].sync_pt,
+					      &fence->cbs[i].cb);
+		fence_put(fence->cbs[i].sync_pt);
+	}
+
+	kfree(fence);
+}
+
+static int sync_fence_release(struct inode *inode, struct file *file)
+{
+	struct sync_fence *fence = file->private_data;
+
+	sync_fence_debug_remove(fence);
+
+	kref_put(&fence->kref, sync_fence_free);
+	return 0;
+}
+
+static unsigned int sync_fence_poll(struct file *file, poll_table *wait)
+{
+	struct sync_fence *fence = file->private_data;
+	int status;
+
+	poll_wait(file, &fence->wq, wait);
+
+	status = atomic_read(&fence->status);
+
+	if (!status)
+		return POLLIN;
+	else if (status < 0)
+		return POLLERR;
+	return 0;
+}
+
+static long sync_fence_ioctl_wait(struct sync_fence *fence, unsigned long arg)
+{
+	__s32 value;
+
+	if (copy_from_user(&value, (void __user *)arg, sizeof(value)))
+		return -EFAULT;
+
+	return sync_fence_wait(fence, value);
+}
+
+static long sync_fence_ioctl_merge(struct sync_fence *fence, unsigned long arg)
+{
+	int fd = get_unused_fd_flags(O_CLOEXEC);
+	int err;
+	struct sync_fence *fence2, *fence3;
+	struct sync_merge_data data;
+
+	if (fd < 0)
+		return fd;
+
+	if (copy_from_user(&data, (void __user *)arg, sizeof(data))) {
+		err = -EFAULT;
+		goto err_put_fd;
+	}
+
+	fence2 = sync_fence_fdget(data.fd2);
+	if (fence2 == NULL) {
+		err = -ENOENT;
+		goto err_put_fd;
+	}
+
+	data.name[sizeof(data.name) - 1] = '\0';
+	fence3 = sync_fence_merge(data.name, fence, fence2);
+	if (fence3 == NULL) {
+		err = -ENOMEM;
+		goto err_put_fence2;
+	}
+
+	data.fence = fd;
+	if (copy_to_user((void __user *)arg, &data, sizeof(data))) {
+		err = -EFAULT;
+		goto err_put_fence3;
+	}
+
+	sync_fence_install(fence3, fd);
+	sync_fence_put(fence2);
+	return 0;
+
+err_put_fence3:
+	sync_fence_put(fence3);
+
+err_put_fence2:
+	sync_fence_put(fence2);
+
+err_put_fd:
+	put_unused_fd(fd);
+	return err;
+}
+
+static int sync_fill_pt_info(struct fence *fence, void *data, int size)
+{
+	struct sync_pt_info *info = data;
+	int ret;
+
+	if (size < sizeof(struct sync_pt_info))
+		return -ENOMEM;
+
+	info->len = sizeof(struct sync_pt_info);
+
+	if (fence->ops->fill_driver_data) {
+		ret = fence->ops->fill_driver_data(fence, info->driver_data,
+						   size - sizeof(*info));
+		if (ret < 0)
+			return ret;
+
+		info->len += ret;
+	}
+
+	strlcpy(info->obj_name, fence->ops->get_timeline_name(fence),
+		sizeof(info->obj_name));
+	strlcpy(info->driver_name, fence->ops->get_driver_name(fence),
+		sizeof(info->driver_name));
+	if (fence_is_signaled(fence))
+		info->status = fence->status >= 0 ? 1 : fence->status;
+	else
+		info->status = 0;
+	info->timestamp_ns = ktime_to_ns(fence->timestamp);
+
+	return info->len;
+}
+
+static long sync_fence_ioctl_fence_info(struct sync_fence *fence,
+					unsigned long arg)
+{
+	struct sync_fence_info_data *data;
+	__u32 size;
+	__u32 len = 0;
+	int ret, i;
+
+	if (copy_from_user(&size, (void __user *)arg, sizeof(size)))
+		return -EFAULT;
+
+	if (size < sizeof(struct sync_fence_info_data))
+		return -EINVAL;
+
+	if (size > 4096)
+		size = 4096;
+
+	data = kzalloc(size, GFP_KERNEL);
+	if (data == NULL)
+		return -ENOMEM;
+
+	strlcpy(data->name, fence->name, sizeof(data->name));
+	data->status = atomic_read(&fence->status);
+	if (data->status >= 0)
+		data->status = !data->status;
+
+	len = sizeof(struct sync_fence_info_data);
+
+	for (i = 0; i < fence->num_fences; ++i) {
+		struct fence *pt = fence->cbs[i].sync_pt;
+
+		ret = sync_fill_pt_info(pt, (u8 *)data + len, size - len);
+
+		if (ret < 0)
+			goto out;
+
+		len += ret;
+	}
+
+	data->len = len;
+
+	if (copy_to_user((void __user *)arg, data, len))
+		ret = -EFAULT;
+	else
+		ret = 0;
+
+out:
+	kfree(data);
+
+	return ret;
+}
+
+static long sync_fence_ioctl(struct file *file, unsigned int cmd,
+			     unsigned long arg)
+{
+	struct sync_fence *fence = file->private_data;
+
+	switch (cmd) {
+	case SYNC_IOC_WAIT:
+		return sync_fence_ioctl_wait(fence, arg);
+
+	case SYNC_IOC_MERGE:
+		return sync_fence_ioctl_merge(fence, arg);
+
+	case SYNC_IOC_FENCE_INFO:
+		return sync_fence_ioctl_fence_info(fence, arg);
+
+	default:
+		return -ENOTTY;
+	}
+}
+
+static const struct file_operations sync_fence_fops = {
+	.release = sync_fence_release,
+	.poll = sync_fence_poll,
+	.unlocked_ioctl = sync_fence_ioctl,
+	.compat_ioctl = sync_fence_ioctl,
+};
+
diff --git a/drivers/android/sync.h b/drivers/android/sync.h
new file mode 100644
index 0000000..8449323
--- /dev/null
+++ b/drivers/android/sync.h
@@ -0,0 +1,366 @@
+/*
+ * include/linux/sync.h
+ *
+ * Copyright (C) 2012 Google, Inc.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef _LINUX_SYNC_H
+#define _LINUX_SYNC_H
+
+#include <linux/types.h>
+#include <linux/kref.h>
+#include <linux/ktime.h>
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+#include <linux/fence.h>
+
+#include <uapi/sync/sync.h>
+
+struct sync_timeline;
+struct sync_pt;
+struct sync_fence;
+
+/**
+ * struct sync_timeline_ops - sync object implementation ops
+ * @driver_name:	name of the implementation
+ * @dup:		duplicate a sync_pt
+ * @has_signaled:	returns:
+ *			  1 if pt has signaled
+ *			  0 if pt has not signaled
+ *			 <0 on error
+ * @compare:		returns:
+ *			  1 if b will signal before a
+ *			  0 if a and b will signal at the same time
+ *			 -1 if a will signal before b
+ * @free_pt:		called before sync_pt is freed
+ * @release_obj:	called before sync_timeline is freed
+ * @fill_driver_data:	write implementation specific driver data to data.
+ *			  should return an error if there is not enough room
+ *			  as specified by size.  This information is returned
+ *			  to userspace by SYNC_IOC_FENCE_INFO.
+ * @timeline_value_str: fill str with the value of the sync_timeline's counter
+ * @pt_value_str:	fill str with the value of the sync_pt
+ */
+struct sync_timeline_ops {
+	const char *driver_name;
+
+	/* required */
+	struct sync_pt * (*dup)(struct sync_pt *pt);
+
+	/* required */
+	int (*has_signaled)(struct sync_pt *pt);
+
+	/* required */
+	int (*compare)(struct sync_pt *a, struct sync_pt *b);
+
+	/* optional */
+	void (*free_pt)(struct sync_pt *sync_pt);
+
+	/* optional */
+	void (*release_obj)(struct sync_timeline *sync_timeline);
+
+	/* optional */
+	int (*fill_driver_data)(struct sync_pt *syncpt, void *data, int size);
+
+	/* optional */
+	void (*timeline_value_str)(struct sync_timeline *timeline, char *str,
+				   int size);
+
+	/* optional */
+	void (*pt_value_str)(struct sync_pt *pt, char *str, int size);
+};
+
+/**
+ * struct sync_timeline - sync object
+ * @kref:		reference count on fence.
+ * @ops:		ops that define the implementation of the sync_timeline
+ * @name:		name of the sync_timeline. Useful for debugging
+ * @destroyed:		set when sync_timeline is destroyed
+ * @child_list_head:	list of children sync_pts for this sync_timeline
+ * @child_list_lock:	lock protecting @child_list_head, destroyed, and
+ *			  sync_pt.status
+ * @active_list_head:	list of active (unsignaled/errored) sync_pts
+ * @sync_timeline_list:	membership in global sync_timeline_list
+ */
+struct sync_timeline {
+	struct kref		kref;
+	const struct sync_timeline_ops	*ops;
+	char			name[32];
+
+	/* protected by child_list_lock */
+	bool			destroyed;
+	int			context, value;
+
+	struct list_head	child_list_head;
+	spinlock_t		child_list_lock;
+
+	struct list_head	active_list_head;
+
+#ifdef CONFIG_DEBUG_FS
+	struct list_head	sync_timeline_list;
+#endif
+};
+
+/**
+ * struct sync_pt - sync point
+ * @fence:		base fence class
+ * @child_list:		membership in sync_timeline.child_list_head
+ * @active_list:	membership in sync_timeline.active_list_head
+ * @signaled_list:	membership in temporary signaled_list on stack
+ * @fence:		sync_fence to which the sync_pt belongs
+ * @pt_list:		membership in sync_fence.pt_list_head
+ * @status:		1: signaled, 0:active, <0: error
+ * @timestamp:		time which sync_pt status transitioned from active to
+ *			  signaled or error.
+ */
+struct sync_pt {
+	struct fence base;
+
+	struct list_head	child_list;
+	struct list_head	active_list;
+};
+
+static inline struct sync_timeline *sync_pt_parent(struct sync_pt *pt)
+{
+	return container_of(pt->base.lock, struct sync_timeline,
+			    child_list_lock);
+}
+
+struct sync_fence_cb {
+	struct fence_cb cb;
+	struct fence *sync_pt;
+	struct sync_fence *fence;
+};
+
+/**
+ * struct sync_fence - sync fence
+ * @file:		file representing this fence
+ * @kref:		reference count on fence.
+ * @name:		name of sync_fence.  Useful for debugging
+ * @pt_list_head:	list of sync_pts in the fence.  immutable once fence
+ *			  is created
+ * @status:		0: signaled, >0:active, <0: error
+ *
+ * @wq:			wait queue for fence signaling
+ * @sync_fence_list:	membership in global fence list
+ */
+struct sync_fence {
+	struct file		*file;
+	struct kref		kref;
+	char			name[32];
+#ifdef CONFIG_DEBUG_FS
+	struct list_head	sync_fence_list;
+#endif
+	int num_fences;
+
+	wait_queue_head_t	wq;
+	atomic_t		status;
+
+	struct sync_fence_cb	cbs[];
+};
+
+struct sync_fence_waiter;
+typedef void (*sync_callback_t)(struct sync_fence *fence,
+				struct sync_fence_waiter *waiter);
+
+/**
+ * struct sync_fence_waiter - metadata for asynchronous waiter on a fence
+ * @waiter_list:	membership in sync_fence.waiter_list_head
+ * @callback:		function pointer to call when fence signals
+ * @callback_data:	pointer to pass to @callback
+ */
+struct sync_fence_waiter {
+	wait_queue_t work;
+	sync_callback_t callback;
+};
+
+static inline void sync_fence_waiter_init(struct sync_fence_waiter *waiter,
+					  sync_callback_t callback)
+{
+	INIT_LIST_HEAD(&waiter->work.task_list);
+	waiter->callback = callback;
+}
+
+/*
+ * API for sync_timeline implementers
+ */
+
+/**
+ * sync_timeline_create() - creates a sync object
+ * @ops:	specifies the implementation ops for the object
+ * @size:	size to allocate for this obj
+ * @name:	sync_timeline name
+ *
+ * Creates a new sync_timeline which will use the implementation specified by
+ * @ops.  @size bytes will be allocated allowing for implementation specific
+ * data to be kept after the generic sync_timeline struct.
+ */
+struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops,
+					   int size, const char *name);
+
+/**
+ * sync_timeline_destroy() - destroys a sync object
+ * @obj:	sync_timeline to destroy
+ *
+ * A sync implementation should call this when the @obj is going away
+ * (i.e. module unload.)  @obj won't actually be freed until all its children
+ * sync_pts are freed.
+ */
+void sync_timeline_destroy(struct sync_timeline *obj);
+
+/**
+ * sync_timeline_signal() - signal a status change on a sync_timeline
+ * @obj:	sync_timeline to signal
+ *
+ * A sync implementation should call this any time one of it's sync_pts
+ * has signaled or has an error condition.
+ */
+void sync_timeline_signal(struct sync_timeline *obj);
+
+/**
+ * sync_pt_create() - creates a sync pt
+ * @parent:	sync_pt's parent sync_timeline
+ * @size:	size to allocate for this pt
+ *
+ * Creates a new sync_pt as a child of @parent.  @size bytes will be
+ * allocated allowing for implementation specific data to be kept after
+ * the generic sync_timeline struct.
+ */
+struct sync_pt *sync_pt_create(struct sync_timeline *parent, int size);
+
+/**
+ * sync_pt_free() - frees a sync pt
+ * @pt:		sync_pt to free
+ *
+ * This should only be called on sync_pts which have been created but
+ * not added to a fence.
+ */
+void sync_pt_free(struct sync_pt *pt);
+
+/**
+ * sync_fence_create() - creates a sync fence
+ * @name:	name of fence to create
+ * @pt:		sync_pt to add to the fence
+ *
+ * Creates a fence containg @pt.  Once this is called, the fence takes
+ * a reference on @pt.
+ */
+struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt);
+
+/**
+ * sync_fence_create_dma() - creates a sync fence from dma-fence
+ * @name:	name of fence to create
+ * @pt:	dma-fence to add to the fence
+ *
+ * Creates a fence containg @pt.  Once this is called, the fence takes
+ * a reference on @pt.
+ */
+struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt);
+
+/*
+ * API for sync_fence consumers
+ */
+
+/**
+ * sync_fence_merge() - merge two fences
+ * @name:	name of new fence
+ * @a:		fence a
+ * @b:		fence b
+ *
+ * Creates a new fence which contains copies of all the sync_pts in both
+ * @a and @b.  @a and @b remain valid, independent fences.
+ */
+struct sync_fence *sync_fence_merge(const char *name,
+				    struct sync_fence *a, struct sync_fence *b);
+
+/**
+ * sync_fence_fdget() - get a fence from an fd
+ * @fd:		fd referencing a fence
+ *
+ * Ensures @fd references a valid fence, increments the refcount of the backing
+ * file, and returns the fence.
+ */
+struct sync_fence *sync_fence_fdget(int fd);
+
+/**
+ * sync_fence_put() - puts a reference of a sync fence
+ * @fence:	fence to put
+ *
+ * Puts a reference on @fence.  If this is the last reference, the fence and
+ * all it's sync_pts will be freed
+ */
+void sync_fence_put(struct sync_fence *fence);
+
+/**
+ * sync_fence_install() - installs a fence into a file descriptor
+ * @fence:	fence to install
+ * @fd:		file descriptor in which to install the fence
+ *
+ * Installs @fence into @fd.  @fd's should be acquired through
+ * get_unused_fd_flags(O_CLOEXEC).
+ */
+void sync_fence_install(struct sync_fence *fence, int fd);
+
+/**
+ * sync_fence_wait_async() - registers and async wait on the fence
+ * @fence:		fence to wait on
+ * @waiter:		waiter callback struck
+ *
+ * Returns 1 if @fence has already signaled.
+ *
+ * Registers a callback to be called when @fence signals or has an error.
+ * @waiter should be initialized with sync_fence_waiter_init().
+ */
+int sync_fence_wait_async(struct sync_fence *fence,
+			  struct sync_fence_waiter *waiter);
+
+/**
+ * sync_fence_cancel_async() - cancels an async wait
+ * @fence:		fence to wait on
+ * @waiter:		waiter callback struck
+ *
+ * returns 0 if waiter was removed from fence's async waiter list.
+ * returns -ENOENT if waiter was not found on fence's async waiter list.
+ *
+ * Cancels a previously registered async wait.  Will fail gracefully if
+ * @waiter was never registered or if @fence has already signaled @waiter.
+ */
+int sync_fence_cancel_async(struct sync_fence *fence,
+			    struct sync_fence_waiter *waiter);
+
+/**
+ * sync_fence_wait() - wait on fence
+ * @fence:	fence to wait on
+ * @tiemout:	timeout in ms
+ *
+ * Wait for @fence to be signaled or have an error.  Waits indefinitely
+ * if @timeout < 0
+ */
+int sync_fence_wait(struct sync_fence *fence, long timeout);
+
+#ifdef CONFIG_DEBUG_FS
+
+void sync_timeline_debug_add(struct sync_timeline *obj);
+void sync_timeline_debug_remove(struct sync_timeline *obj);
+void sync_fence_debug_add(struct sync_fence *fence);
+void sync_fence_debug_remove(struct sync_fence *fence);
+void sync_dump(void);
+
+#else
+# define sync_timeline_debug_add(obj)
+# define sync_timeline_debug_remove(obj)
+# define sync_fence_debug_add(fence)
+# define sync_fence_debug_remove(fence)
+# define sync_dump()
+#endif
+int sync_fence_wake_up_wq(wait_queue_t *curr, unsigned mode,
+				 int wake_flags, void *key);
+
+#endif /* _LINUX_SYNC_H */
diff --git a/drivers/android/sync_debug.c b/drivers/android/sync_debug.c
new file mode 100644
index 0000000..f45d13c
--- /dev/null
+++ b/drivers/android/sync_debug.c
@@ -0,0 +1,256 @@
+/*
+ * drivers/base/sync.c
+ *
+ * Copyright (C) 2012 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include <linux/debugfs.h>
+#include <linux/export.h>
+#include <linux/file.h>
+#include <linux/fs.h>
+#include <linux/kernel.h>
+#include <linux/poll.h>
+#include <linux/sched.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/anon_inodes.h>
+#include <linux/time64.h>
+#include "sync.h"
+
+#ifdef CONFIG_DEBUG_FS
+
+static LIST_HEAD(sync_timeline_list_head);
+static DEFINE_SPINLOCK(sync_timeline_list_lock);
+static LIST_HEAD(sync_fence_list_head);
+static DEFINE_SPINLOCK(sync_fence_list_lock);
+
+void sync_timeline_debug_add(struct sync_timeline *obj)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sync_timeline_list_lock, flags);
+	list_add_tail(&obj->sync_timeline_list, &sync_timeline_list_head);
+	spin_unlock_irqrestore(&sync_timeline_list_lock, flags);
+}
+
+void sync_timeline_debug_remove(struct sync_timeline *obj)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sync_timeline_list_lock, flags);
+	list_del(&obj->sync_timeline_list);
+	spin_unlock_irqrestore(&sync_timeline_list_lock, flags);
+}
+
+void sync_fence_debug_add(struct sync_fence *fence)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sync_fence_list_lock, flags);
+	list_add_tail(&fence->sync_fence_list, &sync_fence_list_head);
+	spin_unlock_irqrestore(&sync_fence_list_lock, flags);
+}
+
+void sync_fence_debug_remove(struct sync_fence *fence)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&sync_fence_list_lock, flags);
+	list_del(&fence->sync_fence_list);
+	spin_unlock_irqrestore(&sync_fence_list_lock, flags);
+}
+
+static const char *sync_status_str(int status)
+{
+	if (status == 0)
+		return "signaled";
+
+	if (status > 0)
+		return "active";
+
+	return "error";
+}
+
+static void sync_print_pt(struct seq_file *s, struct fence *pt, bool fence)
+{
+	int status = 1;
+
+	if (fence_is_signaled_locked(pt))
+		status = pt->status;
+
+	seq_printf(s, "  %s%spt %s",
+		   fence && pt->ops->get_timeline_name ?
+		   pt->ops->get_timeline_name(pt) : "",
+		   fence ? "_" : "",
+		   sync_status_str(status));
+
+	if (status <= 0) {
+		struct timespec64 ts64 =
+			ktime_to_timespec64(pt->timestamp);
+
+		seq_printf(s, "@%lld.%09ld", (s64)ts64.tv_sec, ts64.tv_nsec);
+	}
+
+	if ((!fence || pt->ops->timeline_value_str) &&
+	    pt->ops->fence_value_str) {
+		char value[64];
+		bool success;
+
+		pt->ops->fence_value_str(pt, value, sizeof(value));
+		success = strlen(value);
+
+		if (success)
+			seq_printf(s, ": %s", value);
+
+		if (success && fence) {
+			pt->ops->timeline_value_str(pt, value, sizeof(value));
+
+			if (strlen(value))
+				seq_printf(s, " / %s", value);
+		}
+	}
+
+	seq_puts(s, "\n");
+}
+
+static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj)
+{
+	struct list_head *pos;
+	unsigned long flags;
+
+	seq_printf(s, "%s %s", obj->name, obj->ops->driver_name);
+
+	if (obj->ops->timeline_value_str) {
+		char value[64];
+
+		obj->ops->timeline_value_str(obj, value, sizeof(value));
+		seq_printf(s, ": %s", value);
+	}
+
+	seq_puts(s, "\n");
+
+	spin_lock_irqsave(&obj->child_list_lock, flags);
+	list_for_each(pos, &obj->child_list_head) {
+		struct sync_pt *pt =
+			container_of(pos, struct sync_pt, child_list);
+		sync_print_pt(s, &pt->base, false);
+	}
+	spin_unlock_irqrestore(&obj->child_list_lock, flags);
+}
+
+static void sync_print_fence(struct seq_file *s, struct sync_fence *fence)
+{
+	wait_queue_t *pos;
+	unsigned long flags;
+	int i;
+
+	seq_printf(s, "[%p] %s: %s\n", fence, fence->name,
+		   sync_status_str(atomic_read(&fence->status)));
+
+	for (i = 0; i < fence->num_fences; ++i) {
+		sync_print_pt(s, fence->cbs[i].sync_pt, true);
+	}
+
+	spin_lock_irqsave(&fence->wq.lock, flags);
+	list_for_each_entry(pos, &fence->wq.task_list, task_list) {
+		struct sync_fence_waiter *waiter;
+
+		if (pos->func != &sync_fence_wake_up_wq)
+			continue;
+
+		waiter = container_of(pos, struct sync_fence_waiter, work);
+
+		seq_printf(s, "waiter %pF\n", waiter->callback);
+	}
+	spin_unlock_irqrestore(&fence->wq.lock, flags);
+}
+
+static int sync_debugfs_show(struct seq_file *s, void *unused)
+{
+	unsigned long flags;
+	struct list_head *pos;
+
+	seq_puts(s, "objs:\n--------------\n");
+
+	spin_lock_irqsave(&sync_timeline_list_lock, flags);
+	list_for_each(pos, &sync_timeline_list_head) {
+		struct sync_timeline *obj =
+			container_of(pos, struct sync_timeline,
+				     sync_timeline_list);
+
+		sync_print_obj(s, obj);
+		seq_puts(s, "\n");
+	}
+	spin_unlock_irqrestore(&sync_timeline_list_lock, flags);
+
+	seq_puts(s, "fences:\n--------------\n");
+
+	spin_lock_irqsave(&sync_fence_list_lock, flags);
+	list_for_each(pos, &sync_fence_list_head) {
+		struct sync_fence *fence =
+			container_of(pos, struct sync_fence, sync_fence_list);
+
+		sync_print_fence(s, fence);
+		seq_puts(s, "\n");
+	}
+	spin_unlock_irqrestore(&sync_fence_list_lock, flags);
+	return 0;
+}
+
+static int sync_debugfs_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, sync_debugfs_show, inode->i_private);
+}
+
+static const struct file_operations sync_debugfs_fops = {
+	.open           = sync_debugfs_open,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = single_release,
+};
+
+static __init int sync_debugfs_init(void)
+{
+	debugfs_create_file("sync", S_IRUGO, NULL, NULL, &sync_debugfs_fops);
+	return 0;
+}
+late_initcall(sync_debugfs_init);
+
+#define DUMP_CHUNK 256
+static char sync_dump_buf[64 * 1024];
+void sync_dump(void)
+{
+	struct seq_file s = {
+		.buf = sync_dump_buf,
+		.size = sizeof(sync_dump_buf) - 1,
+	};
+	int i;
+
+	sync_debugfs_show(&s, NULL);
+
+	for (i = 0; i < s.count; i += DUMP_CHUNK) {
+		if ((s.count - i) > DUMP_CHUNK) {
+			char c = s.buf[i + DUMP_CHUNK];
+
+			s.buf[i + DUMP_CHUNK] = 0;
+			pr_cont("%s", s.buf + i);
+			s.buf[i + DUMP_CHUNK] = c;
+		} else {
+			s.buf[s.count] = 0;
+			pr_cont("%s", s.buf + i);
+		}
+	}
+}
+
+#endif
diff --git a/drivers/android/trace/sync.h b/drivers/android/trace/sync.h
new file mode 100644
index 0000000..7dcf2fe
--- /dev/null
+++ b/drivers/android/trace/sync.h
@@ -0,0 +1,82 @@
+#undef TRACE_SYSTEM
+#define TRACE_INCLUDE_PATH ../../drivers/android/trace
+#define TRACE_SYSTEM sync
+
+#if !defined(_TRACE_SYNC_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_SYNC_H
+
+#include "../sync.h"
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(sync_timeline,
+	TP_PROTO(struct sync_timeline *timeline),
+
+	TP_ARGS(timeline),
+
+	TP_STRUCT__entry(
+			__string(name, timeline->name)
+			__array(char, value, 32)
+	),
+
+	TP_fast_assign(
+			__assign_str(name, timeline->name);
+			if (timeline->ops->timeline_value_str) {
+				timeline->ops->timeline_value_str(timeline,
+							__entry->value,
+							sizeof(__entry->value));
+			} else {
+				__entry->value[0] = '\0';
+			}
+	),
+
+	TP_printk("name=%s value=%s", __get_str(name), __entry->value)
+);
+
+TRACE_EVENT(sync_wait,
+	TP_PROTO(struct sync_fence *fence, int begin),
+
+	TP_ARGS(fence, begin),
+
+	TP_STRUCT__entry(
+			__string(name, fence->name)
+			__field(s32, status)
+			__field(u32, begin)
+	),
+
+	TP_fast_assign(
+			__assign_str(name, fence->name);
+			__entry->status = atomic_read(&fence->status);
+			__entry->begin = begin;
+	),
+
+	TP_printk("%s name=%s state=%d", __entry->begin ? "begin" : "end",
+			__get_str(name), __entry->status)
+);
+
+TRACE_EVENT(sync_pt,
+	TP_PROTO(struct fence *pt),
+
+	TP_ARGS(pt),
+
+	TP_STRUCT__entry(
+		__string(timeline, pt->ops->get_timeline_name(pt))
+		__array(char, value, 32)
+	),
+
+	TP_fast_assign(
+		__assign_str(timeline, pt->ops->get_timeline_name(pt));
+		if (pt->ops->fence_value_str) {
+			pt->ops->fence_value_str(pt, __entry->value,
+							sizeof(__entry->value));
+		} else {
+			__entry->value[0] = '\0';
+		}
+	),
+
+	TP_printk("name=%s value=%s", __get_str(timeline), __entry->value)
+);
+
+#endif /* if !defined(_TRACE_SYNC_H) || defined(TRACE_HEADER_MULTI_READ) */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/drivers/staging/android/Kconfig b/drivers/staging/android/Kconfig
index 42b1512..4b18fee 100644
--- a/drivers/staging/android/Kconfig
+++ b/drivers/staging/android/Kconfig
@@ -38,34 +38,6 @@ config ANDROID_LOW_MEMORY_KILLER
 	  scripts (/init.rc), and it defines priority values with minimum free memory size
 	  for each priority.
 
-config SYNC
-	bool "Synchronization framework"
-	default n
-	select ANON_INODES
-	select DMA_SHARED_BUFFER
-	---help---
-	  This option enables the framework for synchronization between multiple
-	  drivers.  Sync implementations can take advantage of hardware
-	  synchronization built into devices like GPUs.
-
-config SW_SYNC
-	bool "Software synchronization objects"
-	default n
-	depends on SYNC
-	---help---
-	  A sync object driver that uses a 32bit counter to coordinate
-	  synchronization.  Useful when there is no hardware primitive backing
-	  the synchronization.
-
-config SW_SYNC_USER
-	bool "Userspace API for SW_SYNC"
-	default n
-	depends on SW_SYNC
-	---help---
-	  Provides a user space API to the sw sync object.
-	  *WARNING* improper use of this can result in deadlocking kernel
-	  drivers from userspace.
-
 source "drivers/staging/android/ion/Kconfig"
 
 endif # if ANDROID
diff --git a/drivers/staging/android/Makefile b/drivers/staging/android/Makefile
index c7b6c99..355ad0e 100644
--- a/drivers/staging/android/Makefile
+++ b/drivers/staging/android/Makefile
@@ -6,5 +6,3 @@ obj-$(CONFIG_ASHMEM)			+= ashmem.o
 obj-$(CONFIG_ANDROID_TIMED_OUTPUT)	+= timed_output.o
 obj-$(CONFIG_ANDROID_TIMED_GPIO)	+= timed_gpio.o
 obj-$(CONFIG_ANDROID_LOW_MEMORY_KILLER)	+= lowmemorykiller.o
-obj-$(CONFIG_SYNC)			+= sync.o sync_debug.o
-obj-$(CONFIG_SW_SYNC)			+= sw_sync.o
diff --git a/drivers/staging/android/sw_sync.c b/drivers/staging/android/sw_sync.c
deleted file mode 100644
index c4ff167..0000000
--- a/drivers/staging/android/sw_sync.c
+++ /dev/null
@@ -1,260 +0,0 @@
-/*
- * drivers/base/sw_sync.c
- *
- * Copyright (C) 2012 Google, Inc.
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- */
-
-#include <linux/kernel.h>
-#include <linux/init.h>
-#include <linux/export.h>
-#include <linux/file.h>
-#include <linux/fs.h>
-#include <linux/miscdevice.h>
-#include <linux/syscalls.h>
-#include <linux/uaccess.h>
-
-#include "sw_sync.h"
-
-static int sw_sync_cmp(u32 a, u32 b)
-{
-	if (a == b)
-		return 0;
-
-	return ((s32)a - (s32)b) < 0 ? -1 : 1;
-}
-
-struct sync_pt *sw_sync_pt_create(struct sw_sync_timeline *obj, u32 value)
-{
-	struct sw_sync_pt *pt;
-
-	pt = (struct sw_sync_pt *)
-		sync_pt_create(&obj->obj, sizeof(struct sw_sync_pt));
-
-	pt->value = value;
-
-	return (struct sync_pt *)pt;
-}
-EXPORT_SYMBOL(sw_sync_pt_create);
-
-static struct sync_pt *sw_sync_pt_dup(struct sync_pt *sync_pt)
-{
-	struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt;
-	struct sw_sync_timeline *obj =
-		(struct sw_sync_timeline *)sync_pt_parent(sync_pt);
-
-	return (struct sync_pt *)sw_sync_pt_create(obj, pt->value);
-}
-
-static int sw_sync_pt_has_signaled(struct sync_pt *sync_pt)
-{
-	struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt;
-	struct sw_sync_timeline *obj =
-		(struct sw_sync_timeline *)sync_pt_parent(sync_pt);
-
-	return sw_sync_cmp(obj->value, pt->value) >= 0;
-}
-
-static int sw_sync_pt_compare(struct sync_pt *a, struct sync_pt *b)
-{
-	struct sw_sync_pt *pt_a = (struct sw_sync_pt *)a;
-	struct sw_sync_pt *pt_b = (struct sw_sync_pt *)b;
-
-	return sw_sync_cmp(pt_a->value, pt_b->value);
-}
-
-static int sw_sync_fill_driver_data(struct sync_pt *sync_pt,
-				    void *data, int size)
-{
-	struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt;
-
-	if (size < sizeof(pt->value))
-		return -ENOMEM;
-
-	memcpy(data, &pt->value, sizeof(pt->value));
-
-	return sizeof(pt->value);
-}
-
-static void sw_sync_timeline_value_str(struct sync_timeline *sync_timeline,
-				       char *str, int size)
-{
-	struct sw_sync_timeline *timeline =
-		(struct sw_sync_timeline *)sync_timeline;
-	snprintf(str, size, "%d", timeline->value);
-}
-
-static void sw_sync_pt_value_str(struct sync_pt *sync_pt,
-				 char *str, int size)
-{
-	struct sw_sync_pt *pt = (struct sw_sync_pt *)sync_pt;
-
-	snprintf(str, size, "%d", pt->value);
-}
-
-static struct sync_timeline_ops sw_sync_timeline_ops = {
-	.driver_name = "sw_sync",
-	.dup = sw_sync_pt_dup,
-	.has_signaled = sw_sync_pt_has_signaled,
-	.compare = sw_sync_pt_compare,
-	.fill_driver_data = sw_sync_fill_driver_data,
-	.timeline_value_str = sw_sync_timeline_value_str,
-	.pt_value_str = sw_sync_pt_value_str,
-};
-
-struct sw_sync_timeline *sw_sync_timeline_create(const char *name)
-{
-	struct sw_sync_timeline *obj = (struct sw_sync_timeline *)
-		sync_timeline_create(&sw_sync_timeline_ops,
-				     sizeof(struct sw_sync_timeline),
-				     name);
-
-	return obj;
-}
-EXPORT_SYMBOL(sw_sync_timeline_create);
-
-void sw_sync_timeline_inc(struct sw_sync_timeline *obj, u32 inc)
-{
-	obj->value += inc;
-
-	sync_timeline_signal(&obj->obj);
-}
-EXPORT_SYMBOL(sw_sync_timeline_inc);
-
-#ifdef CONFIG_SW_SYNC_USER
-/* *WARNING*
- *
- * improper use of this can result in deadlocking kernel drivers from userspace.
- */
-
-/* opening sw_sync create a new sync obj */
-static int sw_sync_open(struct inode *inode, struct file *file)
-{
-	struct sw_sync_timeline *obj;
-	char task_comm[TASK_COMM_LEN];
-
-	get_task_comm(task_comm, current);
-
-	obj = sw_sync_timeline_create(task_comm);
-	if (!obj)
-		return -ENOMEM;
-
-	file->private_data = obj;
-
-	return 0;
-}
-
-static int sw_sync_release(struct inode *inode, struct file *file)
-{
-	struct sw_sync_timeline *obj = file->private_data;
-
-	sync_timeline_destroy(&obj->obj);
-	return 0;
-}
-
-static long sw_sync_ioctl_create_fence(struct sw_sync_timeline *obj,
-				       unsigned long arg)
-{
-	int fd = get_unused_fd_flags(O_CLOEXEC);
-	int err;
-	struct sync_pt *pt;
-	struct sync_fence *fence;
-	struct sw_sync_create_fence_data data;
-
-	if (fd < 0)
-		return fd;
-
-	if (copy_from_user(&data, (void __user *)arg, sizeof(data))) {
-		err = -EFAULT;
-		goto err;
-	}
-
-	pt = sw_sync_pt_create(obj, data.value);
-	if (!pt) {
-		err = -ENOMEM;
-		goto err;
-	}
-
-	data.name[sizeof(data.name) - 1] = '\0';
-	fence = sync_fence_create(data.name, pt);
-	if (!fence) {
-		sync_pt_free(pt);
-		err = -ENOMEM;
-		goto err;
-	}
-
-	data.fence = fd;
-	if (copy_to_user((void __user *)arg, &data, sizeof(data))) {
-		sync_fence_put(fence);
-		err = -EFAULT;
-		goto err;
-	}
-
-	sync_fence_install(fence, fd);
-
-	return 0;
-
-err:
-	put_unused_fd(fd);
-	return err;
-}
-
-static long sw_sync_ioctl_inc(struct sw_sync_timeline *obj, unsigned long arg)
-{
-	u32 value;
-
-	if (copy_from_user(&value, (void __user *)arg, sizeof(value)))
-		return -EFAULT;
-
-	sw_sync_timeline_inc(obj, value);
-
-	return 0;
-}
-
-static long sw_sync_ioctl(struct file *file, unsigned int cmd,
-			  unsigned long arg)
-{
-	struct sw_sync_timeline *obj = file->private_data;
-
-	switch (cmd) {
-	case SW_SYNC_IOC_CREATE_FENCE:
-		return sw_sync_ioctl_create_fence(obj, arg);
-
-	case SW_SYNC_IOC_INC:
-		return sw_sync_ioctl_inc(obj, arg);
-
-	default:
-		return -ENOTTY;
-	}
-}
-
-static const struct file_operations sw_sync_fops = {
-	.owner = THIS_MODULE,
-	.open = sw_sync_open,
-	.release = sw_sync_release,
-	.unlocked_ioctl = sw_sync_ioctl,
-	.compat_ioctl = sw_sync_ioctl,
-};
-
-static struct miscdevice sw_sync_dev = {
-	.minor	= MISC_DYNAMIC_MINOR,
-	.name	= "sw_sync",
-	.fops	= &sw_sync_fops,
-};
-
-static int __init sw_sync_device_init(void)
-{
-	return misc_register(&sw_sync_dev);
-}
-device_initcall(sw_sync_device_init);
-
-#endif /* CONFIG_SW_SYNC_USER */
diff --git a/drivers/staging/android/sw_sync.h b/drivers/staging/android/sw_sync.h
deleted file mode 100644
index c87ae9e..0000000
--- a/drivers/staging/android/sw_sync.h
+++ /dev/null
@@ -1,59 +0,0 @@
-/*
- * include/linux/sw_sync.h
- *
- * Copyright (C) 2012 Google, Inc.
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- */
-
-#ifndef _LINUX_SW_SYNC_H
-#define _LINUX_SW_SYNC_H
-
-#include <linux/types.h>
-#include <linux/kconfig.h>
-#include "sync.h"
-#include "uapi/sw_sync.h"
-
-struct sw_sync_timeline {
-	struct	sync_timeline	obj;
-
-	u32			value;
-};
-
-struct sw_sync_pt {
-	struct sync_pt		pt;
-
-	u32			value;
-};
-
-#if IS_ENABLED(CONFIG_SW_SYNC)
-struct sw_sync_timeline *sw_sync_timeline_create(const char *name);
-void sw_sync_timeline_inc(struct sw_sync_timeline *obj, u32 inc);
-
-struct sync_pt *sw_sync_pt_create(struct sw_sync_timeline *obj, u32 value);
-#else
-static inline struct sw_sync_timeline *sw_sync_timeline_create(const char *name)
-{
-	return NULL;
-}
-
-static inline void sw_sync_timeline_inc(struct sw_sync_timeline *obj, u32 inc)
-{
-}
-
-static inline struct sync_pt *sw_sync_pt_create(struct sw_sync_timeline *obj,
-						u32 value)
-{
-	return NULL;
-}
-#endif /* IS_ENABLED(CONFIG_SW_SYNC) */
-
-#endif /* _LINUX_SW_SYNC_H */
diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c
deleted file mode 100644
index 7f0e919..0000000
--- a/drivers/staging/android/sync.c
+++ /dev/null
@@ -1,734 +0,0 @@
-/*
- * drivers/base/sync.c
- *
- * Copyright (C) 2012 Google, Inc.
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- */
-
-#include <linux/debugfs.h>
-#include <linux/export.h>
-#include <linux/file.h>
-#include <linux/fs.h>
-#include <linux/kernel.h>
-#include <linux/poll.h>
-#include <linux/sched.h>
-#include <linux/seq_file.h>
-#include <linux/slab.h>
-#include <linux/uaccess.h>
-#include <linux/anon_inodes.h>
-
-#include "sync.h"
-
-#define CREATE_TRACE_POINTS
-#include "trace/sync.h"
-
-static const struct fence_ops android_fence_ops;
-static const struct file_operations sync_fence_fops;
-
-struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops,
-					   int size, const char *name)
-{
-	struct sync_timeline *obj;
-
-	if (size < sizeof(struct sync_timeline))
-		return NULL;
-
-	obj = kzalloc(size, GFP_KERNEL);
-	if (obj == NULL)
-		return NULL;
-
-	kref_init(&obj->kref);
-	obj->ops = ops;
-	obj->context = fence_context_alloc(1);
-	strlcpy(obj->name, name, sizeof(obj->name));
-
-	INIT_LIST_HEAD(&obj->child_list_head);
-	INIT_LIST_HEAD(&obj->active_list_head);
-	spin_lock_init(&obj->child_list_lock);
-
-	sync_timeline_debug_add(obj);
-
-	return obj;
-}
-EXPORT_SYMBOL(sync_timeline_create);
-
-static void sync_timeline_free(struct kref *kref)
-{
-	struct sync_timeline *obj =
-		container_of(kref, struct sync_timeline, kref);
-
-	sync_timeline_debug_remove(obj);
-
-	if (obj->ops->release_obj)
-		obj->ops->release_obj(obj);
-
-	kfree(obj);
-}
-
-static void sync_timeline_get(struct sync_timeline *obj)
-{
-	kref_get(&obj->kref);
-}
-
-static void sync_timeline_put(struct sync_timeline *obj)
-{
-	kref_put(&obj->kref, sync_timeline_free);
-}
-
-void sync_timeline_destroy(struct sync_timeline *obj)
-{
-	obj->destroyed = true;
-	/*
-	 * Ensure timeline is marked as destroyed before
-	 * changing timeline's fences status.
-	 */
-	smp_wmb();
-
-	/*
-	 * signal any children that their parent is going away.
-	 */
-	sync_timeline_signal(obj);
-	sync_timeline_put(obj);
-}
-EXPORT_SYMBOL(sync_timeline_destroy);
-
-void sync_timeline_signal(struct sync_timeline *obj)
-{
-	unsigned long flags;
-	LIST_HEAD(signaled_pts);
-	struct sync_pt *pt, *next;
-
-	trace_sync_timeline(obj);
-
-	spin_lock_irqsave(&obj->child_list_lock, flags);
-
-	list_for_each_entry_safe(pt, next, &obj->active_list_head,
-				 active_list) {
-		if (fence_is_signaled_locked(&pt->base))
-			list_del_init(&pt->active_list);
-	}
-
-	spin_unlock_irqrestore(&obj->child_list_lock, flags);
-}
-EXPORT_SYMBOL(sync_timeline_signal);
-
-struct sync_pt *sync_pt_create(struct sync_timeline *obj, int size)
-{
-	unsigned long flags;
-	struct sync_pt *pt;
-
-	if (size < sizeof(struct sync_pt))
-		return NULL;
-
-	pt = kzalloc(size, GFP_KERNEL);
-	if (pt == NULL)
-		return NULL;
-
-	spin_lock_irqsave(&obj->child_list_lock, flags);
-	sync_timeline_get(obj);
-	fence_init(&pt->base, &android_fence_ops, &obj->child_list_lock,
-		   obj->context, ++obj->value);
-	list_add_tail(&pt->child_list, &obj->child_list_head);
-	INIT_LIST_HEAD(&pt->active_list);
-	spin_unlock_irqrestore(&obj->child_list_lock, flags);
-	return pt;
-}
-EXPORT_SYMBOL(sync_pt_create);
-
-void sync_pt_free(struct sync_pt *pt)
-{
-	fence_put(&pt->base);
-}
-EXPORT_SYMBOL(sync_pt_free);
-
-static struct sync_fence *sync_fence_alloc(int size, const char *name)
-{
-	struct sync_fence *fence;
-
-	fence = kzalloc(size, GFP_KERNEL);
-	if (fence == NULL)
-		return NULL;
-
-	fence->file = anon_inode_getfile("sync_fence", &sync_fence_fops,
-					 fence, 0);
-	if (IS_ERR(fence->file))
-		goto err;
-
-	kref_init(&fence->kref);
-	strlcpy(fence->name, name, sizeof(fence->name));
-
-	init_waitqueue_head(&fence->wq);
-
-	return fence;
-
-err:
-	kfree(fence);
-	return NULL;
-}
-
-static void fence_check_cb_func(struct fence *f, struct fence_cb *cb)
-{
-	struct sync_fence_cb *check;
-	struct sync_fence *fence;
-
-	check = container_of(cb, struct sync_fence_cb, cb);
-	fence = check->fence;
-
-	if (atomic_dec_and_test(&fence->status))
-		wake_up_all(&fence->wq);
-}
-
-/* TODO: implement a create which takes more that one sync_pt */
-struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt)
-{
-	struct sync_fence *fence;
-
-	fence = sync_fence_alloc(offsetof(struct sync_fence, cbs[1]), name);
-	if (fence == NULL)
-		return NULL;
-
-	fence->num_fences = 1;
-	atomic_set(&fence->status, 1);
-
-	fence->cbs[0].sync_pt = pt;
-	fence->cbs[0].fence = fence;
-	if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func))
-		atomic_dec(&fence->status);
-
-	sync_fence_debug_add(fence);
-
-	return fence;
-}
-EXPORT_SYMBOL(sync_fence_create_dma);
-
-struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
-{
-	return sync_fence_create_dma(name, &pt->base);
-}
-EXPORT_SYMBOL(sync_fence_create);
-
-struct sync_fence *sync_fence_fdget(int fd)
-{
-	struct file *file = fget(fd);
-
-	if (file == NULL)
-		return NULL;
-
-	if (file->f_op != &sync_fence_fops)
-		goto err;
-
-	return file->private_data;
-
-err:
-	fput(file);
-	return NULL;
-}
-EXPORT_SYMBOL(sync_fence_fdget);
-
-void sync_fence_put(struct sync_fence *fence)
-{
-	fput(fence->file);
-}
-EXPORT_SYMBOL(sync_fence_put);
-
-void sync_fence_install(struct sync_fence *fence, int fd)
-{
-	fd_install(fd, fence->file);
-}
-EXPORT_SYMBOL(sync_fence_install);
-
-static void sync_fence_add_pt(struct sync_fence *fence,
-			      int *i, struct fence *pt)
-{
-	fence->cbs[*i].sync_pt = pt;
-	fence->cbs[*i].fence = fence;
-
-	if (!fence_add_callback(pt, &fence->cbs[*i].cb, fence_check_cb_func)) {
-		fence_get(pt);
-		(*i)++;
-	}
-}
-
-struct sync_fence *sync_fence_merge(const char *name,
-				    struct sync_fence *a, struct sync_fence *b)
-{
-	int num_fences = a->num_fences + b->num_fences;
-	struct sync_fence *fence;
-	int i, i_a, i_b;
-	unsigned long size = offsetof(struct sync_fence, cbs[num_fences]);
-
-	fence = sync_fence_alloc(size, name);
-	if (fence == NULL)
-		return NULL;
-
-	atomic_set(&fence->status, num_fences);
-
-	/*
-	 * Assume sync_fence a and b are both ordered and have no
-	 * duplicates with the same context.
-	 *
-	 * If a sync_fence can only be created with sync_fence_merge
-	 * and sync_fence_create, this is a reasonable assumption.
-	 */
-	for (i = i_a = i_b = 0; i_a < a->num_fences && i_b < b->num_fences; ) {
-		struct fence *pt_a = a->cbs[i_a].sync_pt;
-		struct fence *pt_b = b->cbs[i_b].sync_pt;
-
-		if (pt_a->context < pt_b->context) {
-			sync_fence_add_pt(fence, &i, pt_a);
-
-			i_a++;
-		} else if (pt_a->context > pt_b->context) {
-			sync_fence_add_pt(fence, &i, pt_b);
-
-			i_b++;
-		} else {
-			if (pt_a->seqno - pt_b->seqno <= INT_MAX)
-				sync_fence_add_pt(fence, &i, pt_a);
-			else
-				sync_fence_add_pt(fence, &i, pt_b);
-
-			i_a++;
-			i_b++;
-		}
-	}
-
-	for (; i_a < a->num_fences; i_a++)
-		sync_fence_add_pt(fence, &i, a->cbs[i_a].sync_pt);
-
-	for (; i_b < b->num_fences; i_b++)
-		sync_fence_add_pt(fence, &i, b->cbs[i_b].sync_pt);
-
-	if (num_fences > i)
-		atomic_sub(num_fences - i, &fence->status);
-	fence->num_fences = i;
-
-	sync_fence_debug_add(fence);
-	return fence;
-}
-EXPORT_SYMBOL(sync_fence_merge);
-
-int sync_fence_wake_up_wq(wait_queue_t *curr, unsigned mode,
-				 int wake_flags, void *key)
-{
-	struct sync_fence_waiter *wait;
-
-	wait = container_of(curr, struct sync_fence_waiter, work);
-	list_del_init(&wait->work.task_list);
-
-	wait->callback(wait->work.private, wait);
-	return 1;
-}
-
-int sync_fence_wait_async(struct sync_fence *fence,
-			  struct sync_fence_waiter *waiter)
-{
-	int err = atomic_read(&fence->status);
-	unsigned long flags;
-
-	if (err < 0)
-		return err;
-
-	if (!err)
-		return 1;
-
-	init_waitqueue_func_entry(&waiter->work, sync_fence_wake_up_wq);
-	waiter->work.private = fence;
-
-	spin_lock_irqsave(&fence->wq.lock, flags);
-	err = atomic_read(&fence->status);
-	if (err > 0)
-		__add_wait_queue_tail(&fence->wq, &waiter->work);
-	spin_unlock_irqrestore(&fence->wq.lock, flags);
-
-	if (err < 0)
-		return err;
-
-	return !err;
-}
-EXPORT_SYMBOL(sync_fence_wait_async);
-
-int sync_fence_cancel_async(struct sync_fence *fence,
-			     struct sync_fence_waiter *waiter)
-{
-	unsigned long flags;
-	int ret = 0;
-
-	spin_lock_irqsave(&fence->wq.lock, flags);
-	if (!list_empty(&waiter->work.task_list))
-		list_del_init(&waiter->work.task_list);
-	else
-		ret = -ENOENT;
-	spin_unlock_irqrestore(&fence->wq.lock, flags);
-	return ret;
-}
-EXPORT_SYMBOL(sync_fence_cancel_async);
-
-int sync_fence_wait(struct sync_fence *fence, long timeout)
-{
-	long ret;
-	int i;
-
-	if (timeout < 0)
-		timeout = MAX_SCHEDULE_TIMEOUT;
-	else
-		timeout = msecs_to_jiffies(timeout);
-
-	trace_sync_wait(fence, 1);
-	for (i = 0; i < fence->num_fences; ++i)
-		trace_sync_pt(fence->cbs[i].sync_pt);
-	ret = wait_event_interruptible_timeout(fence->wq,
-					       atomic_read(&fence->status) <= 0,
-					       timeout);
-	trace_sync_wait(fence, 0);
-
-	if (ret < 0) {
-		return ret;
-	} else if (ret == 0) {
-		if (timeout) {
-			pr_info("fence timeout on [%p] after %dms\n", fence,
-				jiffies_to_msecs(timeout));
-			sync_dump();
-		}
-		return -ETIME;
-	}
-
-	ret = atomic_read(&fence->status);
-	if (ret) {
-		pr_info("fence error %ld on [%p]\n", ret, fence);
-		sync_dump();
-	}
-	return ret;
-}
-EXPORT_SYMBOL(sync_fence_wait);
-
-static const char *android_fence_get_driver_name(struct fence *fence)
-{
-	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
-	struct sync_timeline *parent = sync_pt_parent(pt);
-
-	return parent->ops->driver_name;
-}
-
-static const char *android_fence_get_timeline_name(struct fence *fence)
-{
-	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
-	struct sync_timeline *parent = sync_pt_parent(pt);
-
-	return parent->name;
-}
-
-static void android_fence_release(struct fence *fence)
-{
-	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
-	struct sync_timeline *parent = sync_pt_parent(pt);
-	unsigned long flags;
-
-	spin_lock_irqsave(fence->lock, flags);
-	list_del(&pt->child_list);
-	if (WARN_ON_ONCE(!list_empty(&pt->active_list)))
-		list_del(&pt->active_list);
-	spin_unlock_irqrestore(fence->lock, flags);
-
-	if (parent->ops->free_pt)
-		parent->ops->free_pt(pt);
-
-	sync_timeline_put(parent);
-	fence_free(&pt->base);
-}
-
-static bool android_fence_signaled(struct fence *fence)
-{
-	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
-	struct sync_timeline *parent = sync_pt_parent(pt);
-	int ret;
-
-	ret = parent->ops->has_signaled(pt);
-	if (ret < 0)
-		fence->status = ret;
-	return ret;
-}
-
-static bool android_fence_enable_signaling(struct fence *fence)
-{
-	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
-	struct sync_timeline *parent = sync_pt_parent(pt);
-
-	if (android_fence_signaled(fence))
-		return false;
-
-	list_add_tail(&pt->active_list, &parent->active_list_head);
-	return true;
-}
-
-static int android_fence_fill_driver_data(struct fence *fence,
-					  void *data, int size)
-{
-	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
-	struct sync_timeline *parent = sync_pt_parent(pt);
-
-	if (!parent->ops->fill_driver_data)
-		return 0;
-	return parent->ops->fill_driver_data(pt, data, size);
-}
-
-static void android_fence_value_str(struct fence *fence,
-				    char *str, int size)
-{
-	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
-	struct sync_timeline *parent = sync_pt_parent(pt);
-
-	if (!parent->ops->pt_value_str) {
-		if (size)
-			*str = 0;
-		return;
-	}
-	parent->ops->pt_value_str(pt, str, size);
-}
-
-static void android_fence_timeline_value_str(struct fence *fence,
-					     char *str, int size)
-{
-	struct sync_pt *pt = container_of(fence, struct sync_pt, base);
-	struct sync_timeline *parent = sync_pt_parent(pt);
-
-	if (!parent->ops->timeline_value_str) {
-		if (size)
-			*str = 0;
-		return;
-	}
-	parent->ops->timeline_value_str(parent, str, size);
-}
-
-static const struct fence_ops android_fence_ops = {
-	.get_driver_name = android_fence_get_driver_name,
-	.get_timeline_name = android_fence_get_timeline_name,
-	.enable_signaling = android_fence_enable_signaling,
-	.signaled = android_fence_signaled,
-	.wait = fence_default_wait,
-	.release = android_fence_release,
-	.fill_driver_data = android_fence_fill_driver_data,
-	.fence_value_str = android_fence_value_str,
-	.timeline_value_str = android_fence_timeline_value_str,
-};
-
-static void sync_fence_free(struct kref *kref)
-{
-	struct sync_fence *fence = container_of(kref, struct sync_fence, kref);
-	int i, status = atomic_read(&fence->status);
-
-	for (i = 0; i < fence->num_fences; ++i) {
-		if (status)
-			fence_remove_callback(fence->cbs[i].sync_pt,
-					      &fence->cbs[i].cb);
-		fence_put(fence->cbs[i].sync_pt);
-	}
-
-	kfree(fence);
-}
-
-static int sync_fence_release(struct inode *inode, struct file *file)
-{
-	struct sync_fence *fence = file->private_data;
-
-	sync_fence_debug_remove(fence);
-
-	kref_put(&fence->kref, sync_fence_free);
-	return 0;
-}
-
-static unsigned int sync_fence_poll(struct file *file, poll_table *wait)
-{
-	struct sync_fence *fence = file->private_data;
-	int status;
-
-	poll_wait(file, &fence->wq, wait);
-
-	status = atomic_read(&fence->status);
-
-	if (!status)
-		return POLLIN;
-	else if (status < 0)
-		return POLLERR;
-	return 0;
-}
-
-static long sync_fence_ioctl_wait(struct sync_fence *fence, unsigned long arg)
-{
-	__s32 value;
-
-	if (copy_from_user(&value, (void __user *)arg, sizeof(value)))
-		return -EFAULT;
-
-	return sync_fence_wait(fence, value);
-}
-
-static long sync_fence_ioctl_merge(struct sync_fence *fence, unsigned long arg)
-{
-	int fd = get_unused_fd_flags(O_CLOEXEC);
-	int err;
-	struct sync_fence *fence2, *fence3;
-	struct sync_merge_data data;
-
-	if (fd < 0)
-		return fd;
-
-	if (copy_from_user(&data, (void __user *)arg, sizeof(data))) {
-		err = -EFAULT;
-		goto err_put_fd;
-	}
-
-	fence2 = sync_fence_fdget(data.fd2);
-	if (fence2 == NULL) {
-		err = -ENOENT;
-		goto err_put_fd;
-	}
-
-	data.name[sizeof(data.name) - 1] = '\0';
-	fence3 = sync_fence_merge(data.name, fence, fence2);
-	if (fence3 == NULL) {
-		err = -ENOMEM;
-		goto err_put_fence2;
-	}
-
-	data.fence = fd;
-	if (copy_to_user((void __user *)arg, &data, sizeof(data))) {
-		err = -EFAULT;
-		goto err_put_fence3;
-	}
-
-	sync_fence_install(fence3, fd);
-	sync_fence_put(fence2);
-	return 0;
-
-err_put_fence3:
-	sync_fence_put(fence3);
-
-err_put_fence2:
-	sync_fence_put(fence2);
-
-err_put_fd:
-	put_unused_fd(fd);
-	return err;
-}
-
-static int sync_fill_pt_info(struct fence *fence, void *data, int size)
-{
-	struct sync_pt_info *info = data;
-	int ret;
-
-	if (size < sizeof(struct sync_pt_info))
-		return -ENOMEM;
-
-	info->len = sizeof(struct sync_pt_info);
-
-	if (fence->ops->fill_driver_data) {
-		ret = fence->ops->fill_driver_data(fence, info->driver_data,
-						   size - sizeof(*info));
-		if (ret < 0)
-			return ret;
-
-		info->len += ret;
-	}
-
-	strlcpy(info->obj_name, fence->ops->get_timeline_name(fence),
-		sizeof(info->obj_name));
-	strlcpy(info->driver_name, fence->ops->get_driver_name(fence),
-		sizeof(info->driver_name));
-	if (fence_is_signaled(fence))
-		info->status = fence->status >= 0 ? 1 : fence->status;
-	else
-		info->status = 0;
-	info->timestamp_ns = ktime_to_ns(fence->timestamp);
-
-	return info->len;
-}
-
-static long sync_fence_ioctl_fence_info(struct sync_fence *fence,
-					unsigned long arg)
-{
-	struct sync_fence_info_data *data;
-	__u32 size;
-	__u32 len = 0;
-	int ret, i;
-
-	if (copy_from_user(&size, (void __user *)arg, sizeof(size)))
-		return -EFAULT;
-
-	if (size < sizeof(struct sync_fence_info_data))
-		return -EINVAL;
-
-	if (size > 4096)
-		size = 4096;
-
-	data = kzalloc(size, GFP_KERNEL);
-	if (data == NULL)
-		return -ENOMEM;
-
-	strlcpy(data->name, fence->name, sizeof(data->name));
-	data->status = atomic_read(&fence->status);
-	if (data->status >= 0)
-		data->status = !data->status;
-
-	len = sizeof(struct sync_fence_info_data);
-
-	for (i = 0; i < fence->num_fences; ++i) {
-		struct fence *pt = fence->cbs[i].sync_pt;
-
-		ret = sync_fill_pt_info(pt, (u8 *)data + len, size - len);
-
-		if (ret < 0)
-			goto out;
-
-		len += ret;
-	}
-
-	data->len = len;
-
-	if (copy_to_user((void __user *)arg, data, len))
-		ret = -EFAULT;
-	else
-		ret = 0;
-
-out:
-	kfree(data);
-
-	return ret;
-}
-
-static long sync_fence_ioctl(struct file *file, unsigned int cmd,
-			     unsigned long arg)
-{
-	struct sync_fence *fence = file->private_data;
-
-	switch (cmd) {
-	case SYNC_IOC_WAIT:
-		return sync_fence_ioctl_wait(fence, arg);
-
-	case SYNC_IOC_MERGE:
-		return sync_fence_ioctl_merge(fence, arg);
-
-	case SYNC_IOC_FENCE_INFO:
-		return sync_fence_ioctl_fence_info(fence, arg);
-
-	default:
-		return -ENOTTY;
-	}
-}
-
-static const struct file_operations sync_fence_fops = {
-	.release = sync_fence_release,
-	.poll = sync_fence_poll,
-	.unlocked_ioctl = sync_fence_ioctl,
-	.compat_ioctl = sync_fence_ioctl,
-};
-
diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h
deleted file mode 100644
index 798cd56..0000000
--- a/drivers/staging/android/sync.h
+++ /dev/null
@@ -1,366 +0,0 @@
-/*
- * include/linux/sync.h
- *
- * Copyright (C) 2012 Google, Inc.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- */
-
-#ifndef _LINUX_SYNC_H
-#define _LINUX_SYNC_H
-
-#include <linux/types.h>
-#include <linux/kref.h>
-#include <linux/ktime.h>
-#include <linux/list.h>
-#include <linux/spinlock.h>
-#include <linux/wait.h>
-#include <linux/fence.h>
-
-#include "uapi/sync.h"
-
-struct sync_timeline;
-struct sync_pt;
-struct sync_fence;
-
-/**
- * struct sync_timeline_ops - sync object implementation ops
- * @driver_name:	name of the implementation
- * @dup:		duplicate a sync_pt
- * @has_signaled:	returns:
- *			  1 if pt has signaled
- *			  0 if pt has not signaled
- *			 <0 on error
- * @compare:		returns:
- *			  1 if b will signal before a
- *			  0 if a and b will signal at the same time
- *			 -1 if a will signal before b
- * @free_pt:		called before sync_pt is freed
- * @release_obj:	called before sync_timeline is freed
- * @fill_driver_data:	write implementation specific driver data to data.
- *			  should return an error if there is not enough room
- *			  as specified by size.  This information is returned
- *			  to userspace by SYNC_IOC_FENCE_INFO.
- * @timeline_value_str: fill str with the value of the sync_timeline's counter
- * @pt_value_str:	fill str with the value of the sync_pt
- */
-struct sync_timeline_ops {
-	const char *driver_name;
-
-	/* required */
-	struct sync_pt * (*dup)(struct sync_pt *pt);
-
-	/* required */
-	int (*has_signaled)(struct sync_pt *pt);
-
-	/* required */
-	int (*compare)(struct sync_pt *a, struct sync_pt *b);
-
-	/* optional */
-	void (*free_pt)(struct sync_pt *sync_pt);
-
-	/* optional */
-	void (*release_obj)(struct sync_timeline *sync_timeline);
-
-	/* optional */
-	int (*fill_driver_data)(struct sync_pt *syncpt, void *data, int size);
-
-	/* optional */
-	void (*timeline_value_str)(struct sync_timeline *timeline, char *str,
-				   int size);
-
-	/* optional */
-	void (*pt_value_str)(struct sync_pt *pt, char *str, int size);
-};
-
-/**
- * struct sync_timeline - sync object
- * @kref:		reference count on fence.
- * @ops:		ops that define the implementation of the sync_timeline
- * @name:		name of the sync_timeline. Useful for debugging
- * @destroyed:		set when sync_timeline is destroyed
- * @child_list_head:	list of children sync_pts for this sync_timeline
- * @child_list_lock:	lock protecting @child_list_head, destroyed, and
- *			  sync_pt.status
- * @active_list_head:	list of active (unsignaled/errored) sync_pts
- * @sync_timeline_list:	membership in global sync_timeline_list
- */
-struct sync_timeline {
-	struct kref		kref;
-	const struct sync_timeline_ops	*ops;
-	char			name[32];
-
-	/* protected by child_list_lock */
-	bool			destroyed;
-	int			context, value;
-
-	struct list_head	child_list_head;
-	spinlock_t		child_list_lock;
-
-	struct list_head	active_list_head;
-
-#ifdef CONFIG_DEBUG_FS
-	struct list_head	sync_timeline_list;
-#endif
-};
-
-/**
- * struct sync_pt - sync point
- * @fence:		base fence class
- * @child_list:		membership in sync_timeline.child_list_head
- * @active_list:	membership in sync_timeline.active_list_head
- * @signaled_list:	membership in temporary signaled_list on stack
- * @fence:		sync_fence to which the sync_pt belongs
- * @pt_list:		membership in sync_fence.pt_list_head
- * @status:		1: signaled, 0:active, <0: error
- * @timestamp:		time which sync_pt status transitioned from active to
- *			  signaled or error.
- */
-struct sync_pt {
-	struct fence base;
-
-	struct list_head	child_list;
-	struct list_head	active_list;
-};
-
-static inline struct sync_timeline *sync_pt_parent(struct sync_pt *pt)
-{
-	return container_of(pt->base.lock, struct sync_timeline,
-			    child_list_lock);
-}
-
-struct sync_fence_cb {
-	struct fence_cb cb;
-	struct fence *sync_pt;
-	struct sync_fence *fence;
-};
-
-/**
- * struct sync_fence - sync fence
- * @file:		file representing this fence
- * @kref:		reference count on fence.
- * @name:		name of sync_fence.  Useful for debugging
- * @pt_list_head:	list of sync_pts in the fence.  immutable once fence
- *			  is created
- * @status:		0: signaled, >0:active, <0: error
- *
- * @wq:			wait queue for fence signaling
- * @sync_fence_list:	membership in global fence list
- */
-struct sync_fence {
-	struct file		*file;
-	struct kref		kref;
-	char			name[32];
-#ifdef CONFIG_DEBUG_FS
-	struct list_head	sync_fence_list;
-#endif
-	int num_fences;
-
-	wait_queue_head_t	wq;
-	atomic_t		status;
-
-	struct sync_fence_cb	cbs[];
-};
-
-struct sync_fence_waiter;
-typedef void (*sync_callback_t)(struct sync_fence *fence,
-				struct sync_fence_waiter *waiter);
-
-/**
- * struct sync_fence_waiter - metadata for asynchronous waiter on a fence
- * @waiter_list:	membership in sync_fence.waiter_list_head
- * @callback:		function pointer to call when fence signals
- * @callback_data:	pointer to pass to @callback
- */
-struct sync_fence_waiter {
-	wait_queue_t work;
-	sync_callback_t callback;
-};
-
-static inline void sync_fence_waiter_init(struct sync_fence_waiter *waiter,
-					  sync_callback_t callback)
-{
-	INIT_LIST_HEAD(&waiter->work.task_list);
-	waiter->callback = callback;
-}
-
-/*
- * API for sync_timeline implementers
- */
-
-/**
- * sync_timeline_create() - creates a sync object
- * @ops:	specifies the implementation ops for the object
- * @size:	size to allocate for this obj
- * @name:	sync_timeline name
- *
- * Creates a new sync_timeline which will use the implementation specified by
- * @ops.  @size bytes will be allocated allowing for implementation specific
- * data to be kept after the generic sync_timeline struct.
- */
-struct sync_timeline *sync_timeline_create(const struct sync_timeline_ops *ops,
-					   int size, const char *name);
-
-/**
- * sync_timeline_destroy() - destroys a sync object
- * @obj:	sync_timeline to destroy
- *
- * A sync implementation should call this when the @obj is going away
- * (i.e. module unload.)  @obj won't actually be freed until all its children
- * sync_pts are freed.
- */
-void sync_timeline_destroy(struct sync_timeline *obj);
-
-/**
- * sync_timeline_signal() - signal a status change on a sync_timeline
- * @obj:	sync_timeline to signal
- *
- * A sync implementation should call this any time one of it's sync_pts
- * has signaled or has an error condition.
- */
-void sync_timeline_signal(struct sync_timeline *obj);
-
-/**
- * sync_pt_create() - creates a sync pt
- * @parent:	sync_pt's parent sync_timeline
- * @size:	size to allocate for this pt
- *
- * Creates a new sync_pt as a child of @parent.  @size bytes will be
- * allocated allowing for implementation specific data to be kept after
- * the generic sync_timeline struct.
- */
-struct sync_pt *sync_pt_create(struct sync_timeline *parent, int size);
-
-/**
- * sync_pt_free() - frees a sync pt
- * @pt:		sync_pt to free
- *
- * This should only be called on sync_pts which have been created but
- * not added to a fence.
- */
-void sync_pt_free(struct sync_pt *pt);
-
-/**
- * sync_fence_create() - creates a sync fence
- * @name:	name of fence to create
- * @pt:		sync_pt to add to the fence
- *
- * Creates a fence containg @pt.  Once this is called, the fence takes
- * a reference on @pt.
- */
-struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt);
-
-/**
- * sync_fence_create_dma() - creates a sync fence from dma-fence
- * @name:	name of fence to create
- * @pt:	dma-fence to add to the fence
- *
- * Creates a fence containg @pt.  Once this is called, the fence takes
- * a reference on @pt.
- */
-struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt);
-
-/*
- * API for sync_fence consumers
- */
-
-/**
- * sync_fence_merge() - merge two fences
- * @name:	name of new fence
- * @a:		fence a
- * @b:		fence b
- *
- * Creates a new fence which contains copies of all the sync_pts in both
- * @a and @b.  @a and @b remain valid, independent fences.
- */
-struct sync_fence *sync_fence_merge(const char *name,
-				    struct sync_fence *a, struct sync_fence *b);
-
-/**
- * sync_fence_fdget() - get a fence from an fd
- * @fd:		fd referencing a fence
- *
- * Ensures @fd references a valid fence, increments the refcount of the backing
- * file, and returns the fence.
- */
-struct sync_fence *sync_fence_fdget(int fd);
-
-/**
- * sync_fence_put() - puts a reference of a sync fence
- * @fence:	fence to put
- *
- * Puts a reference on @fence.  If this is the last reference, the fence and
- * all it's sync_pts will be freed
- */
-void sync_fence_put(struct sync_fence *fence);
-
-/**
- * sync_fence_install() - installs a fence into a file descriptor
- * @fence:	fence to install
- * @fd:		file descriptor in which to install the fence
- *
- * Installs @fence into @fd.  @fd's should be acquired through
- * get_unused_fd_flags(O_CLOEXEC).
- */
-void sync_fence_install(struct sync_fence *fence, int fd);
-
-/**
- * sync_fence_wait_async() - registers and async wait on the fence
- * @fence:		fence to wait on
- * @waiter:		waiter callback struck
- *
- * Returns 1 if @fence has already signaled.
- *
- * Registers a callback to be called when @fence signals or has an error.
- * @waiter should be initialized with sync_fence_waiter_init().
- */
-int sync_fence_wait_async(struct sync_fence *fence,
-			  struct sync_fence_waiter *waiter);
-
-/**
- * sync_fence_cancel_async() - cancels an async wait
- * @fence:		fence to wait on
- * @waiter:		waiter callback struck
- *
- * returns 0 if waiter was removed from fence's async waiter list.
- * returns -ENOENT if waiter was not found on fence's async waiter list.
- *
- * Cancels a previously registered async wait.  Will fail gracefully if
- * @waiter was never registered or if @fence has already signaled @waiter.
- */
-int sync_fence_cancel_async(struct sync_fence *fence,
-			    struct sync_fence_waiter *waiter);
-
-/**
- * sync_fence_wait() - wait on fence
- * @fence:	fence to wait on
- * @tiemout:	timeout in ms
- *
- * Wait for @fence to be signaled or have an error.  Waits indefinitely
- * if @timeout < 0
- */
-int sync_fence_wait(struct sync_fence *fence, long timeout);
-
-#ifdef CONFIG_DEBUG_FS
-
-void sync_timeline_debug_add(struct sync_timeline *obj);
-void sync_timeline_debug_remove(struct sync_timeline *obj);
-void sync_fence_debug_add(struct sync_fence *fence);
-void sync_fence_debug_remove(struct sync_fence *fence);
-void sync_dump(void);
-
-#else
-# define sync_timeline_debug_add(obj)
-# define sync_timeline_debug_remove(obj)
-# define sync_fence_debug_add(fence)
-# define sync_fence_debug_remove(fence)
-# define sync_dump()
-#endif
-int sync_fence_wake_up_wq(wait_queue_t *curr, unsigned mode,
-				 int wake_flags, void *key);
-
-#endif /* _LINUX_SYNC_H */
diff --git a/drivers/staging/android/sync_debug.c b/drivers/staging/android/sync_debug.c
deleted file mode 100644
index f45d13c..0000000
--- a/drivers/staging/android/sync_debug.c
+++ /dev/null
@@ -1,256 +0,0 @@
-/*
- * drivers/base/sync.c
- *
- * Copyright (C) 2012 Google, Inc.
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- */
-
-#include <linux/debugfs.h>
-#include <linux/export.h>
-#include <linux/file.h>
-#include <linux/fs.h>
-#include <linux/kernel.h>
-#include <linux/poll.h>
-#include <linux/sched.h>
-#include <linux/seq_file.h>
-#include <linux/slab.h>
-#include <linux/uaccess.h>
-#include <linux/anon_inodes.h>
-#include <linux/time64.h>
-#include "sync.h"
-
-#ifdef CONFIG_DEBUG_FS
-
-static LIST_HEAD(sync_timeline_list_head);
-static DEFINE_SPINLOCK(sync_timeline_list_lock);
-static LIST_HEAD(sync_fence_list_head);
-static DEFINE_SPINLOCK(sync_fence_list_lock);
-
-void sync_timeline_debug_add(struct sync_timeline *obj)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&sync_timeline_list_lock, flags);
-	list_add_tail(&obj->sync_timeline_list, &sync_timeline_list_head);
-	spin_unlock_irqrestore(&sync_timeline_list_lock, flags);
-}
-
-void sync_timeline_debug_remove(struct sync_timeline *obj)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&sync_timeline_list_lock, flags);
-	list_del(&obj->sync_timeline_list);
-	spin_unlock_irqrestore(&sync_timeline_list_lock, flags);
-}
-
-void sync_fence_debug_add(struct sync_fence *fence)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&sync_fence_list_lock, flags);
-	list_add_tail(&fence->sync_fence_list, &sync_fence_list_head);
-	spin_unlock_irqrestore(&sync_fence_list_lock, flags);
-}
-
-void sync_fence_debug_remove(struct sync_fence *fence)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&sync_fence_list_lock, flags);
-	list_del(&fence->sync_fence_list);
-	spin_unlock_irqrestore(&sync_fence_list_lock, flags);
-}
-
-static const char *sync_status_str(int status)
-{
-	if (status == 0)
-		return "signaled";
-
-	if (status > 0)
-		return "active";
-
-	return "error";
-}
-
-static void sync_print_pt(struct seq_file *s, struct fence *pt, bool fence)
-{
-	int status = 1;
-
-	if (fence_is_signaled_locked(pt))
-		status = pt->status;
-
-	seq_printf(s, "  %s%spt %s",
-		   fence && pt->ops->get_timeline_name ?
-		   pt->ops->get_timeline_name(pt) : "",
-		   fence ? "_" : "",
-		   sync_status_str(status));
-
-	if (status <= 0) {
-		struct timespec64 ts64 =
-			ktime_to_timespec64(pt->timestamp);
-
-		seq_printf(s, "@%lld.%09ld", (s64)ts64.tv_sec, ts64.tv_nsec);
-	}
-
-	if ((!fence || pt->ops->timeline_value_str) &&
-	    pt->ops->fence_value_str) {
-		char value[64];
-		bool success;
-
-		pt->ops->fence_value_str(pt, value, sizeof(value));
-		success = strlen(value);
-
-		if (success)
-			seq_printf(s, ": %s", value);
-
-		if (success && fence) {
-			pt->ops->timeline_value_str(pt, value, sizeof(value));
-
-			if (strlen(value))
-				seq_printf(s, " / %s", value);
-		}
-	}
-
-	seq_puts(s, "\n");
-}
-
-static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj)
-{
-	struct list_head *pos;
-	unsigned long flags;
-
-	seq_printf(s, "%s %s", obj->name, obj->ops->driver_name);
-
-	if (obj->ops->timeline_value_str) {
-		char value[64];
-
-		obj->ops->timeline_value_str(obj, value, sizeof(value));
-		seq_printf(s, ": %s", value);
-	}
-
-	seq_puts(s, "\n");
-
-	spin_lock_irqsave(&obj->child_list_lock, flags);
-	list_for_each(pos, &obj->child_list_head) {
-		struct sync_pt *pt =
-			container_of(pos, struct sync_pt, child_list);
-		sync_print_pt(s, &pt->base, false);
-	}
-	spin_unlock_irqrestore(&obj->child_list_lock, flags);
-}
-
-static void sync_print_fence(struct seq_file *s, struct sync_fence *fence)
-{
-	wait_queue_t *pos;
-	unsigned long flags;
-	int i;
-
-	seq_printf(s, "[%p] %s: %s\n", fence, fence->name,
-		   sync_status_str(atomic_read(&fence->status)));
-
-	for (i = 0; i < fence->num_fences; ++i) {
-		sync_print_pt(s, fence->cbs[i].sync_pt, true);
-	}
-
-	spin_lock_irqsave(&fence->wq.lock, flags);
-	list_for_each_entry(pos, &fence->wq.task_list, task_list) {
-		struct sync_fence_waiter *waiter;
-
-		if (pos->func != &sync_fence_wake_up_wq)
-			continue;
-
-		waiter = container_of(pos, struct sync_fence_waiter, work);
-
-		seq_printf(s, "waiter %pF\n", waiter->callback);
-	}
-	spin_unlock_irqrestore(&fence->wq.lock, flags);
-}
-
-static int sync_debugfs_show(struct seq_file *s, void *unused)
-{
-	unsigned long flags;
-	struct list_head *pos;
-
-	seq_puts(s, "objs:\n--------------\n");
-
-	spin_lock_irqsave(&sync_timeline_list_lock, flags);
-	list_for_each(pos, &sync_timeline_list_head) {
-		struct sync_timeline *obj =
-			container_of(pos, struct sync_timeline,
-				     sync_timeline_list);
-
-		sync_print_obj(s, obj);
-		seq_puts(s, "\n");
-	}
-	spin_unlock_irqrestore(&sync_timeline_list_lock, flags);
-
-	seq_puts(s, "fences:\n--------------\n");
-
-	spin_lock_irqsave(&sync_fence_list_lock, flags);
-	list_for_each(pos, &sync_fence_list_head) {
-		struct sync_fence *fence =
-			container_of(pos, struct sync_fence, sync_fence_list);
-
-		sync_print_fence(s, fence);
-		seq_puts(s, "\n");
-	}
-	spin_unlock_irqrestore(&sync_fence_list_lock, flags);
-	return 0;
-}
-
-static int sync_debugfs_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, sync_debugfs_show, inode->i_private);
-}
-
-static const struct file_operations sync_debugfs_fops = {
-	.open           = sync_debugfs_open,
-	.read           = seq_read,
-	.llseek         = seq_lseek,
-	.release        = single_release,
-};
-
-static __init int sync_debugfs_init(void)
-{
-	debugfs_create_file("sync", S_IRUGO, NULL, NULL, &sync_debugfs_fops);
-	return 0;
-}
-late_initcall(sync_debugfs_init);
-
-#define DUMP_CHUNK 256
-static char sync_dump_buf[64 * 1024];
-void sync_dump(void)
-{
-	struct seq_file s = {
-		.buf = sync_dump_buf,
-		.size = sizeof(sync_dump_buf) - 1,
-	};
-	int i;
-
-	sync_debugfs_show(&s, NULL);
-
-	for (i = 0; i < s.count; i += DUMP_CHUNK) {
-		if ((s.count - i) > DUMP_CHUNK) {
-			char c = s.buf[i + DUMP_CHUNK];
-
-			s.buf[i + DUMP_CHUNK] = 0;
-			pr_cont("%s", s.buf + i);
-			s.buf[i + DUMP_CHUNK] = c;
-		} else {
-			s.buf[s.count] = 0;
-			pr_cont("%s", s.buf + i);
-		}
-	}
-}
-
-#endif
diff --git a/drivers/staging/android/trace/sync.h b/drivers/staging/android/trace/sync.h
deleted file mode 100644
index 77edb97..0000000
--- a/drivers/staging/android/trace/sync.h
+++ /dev/null
@@ -1,82 +0,0 @@
-#undef TRACE_SYSTEM
-#define TRACE_INCLUDE_PATH ../../drivers/staging/android/trace
-#define TRACE_SYSTEM sync
-
-#if !defined(_TRACE_SYNC_H) || defined(TRACE_HEADER_MULTI_READ)
-#define _TRACE_SYNC_H
-
-#include "../sync.h"
-#include <linux/tracepoint.h>
-
-TRACE_EVENT(sync_timeline,
-	TP_PROTO(struct sync_timeline *timeline),
-
-	TP_ARGS(timeline),
-
-	TP_STRUCT__entry(
-			__string(name, timeline->name)
-			__array(char, value, 32)
-	),
-
-	TP_fast_assign(
-			__assign_str(name, timeline->name);
-			if (timeline->ops->timeline_value_str) {
-				timeline->ops->timeline_value_str(timeline,
-							__entry->value,
-							sizeof(__entry->value));
-			} else {
-				__entry->value[0] = '\0';
-			}
-	),
-
-	TP_printk("name=%s value=%s", __get_str(name), __entry->value)
-);
-
-TRACE_EVENT(sync_wait,
-	TP_PROTO(struct sync_fence *fence, int begin),
-
-	TP_ARGS(fence, begin),
-
-	TP_STRUCT__entry(
-			__string(name, fence->name)
-			__field(s32, status)
-			__field(u32, begin)
-	),
-
-	TP_fast_assign(
-			__assign_str(name, fence->name);
-			__entry->status = atomic_read(&fence->status);
-			__entry->begin = begin;
-	),
-
-	TP_printk("%s name=%s state=%d", __entry->begin ? "begin" : "end",
-			__get_str(name), __entry->status)
-);
-
-TRACE_EVENT(sync_pt,
-	TP_PROTO(struct fence *pt),
-
-	TP_ARGS(pt),
-
-	TP_STRUCT__entry(
-		__string(timeline, pt->ops->get_timeline_name(pt))
-		__array(char, value, 32)
-	),
-
-	TP_fast_assign(
-		__assign_str(timeline, pt->ops->get_timeline_name(pt));
-		if (pt->ops->fence_value_str) {
-			pt->ops->fence_value_str(pt, __entry->value,
-							sizeof(__entry->value));
-		} else {
-			__entry->value[0] = '\0';
-		}
-	),
-
-	TP_printk("name=%s value=%s", __get_str(timeline), __entry->value)
-);
-
-#endif /* if !defined(_TRACE_SYNC_H) || defined(TRACE_HEADER_MULTI_READ) */
-
-/* This part must be outside protection */
-#include <trace/define_trace.h>
diff --git a/drivers/staging/android/uapi/sw_sync.h b/drivers/staging/android/uapi/sw_sync.h
deleted file mode 100644
index 9b5d486..0000000
--- a/drivers/staging/android/uapi/sw_sync.h
+++ /dev/null
@@ -1,32 +0,0 @@
-/*
- * Copyright (C) 2012 Google, Inc.
- *
- * This software is licensed under the terms of the GNU General Public
- * License version 2, as published by the Free Software Foundation, and
- * may be copied, distributed, and modified under those terms.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- */
-
-#ifndef _UAPI_LINUX_SW_SYNC_H
-#define _UAPI_LINUX_SW_SYNC_H
-
-#include <linux/types.h>
-
-struct sw_sync_create_fence_data {
-	__u32	value;
-	char	name[32];
-	__s32	fence; /* fd of new fence */
-};
-
-#define SW_SYNC_IOC_MAGIC	'W'
-
-#define SW_SYNC_IOC_CREATE_FENCE	_IOWR(SW_SYNC_IOC_MAGIC, 0,\
-		struct sw_sync_create_fence_data)
-#define SW_SYNC_IOC_INC			_IOW(SW_SYNC_IOC_MAGIC, 1, __u32)
-
-#endif /* _UAPI_LINUX_SW_SYNC_H */
diff --git a/drivers/staging/android/uapi/sync.h b/drivers/staging/android/uapi/sync.h
deleted file mode 100644
index e964c75..0000000
--- a/drivers/staging/android/uapi/sync.h
+++ /dev/null
@@ -1,97 +0,0 @@
-/*
- * Copyright (C) 2012 Google, Inc.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- */
-
-#ifndef _UAPI_LINUX_SYNC_H
-#define _UAPI_LINUX_SYNC_H
-
-#include <linux/ioctl.h>
-#include <linux/types.h>
-
-/**
- * struct sync_merge_data - data passed to merge ioctl
- * @fd2:	file descriptor of second fence
- * @name:	name of new fence
- * @fence:	returns the fd of the new fence to userspace
- */
-struct sync_merge_data {
-	__s32	fd2; /* fd of second fence */
-	char	name[32]; /* name of new fence */
-	__s32	fence; /* fd on newly created fence */
-};
-
-/**
- * struct sync_pt_info - detailed sync_pt information
- * @len:		length of sync_pt_info including any driver_data
- * @obj_name:		name of parent sync_timeline
- * @driver_name:	name of driver implementing the parent
- * @status:		status of the sync_pt 0:active 1:signaled <0:error
- * @timestamp_ns:	timestamp of status change in nanoseconds
- * @driver_data:	any driver dependent data
- */
-struct sync_pt_info {
-	__u32	len;
-	char	obj_name[32];
-	char	driver_name[32];
-	__s32	status;
-	__u64	timestamp_ns;
-
-	__u8	driver_data[0];
-};
-
-/**
- * struct sync_fence_info_data - data returned from fence info ioctl
- * @len:	ioctl caller writes the size of the buffer its passing in.
- *		ioctl returns length of sync_fence_data returned to userspace
- *		including pt_info.
- * @name:	name of fence
- * @status:	status of fence. 1: signaled 0:active <0:error
- * @pt_info:	a sync_pt_info struct for every sync_pt in the fence
- */
-struct sync_fence_info_data {
-	__u32	len;
-	char	name[32];
-	__s32	status;
-
-	__u8	pt_info[0];
-};
-
-#define SYNC_IOC_MAGIC		'>'
-
-/**
- * DOC: SYNC_IOC_WAIT - wait for a fence to signal
- *
- * pass timeout in milliseconds.  Waits indefinitely timeout < 0.
- */
-#define SYNC_IOC_WAIT		_IOW(SYNC_IOC_MAGIC, 0, __s32)
-
-/**
- * DOC: SYNC_IOC_MERGE - merge two fences
- *
- * Takes a struct sync_merge_data.  Creates a new fence containing copies of
- * the sync_pts in both the calling fd and sync_merge_data.fd2.  Returns the
- * new fence's fd in sync_merge_data.fence
- */
-#define SYNC_IOC_MERGE		_IOWR(SYNC_IOC_MAGIC, 1, struct sync_merge_data)
-
-/**
- * DOC: SYNC_IOC_FENCE_INFO - get detailed information on a fence
- *
- * Takes a struct sync_fence_info_data with extra space allocated for pt_info.
- * Caller should write the size of the buffer into len.  On return, len is
- * updated to reflect the total size of the sync_fence_info_data including
- * pt_info.
- *
- * pt_info is a buffer containing sync_pt_infos for every sync_pt in the fence.
- * To iterate over the sync_pt_infos, use the sync_pt_info.len field.
- */
-#define SYNC_IOC_FENCE_INFO	_IOWR(SYNC_IOC_MAGIC, 2,\
-	struct sync_fence_info_data)
-
-#endif /* _UAPI_LINUX_SYNC_H */
diff --git a/include/uapi/Kbuild b/include/uapi/Kbuild
index 245aa6e..7c415d0 100644
--- a/include/uapi/Kbuild
+++ b/include/uapi/Kbuild
@@ -13,3 +13,4 @@ header-y += drm/
 header-y += xen/
 header-y += scsi/
 header-y += misc/
+header-y += sync/
diff --git a/include/uapi/sync/Kbuild b/include/uapi/sync/Kbuild
new file mode 100644
index 0000000..2716ffe
--- /dev/null
+++ b/include/uapi/sync/Kbuild
@@ -0,0 +1,3 @@
+# sync Header export list
+header-y += sw_sync.h
+header-y += sync.h
diff --git a/include/uapi/sync/sw_sync.h b/include/uapi/sync/sw_sync.h
new file mode 100644
index 0000000..9b5d486
--- /dev/null
+++ b/include/uapi/sync/sw_sync.h
@@ -0,0 +1,32 @@
+/*
+ * Copyright (C) 2012 Google, Inc.
+ *
+ * This software is licensed under the terms of the GNU General Public
+ * License version 2, as published by the Free Software Foundation, and
+ * may be copied, distributed, and modified under those terms.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef _UAPI_LINUX_SW_SYNC_H
+#define _UAPI_LINUX_SW_SYNC_H
+
+#include <linux/types.h>
+
+struct sw_sync_create_fence_data {
+	__u32	value;
+	char	name[32];
+	__s32	fence; /* fd of new fence */
+};
+
+#define SW_SYNC_IOC_MAGIC	'W'
+
+#define SW_SYNC_IOC_CREATE_FENCE	_IOWR(SW_SYNC_IOC_MAGIC, 0,\
+		struct sw_sync_create_fence_data)
+#define SW_SYNC_IOC_INC			_IOW(SW_SYNC_IOC_MAGIC, 1, __u32)
+
+#endif /* _UAPI_LINUX_SW_SYNC_H */
diff --git a/include/uapi/sync/sync.h b/include/uapi/sync/sync.h
new file mode 100644
index 0000000..e964c75
--- /dev/null
+++ b/include/uapi/sync/sync.h
@@ -0,0 +1,97 @@
+/*
+ * Copyright (C) 2012 Google, Inc.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#ifndef _UAPI_LINUX_SYNC_H
+#define _UAPI_LINUX_SYNC_H
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/**
+ * struct sync_merge_data - data passed to merge ioctl
+ * @fd2:	file descriptor of second fence
+ * @name:	name of new fence
+ * @fence:	returns the fd of the new fence to userspace
+ */
+struct sync_merge_data {
+	__s32	fd2; /* fd of second fence */
+	char	name[32]; /* name of new fence */
+	__s32	fence; /* fd on newly created fence */
+};
+
+/**
+ * struct sync_pt_info - detailed sync_pt information
+ * @len:		length of sync_pt_info including any driver_data
+ * @obj_name:		name of parent sync_timeline
+ * @driver_name:	name of driver implementing the parent
+ * @status:		status of the sync_pt 0:active 1:signaled <0:error
+ * @timestamp_ns:	timestamp of status change in nanoseconds
+ * @driver_data:	any driver dependent data
+ */
+struct sync_pt_info {
+	__u32	len;
+	char	obj_name[32];
+	char	driver_name[32];
+	__s32	status;
+	__u64	timestamp_ns;
+
+	__u8	driver_data[0];
+};
+
+/**
+ * struct sync_fence_info_data - data returned from fence info ioctl
+ * @len:	ioctl caller writes the size of the buffer its passing in.
+ *		ioctl returns length of sync_fence_data returned to userspace
+ *		including pt_info.
+ * @name:	name of fence
+ * @status:	status of fence. 1: signaled 0:active <0:error
+ * @pt_info:	a sync_pt_info struct for every sync_pt in the fence
+ */
+struct sync_fence_info_data {
+	__u32	len;
+	char	name[32];
+	__s32	status;
+
+	__u8	pt_info[0];
+};
+
+#define SYNC_IOC_MAGIC		'>'
+
+/**
+ * DOC: SYNC_IOC_WAIT - wait for a fence to signal
+ *
+ * pass timeout in milliseconds.  Waits indefinitely timeout < 0.
+ */
+#define SYNC_IOC_WAIT		_IOW(SYNC_IOC_MAGIC, 0, __s32)
+
+/**
+ * DOC: SYNC_IOC_MERGE - merge two fences
+ *
+ * Takes a struct sync_merge_data.  Creates a new fence containing copies of
+ * the sync_pts in both the calling fd and sync_merge_data.fd2.  Returns the
+ * new fence's fd in sync_merge_data.fence
+ */
+#define SYNC_IOC_MERGE		_IOWR(SYNC_IOC_MAGIC, 1, struct sync_merge_data)
+
+/**
+ * DOC: SYNC_IOC_FENCE_INFO - get detailed information on a fence
+ *
+ * Takes a struct sync_fence_info_data with extra space allocated for pt_info.
+ * Caller should write the size of the buffer into len.  On return, len is
+ * updated to reflect the total size of the sync_fence_info_data including
+ * pt_info.
+ *
+ * pt_info is a buffer containing sync_pt_infos for every sync_pt in the fence.
+ * To iterate over the sync_pt_infos, use the sync_pt_info.len field.
+ */
+#define SYNC_IOC_FENCE_INFO	_IOWR(SYNC_IOC_MAGIC, 2,\
+	struct sync_fence_info_data)
+
+#endif /* _UAPI_LINUX_SYNC_H */
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 04/12] drm/i915: Convert requests to use struct fence
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (2 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 03/12] staging/android/sync: Move sync framework out of staging John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 05/12] drm/i915: Removed now redudant parameter to i915_gem_request_completed() John.C.Harrison
                                     ` (9 subsequent siblings)
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

There is a construct in the linux kernel called 'struct fence' that is
intended to keep track of work that is executed on hardware. I.e. it
solves the basic problem that the drivers 'struct
drm_i915_gem_request' is trying to address. The request structure does
quite a lot more than simply track the execution progress so is very
definitely still required. However, the basic completion status side
could be updated to use the ready made fence implementation and gain
all the advantages that provides.

This patch makes the first step of integrating a struct fence into the
request. It replaces the explicit reference count with that of the
fence. It also replaces the 'is completed' test with the fence's
equivalent. Currently, that simply chains on to the original request
implementation. A future patch will improve this.

v3: Updated after review comments by Tvrtko Ursulin. Added fence
context/seqno pair to the debugfs request info. Renamed fence 'driver
name' to just 'i915'. Removed BUG_ONs.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  5 +--
 drivers/gpu/drm/i915/i915_drv.h         | 45 +++++++++++++-------------
 drivers/gpu/drm/i915/i915_gem.c         | 56 ++++++++++++++++++++++++++++++---
 drivers/gpu/drm/i915/intel_lrc.c        |  1 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |  1 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |  3 ++
 6 files changed, 81 insertions(+), 30 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 7415606..5b31186 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -709,11 +709,12 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 			task = NULL;
 			if (req->pid)
 				task = pid_task(req->pid, PIDTYPE_PID);
-			seq_printf(m, "    %x @ %d: %s [%d]\n",
+			seq_printf(m, "    %x @ %d: %s [%d], fence = %u.%u\n",
 				   req->seqno,
 				   (int) (jiffies - req->emitted_jiffies),
 				   task ? task->comm : "<unknown>",
-				   task ? task->pid : -1);
+				   task ? task->pid : -1,
+				   req->fence.context, req->fence.seqno);
 			rcu_read_unlock();
 		}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 436149e..aa5cba7 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -51,6 +51,7 @@
 #include <linux/kref.h>
 #include <linux/pm_qos.h>
 #include "intel_guc.h"
+#include <linux/fence.h>
 
 /* General customization:
  */
@@ -2174,7 +2175,17 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
  * initial reference taken using kref_init
  */
 struct drm_i915_gem_request {
-	struct kref ref;
+	/**
+	 * Underlying object for implementing the signal/wait stuff.
+	 * NB: Never call fence_later() or return this fence object to user
+	 * land! Due to lazy allocation, scheduler re-ordering, pre-emption,
+	 * etc., there is no guarantee at all about the validity or
+	 * sequentiality of the fence's seqno! It is also unsafe to let
+	 * anything outside of the i915 driver get hold of the fence object
+	 * as the clean up when decrementing the reference count requires
+	 * holding the driver mutex lock.
+	 */
+	struct fence fence;
 
 	/** On Which ring this request was generated */
 	struct drm_i915_private *i915;
@@ -2251,7 +2262,13 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct intel_context *ctx,
 			   struct drm_i915_gem_request **req_out);
 void i915_gem_request_cancel(struct drm_i915_gem_request *req);
-void i915_gem_request_free(struct kref *req_ref);
+
+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
+					      bool lazy_coherency)
+{
+	return fence_is_signaled(&req->fence);
+}
+
 int i915_gem_request_add_to_client(struct drm_i915_gem_request *req,
 				   struct drm_file *file);
 
@@ -2271,7 +2288,7 @@ static inline struct drm_i915_gem_request *
 i915_gem_request_reference(struct drm_i915_gem_request *req)
 {
 	if (req)
-		kref_get(&req->ref);
+		fence_get(&req->fence);
 	return req;
 }
 
@@ -2279,7 +2296,7 @@ static inline void
 i915_gem_request_unreference(struct drm_i915_gem_request *req)
 {
 	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
-	kref_put(&req->ref, i915_gem_request_free);
+	fence_put(&req->fence);
 }
 
 static inline void
@@ -2291,7 +2308,7 @@ i915_gem_request_unreference__unlocked(struct drm_i915_gem_request *req)
 		return;
 
 	dev = req->ring->dev;
-	if (kref_put_mutex(&req->ref, i915_gem_request_free, &dev->struct_mutex))
+	if (kref_put_mutex(&req->fence.refcount, fence_release, &dev->struct_mutex))
 		mutex_unlock(&dev->struct_mutex);
 }
 
@@ -2308,12 +2325,6 @@ static inline void i915_gem_request_assign(struct drm_i915_gem_request **pdst,
 }
 
 /*
- * XXX: i915_gem_request_completed should be here but currently needs the
- * definition of i915_seqno_passed() which is below. It will be moved in
- * a later patch when the call to i915_seqno_passed() is obsoleted...
- */
-
-/*
  * A command that requires special handling by the command parser.
  */
 struct drm_i915_cmd_descriptor {
@@ -2916,18 +2927,6 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 	return (int32_t)(seq1 - seq2) >= 0;
 }
 
-static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
-					      bool lazy_coherency)
-{
-	u32 seqno;
-
-	BUG_ON(req == NULL);
-
-	seqno = req->ring->get_seqno(req->ring, lazy_coherency);
-
-	return i915_seqno_passed(seqno, req->seqno);
-}
-
 int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno);
 int __must_check i915_gem_set_seqno(struct drm_device *dev, u32 seqno);
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index e4056a3..a1b4dbd 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2617,12 +2617,14 @@ static void i915_set_reset_status(struct drm_i915_private *dev_priv,
 	}
 }
 
-void i915_gem_request_free(struct kref *req_ref)
+static void i915_gem_request_free(struct fence *req_fence)
 {
-	struct drm_i915_gem_request *req = container_of(req_ref,
-						 typeof(*req), ref);
+	struct drm_i915_gem_request *req = container_of(req_fence,
+						 typeof(*req), fence);
 	struct intel_context *ctx = req->ctx;
 
+	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
+
 	if (req->file_priv)
 		i915_gem_request_remove_from_client(req);
 
@@ -2638,6 +2640,45 @@ void i915_gem_request_free(struct kref *req_ref)
 	kmem_cache_free(req->i915->requests, req);
 }
 
+static bool i915_gem_request_enable_signaling(struct fence *req_fence)
+{
+	/* Interrupt driven fences are not implemented yet.*/
+	WARN(true, "This should not be called!");
+	return true;
+}
+
+static bool i915_gem_request_is_completed(struct fence *req_fence)
+{
+	struct drm_i915_gem_request *req = container_of(req_fence,
+						 typeof(*req), fence);
+	u32 seqno;
+
+	seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
+
+	return i915_seqno_passed(seqno, req->seqno);
+}
+
+static const char *i915_gem_request_get_driver_name(struct fence *req_fence)
+{
+	return "i915";
+}
+
+static const char *i915_gem_request_get_timeline_name(struct fence *req_fence)
+{
+	struct drm_i915_gem_request *req = container_of(req_fence,
+						 typeof(*req), fence);
+	return req->ring->name;
+}
+
+static const struct fence_ops i915_gem_request_fops = {
+	.enable_signaling	= i915_gem_request_enable_signaling,
+	.signaled		= i915_gem_request_is_completed,
+	.wait			= fence_default_wait,
+	.release		= i915_gem_request_free,
+	.get_driver_name	= i915_gem_request_get_driver_name,
+	.get_timeline_name	= i915_gem_request_get_timeline_name,
+};
+
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct intel_context *ctx,
 			   struct drm_i915_gem_request **req_out)
@@ -2659,7 +2700,6 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	if (ret)
 		goto err;
 
-	kref_init(&req->ref);
 	req->i915 = dev_priv;
 	req->ring = ring;
 	req->ctx  = ctx;
@@ -2674,6 +2714,8 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 		goto err;
 	}
 
+	fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock, ring->fence_context, req->seqno);
+
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
 	 * eventually emit this request. This is to guarantee that the
@@ -4723,7 +4765,7 @@ i915_gem_init_hw(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring;
-	int ret, i, j;
+	int ret, i, j, fence_base;
 
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
@@ -4793,12 +4835,16 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (ret)
 		goto out;
 
+	fence_base = fence_context_alloc(I915_NUM_RINGS);
+
 	/* Now it is safe to go back round and do everything else: */
 	for_each_ring(ring, dev_priv, i) {
 		struct drm_i915_gem_request *req;
 
 		WARN_ON(!ring->default_context);
 
+		ring->fence_context = fence_base + i;
+
 		ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 		if (ret) {
 			i915_gem_cleanup_ringbuffer(dev);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 06180dc..b8c8f9b 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1920,6 +1920,7 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 	ring->dev = dev;
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
+	spin_lock_init(&ring->fence_lock);
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
 	init_waitqueue_head(&ring->irq_queue);
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index c9b081f..f4a6403 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2158,6 +2158,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	INIT_LIST_HEAD(&ring->request_list);
 	INIT_LIST_HEAD(&ring->execlist_queue);
 	INIT_LIST_HEAD(&ring->buffers);
+	spin_lock_init(&ring->fence_lock);
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
 	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 58b1976..4547645 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -348,6 +348,9 @@ struct  intel_engine_cs {
 	 * to encode the command length in the header).
 	 */
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
+
+	unsigned fence_context;
+	spinlock_t fence_lock;
 };
 
 bool intel_ring_initialized(struct intel_engine_cs *ring);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 05/12] drm/i915: Removed now redudant parameter to i915_gem_request_completed()
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (3 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 04/12] drm/i915: Convert requests to use struct fence John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 06/12] drm/i915: Add per context timelines to fence object John.C.Harrison
                                     ` (8 subsequent siblings)
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The change to the implementation of i915_gem_request_completed() means
that the lazy coherency flag is no longer used. This can now be
removed to simplify the interface.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_debugfs.c  |  2 +-
 drivers/gpu/drm/i915/i915_drv.h      |  3 +--
 drivers/gpu/drm/i915/i915_gem.c      | 18 +++++++++---------
 drivers/gpu/drm/i915/intel_display.c |  2 +-
 drivers/gpu/drm/i915/intel_pm.c      |  4 ++--
 5 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 5b31186..18dfb56 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -601,7 +601,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
 					   i915_gem_request_get_seqno(work->flip_queued_req),
 					   dev_priv->next_seqno,
 					   ring->get_seqno(ring, true),
-					   i915_gem_request_completed(work->flip_queued_req, true));
+					   i915_gem_request_completed(work->flip_queued_req));
 			} else
 				seq_printf(m, "Flip not associated with any ring\n");
 			seq_printf(m, "Flip queued on frame %d, (was ready on frame %d), now %d\n",
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index aa5cba7..caf7897 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2263,8 +2263,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct drm_i915_gem_request **req_out);
 void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 
-static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
-					      bool lazy_coherency)
+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req)
 {
 	return fence_is_signaled(&req->fence);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index a1b4dbd..0801738 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1165,7 +1165,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
 
 	timeout = jiffies + 1;
 	while (!need_resched()) {
-		if (i915_gem_request_completed(req, true))
+		if (i915_gem_request_completed(req))
 			return 0;
 
 		if (time_after_eq(jiffies, timeout))
@@ -1173,7 +1173,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
 
 		cpu_relax_lowlatency();
 	}
-	if (i915_gem_request_completed(req, false))
+	if (i915_gem_request_completed(req))
 		return 0;
 
 	return -EAGAIN;
@@ -1217,7 +1217,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	if (list_empty(&req->list))
 		return 0;
 
-	if (i915_gem_request_completed(req, true))
+	if (i915_gem_request_completed(req))
 		return 0;
 
 	timeout_expire = timeout ?
@@ -1257,7 +1257,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			break;
 		}
 
-		if (i915_gem_request_completed(req, false)) {
+		if (i915_gem_request_completed(req)) {
 			ret = 0;
 			break;
 		}
@@ -2758,7 +2758,7 @@ i915_gem_find_active_request(struct intel_engine_cs *ring)
 	struct drm_i915_gem_request *request;
 
 	list_for_each_entry(request, &ring->request_list, list) {
-		if (i915_gem_request_completed(request, false))
+		if (i915_gem_request_completed(request))
 			continue;
 
 		return request;
@@ -2899,7 +2899,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 					   struct drm_i915_gem_request,
 					   list);
 
-		if (!i915_gem_request_completed(request, true))
+		if (!i915_gem_request_completed(request))
 			break;
 
 		i915_gem_request_retire(request);
@@ -2923,7 +2923,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 	}
 
 	if (unlikely(ring->trace_irq_req &&
-		     i915_gem_request_completed(ring->trace_irq_req, true))) {
+		     i915_gem_request_completed(ring->trace_irq_req))) {
 		ring->irq_put(ring);
 		i915_gem_request_assign(&ring->trace_irq_req, NULL);
 	}
@@ -3029,7 +3029,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
 		if (list_empty(&req->list))
 			goto retire;
 
-		if (i915_gem_request_completed(req, true)) {
+		if (i915_gem_request_completed(req)) {
 			__i915_gem_request_retire__upto(req);
 retire:
 			i915_gem_object_retire__read(obj, i);
@@ -3141,7 +3141,7 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (to == from)
 		return 0;
 
-	if (i915_gem_request_completed(from_req, true))
+	if (i915_gem_request_completed(from_req))
 		return 0;
 
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index a5dd528..510365e 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11313,7 +11313,7 @@ static bool __intel_pageflip_stall_check(struct drm_device *dev,
 
 	if (work->flip_ready_vblank == 0) {
 		if (work->flip_queued_req &&
-		    !i915_gem_request_completed(work->flip_queued_req, true))
+		    !i915_gem_request_completed(work->flip_queued_req))
 			return false;
 
 		work->flip_ready_vblank = drm_crtc_vblank_count(crtc);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index ebd6735..c207a3a 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7170,7 +7170,7 @@ static void __intel_rps_boost_work(struct work_struct *work)
 	struct request_boost *boost = container_of(work, struct request_boost, work);
 	struct drm_i915_gem_request *req = boost->req;
 
-	if (!i915_gem_request_completed(req, true))
+	if (!i915_gem_request_completed(req))
 		gen6_rps_boost(to_i915(req->ring->dev), NULL,
 			       req->emitted_jiffies);
 
@@ -7186,7 +7186,7 @@ void intel_queue_rps_boost_for_request(struct drm_device *dev,
 	if (req == NULL || INTEL_INFO(dev)->gen < 6)
 		return;
 
-	if (i915_gem_request_completed(req, true))
+	if (i915_gem_request_completed(req))
 		return;
 
 	boost = kmalloc(sizeof(*boost), GFP_ATOMIC);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 06/12] drm/i915: Add per context timelines to fence object
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (4 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 05/12] drm/i915: Removed now redudant parameter to i915_gem_request_completed() John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 07/12] drm/i915: Delay the freeing of requests until retire time John.C.Harrison
                                     ` (7 subsequent siblings)
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The fence object used inside the request structure requires a sequence
number. Although this is not used by the i915 driver itself, it could
potentially be used by non-i915 code if the fence is passed outside of
the driver. This is the intention as it allows external kernel drivers
and user applications to wait on batch buffer completion
asynchronously via the dma-buff fence API.

To ensure that such external users are not confused by strange things
happening with the seqno, this patch adds in a per context timeline
that can provide a guaranteed in-order seqno value for the fence. This
is safe because the scheduler will not re-order batch buffers within a
context - they are considered to be mutually dependent.

v2: New patch in series.

v3: Renamed/retyped timeline structure fields after review comments by
Tvrtko Ursulin.

Added context information to the timeline's name string for better
identification in debugfs output.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 25 ++++++++---
 drivers/gpu/drm/i915/i915_gem.c         | 80 +++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem_context.c | 15 ++++++-
 drivers/gpu/drm/i915/intel_lrc.c        |  8 ++++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 -
 5 files changed, 111 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index caf7897..7d6a7c0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -841,6 +841,15 @@ struct i915_ctx_hang_stats {
 	bool banned;
 };
 
+struct i915_fence_timeline {
+	char        name[32];
+	unsigned    fence_context;
+	unsigned    next;
+
+	struct intel_context *ctx;
+	struct intel_engine_cs *ring;
+};
+
 /* This must match up with the value previously used for execbuf2.rsvd1. */
 #define DEFAULT_CONTEXT_HANDLE 0
 
@@ -885,6 +894,7 @@ struct intel_context {
 		struct drm_i915_gem_object *state;
 		struct intel_ringbuffer *ringbuf;
 		int pin_count;
+		struct i915_fence_timeline fence_timeline;
 	} engine[I915_NUM_RINGS];
 
 	struct list_head link;
@@ -2177,13 +2187,10 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
 struct drm_i915_gem_request {
 	/**
 	 * Underlying object for implementing the signal/wait stuff.
-	 * NB: Never call fence_later() or return this fence object to user
-	 * land! Due to lazy allocation, scheduler re-ordering, pre-emption,
-	 * etc., there is no guarantee at all about the validity or
-	 * sequentiality of the fence's seqno! It is also unsafe to let
-	 * anything outside of the i915 driver get hold of the fence object
-	 * as the clean up when decrementing the reference count requires
-	 * holding the driver mutex lock.
+	 * NB: Never return this fence object to user land! It is unsafe to
+	 * let anything outside of the i915 driver get hold of the fence
+	 * object as the clean up when decrementing the reference count
+	 * requires holding the driver mutex lock.
 	 */
 	struct fence fence;
 
@@ -2263,6 +2270,10 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct drm_i915_gem_request **req_out);
 void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 
+int i915_create_fence_timeline(struct drm_device *dev,
+			       struct intel_context *ctx,
+			       struct intel_engine_cs *ring);
+
 static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req)
 {
 	return fence_is_signaled(&req->fence);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0801738..7a37fb7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2665,9 +2665,32 @@ static const char *i915_gem_request_get_driver_name(struct fence *req_fence)
 
 static const char *i915_gem_request_get_timeline_name(struct fence *req_fence)
 {
-	struct drm_i915_gem_request *req = container_of(req_fence,
-						 typeof(*req), fence);
-	return req->ring->name;
+	struct drm_i915_gem_request *req;
+	struct i915_fence_timeline *timeline;
+
+	req = container_of(req_fence, typeof(*req), fence);
+	timeline = &req->ctx->engine[req->ring->id].fence_timeline;
+
+	return timeline->name;
+}
+
+static void i915_gem_request_timeline_value_str(struct fence *req_fence, char *str, int size)
+{
+	struct drm_i915_gem_request *req;
+
+	req = container_of(req_fence, typeof(*req), fence);
+
+	/* Last signalled timeline value ??? */
+	snprintf(str, size, "? [%d]"/*, timeline->value*/, req->ring->get_seqno(req->ring, true));
+}
+
+static void i915_gem_request_fence_value_str(struct fence *req_fence, char *str, int size)
+{
+	struct drm_i915_gem_request *req;
+
+	req = container_of(req_fence, typeof(*req), fence);
+
+	snprintf(str, size, "%d [%d]", req->fence.seqno, req->seqno);
 }
 
 static const struct fence_ops i915_gem_request_fops = {
@@ -2677,8 +2700,49 @@ static const struct fence_ops i915_gem_request_fops = {
 	.release		= i915_gem_request_free,
 	.get_driver_name	= i915_gem_request_get_driver_name,
 	.get_timeline_name	= i915_gem_request_get_timeline_name,
+	.fence_value_str	= i915_gem_request_fence_value_str,
+	.timeline_value_str	= i915_gem_request_timeline_value_str,
 };
 
+int i915_create_fence_timeline(struct drm_device *dev,
+			       struct intel_context *ctx,
+			       struct intel_engine_cs *ring)
+{
+	struct i915_fence_timeline *timeline;
+
+	timeline = &ctx->engine[ring->id].fence_timeline;
+
+	if (timeline->ring)
+		return 0;
+
+	timeline->fence_context = fence_context_alloc(1);
+
+	/*
+	 * Start the timeline from seqno 0 as this is a special value
+	 * that is reserved for invalid sync points.
+	 */
+	timeline->next       = 1;
+	timeline->ctx        = ctx;
+	timeline->ring       = ring;
+
+	snprintf(timeline->name, sizeof(timeline->name), "%d>%s:%d", timeline->fence_context, ring->name, ctx->user_handle);
+
+	return 0;
+}
+
+static unsigned i915_fence_timeline_get_next_seqno(struct i915_fence_timeline *timeline)
+{
+	unsigned seqno;
+
+	seqno = timeline->next;
+
+	/* Reserve zero for invalid */
+	if (++timeline->next == 0 )
+		timeline->next = 1;
+
+	return seqno;
+}
+
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct intel_context *ctx,
 			   struct drm_i915_gem_request **req_out)
@@ -2714,7 +2778,9 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 		goto err;
 	}
 
-	fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock, ring->fence_context, req->seqno);
+	fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock,
+		   ctx->engine[ring->id].fence_timeline.fence_context,
+		   i915_fence_timeline_get_next_seqno(&ctx->engine[ring->id].fence_timeline));
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
@@ -4765,7 +4831,7 @@ i915_gem_init_hw(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_engine_cs *ring;
-	int ret, i, j, fence_base;
+	int ret, i, j;
 
 	if (INTEL_INFO(dev)->gen < 6 && !intel_enable_gtt())
 		return -EIO;
@@ -4835,16 +4901,12 @@ i915_gem_init_hw(struct drm_device *dev)
 	if (ret)
 		goto out;
 
-	fence_base = fence_context_alloc(I915_NUM_RINGS);
-
 	/* Now it is safe to go back round and do everything else: */
 	for_each_ring(ring, dev_priv, i) {
 		struct drm_i915_gem_request *req;
 
 		WARN_ON(!ring->default_context);
 
-		ring->fence_context = fence_base + i;
-
 		ret = i915_gem_request_alloc(ring, ring->default_context, &req);
 		if (ret) {
 			i915_gem_cleanup_ringbuffer(dev);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index 43b1c73..2798ddc 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -266,7 +266,7 @@ i915_gem_create_context(struct drm_device *dev,
 {
 	const bool is_global_default_ctx = file_priv == NULL;
 	struct intel_context *ctx;
-	int ret = 0;
+	int i, ret = 0;
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 
@@ -274,6 +274,19 @@ i915_gem_create_context(struct drm_device *dev,
 	if (IS_ERR(ctx))
 		return ctx;
 
+	if (!i915.enable_execlists) {
+		struct intel_engine_cs *ring;
+
+		/* Create a per context timeline for fences */
+		for_each_ring(ring, to_i915(dev), i) {
+			ret = i915_create_fence_timeline(dev, ctx, ring);
+			if (ret) {
+				DRM_ERROR("Fence timeline creation failed for legacy %s: %p\n", ring->name, ctx);
+				goto err_destroy;
+			}
+		}
+	}
+
 	if (is_global_default_ctx && ctx->legacy_hw_ctx.rcs_state) {
 		/* We may need to do things with the shrinker which
 		 * require us to immediately switch back to the default
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b8c8f9b..2b56651 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2489,6 +2489,14 @@ int intel_lr_context_deferred_alloc(struct intel_context *ctx,
 		goto error_ringbuf;
 	}
 
+	/* Create a per context timeline for fences */
+	ret = i915_create_fence_timeline(dev, ctx, ring);
+	if (ret) {
+		DRM_ERROR("Fence timeline creation failed for ring %s, ctx %p\n",
+			  ring->name, ctx);
+		goto error_ringbuf;
+	}
+
 	ctx->engine[ring->id].ringbuf = ringbuf;
 	ctx->engine[ring->id].state = ctx_obj;
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 4547645..356b6a8 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -349,7 +349,6 @@ struct  intel_engine_cs {
 	 */
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 
-	unsigned fence_context;
 	spinlock_t fence_lock;
 };
 
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 07/12] drm/i915: Delay the freeing of requests until retire time
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (5 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 06/12] drm/i915: Add per context timelines to fence object John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 08/12] drm/i915: Interrupt driven fences John.C.Harrison
                                     ` (6 subsequent siblings)
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The request structure is reference counted. When the count reached
zero, the request was immediately freed and all associated objects
were unrefereced/unallocated. This meant that the driver mutex lock
must be held at the point where the count reaches zero. This was fine
while all references were held internally to the driver. However, the
plan is to allow the underlying fence object (and hence the request
itself) to be returned to other drivers and to userland. External
users cannot be expected to acquire a driver private mutex lock.

Rather than attempt to disentangle the request structure from the
driver mutex lock, the decsion was to defer the free code until a
later (safer) point. Hence this patch changes the unreference callback
to merely move the request onto a delayed free list. The driver's
retire worker thread will then process the list and actually call the
free function on the requests.

v2: New patch in series.

v3: Updated after review comments by Tvrtko Ursulin. Rename list nodes
to 'link' rather than 'list'. Update list processing to be more
efficient/safer with respect to spinlocks.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 22 +++----------------
 drivers/gpu/drm/i915/i915_gem.c         | 39 +++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/intel_display.c    |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c        |  2 ++
 drivers/gpu/drm/i915/intel_pm.c         |  2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c |  2 ++
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 ++++
 7 files changed, 48 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 7d6a7c0..fbf591f 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2185,14 +2185,9 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
  * initial reference taken using kref_init
  */
 struct drm_i915_gem_request {
-	/**
-	 * Underlying object for implementing the signal/wait stuff.
-	 * NB: Never return this fence object to user land! It is unsafe to
-	 * let anything outside of the i915 driver get hold of the fence
-	 * object as the clean up when decrementing the reference count
-	 * requires holding the driver mutex lock.
-	 */
+	/** Underlying object for implementing the signal/wait stuff. */
 	struct fence fence;
+	struct list_head delayed_free_link;
 
 	/** On Which ring this request was generated */
 	struct drm_i915_private *i915;
@@ -2305,21 +2300,10 @@ i915_gem_request_reference(struct drm_i915_gem_request *req)
 static inline void
 i915_gem_request_unreference(struct drm_i915_gem_request *req)
 {
-	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
-	fence_put(&req->fence);
-}
-
-static inline void
-i915_gem_request_unreference__unlocked(struct drm_i915_gem_request *req)
-{
-	struct drm_device *dev;
-
 	if (!req)
 		return;
 
-	dev = req->ring->dev;
-	if (kref_put_mutex(&req->fence.refcount, fence_release, &dev->struct_mutex))
-		mutex_unlock(&dev->struct_mutex);
+	fence_put(&req->fence);
 }
 
 static inline void i915_gem_request_assign(struct drm_i915_gem_request **pdst,
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 7a37fb7..171ae5f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2617,10 +2617,27 @@ static void i915_set_reset_status(struct drm_i915_private *dev_priv,
 	}
 }
 
-static void i915_gem_request_free(struct fence *req_fence)
+static void i915_gem_request_release(struct fence *req_fence)
 {
 	struct drm_i915_gem_request *req = container_of(req_fence,
 						 typeof(*req), fence);
+	struct intel_engine_cs *ring = req->ring;
+	struct drm_i915_private *dev_priv = to_i915(ring->dev);
+	unsigned long flags;
+
+	/*
+	 * Need to add the request to a deferred dereference list to be
+	 * processed at a mutex lock safe time.
+	 */
+	spin_lock_irqsave(&ring->delayed_free_lock, flags);
+	list_add_tail(&req->delayed_free_link, &ring->delayed_free_list);
+	spin_unlock_irqrestore(&ring->delayed_free_lock, flags);
+
+	queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
+}
+
+static void i915_gem_request_free(struct drm_i915_gem_request *req)
+{
 	struct intel_context *ctx = req->ctx;
 
 	WARN_ON(!mutex_is_locked(&req->ring->dev->struct_mutex));
@@ -2697,7 +2714,7 @@ static const struct fence_ops i915_gem_request_fops = {
 	.enable_signaling	= i915_gem_request_enable_signaling,
 	.signaled		= i915_gem_request_is_completed,
 	.wait			= fence_default_wait,
-	.release		= i915_gem_request_free,
+	.release		= i915_gem_request_release,
 	.get_driver_name	= i915_gem_request_get_driver_name,
 	.get_timeline_name	= i915_gem_request_get_timeline_name,
 	.fence_value_str	= i915_gem_request_fence_value_str,
@@ -2951,6 +2968,10 @@ void i915_gem_reset(struct drm_device *dev)
 void
 i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 {
+	struct drm_i915_gem_request *req, *req_next;
+	LIST_HEAD(list_head);
+	unsigned long flags;
+
 	WARN_ON(i915_verify_lists(ring->dev));
 
 	/* Retire requests first as we use it above for the early return.
@@ -2994,6 +3015,15 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 		i915_gem_request_assign(&ring->trace_irq_req, NULL);
 	}
 
+	/* Really free any requests that were recently unreferenced */
+	spin_lock_irqsave(&ring->delayed_free_lock, flags);
+	list_splice_init(&ring->delayed_free_list, &list_head);
+	spin_unlock_irqrestore(&ring->delayed_free_lock, flags);
+	list_for_each_entry_safe(req, req_next, &list_head, delayed_free_link) {
+		list_del(&req->delayed_free_link);
+		i915_gem_request_free(req);
+	}
+
 	WARN_ON(i915_verify_lists(ring->dev));
 }
 
@@ -3184,7 +3214,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 			ret = __i915_wait_request(req[i], reset_counter, true,
 						  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
 						  file->driver_priv);
-		i915_gem_request_unreference__unlocked(req[i]);
+		i915_gem_request_unreference(req[i]);
 	}
 	return ret;
 
@@ -4179,7 +4209,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (ret == 0)
 		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
 
-	i915_gem_request_unreference__unlocked(target);
+	i915_gem_request_unreference(target);
 
 	return ret;
 }
@@ -5036,6 +5066,7 @@ init_ring_lists(struct intel_engine_cs *ring)
 {
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
+	INIT_LIST_HEAD(&ring->delayed_free_list);
 }
 
 void
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 510365e..9291a1d 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11256,7 +11256,7 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
 					    mmio_flip->crtc->reset_counter,
 					    false, NULL,
 					    &mmio_flip->i915->rps.mmioflips));
-		i915_gem_request_unreference__unlocked(mmio_flip->req);
+		i915_gem_request_unreference(mmio_flip->req);
 	}
 
 	intel_do_mmio_flip(mmio_flip);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2b56651..06a398a 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1920,7 +1920,9 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 	ring->dev = dev;
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
+	INIT_LIST_HEAD(&ring->delayed_free_list);
 	spin_lock_init(&ring->fence_lock);
+	spin_lock_init(&ring->delayed_free_lock);
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
 	init_waitqueue_head(&ring->irq_queue);
 
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index c207a3a..e2d34a6 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7174,7 +7174,7 @@ static void __intel_rps_boost_work(struct work_struct *work)
 		gen6_rps_boost(to_i915(req->ring->dev), NULL,
 			       req->emitted_jiffies);
 
-	i915_gem_request_unreference__unlocked(req);
+	i915_gem_request_unreference(req);
 	kfree(boost);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index f4a6403..e5573e7 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2158,7 +2158,9 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	INIT_LIST_HEAD(&ring->request_list);
 	INIT_LIST_HEAD(&ring->execlist_queue);
 	INIT_LIST_HEAD(&ring->buffers);
+	INIT_LIST_HEAD(&ring->delayed_free_list);
 	spin_lock_init(&ring->fence_lock);
+	spin_lock_init(&ring->delayed_free_lock);
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
 	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 356b6a8..77384ed 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -301,6 +301,10 @@ struct  intel_engine_cs {
 	 */
 	u32 last_submitted_seqno;
 
+	/* deferred free list to allow unreferencing requests outside the driver */
+	struct list_head delayed_free_list;
+	spinlock_t delayed_free_lock;
+
 	bool gpu_caches_dirty;
 
 	wait_queue_head_t irq_queue;
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 08/12] drm/i915: Interrupt driven fences
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (6 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 07/12] drm/i915: Delay the freeing of requests until retire time John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-12-11 12:17                     ` Tvrtko Ursulin
  2015-11-23 11:34                   ` [RFC 09/12] drm/i915: Updated request structure tracing John.C.Harrison
                                     ` (5 subsequent siblings)
  13 siblings, 1 reply; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The intended usage model for struct fence is that the signalled status
should be set on demand rather than polled. That is, there should not
be a need for a 'signaled' function to be called everytime the status
is queried. Instead, 'something' should be done to enable a signal
callback from the hardware which will update the state directly. In
the case of requests, this is the seqno update interrupt. The idea is
that this callback will only be enabled on demand when something
actually tries to wait on the fence.

This change removes the polling test and replaces it with the callback
scheme. Each fence is added to a 'please poke me' list at the start of
i915_add_request(). The interrupt handler then scans through the 'poke
me' list when a new seqno pops out and signals any matching
fence/request. The fence is then removed from the list so the entire
request stack does not need to be scanned every time. Note that the
fence is added to the list before the commands to generate the seqno
interrupt are added to the ring. Thus the sequence is guaranteed to be
race free if the interrupt is already enabled.

Note that the interrupt is only enabled on demand (i.e. when
__wait_request() is called). Thus there is still a potential race when
enabling the interrupt as the request may already have completed.
However, this is simply solved by calling the interrupt processing
code immediately after enabling the interrupt and thereby checking for
already completed requests.

Lastly, the ring clean up code has the possibility to cancel
outstanding requests (e.g. because TDR has reset the ring). These
requests will never get signalled and so must be removed from the
signal list manually. This is done by setting a 'cancelled' flag and
then calling the regular notify/retire code path rather than
attempting to duplicate the list manipulatation and clean up code in
multiple places. This also avoid any race condition where the
cancellation request might occur after/during the completion interrupt
actually arriving.

v2: Updated to take advantage of the request unreference no longer
requiring the mutex lock.

v3: Move the signal list processing around to prevent unsubmitted
requests being added to the list. This was occurring on Android
because the native sync implementation calls the
fence->enable_signalling API immediately on fence creation.

Updated after review comments by Tvrtko Ursulin. Renamed list nodes to
'link' instead of 'list'. Added support for returning an error code on
a cancelled fence. Update list processing to be more efficient/safer
with respect to spinlocks.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         |  10 ++
 drivers/gpu/drm/i915/i915_gem.c         | 187 ++++++++++++++++++++++++++++++--
 drivers/gpu/drm/i915/i915_irq.c         |   2 +
 drivers/gpu/drm/i915/intel_lrc.c        |   2 +
 drivers/gpu/drm/i915/intel_ringbuffer.c |   2 +
 drivers/gpu/drm/i915/intel_ringbuffer.h |   2 +
 6 files changed, 196 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index fbf591f..d013c6d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2187,7 +2187,12 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
 struct drm_i915_gem_request {
 	/** Underlying object for implementing the signal/wait stuff. */
 	struct fence fence;
+	struct list_head signal_link;
+	struct list_head unsignal_link;
 	struct list_head delayed_free_link;
+	bool cancelled;
+	bool irq_enabled;
+	bool signal_requested;
 
 	/** On Which ring this request was generated */
 	struct drm_i915_private *i915;
@@ -2265,6 +2270,11 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct drm_i915_gem_request **req_out);
 void i915_gem_request_cancel(struct drm_i915_gem_request *req);
 
+void i915_gem_request_submit(struct drm_i915_gem_request *req);
+void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req,
+				       bool fence_locked);
+void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked);
+
 int i915_create_fence_timeline(struct drm_device *dev,
 			       struct intel_context *ctx,
 			       struct intel_engine_cs *ring);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 171ae5f..2a0b346 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1165,6 +1165,8 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
 
 	timeout = jiffies + 1;
 	while (!need_resched()) {
+		i915_gem_request_notify(req->ring, false);
+
 		if (i915_gem_request_completed(req))
 			return 0;
 
@@ -1173,6 +1175,9 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
 
 		cpu_relax_lowlatency();
 	}
+
+	i915_gem_request_notify(req->ring, false);
+
 	if (i915_gem_request_completed(req))
 		return 0;
 
@@ -1214,9 +1219,14 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 
 	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
 
-	if (list_empty(&req->list))
+	if (i915_gem_request_completed(req))
 		return 0;
 
+	/*
+	 * Enable interrupt completion of the request.
+	 */
+	fence_enable_sw_signaling(&req->fence);
+
 	if (i915_gem_request_completed(req))
 		return 0;
 
@@ -1377,6 +1387,19 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	list_del_init(&request->list);
 	i915_gem_request_remove_from_client(request);
 
+	/* In case the request is still in the signal pending list */
+	if (!list_empty(&request->signal_link)) {
+		/*
+		 * The request must be marked as cancelled and the underlying
+		 * fence as both failed. NB: There is no explicit fence fail
+		 * API, there is only a manual poke and signal.
+		 */
+		request->cancelled = true;
+		/* How to propagate to any associated sync_fence??? */
+		request->fence.status = -EIO;
+		fence_signal_locked(&request->fence);
+	}
+
 	i915_gem_request_unreference(request);
 }
 
@@ -2535,6 +2558,12 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 	 */
 	request->postfix = intel_ring_get_tail(ringbuf);
 
+	/*
+	 * Add the fence to the pending list before emitting the commands to
+	 * generate a seqno notification interrupt.
+	 */
+	i915_gem_request_submit(request);
+
 	if (i915.enable_execlists)
 		ret = ring->emit_request(request);
 	else {
@@ -2654,25 +2683,135 @@ static void i915_gem_request_free(struct drm_i915_gem_request *req)
 		i915_gem_context_unreference(ctx);
 	}
 
+	if (req->irq_enabled)
+		req->ring->irq_put(req->ring);
+
 	kmem_cache_free(req->i915->requests, req);
 }
 
-static bool i915_gem_request_enable_signaling(struct fence *req_fence)
+/*
+ * The request is about to be submitted to the hardware so add the fence to
+ * the list of signalable fences.
+ *
+ * NB: This does not necessarily enable interrupts yet. That only occurs on
+ * demand when the request is actually waited on. However, adding it to the
+ * list early ensures that there is no race condition where the interrupt
+ * could pop out prematurely and thus be completely lost. The race is merely
+ * that the interrupt must be manually checked for after being enabled.
+ */
+void i915_gem_request_submit(struct drm_i915_gem_request *req)
 {
-	/* Interrupt driven fences are not implemented yet.*/
-	WARN(true, "This should not be called!");
-	return true;
+	unsigned long flags;
+
+	/*
+	 * Always enable signal processing for the request's fence object
+	 * before that request is submitted to the hardware. Thus there is no
+	 * race condition whereby the interrupt could pop out before the
+	 * request has been added to the signal list. Hence no need to check
+	 * for completion, undo the list add and return false.
+	 */
+	i915_gem_request_reference(req);
+	spin_lock_irqsave(&req->ring->fence_lock, flags);
+	WARN_ON(!list_empty(&req->signal_link));
+	list_add_tail(&req->signal_link, &req->ring->fence_signal_list);
+	spin_unlock_irqrestore(&req->ring->fence_lock, flags);
+
+	/*
+	 * NB: Interrupts are only enabled on demand. Thus there is still a
+	 * race where the request could complete before the interrupt has
+	 * been enabled. Thus care must be taken at that point.
+	 */
+
+	 /* Have interrupts already been requested? */
+	 if (req->signal_requested)
+		i915_gem_request_enable_interrupt(req, false);
+}
+
+/*
+ * The request is being actively waited on, so enable interrupt based
+ * completion signalling.
+ */
+void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req,
+				       bool fence_locked)
+{
+	if (req->irq_enabled)
+		return;
+
+	WARN_ON(!req->ring->irq_get(req->ring));
+	req->irq_enabled = true;
+
+	/*
+	 * Because the interrupt is only enabled on demand, there is a race
+	 * where the interrupt can fire before anyone is looking for it. So
+	 * do an explicit check for missed interrupts.
+	 */
+	i915_gem_request_notify(req->ring, fence_locked);
 }
 
-static bool i915_gem_request_is_completed(struct fence *req_fence)
+static bool i915_gem_request_enable_signaling(struct fence *req_fence)
 {
 	struct drm_i915_gem_request *req = container_of(req_fence,
 						 typeof(*req), fence);
+
+	/*
+	 * No need to actually enable interrupt based processing until the
+	 * request has been submitted to the hardware. At which point
+	 * 'i915_gem_request_submit()' is called. So only really enable
+	 * signalling in there. Just set a flag to say that interrupts are
+	 * wanted when the request is eventually submitted. On the other hand
+	 * if the request has already been submitted then interrupts do need
+	 * to be enabled now.
+	 */
+
+	req->signal_requested = true;
+
+	if (!list_empty(&req->signal_link))
+		i915_gem_request_enable_interrupt(req, true);
+
+	return true;
+}
+
+void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked)
+{
+	struct drm_i915_gem_request *req, *req_next;
+	unsigned long flags;
 	u32 seqno;
 
-	seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
+	if (list_empty(&ring->fence_signal_list))
+		return;
+
+	if (!fence_locked)
+		spin_lock_irqsave(&ring->fence_lock, flags);
+
+	seqno = ring->get_seqno(ring, false);
+
+	list_for_each_entry_safe(req, req_next, &ring->fence_signal_list, signal_link) {
+		if (!req->cancelled) {
+			if (!i915_seqno_passed(seqno, req->seqno))
+				break;
+		}
+
+		/*
+		 * Start by removing the fence from the signal list otherwise
+		 * the retire code can run concurrently and get confused.
+		 */
+		list_del_init(&req->signal_link);
+
+		if (!req->cancelled) {
+			fence_signal_locked(&req->fence);
+		}
+
+		if (req->irq_enabled) {
+			req->ring->irq_put(req->ring);
+			req->irq_enabled = false;
+		}
+
+		/* Can't unreference here because that might grab fence_lock */
+		list_add_tail(&req->unsignal_link, &ring->fence_unsignal_list);
+	}
 
-	return i915_seqno_passed(seqno, req->seqno);
+	if (!fence_locked)
+		spin_unlock_irqrestore(&ring->fence_lock, flags);
 }
 
 static const char *i915_gem_request_get_driver_name(struct fence *req_fence)
@@ -2712,7 +2851,6 @@ static void i915_gem_request_fence_value_str(struct fence *req_fence, char *str,
 
 static const struct fence_ops i915_gem_request_fops = {
 	.enable_signaling	= i915_gem_request_enable_signaling,
-	.signaled		= i915_gem_request_is_completed,
 	.wait			= fence_default_wait,
 	.release		= i915_gem_request_release,
 	.get_driver_name	= i915_gem_request_get_driver_name,
@@ -2795,6 +2933,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 		goto err;
 	}
 
+	INIT_LIST_HEAD(&req->signal_link);
 	fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock,
 		   ctx->engine[ring->id].fence_timeline.fence_context,
 		   i915_fence_timeline_get_next_seqno(&ctx->engine[ring->id].fence_timeline));
@@ -2832,6 +2971,11 @@ void i915_gem_request_cancel(struct drm_i915_gem_request *req)
 {
 	intel_ring_reserved_space_cancel(req->ringbuf);
 
+	req->cancelled = true;
+	/* How to propagate to any associated sync_fence??? */
+	req->fence.status = -EINVAL;
+	fence_signal_locked(&req->fence);
+
 	i915_gem_request_unreference(req);
 }
 
@@ -2925,6 +3069,13 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
 		i915_gem_request_retire(request);
 	}
 
+	/*
+	 * Tidy up anything left over. This includes a call to
+	 * i915_gem_request_notify() which will make sure that any requests
+	 * that were on the signal pending list get also cleaned up.
+	 */
+	i915_gem_retire_requests_ring(ring);
+
 	/* Having flushed all requests from all queues, we know that all
 	 * ringbuffers must now be empty. However, since we do not reclaim
 	 * all space when retiring the request (to prevent HEADs colliding
@@ -2974,6 +3125,13 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 
 	WARN_ON(i915_verify_lists(ring->dev));
 
+	/*
+	 * If no-one has waited on a request recently then interrupts will
+	 * not have been enabled and thus no requests will ever be marked as
+	 * completed. So do an interrupt check now.
+	 */
+	i915_gem_request_notify(ring, false);
+
 	/* Retire requests first as we use it above for the early return.
 	 * If we retire requests last, we may use a later seqno and so clear
 	 * the requests lists without clearing the active list, leading to
@@ -3015,6 +3173,15 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 		i915_gem_request_assign(&ring->trace_irq_req, NULL);
 	}
 
+	/* Tidy up any requests that were recently signalled */
+	spin_lock_irqsave(&ring->fence_lock, flags);
+	list_splice_init(&ring->fence_unsignal_list, &list_head);
+	spin_unlock_irqrestore(&ring->fence_lock, flags);
+	list_for_each_entry_safe(req, req_next, &list_head, unsignal_link) {
+		list_del(&req->unsignal_link);
+		i915_gem_request_unreference(req);
+	}
+
 	/* Really free any requests that were recently unreferenced */
 	spin_lock_irqsave(&ring->delayed_free_lock, flags);
 	list_splice_init(&ring->delayed_free_list, &list_head);
@@ -5066,6 +5233,8 @@ init_ring_lists(struct intel_engine_cs *ring)
 {
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
+	INIT_LIST_HEAD(&ring->fence_signal_list);
+	INIT_LIST_HEAD(&ring->fence_unsignal_list);
 	INIT_LIST_HEAD(&ring->delayed_free_list);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 68b094b..74f8552 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -981,6 +981,8 @@ static void notify_ring(struct intel_engine_cs *ring)
 
 	trace_i915_gem_request_notify(ring);
 
+	i915_gem_request_notify(ring, false);
+
 	wake_up_all(&ring->irq_queue);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 06a398a..76fc245 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1920,6 +1920,8 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 	ring->dev = dev;
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
+	INIT_LIST_HEAD(&ring->fence_signal_list);
+	INIT_LIST_HEAD(&ring->fence_unsignal_list);
 	INIT_LIST_HEAD(&ring->delayed_free_list);
 	spin_lock_init(&ring->fence_lock);
 	spin_lock_init(&ring->delayed_free_lock);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index e5573e7..1dec252 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2158,6 +2158,8 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	INIT_LIST_HEAD(&ring->request_list);
 	INIT_LIST_HEAD(&ring->execlist_queue);
 	INIT_LIST_HEAD(&ring->buffers);
+	INIT_LIST_HEAD(&ring->fence_signal_list);
+	INIT_LIST_HEAD(&ring->fence_unsignal_list);
 	INIT_LIST_HEAD(&ring->delayed_free_list);
 	spin_lock_init(&ring->fence_lock);
 	spin_lock_init(&ring->delayed_free_lock);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 77384ed..9d09edb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -354,6 +354,8 @@ struct  intel_engine_cs {
 	u32 (*get_cmd_length_mask)(u32 cmd_header);
 
 	spinlock_t fence_lock;
+	struct list_head fence_signal_list;
+	struct list_head fence_unsignal_list;
 };
 
 bool intel_ring_initialized(struct intel_engine_cs *ring);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 09/12] drm/i915: Updated request structure tracing
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (7 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 08/12] drm/i915: Interrupt driven fences John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 10/12] android/sync: Fix reversed sense of signaled fence John.C.Harrison
                                     ` (4 subsequent siblings)
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Added the '_complete' trace event which occurs when a fence/request is
signaled as complete. Also moved the notify event from the IRQ handler
code to inside the notify function itself.

v3: Added the current ring seqno to the notify trace point.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c   |  6 +++++-
 drivers/gpu/drm/i915/i915_irq.c   |  2 --
 drivers/gpu/drm/i915/i915_trace.h | 13 ++++++++-----
 3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2a0b346..72cf5dc 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2777,13 +2777,16 @@ void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked)
 	unsigned long flags;
 	u32 seqno;
 
-	if (list_empty(&ring->fence_signal_list))
+	if (list_empty(&ring->fence_signal_list)) {
+		trace_i915_gem_request_notify(ring, 0);
 		return;
+	}
 
 	if (!fence_locked)
 		spin_lock_irqsave(&ring->fence_lock, flags);
 
 	seqno = ring->get_seqno(ring, false);
+	trace_i915_gem_request_notify(ring, seqno);
 
 	list_for_each_entry_safe(req, req_next, &ring->fence_signal_list, signal_link) {
 		if (!req->cancelled) {
@@ -2799,6 +2802,7 @@ void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked)
 
 		if (!req->cancelled) {
 			fence_signal_locked(&req->fence);
+			trace_i915_gem_request_complete(req);
 		}
 
 		if (req->irq_enabled) {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 74f8552..d280e05 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -979,8 +979,6 @@ static void notify_ring(struct intel_engine_cs *ring)
 	if (!intel_ring_initialized(ring))
 		return;
 
-	trace_i915_gem_request_notify(ring);
-
 	i915_gem_request_notify(ring, false);
 
 	wake_up_all(&ring->irq_queue);
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 04fe849..41a026d 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -561,23 +561,26 @@ DEFINE_EVENT(i915_gem_request, i915_gem_request_add,
 );
 
 TRACE_EVENT(i915_gem_request_notify,
-	    TP_PROTO(struct intel_engine_cs *ring),
-	    TP_ARGS(ring),
+	    TP_PROTO(struct intel_engine_cs *ring, uint32_t seqno),
+	    TP_ARGS(ring, seqno),
 
 	    TP_STRUCT__entry(
 			     __field(u32, dev)
 			     __field(u32, ring)
 			     __field(u32, seqno)
+			     __field(bool, is_empty)
 			     ),
 
 	    TP_fast_assign(
 			   __entry->dev = ring->dev->primary->index;
 			   __entry->ring = ring->id;
-			   __entry->seqno = ring->get_seqno(ring, false);
+			   __entry->seqno = seqno;
+			   __entry->is_empty = list_empty(&ring->fence_signal_list);
 			   ),
 
-	    TP_printk("dev=%u, ring=%u, seqno=%u",
-		      __entry->dev, __entry->ring, __entry->seqno)
+	    TP_printk("dev=%u, ring=%u, seqno=%u, empty=%d",
+		      __entry->dev, __entry->ring, __entry->seqno,
+		      __entry->is_empty)
 );
 
 DEFINE_EVENT(i915_gem_request, i915_gem_request_retire,
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 10/12] android/sync: Fix reversed sense of signaled fence
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (8 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 09/12] drm/i915: Updated request structure tracing John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 11/12] drm/i915: Add sync framework support to execbuff IOCTL John.C.Harrison
                                     ` (3 subsequent siblings)
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: Peter Lawthers <peter.lawthers@intel.com>

In the 3.14 kernel, a signaled fence was indicated by the status field
== 1. In 4.x, a status == 0 indicates signaled, status < 0 indicates error,
and status > 0 indicates active.

This patch wraps the check for a signaled fence in a function so that
callers no longer needs to know the underlying implementation.

v3: New patch for series.

Change-Id: I8e565e49683e3efeb9474656cd84cf4add6ad6a2
Tracked-On: https://jira01.devtools.intel.com/browse/ACD-308
Signed-off-by: Peter Lawthers <peter.lawthers@intel.com>
---
 drivers/android/sync.h | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/drivers/android/sync.h b/drivers/android/sync.h
index 8449323..5a8133e 100644
--- a/drivers/android/sync.h
+++ b/drivers/android/sync.h
@@ -345,6 +345,27 @@ int sync_fence_cancel_async(struct sync_fence *fence,
  */
 int sync_fence_wait(struct sync_fence *fence, long timeout);
 
+/**
+ * sync_fence_is_signaled() - Return an indication if the fence is signaled
+ * @fence:	fence to check
+ *
+ * returns 1 if fence is signaled
+ * returns 0 if fence is not signaled
+ * returns < 0 if fence is in error state
+ */
+static inline int
+sync_fence_is_signaled(struct sync_fence *fence)
+{
+	int status;
+
+	status = atomic_read(&fence->status);
+	if (status == 0)
+		return 1;
+	if (status > 0)
+		return 0;
+	return status;
+}
+
 #ifdef CONFIG_DEBUG_FS
 
 void sync_timeline_debug_add(struct sync_timeline *obj);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 11/12] drm/i915: Add sync framework support to execbuff IOCTL
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (9 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 10/12] android/sync: Fix reversed sense of signaled fence John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:34                   ` [RFC 12/12] drm/i915: Cache last IRQ seqno to reduce IRQ overhead John.C.Harrison
                                     ` (2 subsequent siblings)
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

Various projects desire a mechanism for managing dependencies between
work items asynchronously. This can also include work items across
complete different and independent systems. For example, an
application wants to retreive a frame from a video in device,
using it for rendering on a GPU then send it to the video out device
for display all without having to stall waiting for completion along
the way. The sync framework allows this. It encapsulates
synchronisation events in file descriptors. The application can
request a sync point for the completion of each piece of work. Drivers
should also take sync points in with each new work request and not
schedule the work to start until the sync has been signalled.

This patch adds sync framework support to the exec buffer IOCTL. A
sync point can be passed in to stall execution of the batch buffer
until signalled. And a sync point can be returned after each batch
buffer submission which will be signalled upon that batch buffer's
completion.

At present, the input sync point is simply waited on synchronously
inside the exec buffer IOCTL call. Once the GPU scheduler arrives,
this will be handled asynchronously inside the scheduler and the IOCTL
can return without having to wait.

Note also that the scheduler will re-order the execution of batch
buffers, e.g. because a batch buffer is stalled on a sync point and
cannot be submitted yet but other, independent, batch buffers are
being presented to the driver. This means that the timeline within the
sync points returned cannot be global to the engine. Instead they must
be kept per context per engine (the scheduler may not re-order batches
within a context). Hence the timeline cannot be based on the existing
seqno values but must be a new implementation.

This patch is a port of work by several people that has been pulled
across from Android. It has been updated several times across several
patches. Rather than attempt to port each individual patch, this
version is the finished product as a single patch. The various
contributors/authors along the way (in addition to myself) were:
  Satyanantha RamaGopal M <rama.gopal.m.satyanantha@intel.com>
  Tvrtko Ursulin <tvrtko.ursulin@intel.com>
  Michel Thierry <michel.thierry@intel.com>
  Arun Siluvery <arun.siluvery@linux.intel.com>

v2: New patch in series.

v3: Updated to use the new 'sync_fence_is_signaled' API rather than
having to know about the internal meaning of the 'fence::status' field
(which recently got inverted!) [work by Peter Lawthers].

Updated after review comments by Daniel Vetter. Removed '#ifdef
CONFIG_SYNC' and add 'select SYNC' to the Kconfig instead. Moved the
fd installation of fences to the end of the execbuff call to in order
to remove the need to use 'sys_close' to clean up on failure.

Updated after review comments by Tvrtko Ursulin. Remvoed the
'fence_external' flag as redundant. Covnerted DRM_ERRORs to
DRM_DEBUGs. Changed one second wait to a wait forever when waiting on
incoming fences.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Peter Lawthers <peter.lawthers@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
---
 drivers/gpu/drm/i915/Kconfig               |  3 +
 drivers/gpu/drm/i915/i915_drv.h            |  6 ++
 drivers/gpu/drm/i915/i915_gem.c            | 87 ++++++++++++++++++++++++++-
 drivers/gpu/drm/i915/i915_gem_execbuffer.c | 95 ++++++++++++++++++++++++++++--
 include/uapi/drm/i915_drm.h                | 16 ++++-
 5 files changed, 198 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index 1d96fe1..cb5d5b2 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -22,6 +22,9 @@ config DRM_I915
 	select ACPI_VIDEO if ACPI
 	select ACPI_BUTTON if ACPI
 	select MMU_NOTIFIER
+	# ANDROID is required for SYNC
+	select ANDROID
+	select SYNC
 	help
 	  Choose this option if you have a system that has "Intel Graphics
 	  Media Accelerator" or "HD Graphics" integrated graphics,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d013c6d..194bca0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2278,6 +2278,12 @@ void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked);
 int i915_create_fence_timeline(struct drm_device *dev,
 			       struct intel_context *ctx,
 			       struct intel_engine_cs *ring);
+struct sync_fence;
+int i915_create_sync_fence(struct drm_i915_gem_request *req,
+			   struct sync_fence **sync_fence, int *fence_fd);
+void i915_install_sync_fence_fd(struct drm_i915_gem_request *req,
+				struct sync_fence *sync_fence, int fence_fd);
+bool i915_safe_to_ignore_fence(struct intel_engine_cs *ring, struct sync_fence *fence);
 
 static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req)
 {
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 72cf5dc..5c84866 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -37,6 +37,7 @@
 #include <linux/swap.h>
 #include <linux/pci.h>
 #include <linux/dma-buf.h>
+#include <../drivers/android/sync.h>
 
 #define RQ_BUG_ON(expr)
 
@@ -2560,7 +2561,13 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 
 	/*
 	 * Add the fence to the pending list before emitting the commands to
-	 * generate a seqno notification interrupt.
+	 * generate a seqno notification interrupt. This will also enable
+	 * interrupts if 'signal_requested' has been set.
+	 *
+	 * For example, if an exported sync point has been requested for this
+	 * request then it can be waited on without the driver's knowledge,
+	 * i.e. without calling __i915_wait_request(). Thus interrupts must
+	 * be enabled from the start rather than only on demand.
 	 */
 	i915_gem_request_submit(request);
 
@@ -2902,6 +2909,84 @@ static unsigned i915_fence_timeline_get_next_seqno(struct i915_fence_timeline *t
 	return seqno;
 }
 
+int i915_create_sync_fence(struct drm_i915_gem_request *req,
+			   struct sync_fence **sync_fence, int *fence_fd)
+{
+	char ring_name[] = "i915_ring0";
+	int fd;
+
+	fd = get_unused_fd_flags(O_CLOEXEC);
+	if (fd < 0) {
+		DRM_DEBUG("No available file descriptors!\n");
+		*fence_fd = -1;
+		return fd;
+	}
+
+	ring_name[9] += req->ring->id;
+	*sync_fence = sync_fence_create_dma(ring_name, &req->fence);
+	if (!*sync_fence) {
+		put_unused_fd(fd);
+		*fence_fd = -1;
+		return -ENOMEM;
+	}
+
+	return 0;
+}
+
+void i915_install_sync_fence_fd(struct drm_i915_gem_request *req,
+				struct sync_fence *sync_fence, int fence_fd)
+{
+	sync_fence_install(sync_fence, fence_fd);
+
+	/*
+	 * NB: The corresponding put happens automatically on file close
+	 * from sync_fence_release() via the fops callback.
+	 */
+	fence_get(&req->fence);
+
+	/*
+	 * The sync framework adds a callback to the fence. The fence
+	 * framework calls 'enable_signalling' when a callback is added.
+	 * Thus this flag should have been set by now. If not then
+	 * 'enable_signalling' must be called explicitly because exporting
+	 * a fence to user land means it can be waited on asynchronously and
+	 * thus must be signalled asynchronously.
+	 */
+	WARN_ON(!req->signal_requested);
+}
+
+bool i915_safe_to_ignore_fence(struct intel_engine_cs *ring, struct sync_fence *sync_fence)
+{
+	struct fence *dma_fence;
+	struct drm_i915_gem_request *req;
+	int i;
+
+	if (sync_fence_is_signaled(sync_fence))
+		return true;
+
+	for(i = 0; i < sync_fence->num_fences; i++) {
+		dma_fence = sync_fence->cbs[i].sync_pt;
+
+		/* No need to worry about dead points: */
+		if (fence_is_signaled(dma_fence))
+			continue;
+
+		/* Can't ignore other people's points: */
+		if(dma_fence->ops != &i915_gem_request_fops)
+			return false;
+
+		req = container_of(dma_fence, typeof(*req), fence);
+
+		/* Can't ignore points on other rings: */
+		if (req->ring != ring)
+			return false;
+
+		/* Same ring means guaranteed to be in order so ignore it. */
+	}
+
+	return true;
+}
+
 int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct intel_context *ctx,
 			   struct drm_i915_gem_request **req_out)
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index bfc4c17..5f629f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -26,6 +26,7 @@
  *
  */
 
+#include <linux/syscalls.h>
 #include <drm/drmP.h>
 #include <drm/i915_drm.h>
 #include "i915_drv.h"
@@ -33,6 +34,7 @@
 #include "intel_drv.h"
 #include <linux/dma_remapping.h>
 #include <linux/uaccess.h>
+#include <../drivers/android/sync.h>
 
 #define  __EXEC_OBJECT_HAS_PIN (1<<31)
 #define  __EXEC_OBJECT_HAS_FENCE (1<<30)
@@ -1322,6 +1324,38 @@ eb_get_batch(struct eb_vmas *eb)
 	return vma->obj;
 }
 
+static int i915_early_fence_wait(struct intel_engine_cs *ring, int fence_fd)
+{
+	struct sync_fence *fence;
+	int ret = 0;
+
+	if (fence_fd < 0) {
+		DRM_DEBUG("Invalid wait fence fd %d on ring %d\n", fence_fd,
+			  (int) ring->id);
+		return 1;
+	}
+
+	fence = sync_fence_fdget(fence_fd);
+	if (fence == NULL) {
+		DRM_DEBUG("Invalid wait fence %d on ring %d\n", fence_fd,
+			  (int) ring->id);
+		return 1;
+	}
+
+	if (!sync_fence_is_signaled(fence)) {
+		/*
+		 * Wait forever for the fence to be signalled. This is safe
+		 * because the the mutex lock has not yet been acquired and
+		 * the wait is interruptible.
+		 */
+		if (!i915_safe_to_ignore_fence(ring, fence))
+			ret = sync_fence_wait(fence, -1);
+	}
+
+	sync_fence_put(fence);
+	return ret;
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		       struct drm_file *file,
@@ -1341,6 +1375,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	u32 dispatch_flags;
 	int ret;
 	bool need_relocs;
+	int fd_fence_complete = -1;
+	int fd_fence_wait = lower_32_bits(args->rsvd2);
+	struct sync_fence *sync_fence;
+
+	/*
+	 * Make sure an broken fence handle is not returned no matter
+	 * how early an error might be hit. Note that rsvd2 has to be
+	 * saved away first because it is also an input parameter!
+	 */
+	if (args->flags & I915_EXEC_CREATE_FENCE)
+		args->rsvd2 = (__u64) -1;
 
 	if (!i915_gem_check_execbuffer(args))
 		return -EINVAL;
@@ -1424,6 +1469,17 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 		dispatch_flags |= I915_DISPATCH_RS;
 	}
 
+	/*
+	 * Without a GPU scheduler, any fence waits must be done up front.
+	 */
+	if (args->flags & I915_EXEC_WAIT_FENCE) {
+		ret = i915_early_fence_wait(ring, fd_fence_wait);
+		if (ret < 0)
+			return ret;
+
+		args->flags &= ~I915_EXEC_WAIT_FENCE;
+	}
+
 	intel_runtime_pm_get(dev_priv);
 
 	ret = i915_mutex_lock_interruptible(dev);
@@ -1571,8 +1627,41 @@ i915_gem_do_execbuffer(struct drm_device *dev, void *data,
 	params->batch_obj               = batch_obj;
 	params->ctx                     = ctx;
 
+	if (args->flags & I915_EXEC_CREATE_FENCE) {
+		/*
+		 * Caller has requested a sync fence.
+		 * User interrupts will be enabled to make sure that
+		 * the timeline is signalled on completion.
+		 */
+		ret = i915_create_sync_fence(params->request, &sync_fence,
+					     &fd_fence_complete);
+		if (ret) {
+			DRM_ERROR("Fence creation failed for ring %d, ctx %p\n",
+				  ring->id, ctx);
+			goto err_batch_unpin;
+		}
+	}
+
 	ret = dev_priv->gt.execbuf_submit(params, args, &eb->vmas);
 
+	if (fd_fence_complete != -1) {
+		if (ret) {
+			sync_fence_put(sync_fence);
+			put_unused_fd(fd_fence_complete);
+		} else {
+			/*
+			 * Install the fence into the pre-allocated file
+			 * descriptor to the fence object so that user land
+			 * can wait on it...
+			 */
+			i915_install_sync_fence_fd(params->request,
+						   sync_fence, fd_fence_complete);
+
+			/* Return the fence through the rsvd2 field */
+			args->rsvd2 = (__u64) fd_fence_complete;
+		}
+	}
+
 err_batch_unpin:
 	/*
 	 * FIXME: We crucially rely upon the active tracking for the (ppgtt)
@@ -1602,6 +1691,7 @@ pre_mutex_err:
 	/* intel_gpu_busy should also get a ref, so it will free when the device
 	 * is really idle. */
 	intel_runtime_pm_put(dev_priv);
+
 	return ret;
 }
 
@@ -1707,11 +1797,6 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
-	if (args->rsvd2 != 0) {
-		DRM_DEBUG("dirty rvsd2 field\n");
-		return -EINVAL;
-	}
-
 	exec2_list = kmalloc(sizeof(*exec2_list)*args->buffer_count,
 			     GFP_TEMPORARY | __GFP_NOWARN | __GFP_NORETRY);
 	if (exec2_list == NULL)
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 67cebe6..86f7921 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -250,7 +250,7 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_HWS_ADDR		DRM_IOW(DRM_COMMAND_BASE + DRM_I915_HWS_ADDR, struct drm_i915_gem_init)
 #define DRM_IOCTL_I915_GEM_INIT		DRM_IOW(DRM_COMMAND_BASE + DRM_I915_GEM_INIT, struct drm_i915_gem_init)
 #define DRM_IOCTL_I915_GEM_EXECBUFFER	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER, struct drm_i915_gem_execbuffer)
-#define DRM_IOCTL_I915_GEM_EXECBUFFER2	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER2, struct drm_i915_gem_execbuffer2)
+#define DRM_IOCTL_I915_GEM_EXECBUFFER2	DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_EXECBUFFER2, struct drm_i915_gem_execbuffer2)
 #define DRM_IOCTL_I915_GEM_PIN		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_PIN, struct drm_i915_gem_pin)
 #define DRM_IOCTL_I915_GEM_UNPIN	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_GEM_UNPIN, struct drm_i915_gem_unpin)
 #define DRM_IOCTL_I915_GEM_BUSY		DRM_IOWR(DRM_COMMAND_BASE + DRM_I915_GEM_BUSY, struct drm_i915_gem_busy)
@@ -695,7 +695,7 @@ struct drm_i915_gem_exec_object2 {
 	__u64 flags;
 
 	__u64 rsvd1;
-	__u64 rsvd2;
+	__u64 rsvd2;	/* Used for fence fd */
 };
 
 struct drm_i915_gem_execbuffer2 {
@@ -776,7 +776,17 @@ struct drm_i915_gem_execbuffer2 {
  */
 #define I915_EXEC_RESOURCE_STREAMER     (1<<15)
 
-#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_RESOURCE_STREAMER<<1)
+/** Caller supplies a sync fence fd in the rsvd2 field.
+ * Wait for it to be signalled before starting the work
+ */
+#define I915_EXEC_WAIT_FENCE		(1<<16)
+
+/** Caller wants a sync fence fd for this execbuffer.
+ *  It will be returned in rsvd2
+ */
+#define I915_EXEC_CREATE_FENCE		(1<<17)
+
+#define __I915_EXEC_UNKNOWN_FLAGS -(I915_EXEC_CREATE_FENCE<<1)
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [RFC 12/12] drm/i915: Cache last IRQ seqno to reduce IRQ overhead
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (10 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 11/12] drm/i915: Add sync framework support to execbuff IOCTL John.C.Harrison
@ 2015-11-23 11:34                   ` John.C.Harrison
  2015-11-23 11:38                   ` [RFC 00/12] Convert requests to use struct fence John Harrison
  2015-12-08 14:53                   ` [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd Dave Gordon
  13 siblings, 0 replies; 92+ messages in thread
From: John.C.Harrison @ 2015-11-23 11:34 UTC (permalink / raw)
  To: Intel-GFX

From: John Harrison <John.C.Harrison@Intel.com>

The notify function can be called many times without the seqno
changing. A large number of duplicates are to prevent races due to the
requirement of not enabling interrupts until requested. However, when
interrupts are enabled the IRQ handle can be called multiple times
without the ring's seqno value changing. This patch reduces the
overhead of these extra calls by caching the last processed seqno
value and early exiting if it has not changed.

v3: New patch for series.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
---
 drivers/gpu/drm/i915/i915_gem.c         | 14 +++++++++++---
 drivers/gpu/drm/i915/intel_ringbuffer.h |  1 +
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 5c84866..0e41386 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2457,6 +2457,8 @@ i915_gem_init_seqno(struct drm_device *dev, u32 seqno)
 
 		for (j = 0; j < ARRAY_SIZE(ring->semaphore.sync_seqno); j++)
 			ring->semaphore.sync_seqno[j] = 0;
+
+		ring->last_irq_seqno = 0;
 	}
 
 	return 0;
@@ -2789,11 +2791,14 @@ void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked)
 		return;
 	}
 
-	if (!fence_locked)
-		spin_lock_irqsave(&ring->fence_lock, flags);
-
 	seqno = ring->get_seqno(ring, false);
 	trace_i915_gem_request_notify(ring, seqno);
+	if (seqno == ring->last_irq_seqno)
+		return;
+	ring->last_irq_seqno = seqno;
+
+	if (!fence_locked)
+		spin_lock_irqsave(&ring->fence_lock, flags);
 
 	list_for_each_entry_safe(req, req_next, &ring->fence_signal_list, signal_link) {
 		if (!req->cancelled) {
@@ -3162,7 +3167,10 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
 	 * Tidy up anything left over. This includes a call to
 	 * i915_gem_request_notify() which will make sure that any requests
 	 * that were on the signal pending list get also cleaned up.
+	 * NB: The seqno cache must be cleared otherwise the notify call will
+	 * simply return immediately.
 	 */
+	ring->last_irq_seqno = 0;
 	i915_gem_retire_requests_ring(ring);
 
 	/* Having flushed all requests from all queues, we know that all
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 9d09edb..1987abd 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -356,6 +356,7 @@ struct  intel_engine_cs {
 	spinlock_t fence_lock;
 	struct list_head fence_signal_list;
 	struct list_head fence_unsignal_list;
+	uint32_t last_irq_seqno;
 };
 
 bool intel_ring_initialized(struct intel_engine_cs *ring);
-- 
1.9.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* Re: [RFC 00/12] Convert requests to use struct fence
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (11 preceding siblings ...)
  2015-11-23 11:34                   ` [RFC 12/12] drm/i915: Cache last IRQ seqno to reduce IRQ overhead John.C.Harrison
@ 2015-11-23 11:38                   ` John Harrison
  2015-12-08 14:53                   ` [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd Dave Gordon
  13 siblings, 0 replies; 92+ messages in thread
From: John Harrison @ 2015-11-23 11:38 UTC (permalink / raw)
  To: Intel-GFX@Lists.FreeDesktop.Org

Oops, forgot to update the tag. These should be PATCH not RFC.

On 23/11/2015 11:34, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> There is a construct in the linux kernel called 'struct fence' that is
> intended to keep track of work that is executed on hardware. I.e. it
> solves the basic problem that the drivers 'struct
> drm_i915_gem_request' is trying to address. The request structure does
> quite a lot more than simply track the execution progress so is very
> definitely still required. However, the basic completion status side
> could be updated to use the ready made fence implementation and gain
> all the advantages that provides.
>
> Using the struct fence object also has the advantage that the fence
> can be used outside of the i915 driver (by other drivers or by
> userland applications). That is the basis of the dma-buff
> synchronisation API and allows asynchronous tracking of work
> completion. In this case, it allows applications to be signalled
> directly when a batch buffer completes without having to make an IOCTL
> call into the driver.
>
> This is work that was planned since the conversion of the driver from
> being seqno value based to being request structure based. This patch
> series does that work.
>
> An IGT test to exercise the fence support from user land is in
> progress and will follow. Android already makes extensive use of
> fences for display composition. Real world linux usage is planned in
> the form of Jesse's page table sharing / bufferless execbuf support.
> There is also a plan that Wayland (and others) could make use of it in
> a similar manner to Android.
>
> v2: Updated for review comments by various people and to add support
> for Android style 'native sync'.
>
> v3: Updated from review comments by Tvrtko Ursulin. Also moved sync
> framework out of staging and improved request completion handling.
>
> [Patches against drm-intel-nightly tree fetched 17/11/2015]
>
> John Harrison (9):
>    staging/android/sync: Move sync framework out of staging
>    drm/i915: Convert requests to use struct fence
>    drm/i915: Removed now redudant parameter to i915_gem_request_completed()
>    drm/i915: Add per context timelines to fence object
>    drm/i915: Delay the freeing of requests until retire time
>    drm/i915: Interrupt driven fences
>    drm/i915: Updated request structure tracing
>    drm/i915: Add sync framework support to execbuff IOCTL
>    drm/i915: Cache last IRQ seqno to reduce IRQ overhead
>
> Maarten Lankhorst (1):
>    staging/android/sync: add sync_fence_create_dma
>
> Peter Lawthers (1):
>    android/sync: Fix reversed sense of signaled fence
>
> Tvrtko Ursulin (1):
>    staging/android/sync: Support sync points created from dma-fences
>
>   drivers/android/Kconfig                    |  28 ++
>   drivers/android/Makefile                   |   2 +
>   drivers/android/sw_sync.c                  | 260 ++++++++++
>   drivers/android/sw_sync.h                  |  59 +++
>   drivers/android/sync.c                     | 734 +++++++++++++++++++++++++++++
>   drivers/android/sync.h                     | 387 +++++++++++++++
>   drivers/android/sync_debug.c               | 256 ++++++++++
>   drivers/android/trace/sync.h               |  82 ++++
>   drivers/gpu/drm/i915/Kconfig               |   3 +
>   drivers/gpu/drm/i915/i915_debugfs.c        |   7 +-
>   drivers/gpu/drm/i915/i915_drv.h            |  75 +--
>   drivers/gpu/drm/i915/i915_gem.c            | 437 ++++++++++++++++-
>   drivers/gpu/drm/i915/i915_gem_context.c    |  15 +-
>   drivers/gpu/drm/i915/i915_gem_execbuffer.c |  95 +++-
>   drivers/gpu/drm/i915/i915_irq.c            |   2 +-
>   drivers/gpu/drm/i915/i915_trace.h          |  13 +-
>   drivers/gpu/drm/i915/intel_display.c       |   4 +-
>   drivers/gpu/drm/i915/intel_lrc.c           |  13 +
>   drivers/gpu/drm/i915/intel_pm.c            |   6 +-
>   drivers/gpu/drm/i915/intel_ringbuffer.c    |   5 +
>   drivers/gpu/drm/i915/intel_ringbuffer.h    |   9 +
>   drivers/staging/android/Kconfig            |  28 --
>   drivers/staging/android/Makefile           |   2 -
>   drivers/staging/android/sw_sync.c          | 260 ----------
>   drivers/staging/android/sw_sync.h          |  59 ---
>   drivers/staging/android/sync.c             | 729 ----------------------------
>   drivers/staging/android/sync.h             | 356 --------------
>   drivers/staging/android/sync_debug.c       | 254 ----------
>   drivers/staging/android/trace/sync.h       |  82 ----
>   drivers/staging/android/uapi/sw_sync.h     |  32 --
>   drivers/staging/android/uapi/sync.h        |  97 ----
>   include/uapi/Kbuild                        |   1 +
>   include/uapi/drm/i915_drm.h                |  16 +-
>   include/uapi/sync/Kbuild                   |   3 +
>   include/uapi/sync/sw_sync.h                |  32 ++
>   include/uapi/sync/sync.h                   |  97 ++++
>   36 files changed, 2569 insertions(+), 1971 deletions(-)
>   create mode 100644 drivers/android/sw_sync.c
>   create mode 100644 drivers/android/sw_sync.h
>   create mode 100644 drivers/android/sync.c
>   create mode 100644 drivers/android/sync.h
>   create mode 100644 drivers/android/sync_debug.c
>   create mode 100644 drivers/android/trace/sync.h
>   delete mode 100644 drivers/staging/android/sw_sync.c
>   delete mode 100644 drivers/staging/android/sw_sync.h
>   delete mode 100644 drivers/staging/android/sync.c
>   delete mode 100644 drivers/staging/android/sync.h
>   delete mode 100644 drivers/staging/android/sync_debug.c
>   delete mode 100644 drivers/staging/android/trace/sync.h
>   delete mode 100644 drivers/staging/android/uapi/sw_sync.h
>   delete mode 100644 drivers/staging/android/uapi/sync.h
>   create mode 100644 include/uapi/sync/Kbuild
>   create mode 100644 include/uapi/sync/sw_sync.h
>   create mode 100644 include/uapi/sync/sync.h
>

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [RFC 02/12] staging/android/sync: add sync_fence_create_dma
  2015-11-23 11:34                   ` [RFC 02/12] staging/android/sync: add sync_fence_create_dma John.C.Harrison
@ 2015-11-23 13:27                     ` Maarten Lankhorst
  2015-11-23 13:38                       ` John Harrison
  0 siblings, 1 reply; 92+ messages in thread
From: Maarten Lankhorst @ 2015-11-23 13:27 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX
  Cc: devel, Tvrtko Ursulin, Greg Kroah-Hartman,
	Arve Hjønnevåg, Jesse Barnes, Riley Andrews,
	Daniel Vetter, Maarten Lankhorst

Op 23-11-15 om 12:34 schreef John.C.Harrison@Intel.com:
> From: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>
> This allows users of dma fences to create a android fence.
>
> v2: Added kerneldoc. (Tvrtko Ursulin).
>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
> Cc: devel@driverdev.osuosl.org
> Cc: Riley Andrews <riandrews@android.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Arve Hjønnevåg <arve@android.com>
> ---
>  drivers/staging/android/sync.c | 13 +++++++++----
>  drivers/staging/android/sync.h | 12 +++++++++++-
>  2 files changed, 20 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c
> index f83e00c..7f0e919 100644
> --- a/drivers/staging/android/sync.c
> +++ b/drivers/staging/android/sync.c
> @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct fence_cb *cb)
>  }
>  
>  /* TODO: implement a create which takes more that one sync_pt */
> -struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
> +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt)
>  {
>  	struct sync_fence *fence;
>  
> @@ -199,16 +199,21 @@ struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
>  	fence->num_fences = 1;
>  	atomic_set(&fence->status, 1);
>  
> -	fence->cbs[0].sync_pt = &pt->base;
> +	fence->cbs[0].sync_pt = pt;
>  	fence->cbs[0].fence = fence;
> -	if (fence_add_callback(&pt->base, &fence->cbs[0].cb,
> -			       fence_check_cb_func))
> +	if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func))
>  		atomic_dec(&fence->status);
>  
>  	sync_fence_debug_add(fence);
>  
>  	return fence;
>  }
> +EXPORT_SYMBOL(sync_fence_create_dma);
> +
> +struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
> +{
> +	return sync_fence_create_dma(name, &pt->base);
> +}
>  EXPORT_SYMBOL(sync_fence_create);
>  
>  struct sync_fence *sync_fence_fdget(int fd)
> diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h
> index 61f8a3a..798cd56 100644
> --- a/drivers/staging/android/sync.h
> +++ b/drivers/staging/android/sync.h
> @@ -250,10 +250,20 @@ void sync_pt_free(struct sync_pt *pt);
>   * @pt:		sync_pt to add to the fence
>   *
>   * Creates a fence containg @pt.  Once this is called, the fence takes
> - * ownership of @pt.
> + * a reference on @pt.
>   */
>  struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt);
No it doesn't.
> +/**
> + * sync_fence_create_dma() - creates a sync fence from dma-fence
> + * @name:	name of fence to create
> + * @pt:	dma-fence to add to the fence
> + *
> + * Creates a fence containg @pt.  Once this is called, the fence takes
> + * a reference on @pt.
> + */
No it doesn't.
> +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt);
> +
>  /*
>   * API for sync_fence consumers
>   */

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [RFC 01/12] staging/android/sync: Support sync points created from dma-fences
  2015-11-23 11:34                   ` [RFC 01/12] staging/android/sync: Support sync points created from dma-fences John.C.Harrison
@ 2015-11-23 13:29                     ` Maarten Lankhorst
  2015-11-23 13:31                     ` [Intel-gfx] " Tvrtko Ursulin
  1 sibling, 0 replies; 92+ messages in thread
From: Maarten Lankhorst @ 2015-11-23 13:29 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX
  Cc: devel, Greg Kroah-Hartman, Riley Andrews, Arve Hjønnevåg

Op 23-11-15 om 12:34 schreef John.C.Harrison@Intel.com:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>
> Debug output assumes all sync points are built on top of Android sync points
> and when we start creating them from dma-fences will NULL ptr deref unless
> taught about this.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: devel@driverdev.osuosl.org
> Cc: Riley Andrews <riandrews@android.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Arve Hjønnevåg <arve@android.com>
>
Could this be upstreamed already? It makes the second patch possible and doesn't break current staging..
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [Intel-gfx] [RFC 01/12] staging/android/sync: Support sync points created from dma-fences
  2015-11-23 11:34                   ` [RFC 01/12] staging/android/sync: Support sync points created from dma-fences John.C.Harrison
  2015-11-23 13:29                     ` Maarten Lankhorst
@ 2015-11-23 13:31                     ` Tvrtko Ursulin
  1 sibling, 0 replies; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-23 13:31 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX
  Cc: devel, Greg Kroah-Hartman, Arve Hjønnevåg, Riley Andrews


Hi,

On 23/11/15 11:34, John.C.Harrison@Intel.com wrote:
> From: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

This is actually Maarten's patch, I guess authorship got lost over time 
moving through different trees.

Tvrtko

> Debug output assumes all sync points are built on top of Android sync points
> and when we start creating them from dma-fences will NULL ptr deref unless
> taught about this.
>
> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> Cc: devel@driverdev.osuosl.org
> Cc: Riley Andrews <riandrews@android.com>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Arve Hjønnevåg <arve@android.com>
> ---
>   drivers/staging/android/sync_debug.c | 42 +++++++++++++++++++-----------------
>   1 file changed, 22 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/staging/android/sync_debug.c b/drivers/staging/android/sync_debug.c
> index 91ed2c4..f45d13c 100644
> --- a/drivers/staging/android/sync_debug.c
> +++ b/drivers/staging/android/sync_debug.c
> @@ -82,36 +82,42 @@ static const char *sync_status_str(int status)
>   	return "error";
>   }
>
> -static void sync_print_pt(struct seq_file *s, struct sync_pt *pt, bool fence)
> +static void sync_print_pt(struct seq_file *s, struct fence *pt, bool fence)
>   {
>   	int status = 1;
> -	struct sync_timeline *parent = sync_pt_parent(pt);
>
> -	if (fence_is_signaled_locked(&pt->base))
> -		status = pt->base.status;
> +	if (fence_is_signaled_locked(pt))
> +		status = pt->status;
>
>   	seq_printf(s, "  %s%spt %s",
> -		   fence ? parent->name : "",
> +		   fence && pt->ops->get_timeline_name ?
> +		   pt->ops->get_timeline_name(pt) : "",
>   		   fence ? "_" : "",
>   		   sync_status_str(status));
>
>   	if (status <= 0) {
>   		struct timespec64 ts64 =
> -			ktime_to_timespec64(pt->base.timestamp);
> +			ktime_to_timespec64(pt->timestamp);
>
>   		seq_printf(s, "@%lld.%09ld", (s64)ts64.tv_sec, ts64.tv_nsec);
>   	}
>
> -	if (parent->ops->timeline_value_str &&
> -	    parent->ops->pt_value_str) {
> +	if ((!fence || pt->ops->timeline_value_str) &&
> +	    pt->ops->fence_value_str) {
>   		char value[64];
> +		bool success;
>
> -		parent->ops->pt_value_str(pt, value, sizeof(value));
> -		seq_printf(s, ": %s", value);
> -		if (fence) {
> -			parent->ops->timeline_value_str(parent, value,
> -						    sizeof(value));
> -			seq_printf(s, " / %s", value);
> +		pt->ops->fence_value_str(pt, value, sizeof(value));
> +		success = strlen(value);
> +
> +		if (success)
> +			seq_printf(s, ": %s", value);
> +
> +		if (success && fence) {
> +			pt->ops->timeline_value_str(pt, value, sizeof(value));
> +
> +			if (strlen(value))
> +				seq_printf(s, " / %s", value);
>   		}
>   	}
>
> @@ -138,7 +144,7 @@ static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj)
>   	list_for_each(pos, &obj->child_list_head) {
>   		struct sync_pt *pt =
>   			container_of(pos, struct sync_pt, child_list);
> -		sync_print_pt(s, pt, false);
> +		sync_print_pt(s, &pt->base, false);
>   	}
>   	spin_unlock_irqrestore(&obj->child_list_lock, flags);
>   }
> @@ -153,11 +159,7 @@ static void sync_print_fence(struct seq_file *s, struct sync_fence *fence)
>   		   sync_status_str(atomic_read(&fence->status)));
>
>   	for (i = 0; i < fence->num_fences; ++i) {
> -		struct sync_pt *pt =
> -			container_of(fence->cbs[i].sync_pt,
> -				     struct sync_pt, base);
> -
> -		sync_print_pt(s, pt, true);
> +		sync_print_pt(s, fence->cbs[i].sync_pt, true);
>   	}
>
>   	spin_lock_irqsave(&fence->wq.lock, flags);
>
_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [RFC 02/12] staging/android/sync: add sync_fence_create_dma
  2015-11-23 13:27                     ` Maarten Lankhorst
@ 2015-11-23 13:38                       ` John Harrison
  2015-11-23 13:44                         ` Tvrtko Ursulin
  0 siblings, 1 reply; 92+ messages in thread
From: John Harrison @ 2015-11-23 13:38 UTC (permalink / raw)
  To: Maarten Lankhorst, Intel-GFX
  Cc: devel, Tvrtko Ursulin, Greg Kroah-Hartman,
	Arve Hjønnevåg, Jesse Barnes, Riley Andrews,
	Daniel Vetter, Maarten Lankhorst

On 23/11/2015 13:27, Maarten Lankhorst wrote:
> Op 23-11-15 om 12:34 schreef John.C.Harrison@Intel.com:
>> From: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>
>> This allows users of dma fences to create a android fence.
>>
>> v2: Added kerneldoc. (Tvrtko Ursulin).
>>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> Cc: Daniel Vetter <daniel@ffwll.ch>
>> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
>> Cc: devel@driverdev.osuosl.org
>> Cc: Riley Andrews <riandrews@android.com>
>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>> Cc: Arve Hjønnevåg <arve@android.com>
>> ---
>>   drivers/staging/android/sync.c | 13 +++++++++----
>>   drivers/staging/android/sync.h | 12 +++++++++++-
>>   2 files changed, 20 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/staging/android/sync.c b/drivers/staging/android/sync.c
>> index f83e00c..7f0e919 100644
>> --- a/drivers/staging/android/sync.c
>> +++ b/drivers/staging/android/sync.c
>> @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f, struct fence_cb *cb)
>>   }
>>   
>>   /* TODO: implement a create which takes more that one sync_pt */
>> -struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
>> +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt)
>>   {
>>   	struct sync_fence *fence;
>>   
>> @@ -199,16 +199,21 @@ struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
>>   	fence->num_fences = 1;
>>   	atomic_set(&fence->status, 1);
>>   
>> -	fence->cbs[0].sync_pt = &pt->base;
>> +	fence->cbs[0].sync_pt = pt;
>>   	fence->cbs[0].fence = fence;
>> -	if (fence_add_callback(&pt->base, &fence->cbs[0].cb,
>> -			       fence_check_cb_func))
>> +	if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func))
>>   		atomic_dec(&fence->status);
>>   
>>   	sync_fence_debug_add(fence);
>>   
>>   	return fence;
>>   }
>> +EXPORT_SYMBOL(sync_fence_create_dma);
>> +
>> +struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt)
>> +{
>> +	return sync_fence_create_dma(name, &pt->base);
>> +}
>>   EXPORT_SYMBOL(sync_fence_create);
>>   
>>   struct sync_fence *sync_fence_fdget(int fd)
>> diff --git a/drivers/staging/android/sync.h b/drivers/staging/android/sync.h
>> index 61f8a3a..798cd56 100644
>> --- a/drivers/staging/android/sync.h
>> +++ b/drivers/staging/android/sync.h
>> @@ -250,10 +250,20 @@ void sync_pt_free(struct sync_pt *pt);
>>    * @pt:		sync_pt to add to the fence
>>    *
>>    * Creates a fence containg @pt.  Once this is called, the fence takes
>> - * ownership of @pt.
>> + * a reference on @pt.
>>    */
>>   struct sync_fence *sync_fence_create(const char *name, struct sync_pt *pt);
> No it doesn't.
>> +/**
>> + * sync_fence_create_dma() - creates a sync fence from dma-fence
>> + * @name:	name of fence to create
>> + * @pt:	dma-fence to add to the fence
>> + *
>> + * Creates a fence containg @pt.  Once this is called, the fence takes
>> + * a reference on @pt.
>> + */
> No it doesn't.
This is your patch isn't it? Or is this something Tvrtko added on the 
way past? Either way, what should the correct description be? It takes a 
copy of the pointer to 'pt'. Is the comment meaning a reference in the 
sense of 'pass by reference', i.e. a pointer? But you are meaning a 
reference in the sense of incrementing a usage count?

>> +struct sync_fence *sync_fence_create_dma(const char *name, struct fence *pt);
>> +
>>   /*
>>    * API for sync_fence consumers
>>    */

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [RFC 02/12] staging/android/sync: add sync_fence_create_dma
  2015-11-23 13:38                       ` John Harrison
@ 2015-11-23 13:44                         ` Tvrtko Ursulin
  2015-11-23 13:48                           ` Maarten Lankhorst
  0 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-23 13:44 UTC (permalink / raw)
  To: John Harrison, Maarten Lankhorst, Intel-GFX
  Cc: devel, Greg Kroah-Hartman, Arve Hjønnevåg,
	Maarten Lankhorst, Riley Andrews


On 23/11/15 13:38, John Harrison wrote:
> On 23/11/2015 13:27, Maarten Lankhorst wrote:
>> Op 23-11-15 om 12:34 schreef John.C.Harrison@Intel.com:
>>> From: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>
>>> This allows users of dma fences to create a android fence.
>>>
>>> v2: Added kerneldoc. (Tvrtko Ursulin).
>>>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
>>> Cc: devel@driverdev.osuosl.org
>>> Cc: Riley Andrews <riandrews@android.com>
>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>> Cc: Arve Hjønnevåg <arve@android.com>
>>> ---
>>>   drivers/staging/android/sync.c | 13 +++++++++----
>>>   drivers/staging/android/sync.h | 12 +++++++++++-
>>>   2 files changed, 20 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/drivers/staging/android/sync.c
>>> b/drivers/staging/android/sync.c
>>> index f83e00c..7f0e919 100644
>>> --- a/drivers/staging/android/sync.c
>>> +++ b/drivers/staging/android/sync.c
>>> @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f,
>>> struct fence_cb *cb)
>>>   }
>>>   /* TODO: implement a create which takes more that one sync_pt */
>>> -struct sync_fence *sync_fence_create(const char *name, struct
>>> sync_pt *pt)
>>> +struct sync_fence *sync_fence_create_dma(const char *name, struct
>>> fence *pt)
>>>   {
>>>       struct sync_fence *fence;
>>> @@ -199,16 +199,21 @@ struct sync_fence *sync_fence_create(const char
>>> *name, struct sync_pt *pt)
>>>       fence->num_fences = 1;
>>>       atomic_set(&fence->status, 1);
>>> -    fence->cbs[0].sync_pt = &pt->base;
>>> +    fence->cbs[0].sync_pt = pt;
>>>       fence->cbs[0].fence = fence;
>>> -    if (fence_add_callback(&pt->base, &fence->cbs[0].cb,
>>> -                   fence_check_cb_func))
>>> +    if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func))
>>>           atomic_dec(&fence->status);
>>>       sync_fence_debug_add(fence);
>>>       return fence;
>>>   }
>>> +EXPORT_SYMBOL(sync_fence_create_dma);
>>> +
>>> +struct sync_fence *sync_fence_create(const char *name, struct
>>> sync_pt *pt)
>>> +{
>>> +    return sync_fence_create_dma(name, &pt->base);
>>> +}
>>>   EXPORT_SYMBOL(sync_fence_create);
>>>   struct sync_fence *sync_fence_fdget(int fd)
>>> diff --git a/drivers/staging/android/sync.h
>>> b/drivers/staging/android/sync.h
>>> index 61f8a3a..798cd56 100644
>>> --- a/drivers/staging/android/sync.h
>>> +++ b/drivers/staging/android/sync.h
>>> @@ -250,10 +250,20 @@ void sync_pt_free(struct sync_pt *pt);
>>>    * @pt:        sync_pt to add to the fence
>>>    *
>>>    * Creates a fence containg @pt.  Once this is called, the fence takes
>>> - * ownership of @pt.
>>> + * a reference on @pt.
>>>    */
>>>   struct sync_fence *sync_fence_create(const char *name, struct
>>> sync_pt *pt);
>> No it doesn't.
>>> +/**
>>> + * sync_fence_create_dma() - creates a sync fence from dma-fence
>>> + * @name:    name of fence to create
>>> + * @pt:    dma-fence to add to the fence
>>> + *
>>> + * Creates a fence containg @pt.  Once this is called, the fence takes
>>> + * a reference on @pt.
>>> + */
>> No it doesn't.
> This is your patch isn't it? Or is this something Tvrtko added on the
> way past? Either way, what should the correct description be? It takes a

It is Maarten's patch, I just copied the comment over from one function 
to another. The one I picked up was 
https://patchwork.freedesktop.org/patch/38074/, December 2014.

> copy of the pointer to 'pt'. Is the comment meaning a reference in the
> sense of 'pass by reference', i.e. a pointer? But you are meaning a
> reference in the sense of incrementing a usage count?

Maybe he means it is still taking ownership, not reference?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [RFC 02/12] staging/android/sync: add sync_fence_create_dma
  2015-11-23 13:44                         ` Tvrtko Ursulin
@ 2015-11-23 13:48                           ` Maarten Lankhorst
  0 siblings, 0 replies; 92+ messages in thread
From: Maarten Lankhorst @ 2015-11-23 13:48 UTC (permalink / raw)
  To: Tvrtko Ursulin, John Harrison, Intel-GFX
  Cc: devel, Greg Kroah-Hartman, Arve Hjønnevåg,
	Maarten Lankhorst, Riley Andrews

Op 23-11-15 om 14:44 schreef Tvrtko Ursulin:
>
> On 23/11/15 13:38, John Harrison wrote:
>> On 23/11/2015 13:27, Maarten Lankhorst wrote:
>>> Op 23-11-15 om 12:34 schreef John.C.Harrison@Intel.com:
>>>> From: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>>
>>>> This allows users of dma fences to create a android fence.
>>>>
>>>> v2: Added kerneldoc. (Tvrtko Ursulin).
>>>>
>>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
>>>> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>> Cc: Daniel Vetter <daniel@ffwll.ch>
>>>> Cc: Jesse Barnes <jbarnes@virtuousgeek.org>
>>>> Cc: devel@driverdev.osuosl.org
>>>> Cc: Riley Andrews <riandrews@android.com>
>>>> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>> Cc: Arve Hjønnevåg <arve@android.com>
>>>> ---
>>>>   drivers/staging/android/sync.c | 13 +++++++++----
>>>>   drivers/staging/android/sync.h | 12 +++++++++++-
>>>>   2 files changed, 20 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/drivers/staging/android/sync.c
>>>> b/drivers/staging/android/sync.c
>>>> index f83e00c..7f0e919 100644
>>>> --- a/drivers/staging/android/sync.c
>>>> +++ b/drivers/staging/android/sync.c
>>>> @@ -188,7 +188,7 @@ static void fence_check_cb_func(struct fence *f,
>>>> struct fence_cb *cb)
>>>>   }
>>>>   /* TODO: implement a create which takes more that one sync_pt */
>>>> -struct sync_fence *sync_fence_create(const char *name, struct
>>>> sync_pt *pt)
>>>> +struct sync_fence *sync_fence_create_dma(const char *name, struct
>>>> fence *pt)
>>>>   {
>>>>       struct sync_fence *fence;
>>>> @@ -199,16 +199,21 @@ struct sync_fence *sync_fence_create(const char
>>>> *name, struct sync_pt *pt)
>>>>       fence->num_fences = 1;
>>>>       atomic_set(&fence->status, 1);
>>>> -    fence->cbs[0].sync_pt = &pt->base;
>>>> +    fence->cbs[0].sync_pt = pt;
>>>>       fence->cbs[0].fence = fence;
>>>> -    if (fence_add_callback(&pt->base, &fence->cbs[0].cb,
>>>> -                   fence_check_cb_func))
>>>> +    if (fence_add_callback(pt, &fence->cbs[0].cb, fence_check_cb_func))
>>>>           atomic_dec(&fence->status);
>>>>       sync_fence_debug_add(fence);
>>>>       return fence;
>>>>   }
>>>> +EXPORT_SYMBOL(sync_fence_create_dma);
>>>> +
>>>> +struct sync_fence *sync_fence_create(const char *name, struct
>>>> sync_pt *pt)
>>>> +{
>>>> +    return sync_fence_create_dma(name, &pt->base);
>>>> +}
>>>>   EXPORT_SYMBOL(sync_fence_create);
>>>>   struct sync_fence *sync_fence_fdget(int fd)
>>>> diff --git a/drivers/staging/android/sync.h
>>>> b/drivers/staging/android/sync.h
>>>> index 61f8a3a..798cd56 100644
>>>> --- a/drivers/staging/android/sync.h
>>>> +++ b/drivers/staging/android/sync.h
>>>> @@ -250,10 +250,20 @@ void sync_pt_free(struct sync_pt *pt);
>>>>    * @pt:        sync_pt to add to the fence
>>>>    *
>>>>    * Creates a fence containg @pt.  Once this is called, the fence takes
>>>> - * ownership of @pt.
>>>> + * a reference on @pt.
>>>>    */
>>>>   struct sync_fence *sync_fence_create(const char *name, struct
>>>> sync_pt *pt);
>>> No it doesn't.
>>>> +/**
>>>> + * sync_fence_create_dma() - creates a sync fence from dma-fence
>>>> + * @name:    name of fence to create
>>>> + * @pt:    dma-fence to add to the fence
>>>> + *
>>>> + * Creates a fence containg @pt.  Once this is called, the fence takes
>>>> + * a reference on @pt.
>>>> + */
>>> No it doesn't.
>> This is your patch isn't it? Or is this something Tvrtko added on the
>> way past? Either way, what should the correct description be? It takes a
>
> It is Maarten's patch, I just copied the comment over from one function to another. The one I picked up was https://patchwork.freedesktop.org/patch/38074/, December 2014.
>
>> copy of the pointer to 'pt'. Is the comment meaning a reference in the
>> sense of 'pass by reference', i.e. a pointer? But you are meaning a
>> reference in the sense of incrementing a usage count?
>
> Maybe he means it is still taking ownership, not reference?
Indeed. See commit 3ea411c56ef "android: fix reference leak in sync_fence_create".

This broke compatibility with android by creating a ref leak. :)

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* i915_wait_request scaling
@ 2015-11-29  8:47 Chris Wilson
  2015-11-29  8:47 ` [PATCH 01/15] drm/i915: Break busywaiting for requests on pending signals Chris Wilson
                   ` (14 more replies)
  0 siblings, 15 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:47 UTC (permalink / raw)
  To: intel-gfx

The first 3 patches are cc:stable for reducing the negative
side-effects of busywaiting. Following on from them, we have a set of
patches to prevent the thundering herd issue with multiple concurrent
waiters and to optimise the waiters. Since this has been flagged as
severely impacting some client loads, I've moved this forward of the
series to cut the execbuffer overheads.

Available as:
http://cgit.freedesktop.org/~ickle/linux-2.6/log/?h=breadcrumbs
-Chris

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 01/15] drm/i915: Break busywaiting for requests on pending signals
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
@ 2015-11-29  8:47 ` Chris Wilson
  2015-11-30 10:01   ` Tvrtko Ursulin
  2015-11-29  8:48 ` [PATCH 02/15] drm/i915: Limit the busy wait on requests to 10us not 10ms! Chris Wilson
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:47 UTC (permalink / raw)
  To: intel-gfx
  Cc: Jens Axboe, Daniel Vetter, Eero Tamminen, stable, Rantala, Valtteri

The busywait in __i915_spin_request() does not respect pending signals
and so may consume the entire timeslice for the task instead of
returning to userspace to handle the signal.

In the worst case this could cause a delay in signal processing of 20ms,
which would be a noticeable jitter in cursor tracking. If a higher
resolution signal was being used, for example to provide fairness of a
server timeslices between clients, we could expect to detect some
unfairness between clients (i.e. some windows not updating as fast as
others). This issue was noticed when inspecting a report of poor
interactivity resulting from excessively high __i915_spin_request usage.

Fixes regression from
commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Apr 7 16:20:41 2015 +0100

     drm/i915: Optimistically spin for the request completion

v2: Try to assess the impact of the bug

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jens Axboe <axboe@kernel.dk>
Cc; "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@kernel.vger.org
---
 drivers/gpu/drm/i915/i915_gem.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 33adc8f8ab20..87fc34f5899f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1146,7 +1146,7 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
 	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
 }
 
-static int __i915_spin_request(struct drm_i915_gem_request *req)
+static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 {
 	unsigned long timeout;
 
@@ -1158,6 +1158,9 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
 		if (i915_gem_request_completed(req, true))
 			return 0;
 
+		if (signal_pending_state(state, current))
+			break;
+
 		if (time_after_eq(jiffies, timeout))
 			break;
 
@@ -1197,6 +1200,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	const bool irq_test_in_progress =
 		ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
+	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
 	DEFINE_WAIT(wait);
 	unsigned long timeout_expire;
 	s64 before, now;
@@ -1221,7 +1225,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	before = ktime_get_raw_ns();
 
 	/* Optimistic spin for the next jiffie before touching IRQs */
-	ret = __i915_spin_request(req);
+	ret = __i915_spin_request(req, state);
 	if (ret == 0)
 		goto out;
 
@@ -1233,8 +1237,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	for (;;) {
 		struct timer_list timer;
 
-		prepare_to_wait(&ring->irq_queue, &wait,
-				interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
+		prepare_to_wait(&ring->irq_queue, &wait, state);
 
 		/* We need to check whether any gpu reset happened in between
 		 * the caller grabbing the seqno and now ... */
@@ -1252,7 +1255,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			break;
 		}
 
-		if (interruptible && signal_pending(current)) {
+		if (signal_pending_state(state, current)) {
 			ret = -ERESTARTSYS;
 			break;
 		}
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 02/15] drm/i915: Limit the busy wait on requests to 10us not 10ms!
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
  2015-11-29  8:47 ` [PATCH 01/15] drm/i915: Break busywaiting for requests on pending signals Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-30 10:02   ` Tvrtko Ursulin
  2015-11-29  8:48 ` [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request Chris Wilson
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Eero Tamminen, stable, Rantala, Valtteri

When waiting for high frequency requests, the finite amount of time
required to set up the irq and wait upon it limits the response rate. By
busywaiting on the request completion for a short while we can service
the high frequency waits as quick as possible. However, if it is a slow
request, we want to sleep as quickly as possible. The tradeoff between
waiting and sleeping is roughly the time it takes to sleep on a request,
on the order of a microsecond. Based on measurements of synchronous
workloads from across big core and little atom, I have set the limit for
busywaiting as 10 microseconds. In most of the synchronous cases, we can
reduce the limit down to as little as 2 miscroseconds, but that leaves
quite a few test cases regressing by factors of 3 and more.

The code currently uses the jiffie clock, but that is far too coarse (on
the order of 10 milliseconds) and results in poor interactivity as the
CPU ends up being hogged by slow requests. To get microsecond resolution
we need to use a high resolution timer. The cheapest of which is polling
local_clock(), but that is only valid on the same CPU. If we switch CPUs
because the task was preempted, we can also use that as an indicator that
 the system is too busy to waste cycles on spinning and we should sleep
instead.

__i915_spin_request was introduced in
commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2]
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Tue Apr 7 16:20:41 2015 +0100

     drm/i915: Optimistically spin for the request completion

v2: Drop full u64 for unsigned long - the timer is 32bit wraparound safe,
so we can use native register sizes on smaller architectures. Mention
the approximate microseconds units for elapsed time and add some extra
comments describing the reason for busywaiting.

v3: Raise the limit to 10us

Reported-by: Jens Axboe <axboe@kernel.dk>
Link: https://lkml.org/lkml/2015/11/12/621
Cc; "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@kernel.vger.org
---
 drivers/gpu/drm/i915/i915_gem.c | 47 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 45 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 87fc34f5899f..bad112abb16b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1146,14 +1146,57 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
 	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
 }
 
+static unsigned long local_clock_us(unsigned *cpu)
+{
+	unsigned long t;
+
+	/* Cheaply and approximately convert from nanoseconds to microseconds.
+	 * The result and subsequent calculations are also defined in the same
+	 * approximate microseconds units. The principal source of timing
+	 * error here is from the simple truncation.
+	 *
+	 * Note that local_clock() is only defined wrt to the current CPU;
+	 * the comparisons are no longer valid if we switch CPUs. Instead of
+	 * blocking preemption for the entire busywait, we can detect the CPU
+	 * switch and use that as indicator of system load and a reason to
+	 * stop busywaiting, see busywait_stop().
+	 */
+	*cpu = get_cpu();
+	t = local_clock() >> 10;
+	put_cpu();
+
+	return t;
+}
+
+static bool busywait_stop(unsigned long timeout, unsigned cpu)
+{
+	unsigned this_cpu;
+
+	if (time_after(local_clock_us(&this_cpu), timeout))
+		return true;
+
+	return this_cpu != cpu;
+}
+
 static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 {
 	unsigned long timeout;
+	unsigned cpu;
+
+	/* When waiting for high frequency requests, e.g. during synchronous
+	 * rendering split between the CPU and GPU, the finite amount of time
+	 * required to set up the irq and wait upon it limits the response
+	 * rate. By busywaiting on the request completion for a short while we
+	 * can service the high frequency waits as quick as possible. However,
+	 * if it is a slow request, we want to sleep as quickly as possible.
+	 * The tradeoff between waiting and sleeping is roughly the time it
+	 * takes to sleep on a request, on the order of a microsecond.
+	 */
 
 	if (i915_gem_request_get_ring(req)->irq_refcount)
 		return -EBUSY;
 
-	timeout = jiffies + 1;
+	timeout = local_clock_us(&cpu) + 10;
 	while (!need_resched()) {
 		if (i915_gem_request_completed(req, true))
 			return 0;
@@ -1161,7 +1204,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 		if (signal_pending_state(state, current))
 			break;
 
-		if (time_after_eq(jiffies, timeout))
+		if (busywait_stop(timeout, cpu))
 			break;
 
 		cpu_relax_lowlatency();
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
  2015-11-29  8:47 ` [PATCH 01/15] drm/i915: Break busywaiting for requests on pending signals Chris Wilson
  2015-11-29  8:48 ` [PATCH 02/15] drm/i915: Limit the busy wait on requests to 10us not 10ms! Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-30 10:06   ` Tvrtko Ursulin
  2015-11-29  8:48 ` [PATCH 04/15] drm/i915: Cache the reset_counter for the request Chris Wilson
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: Daniel Vetter, Eero Tamminen, stable, Rantala, Valtteri

Limit busywaiting only to the request currently being processed by the
GPU. If the request is not currently being processed by the GPU, there
is a very low likelihood of it being completed within the 2 microsecond
spin timeout and so we will just be wasting CPU cycles.

v2: Check for logical inversion when rebasing - we were incorrectly
checking for this request being active, and instead busywaiting for
when the GPU was not yet processing the request of interest.

v3: Try another colour for the seqno names.
v4: Another colour for the function names.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Eero Tamminen <eero.t.tamminen@intel.com>
Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
Cc: stable@kernel.vger.org
---
 drivers/gpu/drm/i915/i915_drv.h | 27 +++++++++++++++++++--------
 drivers/gpu/drm/i915/i915_gem.c |  8 +++++++-
 2 files changed, 26 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index bc865e234df2..d07041c1729d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2182,8 +2182,17 @@ struct drm_i915_gem_request {
 	struct drm_i915_private *i915;
 	struct intel_engine_cs *ring;
 
-	/** GEM sequence number associated with this request. */
-	uint32_t seqno;
+	 /** GEM sequence number associated with the previous request,
+	  * when the HWS breadcrumb is equal to this the GPU is processing
+	  * this request.
+	  */
+	u32 previous_seqno;
+
+	 /** GEM sequence number associated with this request,
+	  * when the HWS breadcrumb is equal or greater than this the GPU
+	  * has finished processing this request.
+	  */
+	u32 seqno;
 
 	/** Position in the ringbuffer of the start of the request */
 	u32 head;
@@ -2945,15 +2954,17 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 	return (int32_t)(seq1 - seq2) >= 0;
 }
 
+static inline bool i915_gem_request_started(struct drm_i915_gem_request *req,
+					   bool lazy_coherency)
+{
+	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
+	return i915_seqno_passed(seqno, req->previous_seqno);
+}
+
 static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
 					      bool lazy_coherency)
 {
-	u32 seqno;
-
-	BUG_ON(req == NULL);
-
-	seqno = req->ring->get_seqno(req->ring, lazy_coherency);
-
+	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
 	return i915_seqno_passed(seqno, req->seqno);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bad112abb16b..ada461e02718 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1193,9 +1193,13 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 	 * takes to sleep on a request, on the order of a microsecond.
 	 */
 
-	if (i915_gem_request_get_ring(req)->irq_refcount)
+	if (req->ring->irq_refcount)
 		return -EBUSY;
 
+	/* Only spin if we know the GPU is processing this request */
+	if (!i915_gem_request_started(req, false))
+		return -EAGAIN;
+
 	timeout = local_clock_us(&cpu) + 10;
 	while (!need_resched()) {
 		if (i915_gem_request_completed(req, true))
@@ -1209,6 +1213,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 
 		cpu_relax_lowlatency();
 	}
+
 	if (i915_gem_request_completed(req, false))
 		return 0;
 
@@ -2592,6 +2597,7 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 	request->batch_obj = obj;
 
 	request->emitted_jiffies = jiffies;
+	request->previous_seqno = ring->last_submitted_seqno;
 	ring->last_submitted_seqno = request->seqno;
 	list_add_tail(&request->list, &ring->request_list);
 
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 04/15] drm/i915: Cache the reset_counter for the request
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (2 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-12-01  8:31   ` Daniel Vetter
  2015-11-29  8:48 ` [PATCH 05/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

Instead of querying the reset counter before every access to the ring,
query it the first time we touch the ring, and do a final compare when
submitting the request. For correctness, we need to then sanitize how
the reset_counter is incremented to prevent broken submission and
waiting across resets, in the process fixing the persistent -EIO we
still see today on failed waits.

v2: Rebase
v3: Now with added testcase
v4: Rebase

Testcase: igt/gem_eio
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
 drivers/gpu/drm/i915/i915_drv.c         | 32 +++++++-----
 drivers/gpu/drm/i915/i915_drv.h         | 39 ++++++++++----
 drivers/gpu/drm/i915/i915_gem.c         | 90 +++++++++++++++------------------
 drivers/gpu/drm/i915/i915_irq.c         | 21 +-------
 drivers/gpu/drm/i915/intel_display.c    | 13 ++---
 drivers/gpu/drm/i915/intel_lrc.c        | 13 +++--
 drivers/gpu/drm/i915/intel_ringbuffer.c | 14 ++---
 8 files changed, 111 insertions(+), 113 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index a728ff11e389..8458447ddc17 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4660,7 +4660,7 @@ i915_wedged_get(void *data, u64 *val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	*val = atomic_read(&dev_priv->gpu_error.reset_counter);
+	*val = i915_terminally_wedged(&dev_priv->gpu_error);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 90faa8e03fca..eb893a5e00b1 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -923,23 +923,31 @@ int i915_resume_switcheroo(struct drm_device *dev)
 int i915_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_gpu_error *error = &dev_priv->gpu_error;
+	unsigned reset_counter;
 	bool simulated;
 	int ret;
 
 	intel_reset_gt_powersave(dev);
 
 	mutex_lock(&dev->struct_mutex);
+	atomic_andnot(I915_WEDGED, &error->reset_counter);
+	reset_counter = atomic_inc_return(&error->reset_counter);
+	if (WARN_ON(__i915_reset_in_progress(reset_counter))) {
+		ret = -EIO;
+		goto error;
+	}
 
 	i915_gem_reset(dev);
 
-	simulated = dev_priv->gpu_error.stop_rings != 0;
+	simulated = error->stop_rings != 0;
 
 	ret = intel_gpu_reset(dev);
 
 	/* Also reset the gpu hangman. */
 	if (simulated) {
 		DRM_INFO("Simulated gpu hang, resetting stop_rings\n");
-		dev_priv->gpu_error.stop_rings = 0;
+		error->stop_rings = 0;
 		if (ret == -ENODEV) {
 			DRM_INFO("Reset not implemented, but ignoring "
 				 "error for simulated gpu hangs\n");
@@ -952,8 +960,7 @@ int i915_reset(struct drm_device *dev)
 
 	if (ret) {
 		DRM_ERROR("Failed to reset chip: %i\n", ret);
-		mutex_unlock(&dev->struct_mutex);
-		return ret;
+		goto error;
 	}
 
 	intel_overlay_reset(dev_priv);
@@ -972,20 +979,14 @@ int i915_reset(struct drm_device *dev)
 	 * was running at the time of the reset (i.e. we weren't VT
 	 * switched away).
 	 */
-
-	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */
-	dev_priv->gpu_error.reload_in_reset = true;
-
 	ret = i915_gem_init_hw(dev);
-
-	dev_priv->gpu_error.reload_in_reset = false;
-
-	mutex_unlock(&dev->struct_mutex);
 	if (ret) {
 		DRM_ERROR("Failed hw init on reset %d\n", ret);
-		return ret;
+		goto error;
 	}
 
+	mutex_unlock(&dev->struct_mutex);
+
 	/*
 	 * rps/rc6 re-init is necessary to restore state lost after the
 	 * reset and the re-install of gt irqs. Skip for ironlake per
@@ -996,6 +997,11 @@ int i915_reset(struct drm_device *dev)
 		intel_enable_gt_powersave(dev);
 
 	return 0;
+
+error:
+	atomic_or(I915_WEDGED, &error->reset_counter);
+	mutex_unlock(&dev->struct_mutex);
+	return ret;
 }
 
 static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d07041c1729d..c24c23d8a0c0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1384,9 +1384,6 @@ struct i915_gpu_error {
 
 	/* For missed irq/seqno simulation. */
 	unsigned int test_irq_rings;
-
-	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset   */
-	bool reload_in_reset;
 };
 
 enum modeset_restore {
@@ -2181,6 +2178,7 @@ struct drm_i915_gem_request {
 	/** On Which ring this request was generated */
 	struct drm_i915_private *i915;
 	struct intel_engine_cs *ring;
+	unsigned reset_counter;
 
 	 /** GEM sequence number associated with the previous request,
 	  * when the HWS breadcrumb is equal to this the GPU is processing
@@ -2976,23 +2974,47 @@ i915_gem_find_active_request(struct intel_engine_cs *ring);
 
 bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
-int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
+int __must_check i915_gem_check_wedge(unsigned reset_counter,
 				      bool interruptible);
 
+static inline u32 i915_reset_counter(struct i915_gpu_error *error)
+{
+	return atomic_read(&error->reset_counter);
+}
+
+static inline bool __i915_reset_in_progress(u32 reset)
+{
+	return unlikely(reset & I915_RESET_IN_PROGRESS_FLAG);
+}
+
+static inline bool __i915_reset_in_progress_or_wedged(u32 reset)
+{
+	return unlikely(reset & (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
+}
+
+static inline bool __i915_terminally_wedged(u32 reset)
+{
+	return unlikely(reset & I915_WEDGED);
+}
+
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
-	return unlikely(atomic_read(&error->reset_counter)
-			& (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
+	return __i915_reset_in_progress(i915_reset_counter(error));
+}
+
+static inline bool i915_reset_in_progress_or_wedged(struct i915_gpu_error *error)
+{
+	return __i915_reset_in_progress_or_wedged(i915_reset_counter(error));
 }
 
 static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
 {
-	return atomic_read(&error->reset_counter) & I915_WEDGED;
+	return __i915_terminally_wedged(i915_reset_counter(error));
 }
 
 static inline u32 i915_reset_count(struct i915_gpu_error *error)
 {
-	return ((atomic_read(&error->reset_counter) & ~I915_WEDGED) + 1) / 2;
+	return ((i915_reset_counter(error) & ~I915_WEDGED) + 1) / 2;
 }
 
 static inline bool i915_stop_ring_allow_ban(struct drm_i915_private *dev_priv)
@@ -3025,7 +3047,6 @@ void __i915_add_request(struct drm_i915_gem_request *req,
 #define i915_add_request_no_flush(req) \
 	__i915_add_request(req, NULL, false)
 int __i915_wait_request(struct drm_i915_gem_request *req,
-			unsigned reset_counter,
 			bool interruptible,
 			s64 *timeout,
 			struct intel_rps_client *rps);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index ada461e02718..646d189e23a1 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -85,9 +85,7 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
 {
 	int ret;
 
-#define EXIT_COND (!i915_reset_in_progress(error) || \
-		   i915_terminally_wedged(error))
-	if (EXIT_COND)
+	if (!i915_reset_in_progress(error))
 		return 0;
 
 	/*
@@ -96,17 +94,16 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
 	 * we should simply try to bail out and fail as gracefully as possible.
 	 */
 	ret = wait_event_interruptible_timeout(error->reset_queue,
-					       EXIT_COND,
+					       !i915_reset_in_progress(error),
 					       10*HZ);
 	if (ret == 0) {
 		DRM_ERROR("Timed out waiting for the gpu reset to complete\n");
 		return -EIO;
 	} else if (ret < 0) {
 		return ret;
+	} else {
+		return 0;
 	}
-#undef EXIT_COND
-
-	return 0;
 }
 
 int i915_mutex_lock_interruptible(struct drm_device *dev)
@@ -1110,26 +1107,20 @@ put_rpm:
 }
 
 int
-i915_gem_check_wedge(struct i915_gpu_error *error,
+i915_gem_check_wedge(unsigned reset_counter,
 		     bool interruptible)
 {
-	if (i915_reset_in_progress(error)) {
+	if (__i915_reset_in_progress_or_wedged(reset_counter)) {
 		/* Non-interruptible callers can't handle -EAGAIN, hence return
 		 * -EIO unconditionally for these. */
 		if (!interruptible)
 			return -EIO;
 
 		/* Recovery complete, but the reset failed ... */
-		if (i915_terminally_wedged(error))
+		if (__i915_terminally_wedged(reset_counter))
 			return -EIO;
 
-		/*
-		 * Check if GPU Reset is in progress - we need intel_ring_begin
-		 * to work properly to reinit the hw state while the gpu is
-		 * still marked as reset-in-progress. Handle this with a flag.
-		 */
-		if (!error->reload_in_reset)
-			return -EAGAIN;
+		return -EAGAIN;
 	}
 
 	return 0;
@@ -1223,7 +1214,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 /**
  * __i915_wait_request - wait until execution of request has finished
  * @req: duh!
- * @reset_counter: reset sequence associated with the given request
  * @interruptible: do an interruptible wait (normally yes)
  * @timeout: in - how long to wait (NULL forever); out - how much time remaining
  *
@@ -1238,7 +1228,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
  * errno with remaining time filled in timeout argument.
  */
 int __i915_wait_request(struct drm_i915_gem_request *req,
-			unsigned reset_counter,
 			bool interruptible,
 			s64 *timeout,
 			struct intel_rps_client *rps)
@@ -1289,12 +1278,12 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 
 		/* We need to check whether any gpu reset happened in between
 		 * the caller grabbing the seqno and now ... */
-		if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
-			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
-			 * is truely gone. */
-			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
-			if (ret == 0)
-				ret = -EAGAIN;
+		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
+			/* As we do not requeue the request over a GPU reset,
+			 * if one does occur we know that the request is
+			 * effectively complete.
+			 */
+			ret = 0;
 			break;
 		}
 
@@ -1462,13 +1451,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
-	if (ret)
-		return ret;
-
-	ret = __i915_wait_request(req,
-				  atomic_read(&dev_priv->gpu_error.reset_counter),
-				  interruptible, NULL, NULL);
+	ret = __i915_wait_request(req, interruptible, NULL, NULL);
 	if (ret)
 		return ret;
 
@@ -1543,7 +1526,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_request *requests[I915_NUM_RINGS];
-	unsigned reset_counter;
 	int ret, i, n = 0;
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -1552,12 +1534,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	if (!obj->active)
 		return 0;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, true);
-	if (ret)
-		return ret;
-
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
-
 	if (readonly) {
 		struct drm_i915_gem_request *req;
 
@@ -1579,9 +1555,9 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	}
 
 	mutex_unlock(&dev->struct_mutex);
+	ret = 0;
 	for (i = 0; ret == 0 && i < n; i++)
-		ret = __i915_wait_request(requests[i], reset_counter, true,
-					  NULL, rps);
+		ret = __i915_wait_request(requests[i], true, NULL, rps);
 	mutex_lock(&dev->struct_mutex);
 
 	for (i = 0; i < n; i++) {
@@ -2685,6 +2661,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct drm_i915_gem_request **req_out)
 {
 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
+	unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 	struct drm_i915_gem_request *req;
 	int ret;
 
@@ -2693,6 +2670,10 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 
 	*req_out = NULL;
 
+	ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
+	if (ret)
+		return ret;
+
 	req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
 	if (req == NULL)
 		return -ENOMEM;
@@ -2704,6 +2685,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	kref_init(&req->ref);
 	req->i915 = dev_priv;
 	req->ring = ring;
+	req->reset_counter = reset_counter;
 	req->ctx  = ctx;
 	i915_gem_context_reference(req->ctx);
 
@@ -3064,11 +3046,9 @@ retire:
 int
 i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_wait *args = data;
 	struct drm_i915_gem_object *obj;
 	struct drm_i915_gem_request *req[I915_NUM_RINGS];
-	unsigned reset_counter;
 	int i, n = 0;
 	int ret;
 
@@ -3102,7 +3082,6 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	}
 
 	drm_gem_object_unreference(&obj->base);
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
 		if (obj->last_read_req[i] == NULL)
@@ -3115,11 +3094,23 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 
 	for (i = 0; i < n; i++) {
 		if (ret == 0)
-			ret = __i915_wait_request(req[i], reset_counter, true,
+			ret = __i915_wait_request(req[i], true,
 						  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
 						  file->driver_priv);
+
+		/* If the GPU hung before this, report it. Ideally we only
+		 * report if this request cannot be completed. Currently
+		 * when we don't mark the guilty party and abort all
+		 * requests on reset, so just mark all as EIO.
+		 */
+		if (ret == 0 &&
+		    req[i]->reset_counter != i915_reset_counter(&req[i]->i915->gpu_error))
+			ret = -EIO;
+
 		i915_gem_request_unreference__unlocked(req[i]);
 	}
+
+
 	return ret;
 
 out:
@@ -3147,7 +3138,6 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
 		struct drm_i915_private *i915 = to_i915(obj->base.dev);
 		ret = __i915_wait_request(from_req,
-					  atomic_read(&i915->gpu_error.reset_counter),
 					  i915->mm.interruptible,
 					  NULL,
 					  &i915->rps.semaphores);
@@ -4076,14 +4066,15 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	unsigned long recent_enough = jiffies - DRM_I915_THROTTLE_JIFFIES;
 	struct drm_i915_gem_request *request, *target = NULL;
-	unsigned reset_counter;
 	int ret;
 
 	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
 	if (ret)
 		return ret;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
+	/* ABI: return -EIO if wedged */
+	ret = i915_gem_check_wedge(i915_reset_counter(&dev_priv->gpu_error),
+				   false);
 	if (ret)
 		return ret;
 
@@ -4101,7 +4092,6 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 		target = request;
 	}
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
 	if (target)
 		i915_gem_request_reference(target);
 	spin_unlock(&file_priv->mm.lock);
@@ -4109,7 +4099,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (target == NULL)
 		return 0;
 
-	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
+	ret = __i915_wait_request(target, true, NULL, NULL);
 	if (ret == 0)
 		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
 
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index e88d692583a5..3b2a8fbe0392 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2434,7 +2434,6 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 static void i915_reset_and_wakeup(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct i915_gpu_error *error = &dev_priv->gpu_error;
 	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
 	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
 	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
@@ -2452,7 +2451,7 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 	 * the reset in-progress bit is only ever set by code outside of this
 	 * work we don't need to worry about any other races.
 	 */
-	if (i915_reset_in_progress(error) && !i915_terminally_wedged(error)) {
+	if (i915_reset_in_progress(&dev_priv->gpu_error)) {
 		DRM_DEBUG_DRIVER("resetting chip\n");
 		kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE,
 				   reset_event);
@@ -2480,25 +2479,9 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 
 		intel_runtime_pm_put(dev_priv);
 
-		if (ret == 0) {
-			/*
-			 * After all the gem state is reset, increment the reset
-			 * counter and wake up everyone waiting for the reset to
-			 * complete.
-			 *
-			 * Since unlock operations are a one-sided barrier only,
-			 * we need to insert a barrier here to order any seqno
-			 * updates before
-			 * the counter increment.
-			 */
-			smp_mb__before_atomic();
-			atomic_inc(&dev_priv->gpu_error.reset_counter);
-
+		if (ret == 0)
 			kobject_uevent_env(&dev->primary->kdev->kobj,
 					   KOBJ_CHANGE, reset_done_event);
-		} else {
-			atomic_or(I915_WEDGED, &error->reset_counter);
-		}
 
 		/*
 		 * Note: The wake_up also serves as a memory barrier so that
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 0743337ac9e8..d77a17bb94d0 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3289,8 +3289,7 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	bool pending;
 
-	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
-	    intel_crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
+	if (intel_crtc->reset_counter != i915_reset_counter(&dev_priv->gpu_error))
 		return false;
 
 	spin_lock_irq(&dev->event_lock);
@@ -10879,8 +10878,7 @@ static bool page_flip_finished(struct intel_crtc *crtc)
 	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
-	    crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
+	if (crtc->reset_counter != i915_reset_counter(&dev_priv->gpu_error))
 		return true;
 
 	/*
@@ -11322,7 +11320,6 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
 
 	if (mmio_flip->req) {
 		WARN_ON(__i915_wait_request(mmio_flip->req,
-					    mmio_flip->crtc->reset_counter,
 					    false, NULL,
 					    &mmio_flip->i915->rps.mmioflips));
 		i915_gem_request_unreference__unlocked(mmio_flip->req);
@@ -13312,9 +13309,6 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);
 	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
-		u32 reset_counter;
-
-		reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
 		mutex_unlock(&dev->struct_mutex);
 
 		for_each_plane_in_state(state, plane, plane_state, i) {
@@ -13325,8 +13319,7 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 				continue;
 
 			ret = __i915_wait_request(intel_plane_state->wait_req,
-						  reset_counter, true,
-						  NULL, NULL);
+						  true, NULL, NULL);
 
 			/* Swallow -EIO errors to allow updates during hw lockup. */
 			if (ret == -EIO)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4ebafab53f30..b40ffc3607e7 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -711,6 +711,14 @@ static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
 	if (ret)
 		return ret;
 
+	/* If the request was completed due to a GPU hang, we want to
+	 * error out before we continue to emit more commands to the GPU.
+	 */
+	ret = i915_gem_check_wedge(i915_reset_counter(&req->i915->gpu_error),
+				   req->i915->mm.interruptible);
+	if (ret)
+		return ret;
+
 	ringbuf->space = space;
 	return 0;
 }
@@ -825,11 +833,6 @@ int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
 	WARN_ON(req == NULL);
 	dev_priv = req->ring->dev->dev_private;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
-				   dev_priv->mm.interruptible);
-	if (ret)
-		return ret;
-
 	ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 57d78f264b53..511efe556d73 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2254,6 +2254,14 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
 	if (ret)
 		return ret;
 
+	/* If the request was completed due to a GPU hang, we want to
+	 * error out before we continue to emit more commands to the GPU.
+	 */
+	ret = i915_gem_check_wedge(i915_reset_counter(&to_i915(ring->dev)->gpu_error),
+				   to_i915(ring->dev)->mm.interruptible);
+	if (ret)
+		return ret;
+
 	ringbuf->space = space;
 	return 0;
 }
@@ -2286,7 +2294,6 @@ int intel_ring_idle(struct intel_engine_cs *ring)
 
 	/* Make sure we do not trigger any retires */
 	return __i915_wait_request(req,
-				   atomic_read(&to_i915(ring->dev)->gpu_error.reset_counter),
 				   to_i915(ring->dev)->mm.interruptible,
 				   NULL, NULL);
 }
@@ -2417,11 +2424,6 @@ int intel_ring_begin(struct drm_i915_gem_request *req,
 	ring = req->ring;
 	dev_priv = ring->dev->dev_private;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
-				   dev_priv->mm.interruptible);
-	if (ret)
-		return ret;
-
 	ret = __intel_ring_prepare(ring, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 05/15] drm/i915: Suppress error message when GPU resets are disabled
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (3 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 04/15] drm/i915: Cache the reset_counter for the request Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-12-01  8:30   ` Daniel Vetter
  2015-11-29  8:48 ` [PATCH 06/15] drm/i915: Delay queuing hangcheck to wait-request Chris Wilson
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

If we do not have lowlevel support for reseting the GPU, or if the user
has explicitly disabled reseting the device, the failure is expected.
Since it is an expected failure, we should be using a lower priority
message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
emit the expected failure as a DEBUG message.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index eb893a5e00b1..476bb696dbcb 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -959,7 +959,10 @@ int i915_reset(struct drm_device *dev)
 		pr_notice("drm/i915: Resetting chip after gpu hang\n");
 
 	if (ret) {
-		DRM_ERROR("Failed to reset chip: %i\n", ret);
+		if (ret != -ENODEV)
+			DRM_ERROR("Failed to reset chip: %i\n", ret);
+		else
+			DRM_DEBUG_DRIVER("GPU reset disabled\n");
 		goto error;
 	}
 
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 06/15] drm/i915: Delay queuing hangcheck to wait-request
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (4 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 05/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-29  8:48 ` [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request Chris Wilson
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

We can forgo queuing the hangcheck from the start of every request to
until we wait upon a request. This reduces the overhead of every
request, but may increase the latency of detecting a hang. Howeever, if
nothing every waits upon a hang, did it ever hang? It also improves the
robustness of the wait-request by ensuring that the hangchecker is
indeed running before we sleep indefinitely (and thereby ensuring that
we never actually sleep forever waiting for a dead GPU).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h | 2 +-
 drivers/gpu/drm/i915/i915_gem.c | 4 ++--
 drivers/gpu/drm/i915/i915_irq.c | 9 ++++-----
 3 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index c24c23d8a0c0..e1980212ba37 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2711,7 +2711,7 @@ void intel_hpd_cancel_work(struct drm_i915_private *dev_priv);
 bool intel_hpd_pin_to_port(enum hpd_pin pin, enum port *port);
 
 /* i915_irq.c */
-void i915_queue_hangcheck(struct drm_device *dev);
+void i915_queue_hangcheck(struct drm_i915_private *dev_priv);
 __printf(3, 4)
 void i915_handle_error(struct drm_device *dev, bool wedged,
 		       const char *fmt, ...);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 646d189e23a1..f33c35c6130f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1302,6 +1302,8 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			break;
 		}
 
+		i915_queue_hangcheck(dev_priv);
+
 		timer.function = NULL;
 		if (timeout || missed_irq(dev_priv, ring)) {
 			unsigned long expire;
@@ -2579,8 +2581,6 @@ void __i915_add_request(struct drm_i915_gem_request *request,
 
 	trace_i915_gem_request_add(request);
 
-	i915_queue_hangcheck(ring->dev);
-
 	queue_delayed_work(dev_priv->wq,
 			   &dev_priv->mm.retire_work,
 			   round_jiffies_up_relative(HZ));
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 3b2a8fbe0392..28c0329f8281 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -3066,15 +3066,14 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 	if (rings_hung)
 		return i915_handle_error(dev, true, "Ring hung");
 
+	/* Reset timer in case GPU hangs without another request being added */
 	if (busy_count)
-		/* Reset timer case chip hangs without another request
-		 * being added */
-		i915_queue_hangcheck(dev);
+		i915_queue_hangcheck(dev_priv);
 }
 
-void i915_queue_hangcheck(struct drm_device *dev)
+void i915_queue_hangcheck(struct drm_i915_private *dev_priv)
 {
-	struct i915_gpu_error *e = &to_i915(dev)->gpu_error;
+	struct i915_gpu_error *e = &dev_priv->gpu_error;
 
 	if (!i915.enable_hangcheck)
 		return;
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (5 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 06/15] drm/i915: Delay queuing hangcheck to wait-request Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-30 10:14   ` Tvrtko Ursulin
  2015-11-30 10:22   ` Chris Wilson
  2015-11-29  8:48 ` [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
                   ` (7 subsequent siblings)
  14 siblings, 2 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lionel Landwerlin, Daniel Vetter

We have relied upon the sole caller (wait_ioctl) validating the timeout
argument. However, when waiting for multiple requests I forgot to ensure
that the timeout was still positive on the later requests. This is more
simply done inside __i915_wait_request.

Fixes a minor regression introduced in

commit b47161858ba13c9c7e03333132230d66e008dd55
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date:   Mon Apr 27 13:41:17 2015 +0100

    drm/i915: Implement inter-engine read-read optimisations

where we may end up waiting for an extra jiffie for each active ring
after consuming all of the user's timeout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
---
 drivers/gpu/drm/i915/i915_gem.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index f33c35c6130f..65d101b47d8e 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1251,8 +1251,16 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	if (i915_gem_request_completed(req, true))
 		return 0;
 
-	timeout_expire = timeout ?
-		jiffies + nsecs_to_jiffies_timeout((u64)*timeout) : 0;
+	timeout_expire = 0;
+	if (timeout) {
+		if (WARN_ON(*timeout < 0))
+			return -EINVAL;
+
+		if (*timeout == 0)
+			return -ETIME;
+
+		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
+	}
 
 	if (INTEL_INFO(dev_priv)->gen >= 6)
 		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (6 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-30 10:53   ` Chris Wilson
                     ` (3 more replies)
  2015-11-29  8:48 ` [PATCH 09/15] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
                   ` (6 subsequent siblings)
  14 siblings, 4 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batchbuffer - hence the thunder of hooves as then every client must
do its heavyweight dance to read a coherent seqno to see if it is the
lucky one. Alternatively, we can have one kthread responsible for waking
after an interrupt, checking the seqno and only waking up the waiting
clients who are complete. The disadvantage is that in the uncontended
scenario (i.e. only one waiter) we incur an extra context switch in the
wakeup path - though that should be mitigated somewhat by the busy-wait
we do first before sleeping.

v2: Convert from a kworker per engine into a dedicated kthread for the
bottom-half.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile            |   1 +
 drivers/gpu/drm/i915/i915_dma.c          |   9 +-
 drivers/gpu/drm/i915/i915_drv.h          |  35 ++++-
 drivers/gpu/drm/i915/i915_gem.c          |  99 +++---------
 drivers/gpu/drm/i915/i915_gpu_error.c    |   8 +-
 drivers/gpu/drm/i915/i915_irq.c          |  17 +--
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 253 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   3 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   3 +-
 10 files changed, 333 insertions(+), 97 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_breadcrumbs.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0851de07bd13..d3b9d3618719 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
 	  i915_gem_userptr.o \
 	  i915_gpu_error.o \
 	  i915_trace_points.o \
+	  intel_breadcrumbs.o \
 	  intel_lrc.o \
 	  intel_mocs.o \
 	  intel_ringbuffer.o \
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a81c76603544..93c6762d87b4 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -388,12 +388,16 @@ static int i915_load_modeset_init(struct drm_device *dev)
 	if (ret)
 		goto cleanup_vga_client;
 
+	ret = intel_breadcrumbs_init(dev_priv);
+	if (ret)
+		goto cleanup_vga_switcheroo;
+
 	/* Initialise stolen first so that we may reserve preallocated
 	 * objects for the BIOS to KMS transition.
 	 */
 	ret = i915_gem_init_stolen(dev);
 	if (ret)
-		goto cleanup_vga_switcheroo;
+		goto cleanup_breadcrumbs;
 
 	intel_power_domains_init_hw(dev_priv, false);
 
@@ -454,6 +458,8 @@ cleanup_irq:
 	drm_irq_uninstall(dev);
 cleanup_gem_stolen:
 	i915_gem_cleanup_stolen(dev);
+cleanup_breadcrumbs:
+	intel_breadcrumbs_fini(dev_priv);
 cleanup_vga_switcheroo:
 	vga_switcheroo_unregister_client(dev->pdev);
 cleanup_vga_client:
@@ -1193,6 +1199,7 @@ int i915_driver_unload(struct drm_device *dev)
 	mutex_unlock(&dev->struct_mutex);
 	intel_fbc_cleanup_cfb(dev_priv);
 	i915_gem_cleanup_stolen(dev);
+	intel_breadcrumbs_fini(dev_priv);
 
 	intel_csr_ucode_fini(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e1980212ba37..782d08ab6264 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -564,6 +564,7 @@ struct drm_i915_error_state {
 			long jiffies;
 			u32 seqno;
 			u32 tail;
+			u32 waiters;
 		} *requests;
 
 		struct {
@@ -1383,7 +1384,7 @@ struct i915_gpu_error {
 #define I915_STOP_RING_ALLOW_WARN      (1 << 30)
 
 	/* For missed irq/seqno simulation. */
-	unsigned int test_irq_rings;
+	unsigned long test_irq_rings;
 };
 
 enum modeset_restore {
@@ -1695,6 +1696,27 @@ struct drm_i915_private {
 
 	struct intel_uncore uncore;
 
+	/* Rather than have every client wait upon all user interrupts,
+	 * with the herd waking after every interrupt and each doing the
+	 * heavyweight seqno dance, we delegate the task to a bottom-half
+	 * for the user interrupt (one bh handling all engines). Now,
+	 * just one task is woken up and does the coherent seqno read and
+	 * then wakes up all the waiters registered with the bottom-half
+	 * that are complete. This avoid the thundering herd problem
+	 * with lots of concurrent waiters, at the expense of an extra
+	 * context switch between the interrupt and the client. That should
+	 * be mitigated somewhat by the busyspin in the client before we go to
+	 * sleep waiting for an interrupt from the bottom-half.
+	 */
+	struct intel_breadcrumbs {
+		spinlock_t lock; /* protects the per-engine request list */
+		struct task_struct *task; /* bh for all user interrupts */
+		struct intel_breadcrumbs_engine {
+			struct rb_root requests; /* sorted by retirement */
+			struct drm_i915_gem_request *first;
+		} engine[I915_NUM_RINGS];
+	} breadcrumbs;
+
 	struct i915_virtual_gpu vgpu;
 
 	struct intel_guc guc;
@@ -2228,6 +2250,11 @@ struct drm_i915_gem_request {
 	/** global list entry for this request */
 	struct list_head list;
 
+	/** List of clients waiting for completion of this request */
+	wait_queue_head_t wait;
+	struct rb_node irq_node;
+	unsigned irq_count;
+
 	struct drm_i915_file_private *file_priv;
 	/** file_priv list entry for this request */
 	struct list_head client_list;
@@ -2800,6 +2827,12 @@ ibx_disable_display_interrupt(struct drm_i915_private *dev_priv, uint32_t bits)
 }
 
 
+/* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
+int intel_breadcrumbs_init(struct drm_i915_private *i915);
+bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request);
+void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request);
+void intel_breadcrumbs_fini(struct drm_i915_private *i915);
+
 /* i915_gem.c */
 int i915_gem_create_ioctl(struct drm_device *dev, void *data,
 			  struct drm_file *file_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 65d101b47d8e..969592fb0969 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1126,17 +1126,6 @@ i915_gem_check_wedge(unsigned reset_counter,
 	return 0;
 }
 
-static void fake_irq(unsigned long data)
-{
-	wake_up_process((struct task_struct *)data);
-}
-
-static bool missed_irq(struct drm_i915_private *dev_priv,
-		       struct intel_engine_cs *ring)
-{
-	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
-}
-
 static unsigned long local_clock_us(unsigned *cpu)
 {
 	unsigned long t;
@@ -1184,9 +1173,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 	 * takes to sleep on a request, on the order of a microsecond.
 	 */
 
-	if (req->ring->irq_refcount)
-		return -EBUSY;
-
 	/* Only spin if we know the GPU is processing this request */
 	if (!i915_gem_request_started(req, false))
 		return -EAGAIN;
@@ -1232,26 +1218,19 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			s64 *timeout,
 			struct intel_rps_client *rps)
 {
-	struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
-	struct drm_device *dev = ring->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const bool irq_test_in_progress =
-		ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
 	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
 	DEFINE_WAIT(wait);
-	unsigned long timeout_expire;
+	unsigned long timeout_remain;
 	s64 before, now;
 	int ret;
 
-	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
-
 	if (list_empty(&req->list))
 		return 0;
 
 	if (i915_gem_request_completed(req, true))
 		return 0;
 
-	timeout_expire = 0;
+	timeout_remain = 0;
 	if (timeout) {
 		if (WARN_ON(*timeout < 0))
 			return -EINVAL;
@@ -1259,43 +1238,28 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 		if (*timeout == 0)
 			return -ETIME;
 
-		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
+		timeout_remain = nsecs_to_jiffies_timeout(*timeout);
 	}
 
-	if (INTEL_INFO(dev_priv)->gen >= 6)
-		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
-
 	/* Record current time in case interrupted by signal, or wedged */
 	trace_i915_gem_request_wait_begin(req);
 	before = ktime_get_raw_ns();
 
-	/* Optimistic spin for the next jiffie before touching IRQs */
-	ret = __i915_spin_request(req, state);
-	if (ret == 0)
-		goto out;
+	if (INTEL_INFO(req->i915)->gen >= 6)
+		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
 
-	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
-		ret = -ENODEV;
-		goto out;
+	/* Optimistic spin for the next jiffie before touching IRQs */
+	if (intel_breadcrumbs_add_waiter(req)) {
+		ret = __i915_spin_request(req, state);
+		if (ret == 0)
+			goto out;
 	}
 
 	for (;;) {
-		struct timer_list timer;
-
-		prepare_to_wait(&ring->irq_queue, &wait, state);
+		prepare_to_wait(&req->wait, &wait, state);
 
-		/* We need to check whether any gpu reset happened in between
-		 * the caller grabbing the seqno and now ... */
-		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
-			/* As we do not requeue the request over a GPU reset,
-			 * if one does occur we know that the request is
-			 * effectively complete.
-			 */
-			ret = 0;
-			break;
-		}
-
-		if (i915_gem_request_completed(req, false)) {
+		if (i915_gem_request_completed(req, true) ||
+		    req->reset_counter != i915_reset_counter(&req->i915->gpu_error)) {
 			ret = 0;
 			break;
 		}
@@ -1305,36 +1269,19 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			break;
 		}
 
-		if (timeout && time_after_eq(jiffies, timeout_expire)) {
-			ret = -ETIME;
-			break;
-		}
-
-		i915_queue_hangcheck(dev_priv);
-
-		timer.function = NULL;
-		if (timeout || missed_irq(dev_priv, ring)) {
-			unsigned long expire;
-
-			setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
-			expire = missed_irq(dev_priv, ring) ? jiffies + 1 : timeout_expire;
-			mod_timer(&timer, expire);
-		}
-
-		io_schedule();
-
-		if (timer.function) {
-			del_singleshot_timer_sync(&timer);
-			destroy_timer_on_stack(&timer);
-		}
+		if (timeout) {
+			timeout_remain = io_schedule_timeout(timeout_remain);
+			if (timeout_remain == 0) {
+				ret = -ETIME;
+				break;
+			}
+		} else
+			io_schedule();
 	}
-	if (!irq_test_in_progress)
-		ring->irq_put(ring);
-
-	finish_wait(&ring->irq_queue, &wait);
-
+	finish_wait(&req->wait, &wait);
 out:
 	now = ktime_get_raw_ns();
+	intel_breadcrumbs_remove_waiter(req);
 	trace_i915_gem_request_wait_end(req);
 
 	if (timeout) {
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 06ca4082735b..d1df405b905c 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -453,10 +453,11 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 				   dev_priv->ring[i].name,
 				   error->ring[i].num_requests);
 			for (j = 0; j < error->ring[i].num_requests; j++) {
-				err_printf(m, "  seqno 0x%08x, emitted %ld, tail 0x%08x\n",
+				err_printf(m, "  seqno 0x%08x, emitted %ld, tail 0x%08x, waiters? %d\n",
 					   error->ring[i].requests[j].seqno,
 					   error->ring[i].requests[j].jiffies,
-					   error->ring[i].requests[j].tail);
+					   error->ring[i].requests[j].tail,
+					   error->ring[i].requests[j].waiters);
 			}
 		}
 
@@ -900,7 +901,7 @@ static void i915_record_ring_state(struct drm_device *dev,
 		ering->instdone = I915_READ(GEN2_INSTDONE);
 	}
 
-	ering->waiting = waitqueue_active(&ring->irq_queue);
+	ering->waiting = READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first);
 	ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
 	ering->seqno = ring->get_seqno(ring, false);
 	ering->acthd = intel_ring_get_active_head(ring);
@@ -1105,6 +1106,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 			erq->seqno = request->seqno;
 			erq->jiffies = request->emitted_jiffies;
 			erq->tail = request->postfix;
+			erq->waiters = request->irq_count;
 		}
 	}
 }
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 28c0329f8281..0c2390476bdb 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -996,12 +996,11 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 
 static void notify_ring(struct intel_engine_cs *ring)
 {
-	if (!intel_ring_initialized(ring))
+	if (ring->i915 == NULL)
 		return;
 
 	trace_i915_gem_request_notify(ring);
-
-	wake_up_all(&ring->irq_queue);
+	wake_up_process(ring->i915->breadcrumbs.task);
 }
 
 static void vlv_c0_read(struct drm_i915_private *dev_priv,
@@ -2399,9 +2398,6 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
 static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 			       bool reset_completed)
 {
-	struct intel_engine_cs *ring;
-	int i;
-
 	/*
 	 * Notify all waiters for GPU completion events that reset state has
 	 * been changed, and that they need to restart their wait after
@@ -2410,8 +2406,7 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 	 */
 
 	/* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
-	for_each_ring(ring, dev_priv, i)
-		wake_up_all(&ring->irq_queue);
+	wake_up_process(dev_priv->breadcrumbs.task);
 
 	/* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
 	wake_up_all(&dev_priv->pending_flip_queue);
@@ -2986,16 +2981,16 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 			if (ring_idle(ring, seqno)) {
 				ring->hangcheck.action = HANGCHECK_IDLE;
 
-				if (waitqueue_active(&ring->irq_queue)) {
+				if (READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first)) {
 					/* Issue a wake-up to catch stuck h/w. */
 					if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) {
-						if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)))
+						if (!test_bit(ring->id, &dev_priv->gpu_error.test_irq_rings))
 							DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
 								  ring->name);
 						else
 							DRM_INFO("Fake missed irq on %s\n",
 								 ring->name);
-						wake_up_all(&ring->irq_queue);
+						wake_up_process(dev_priv->breadcrumbs.task);
 					}
 					/* Safeguard against driver failure */
 					ring->hangcheck.score += BUSY;
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
new file mode 100644
index 000000000000..0a18405c8689
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -0,0 +1,253 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/kthread.h>
+
+#include "i915_drv.h"
+
+static bool __irq_enable(struct intel_engine_cs *engine)
+{
+	if (test_bit(engine->id, &engine->i915->gpu_error.test_irq_rings))
+		return false;
+
+	if (!intel_irqs_enabled(engine->i915))
+		return false;
+
+	return engine->irq_get(engine);
+}
+
+static struct drm_i915_gem_request *to_request(struct rb_node *rb)
+{
+	if (rb == NULL)
+		return NULL;
+
+	return rb_entry(rb, struct drm_i915_gem_request, irq_node);
+}
+
+/*
+ * intel_breadcrumbs_irq() acts as the bottom-half for the user interrupt,
+ * which we use as the breadcrumb after every request. In order to scale
+ * to many concurrent waiters, we delegate the task of reading the coherent
+ * seqno to this bottom-half. (Otherwise every waiter is woken after each
+ * interrupt and they all attempt to perform the heavyweight coherent
+ * seqno check.) Individual clients register with the bottom-half when
+ * waiting on a request, and are then woken when the bottom-half notices
+ * their seqno is complete. This incurs an extra context switch from the
+ * interrupt to the client in the uncontended case.
+ */
+static int intel_breadcrumbs_irq(void *data)
+{
+	struct drm_i915_private *i915 = data;
+	struct intel_breadcrumbs *b = &i915->breadcrumbs;
+	unsigned irq_get = 0;
+	unsigned irq_enabled = 0;
+	int i;
+
+	while (!kthread_should_stop()) {
+		unsigned reset_counter = i915_reset_counter(&i915->gpu_error);
+		unsigned long timeout_jiffies;
+		bool idling = false;
+
+		/* On every tick, walk the seqno-ordered list of requests
+		 * and for each retired request wakeup its clients. If we
+		 * find an unfinished request, go back to sleep.
+		 */
+		set_current_state(TASK_INTERRUPTIBLE);
+
+		/* Note carefully that we do not hold a reference for the
+		 * requests on this list. Therefore we only inspect them
+		 * whilst holding the spinlock to ensure that they are not
+		 * freed in the meantime, and the client must remove the
+		 * request from the list if it is interrupted (before it
+		 * itself releases its reference).
+		 */
+		spin_lock(&b->lock);
+		for (i = 0; i < I915_NUM_RINGS; i++) {
+			struct intel_engine_cs *engine = &i915->ring[i];
+			struct intel_breadcrumbs_engine *be = &b->engine[i];
+			struct drm_i915_gem_request *request = be->first;
+
+			if (request == NULL) {
+				if ((irq_get & (1 << i))) {
+					if (irq_enabled & (1 << i)) {
+						engine->irq_put(engine);
+						irq_enabled &= ~ (1 << i);
+					}
+					intel_runtime_pm_put(i915);
+					irq_get &= ~(1 << i);
+				}
+				continue;
+			}
+
+			if ((irq_get & (1 << i)) == 0) {
+				intel_runtime_pm_get(i915);
+				irq_enabled |= __irq_enable(engine) << i;
+				irq_get |= 1 << i;
+			}
+
+			do {
+				struct rb_node *next;
+
+				if (request->reset_counter == reset_counter &&
+				    !i915_gem_request_completed(request, false))
+					break;
+
+				next = rb_next(&request->irq_node);
+				rb_erase(&request->irq_node, &be->requests);
+				RB_CLEAR_NODE(&request->irq_node);
+
+				wake_up_all(&request->wait);
+
+				request = to_request(next);
+			} while (request);
+			be->first = request;
+			idling |= request == NULL;
+
+			/* Make sure the hangcheck timer is armed in case
+			 * the GPU hangs and we are never woken up.
+			 */
+			if (request)
+				i915_queue_hangcheck(i915);
+		}
+		spin_unlock(&b->lock);
+
+		/* If we don't have interrupts available (either we have
+		 * detected the hardware is missing interrupts or the
+		 * interrupt delivery was disabled by the user), wake up
+		 * every jiffie to check for request completion.
+		 *
+		 * If we have processed all requests for one engine, we
+		 * also wish to wake up in a jiffie to turn off the
+		 * breadcrumb interrupt for that engine. We delay
+		 * switching off the interrupt in order to allow another
+		 * waiter to start without incurring additional latency
+		 * enabling the interrupt.
+		 */
+		timeout_jiffies = MAX_SCHEDULE_TIMEOUT;
+		if (idling || i915->gpu_error.missed_irq_rings & irq_enabled)
+			timeout_jiffies = 1;
+
+		/* Unlike the individual clients, we do not want this
+		 * background thread to contribute to the system load,
+		 * i.e. we do not want to use io_schedule() here.
+		 */
+		schedule_timeout(timeout_jiffies);
+	}
+
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		if ((irq_get & (1 << i))) {
+			if (irq_enabled & (1 << i)) {
+				struct intel_engine_cs *engine = &i915->ring[i];
+				engine->irq_put(engine);
+			}
+			intel_runtime_pm_put(i915);
+		}
+	}
+
+	return 0;
+}
+
+bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request)
+{
+	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
+	bool first = false;
+
+	spin_lock(&b->lock);
+	if (request->irq_count++ == 0) {
+		struct intel_breadcrumbs_engine *be =
+			&b->engine[request->ring->id];
+		struct rb_node **p, *parent;
+
+		if (be->first == NULL)
+			wake_up_process(b->task);
+
+		init_waitqueue_head(&request->wait);
+
+		first = true;
+		parent = NULL;
+		p = &be->requests.rb_node;
+		while (*p) {
+			struct drm_i915_gem_request *__req;
+
+			parent = *p;
+			__req = rb_entry(parent, typeof(*__req), irq_node);
+
+			if (i915_seqno_passed(request->seqno, __req->seqno)) {
+				p = &parent->rb_right;
+				first = false;
+			} else
+				p = &parent->rb_left;
+		}
+		if (first)
+			be->first = request;
+
+		rb_link_node(&request->irq_node, parent, p);
+		rb_insert_color(&request->irq_node, &be->requests);
+	}
+	spin_unlock(&b->lock);
+
+	return first;
+}
+
+void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request)
+{
+	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
+
+	spin_lock(&b->lock);
+	if (--request->irq_count == 0 && !RB_EMPTY_NODE(&request->irq_node)) {
+		struct intel_breadcrumbs_engine *be =
+			&b->engine[request->ring->id];
+		if (be->first == request)
+			be->first = to_request(rb_next(&request->irq_node));
+		rb_erase(&request->irq_node, &be->requests);
+	}
+	spin_unlock(&b->lock);
+}
+
+int intel_breadcrumbs_init(struct drm_i915_private *i915)
+{
+	struct intel_breadcrumbs *b = &i915->breadcrumbs;
+	struct task_struct *task;
+
+	spin_lock_init(&b->lock);
+
+	task = kthread_run(intel_breadcrumbs_irq, i915, "irq/i915");
+	if (IS_ERR(task))
+		return PTR_ERR(task);
+
+	b->task = task;
+
+	return 0;
+}
+
+void intel_breadcrumbs_fini(struct drm_i915_private *i915)
+{
+	struct intel_breadcrumbs *b = &i915->breadcrumbs;
+
+	if (b->task == NULL)
+		return;
+
+	kthread_stop(b->task);
+	b->task = NULL;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b40ffc3607e7..604c081b0a0f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1922,10 +1922,10 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 	ring->buffer = NULL;
 
 	ring->dev = dev;
+	ring->i915 = to_i915(dev);
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
-	init_waitqueue_head(&ring->irq_queue);
 
 	INIT_LIST_HEAD(&ring->buffers);
 	INIT_LIST_HEAD(&ring->execlist_queue);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 511efe556d73..5a5a7195dd54 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2157,6 +2157,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	WARN_ON(ring->buffer);
 
 	ring->dev = dev;
+	ring->i915 = to_i915(dev);
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	INIT_LIST_HEAD(&ring->execlist_queue);
@@ -2164,8 +2165,6 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
 	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
 
-	init_waitqueue_head(&ring->irq_queue);
-
 	ringbuf = intel_engine_create_ringbuffer(ring, 32 * PAGE_SIZE);
 	if (IS_ERR(ringbuf))
 		return PTR_ERR(ringbuf);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 5d1eb206151d..49b5bded2767 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -157,6 +157,7 @@ struct  intel_engine_cs {
 #define LAST_USER_RING (VECS + 1)
 	u32		mmio_base;
 	struct		drm_device *dev;
+	struct drm_i915_private *i915;
 	struct intel_ringbuffer *buffer;
 	struct list_head buffers;
 
@@ -303,8 +304,6 @@ struct  intel_engine_cs {
 
 	bool gpu_caches_dirty;
 
-	wait_queue_head_t irq_queue;
-
 	struct intel_context *default_context;
 	struct intel_context *last_context;
 
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 09/15] drm/i915: Separate out the seqno-barrier from engine->get_seqno
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (7 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-29  8:48 ` [PATCH 10/15] drm/i915: Remove the lazy_coherency parameter from request-completed? Chris Wilson
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

In order to simplify the next couple of patches, extract the
lazy_coherency optimisation our of the engine->get_seqno() vfunc into
its own callback.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c     | 10 ++++-----
 drivers/gpu/drm/i915/i915_drv.h         | 12 ++++++----
 drivers/gpu/drm/i915/i915_gpu_error.c   |  2 +-
 drivers/gpu/drm/i915/i915_irq.c         |  4 ++--
 drivers/gpu/drm/i915/i915_trace.h       |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c        | 39 +++++++++++++--------------------
 drivers/gpu/drm/i915/intel_ringbuffer.c | 34 ++++++++++++++--------------
 drivers/gpu/drm/i915/intel_ringbuffer.h |  4 ++--
 8 files changed, 51 insertions(+), 56 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 8458447ddc17..1e8aa897673a 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -600,7 +600,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
 					   ring->name,
 					   i915_gem_request_get_seqno(work->flip_queued_req),
 					   dev_priv->next_seqno,
-					   ring->get_seqno(ring, true),
+					   ring->get_seqno(ring),
 					   i915_gem_request_completed(work->flip_queued_req, true));
 			} else
 				seq_printf(m, "Flip not associated with any ring\n");
@@ -730,10 +730,8 @@ static int i915_gem_request_info(struct seq_file *m, void *data)
 static void i915_ring_seqno_info(struct seq_file *m,
 				 struct intel_engine_cs *ring)
 {
-	if (ring->get_seqno) {
-		seq_printf(m, "Current sequence (%s): %x\n",
-			   ring->name, ring->get_seqno(ring, false));
-	}
+	seq_printf(m, "Current sequence (%s): %x\n",
+		   ring->name, ring->get_seqno(ring));
 }
 
 static int i915_gem_seqno_info(struct seq_file *m, void *data)
@@ -1342,7 +1340,7 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	intel_runtime_pm_get(dev_priv);
 
 	for_each_ring(ring, dev_priv, i) {
-		seqno[i] = ring->get_seqno(ring, false);
+		seqno[i] = ring->get_seqno(ring);
 		acthd[i] = intel_ring_get_active_head(ring);
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 782d08ab6264..a9c1785ac08c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2988,15 +2988,19 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 static inline bool i915_gem_request_started(struct drm_i915_gem_request *req,
 					   bool lazy_coherency)
 {
-	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
-	return i915_seqno_passed(seqno, req->previous_seqno);
+	if (!lazy_coherency && req->ring->seqno_barrier)
+		req->ring->seqno_barrier(req->ring);
+	return i915_seqno_passed(req->ring->get_seqno(req->ring),
+				 req->previous_seqno);
 }
 
 static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
 					      bool lazy_coherency)
 {
-	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
-	return i915_seqno_passed(seqno, req->seqno);
+	if (!lazy_coherency && req->ring->seqno_barrier)
+		req->ring->seqno_barrier(req->ring);
+	return i915_seqno_passed(req->ring->get_seqno(req->ring),
+				 req->seqno);
 }
 
 int __must_check i915_gem_get_seqno(struct drm_device *dev, u32 *seqno);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index d1df405b905c..7a427240c813 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -903,8 +903,8 @@ static void i915_record_ring_state(struct drm_device *dev,
 
 	ering->waiting = READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first);
 	ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
-	ering->seqno = ring->get_seqno(ring, false);
 	ering->acthd = intel_ring_get_active_head(ring);
+	ering->seqno = ring->get_seqno(ring);
 	ering->start = I915_READ_START(ring);
 	ering->head = I915_READ_HEAD(ring);
 	ering->tail = I915_READ_TAIL(ring);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 0c2390476bdb..43078e09e1f0 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2871,7 +2871,7 @@ static int semaphore_passed(struct intel_engine_cs *ring)
 	if (signaller->hangcheck.deadlock >= I915_NUM_RINGS)
 		return -1;
 
-	if (i915_seqno_passed(signaller->get_seqno(signaller, false), seqno))
+	if (i915_seqno_passed(signaller->get_seqno(signaller), seqno))
 		return 1;
 
 	/* cursory check for an unkickable deadlock */
@@ -2974,8 +2974,8 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 
 		semaphore_clear_deadlocks(dev_priv);
 
-		seqno = ring->get_seqno(ring, false);
 		acthd = intel_ring_get_active_head(ring);
+		seqno = ring->get_seqno(ring);
 
 		if (ring->hangcheck.seqno == seqno) {
 			if (ring_idle(ring, seqno)) {
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index 52b2d409945d..cfb5f78a6e84 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -573,7 +573,7 @@ TRACE_EVENT(i915_gem_request_notify,
 	    TP_fast_assign(
 			   __entry->dev = ring->dev->primary->index;
 			   __entry->ring = ring->id;
-			   __entry->seqno = ring->get_seqno(ring, false);
+			   __entry->seqno = ring->get_seqno(ring);
 			   ),
 
 	    TP_printk("dev=%u, ring=%u, seqno=%u",
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 604c081b0a0f..fef97dc8e02d 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1755,7 +1755,7 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request,
 	return 0;
 }
 
-static u32 gen8_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
+static u32 gen8_get_seqno(struct intel_engine_cs *ring)
 {
 	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
 }
@@ -1765,9 +1765,8 @@ static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno)
 	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
 }
 
-static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
+static void bxt_seqno_barrier(struct intel_engine_cs *ring)
 {
-
 	/*
 	 * On BXT A steppings there is a HW coherency issue whereby the
 	 * MI_STORE_DATA_IMM storing the completed request's seqno
@@ -1778,11 +1777,7 @@ static u32 bxt_a_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
 	 * bxt_a_set_seqno(), where we also do a clflush after the write. So
 	 * this clflush in practice becomes an invalidate operation.
 	 */
-
-	if (!lazy_coherency)
-		intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
-
-	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
+	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
 }
 
 static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno)
@@ -1977,12 +1972,11 @@ static int logical_render_ring_init(struct drm_device *dev)
 		ring->init_hw = gen8_init_render_ring;
 	ring->init_context = gen8_init_rcs_context;
 	ring->cleanup = intel_fini_pipe_control;
+	ring->get_seqno = gen8_get_seqno;
+	ring->set_seqno = gen8_set_seqno;
 	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
-		ring->get_seqno = bxt_a_get_seqno;
+		ring->seqno_barrier = bxt_seqno_barrier;
 		ring->set_seqno = bxt_a_set_seqno;
-	} else {
-		ring->get_seqno = gen8_get_seqno;
-		ring->set_seqno = gen8_set_seqno;
 	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush_render;
@@ -2029,12 +2023,11 @@ static int logical_bsd_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
+	ring->get_seqno = gen8_get_seqno;
+	ring->set_seqno = gen8_set_seqno;
 	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
-		ring->get_seqno = bxt_a_get_seqno;
+		ring->seqno_barrier = bxt_seqno_barrier;
 		ring->set_seqno = bxt_a_set_seqno;
-	} else {
-		ring->get_seqno = gen8_get_seqno;
-		ring->set_seqno = gen8_set_seqno;
 	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
@@ -2084,12 +2077,11 @@ static int logical_blt_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
+	ring->get_seqno = gen8_get_seqno;
+	ring->set_seqno = gen8_set_seqno;
 	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
-		ring->get_seqno = bxt_a_get_seqno;
+		ring->seqno_barrier = bxt_seqno_barrier;
 		ring->set_seqno = bxt_a_set_seqno;
-	} else {
-		ring->get_seqno = gen8_get_seqno;
-		ring->set_seqno = gen8_set_seqno;
 	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
@@ -2114,12 +2106,11 @@ static int logical_vebox_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
+	ring->get_seqno = gen8_get_seqno;
+	ring->set_seqno = gen8_set_seqno;
 	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
-		ring->get_seqno = bxt_a_get_seqno;
+		ring->seqno_barrier = bxt_seqno_barrier;
 		ring->set_seqno = bxt_a_set_seqno;
-	} else {
-		ring->get_seqno = gen8_get_seqno;
-		ring->set_seqno = gen8_set_seqno;
 	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 5a5a7195dd54..98de72177d12 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1501,22 +1501,18 @@ pc_render_add_request(struct drm_i915_gem_request *req)
 	return 0;
 }
 
-static u32
-gen6_ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
+static void
+gen6_seqno_barrier(struct intel_engine_cs *ring)
 {
 	/* Workaround to force correct ordering between irq and seqno writes on
 	 * ivb (and maybe also on snb) by reading from a CS register (like
 	 * ACTHD) before reading the status page. */
-	if (!lazy_coherency) {
-		struct drm_i915_private *dev_priv = ring->dev->dev_private;
-		POSTING_READ(RING_ACTHD(ring->mmio_base));
-	}
-
-	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
+	struct drm_i915_private *dev_priv = ring->i915;
+	POSTING_READ(RING_ACTHD(ring->mmio_base));
 }
 
 static u32
-ring_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
+ring_get_seqno(struct intel_engine_cs *ring)
 {
 	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
 }
@@ -1528,7 +1524,7 @@ ring_set_seqno(struct intel_engine_cs *ring, u32 seqno)
 }
 
 static u32
-pc_render_get_seqno(struct intel_engine_cs *ring, bool lazy_coherency)
+pc_render_get_seqno(struct intel_engine_cs *ring)
 {
 	return ring->scratch.cpu_page[0];
 }
@@ -2703,7 +2699,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->irq_get = gen8_ring_get_irq;
 		ring->irq_put = gen8_ring_put_irq;
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
-		ring->get_seqno = gen6_ring_get_seqno;
+		ring->seqno_barrier = gen6_seqno_barrier;
+		ring->get_seqno = ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
 			WARN_ON(!dev_priv->semaphore_obj);
@@ -2720,7 +2717,8 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->irq_get = gen6_ring_get_irq;
 		ring->irq_put = gen6_ring_put_irq;
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
-		ring->get_seqno = gen6_ring_get_seqno;
+		ring->seqno_barrier = gen6_seqno_barrier;
+		ring->get_seqno = ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
 			ring->semaphore.sync_to = gen6_ring_sync;
@@ -2834,7 +2832,8 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 			ring->write_tail = gen6_bsd_ring_write_tail;
 		ring->flush = gen6_bsd_ring_flush;
 		ring->add_request = gen6_add_request;
-		ring->get_seqno = gen6_ring_get_seqno;
+		ring->seqno_barrier = gen6_seqno_barrier;
+		ring->get_seqno = ring_get_seqno;
 		ring->set_seqno = ring_set_seqno;
 		if (INTEL_INFO(dev)->gen >= 8) {
 			ring->irq_enable_mask =
@@ -2906,7 +2905,8 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 	ring->mmio_base = GEN8_BSD2_RING_BASE;
 	ring->flush = gen6_bsd_ring_flush;
 	ring->add_request = gen6_add_request;
-	ring->get_seqno = gen6_ring_get_seqno;
+	ring->seqno_barrier = gen6_seqno_barrier;
+	ring->get_seqno = ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
@@ -2936,7 +2936,8 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
-	ring->get_seqno = gen6_ring_get_seqno;
+	ring->seqno_barrier = gen6_seqno_barrier;
+	ring->get_seqno = ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
 		ring->irq_enable_mask =
@@ -2993,7 +2994,8 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 	ring->write_tail = ring_write_tail;
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
-	ring->get_seqno = gen6_ring_get_seqno;
+	ring->seqno_barrier = gen6_seqno_barrier;
+	ring->get_seqno = ring_get_seqno;
 	ring->set_seqno = ring_set_seqno;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 49b5bded2767..af66119ecca9 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -193,8 +193,8 @@ struct  intel_engine_cs {
 	 * seen value is good enough. Note that the seqno will always be
 	 * monotonic, even if not coherent.
 	 */
-	u32		(*get_seqno)(struct intel_engine_cs *ring,
-				     bool lazy_coherency);
+	void		(*seqno_barrier)(struct intel_engine_cs *ring);
+	u32		(*get_seqno)(struct intel_engine_cs *ring);
 	void		(*set_seqno)(struct intel_engine_cs *ring,
 				     u32 seqno);
 	int		(*dispatch_execbuffer)(struct drm_i915_gem_request *req,
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 10/15] drm/i915: Remove the lazy_coherency parameter from request-completed?
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (8 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 09/15] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-29  8:48 ` [PATCH 11/15] drm/i915: Use HWS for seqno tracking everywhere Chris Wilson
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

Now that we have split out the seqno-barrier from the
engine->get_seqno() callback itself, we can move the users of the
seqno-barrier to the required callsites simplifying the common code and
making the required workaround handling much more explicit.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  2 +-
 drivers/gpu/drm/i915/i915_drv.h          | 10 ++--------
 drivers/gpu/drm/i915/i915_gem.c          | 30 ++++++++++++++++++------------
 drivers/gpu/drm/i915/intel_breadcrumbs.c |  7 ++++++-
 drivers/gpu/drm/i915/intel_display.c     |  2 +-
 drivers/gpu/drm/i915/intel_pm.c          |  4 ++--
 6 files changed, 30 insertions(+), 25 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 1e8aa897673a..99a79a10ce75 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -601,7 +601,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
 					   i915_gem_request_get_seqno(work->flip_queued_req),
 					   dev_priv->next_seqno,
 					   ring->get_seqno(ring),
-					   i915_gem_request_completed(work->flip_queued_req, true));
+					   i915_gem_request_completed(work->flip_queued_req));
 			} else
 				seq_printf(m, "Flip not associated with any ring\n");
 			seq_printf(m, "Flip queued on frame %d, (was ready on frame %d), now %d\n",
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index a9c1785ac08c..8eb0871cd7af 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2985,20 +2985,14 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 	return (int32_t)(seq1 - seq2) >= 0;
 }
 
-static inline bool i915_gem_request_started(struct drm_i915_gem_request *req,
-					   bool lazy_coherency)
+static inline bool i915_gem_request_started(struct drm_i915_gem_request *req)
 {
-	if (!lazy_coherency && req->ring->seqno_barrier)
-		req->ring->seqno_barrier(req->ring);
 	return i915_seqno_passed(req->ring->get_seqno(req->ring),
 				 req->previous_seqno);
 }
 
-static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
-					      bool lazy_coherency)
+static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req)
 {
-	if (!lazy_coherency && req->ring->seqno_barrier)
-		req->ring->seqno_barrier(req->ring);
 	return i915_seqno_passed(req->ring->get_seqno(req->ring),
 				 req->seqno);
 }
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 969592fb0969..0cfdd971e8d7 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1173,13 +1173,16 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 	 * takes to sleep on a request, on the order of a microsecond.
 	 */
 
+	if (req->ring->seqno_barrier)
+		req->ring->seqno_barrier(req->ring);
+
 	/* Only spin if we know the GPU is processing this request */
-	if (!i915_gem_request_started(req, false))
+	if (!i915_gem_request_started(req))
 		return -EAGAIN;
 
 	timeout = local_clock_us(&cpu) + 10;
-	while (!need_resched()) {
-		if (i915_gem_request_completed(req, true))
+	do {
+		if (i915_gem_request_completed(req))
 			return 0;
 
 		if (signal_pending_state(state, current))
@@ -1189,9 +1192,12 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 			break;
 
 		cpu_relax_lowlatency();
-	}
+	} while (!need_resched());
+
+	if (req->ring->seqno_barrier)
+		req->ring->seqno_barrier(req->ring);
 
-	if (i915_gem_request_completed(req, false))
+	if (i915_gem_request_completed(req))
 		return 0;
 
 	return -EAGAIN;
@@ -1227,7 +1233,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	if (list_empty(&req->list))
 		return 0;
 
-	if (i915_gem_request_completed(req, true))
+	if (i915_gem_request_completed(req))
 		return 0;
 
 	timeout_remain = 0;
@@ -1258,7 +1264,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	for (;;) {
 		prepare_to_wait(&req->wait, &wait, state);
 
-		if (i915_gem_request_completed(req, true) ||
+		if (i915_gem_request_completed(req) ||
 		    req->reset_counter != i915_reset_counter(&req->i915->gpu_error)) {
 			ret = 0;
 			break;
@@ -2695,7 +2701,7 @@ i915_gem_find_active_request(struct intel_engine_cs *ring)
 	struct drm_i915_gem_request *request;
 
 	list_for_each_entry(request, &ring->request_list, list) {
-		if (i915_gem_request_completed(request, false))
+		if (i915_gem_request_completed(request))
 			continue;
 
 		return request;
@@ -2836,7 +2842,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 					   struct drm_i915_gem_request,
 					   list);
 
-		if (!i915_gem_request_completed(request, true))
+		if (!i915_gem_request_completed(request))
 			break;
 
 		i915_gem_request_retire(request);
@@ -2860,7 +2866,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 	}
 
 	if (unlikely(ring->trace_irq_req &&
-		     i915_gem_request_completed(ring->trace_irq_req, true))) {
+		     i915_gem_request_completed(ring->trace_irq_req))) {
 		ring->irq_put(ring);
 		i915_gem_request_assign(&ring->trace_irq_req, NULL);
 	}
@@ -2966,7 +2972,7 @@ i915_gem_object_flush_active(struct drm_i915_gem_object *obj)
 		if (list_empty(&req->list))
 			goto retire;
 
-		if (i915_gem_request_completed(req, true)) {
+		if (i915_gem_request_completed(req)) {
 			__i915_gem_request_retire__upto(req);
 retire:
 			i915_gem_object_retire__read(obj, i);
@@ -3087,7 +3093,7 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (to == from)
 		return 0;
 
-	if (i915_gem_request_completed(from_req, true))
+	if (i915_gem_request_completed(from_req))
 		return 0;
 
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 0a18405c8689..b1b99068442c 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -87,6 +87,7 @@ static int intel_breadcrumbs_irq(void *data)
 			struct intel_engine_cs *engine = &i915->ring[i];
 			struct intel_breadcrumbs_engine *be = &b->engine[i];
 			struct drm_i915_gem_request *request = be->first;
+			u32 seqno;
 
 			if (request == NULL) {
 				if ((irq_get & (1 << i))) {
@@ -106,11 +107,15 @@ static int intel_breadcrumbs_irq(void *data)
 				irq_get |= 1 << i;
 			}
 
+			if (engine->seqno_barrier)
+				engine->seqno_barrier(engine);
+
+			seqno = engine->get_seqno(engine);
 			do {
 				struct rb_node *next;
 
 				if (request->reset_counter == reset_counter &&
-				    !i915_gem_request_completed(request, false))
+				    !i915_seqno_passed(seqno, request->seqno))
 					break;
 
 				next = rb_next(&request->irq_node);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index d77a17bb94d0..d0b98bcff303 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11379,7 +11379,7 @@ static bool __intel_pageflip_stall_check(struct drm_device *dev,
 
 	if (work->flip_ready_vblank == 0) {
 		if (work->flip_queued_req &&
-		    !i915_gem_request_completed(work->flip_queued_req, true))
+		    !i915_gem_request_completed(work->flip_queued_req))
 			return false;
 
 		work->flip_ready_vblank = drm_crtc_vblank_count(crtc);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 96f45d7b3e4b..b080695b6c1b 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -7171,7 +7171,7 @@ static void __intel_rps_boost_work(struct work_struct *work)
 	struct request_boost *boost = container_of(work, struct request_boost, work);
 	struct drm_i915_gem_request *req = boost->req;
 
-	if (!i915_gem_request_completed(req, true))
+	if (!i915_gem_request_completed(req))
 		gen6_rps_boost(to_i915(req->ring->dev), NULL,
 			       req->emitted_jiffies);
 
@@ -7187,7 +7187,7 @@ void intel_queue_rps_boost_for_request(struct drm_device *dev,
 	if (req == NULL || INTEL_INFO(dev)->gen < 6)
 		return;
 
-	if (i915_gem_request_completed(req, true))
+	if (i915_gem_request_completed(req))
 		return;
 
 	boost = kmalloc(sizeof(*boost), GFP_ATOMIC);
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 11/15] drm/i915: Use HWS for seqno tracking everywhere
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (9 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 10/15] drm/i915: Remove the lazy_coherency parameter from request-completed? Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-29  8:48 ` [PATCH 12/15] drm/i915: Reduce seqno/irq barrier to a clflush on legacy gen6+ Chris Wilson
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

By using the same address for storing the HWS on every platform, we can
remove the platform specific vfuncs and reduce the get-seqno routine to
a single read of a cached memory location.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c      |  6 +--
 drivers/gpu/drm/i915/i915_drv.h          |  4 +-
 drivers/gpu/drm/i915/i915_gpu_error.c    |  2 +-
 drivers/gpu/drm/i915/i915_irq.c          |  4 +-
 drivers/gpu/drm/i915/i915_trace.h        |  2 +-
 drivers/gpu/drm/i915/intel_breadcrumbs.c |  2 +-
 drivers/gpu/drm/i915/intel_lrc.c         | 46 ++---------------
 drivers/gpu/drm/i915/intel_ringbuffer.c  | 86 ++++++++------------------------
 drivers/gpu/drm/i915/intel_ringbuffer.h  |  7 +--
 9 files changed, 41 insertions(+), 118 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 99a79a10ce75..1ced3738281c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -600,7 +600,7 @@ static int i915_gem_pageflip_info(struct seq_file *m, void *data)
 					   ring->name,
 					   i915_gem_request_get_seqno(work->flip_queued_req),
 					   dev_priv->next_seqno,
-					   ring->get_seqno(ring),
+					   intel_ring_get_seqno(ring),
 					   i915_gem_request_completed(work->flip_queued_req));
 			} else
 				seq_printf(m, "Flip not associated with any ring\n");
@@ -731,7 +731,7 @@ static void i915_ring_seqno_info(struct seq_file *m,
 				 struct intel_engine_cs *ring)
 {
 	seq_printf(m, "Current sequence (%s): %x\n",
-		   ring->name, ring->get_seqno(ring));
+		   ring->name, intel_ring_get_seqno(ring));
 }
 
 static int i915_gem_seqno_info(struct seq_file *m, void *data)
@@ -1340,7 +1340,7 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	intel_runtime_pm_get(dev_priv);
 
 	for_each_ring(ring, dev_priv, i) {
-		seqno[i] = ring->get_seqno(ring);
+		seqno[i] = intel_ring_get_seqno(ring);
 		acthd[i] = intel_ring_get_active_head(ring);
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 8eb0871cd7af..410939c3f05b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2987,13 +2987,13 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
 
 static inline bool i915_gem_request_started(struct drm_i915_gem_request *req)
 {
-	return i915_seqno_passed(req->ring->get_seqno(req->ring),
+	return i915_seqno_passed(intel_ring_get_seqno(req->ring),
 				 req->previous_seqno);
 }
 
 static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req)
 {
-	return i915_seqno_passed(req->ring->get_seqno(req->ring),
+	return i915_seqno_passed(intel_ring_get_seqno(req->ring),
 				 req->seqno);
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 7a427240c813..b4e4796fac1a 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -904,7 +904,7 @@ static void i915_record_ring_state(struct drm_device *dev,
 	ering->waiting = READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first);
 	ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
 	ering->acthd = intel_ring_get_active_head(ring);
-	ering->seqno = ring->get_seqno(ring);
+	ering->seqno = intel_ring_get_seqno(ring);
 	ering->start = I915_READ_START(ring);
 	ering->head = I915_READ_HEAD(ring);
 	ering->tail = I915_READ_TAIL(ring);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 43078e09e1f0..86f98d8eacc8 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2871,7 +2871,7 @@ static int semaphore_passed(struct intel_engine_cs *ring)
 	if (signaller->hangcheck.deadlock >= I915_NUM_RINGS)
 		return -1;
 
-	if (i915_seqno_passed(signaller->get_seqno(signaller), seqno))
+	if (i915_seqno_passed(intel_ring_get_seqno(signaller), seqno))
 		return 1;
 
 	/* cursory check for an unkickable deadlock */
@@ -2975,7 +2975,7 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 		semaphore_clear_deadlocks(dev_priv);
 
 		acthd = intel_ring_get_active_head(ring);
-		seqno = ring->get_seqno(ring);
+		seqno = intel_ring_get_seqno(ring);
 
 		if (ring->hangcheck.seqno == seqno) {
 			if (ring_idle(ring, seqno)) {
diff --git a/drivers/gpu/drm/i915/i915_trace.h b/drivers/gpu/drm/i915/i915_trace.h
index cfb5f78a6e84..efca75bcace3 100644
--- a/drivers/gpu/drm/i915/i915_trace.h
+++ b/drivers/gpu/drm/i915/i915_trace.h
@@ -573,7 +573,7 @@ TRACE_EVENT(i915_gem_request_notify,
 	    TP_fast_assign(
 			   __entry->dev = ring->dev->primary->index;
 			   __entry->ring = ring->id;
-			   __entry->seqno = ring->get_seqno(ring);
+			   __entry->seqno = intel_ring_get_seqno(ring);
 			   ),
 
 	    TP_printk("dev=%u, ring=%u, seqno=%u",
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index b1b99068442c..4a22c6db39b4 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -110,7 +110,7 @@ static int intel_breadcrumbs_irq(void *data)
 			if (engine->seqno_barrier)
 				engine->seqno_barrier(engine);
 
-			seqno = engine->get_seqno(engine);
+			seqno = intel_ring_get_seqno(engine);
 			do {
 				struct rb_node *next;
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index fef97dc8e02d..c457dd035900 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1755,16 +1755,6 @@ static int gen8_emit_flush_render(struct drm_i915_gem_request *request,
 	return 0;
 }
 
-static u32 gen8_get_seqno(struct intel_engine_cs *ring)
-{
-	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
-}
-
-static void gen8_set_seqno(struct intel_engine_cs *ring, u32 seqno)
-{
-	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
-}
-
 static void bxt_seqno_barrier(struct intel_engine_cs *ring)
 {
 	/*
@@ -1780,14 +1770,6 @@ static void bxt_seqno_barrier(struct intel_engine_cs *ring)
 	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
 }
 
-static void bxt_a_set_seqno(struct intel_engine_cs *ring, u32 seqno)
-{
-	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
-
-	/* See bxt_a_get_seqno() explaining the reason for the clflush. */
-	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
-}
-
 static int gen8_emit_request(struct drm_i915_gem_request *request)
 {
 	struct intel_ringbuffer *ringbuf = request->ringbuf;
@@ -1812,7 +1794,7 @@ static int gen8_emit_request(struct drm_i915_gem_request *request)
 				(ring->status_page.gfx_addr +
 				(I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT)));
 	intel_logical_ring_emit(ringbuf, 0);
-	intel_logical_ring_emit(ringbuf, i915_gem_request_get_seqno(request));
+	intel_logical_ring_emit(ringbuf, request->seqno);
 	intel_logical_ring_emit(ringbuf, MI_USER_INTERRUPT);
 	intel_logical_ring_emit(ringbuf, MI_NOOP);
 	intel_logical_ring_advance_and_submit(request);
@@ -1972,12 +1954,8 @@ static int logical_render_ring_init(struct drm_device *dev)
 		ring->init_hw = gen8_init_render_ring;
 	ring->init_context = gen8_init_rcs_context;
 	ring->cleanup = intel_fini_pipe_control;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
-	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
+	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
 		ring->seqno_barrier = bxt_seqno_barrier;
-		ring->set_seqno = bxt_a_set_seqno;
-	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush_render;
 	ring->irq_get = gen8_logical_ring_get_irq;
@@ -2023,12 +2001,8 @@ static int logical_bsd_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
-	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
+	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
 		ring->seqno_barrier = bxt_seqno_barrier;
-		ring->set_seqno = bxt_a_set_seqno;
-	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
 	ring->irq_get = gen8_logical_ring_get_irq;
@@ -2052,8 +2026,6 @@ static int logical_bsd2_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
 	ring->irq_get = gen8_logical_ring_get_irq;
@@ -2077,12 +2049,8 @@ static int logical_blt_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
-	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
+	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
 		ring->seqno_barrier = bxt_seqno_barrier;
-		ring->set_seqno = bxt_a_set_seqno;
-	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
 	ring->irq_get = gen8_logical_ring_get_irq;
@@ -2106,12 +2074,8 @@ static int logical_vebox_ring_init(struct drm_device *dev)
 		GT_CONTEXT_SWITCH_INTERRUPT << GEN8_VECS_IRQ_SHIFT;
 
 	ring->init_hw = gen8_init_common_ring;
-	ring->get_seqno = gen8_get_seqno;
-	ring->set_seqno = gen8_set_seqno;
-	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1)) {
+	if (IS_BXT_REVID(dev, 0, BXT_REVID_A1))
 		ring->seqno_barrier = bxt_seqno_barrier;
-		ring->set_seqno = bxt_a_set_seqno;
-	}
 	ring->emit_request = gen8_emit_request;
 	ring->emit_flush = gen8_emit_flush;
 	ring->irq_get = gen8_logical_ring_get_irq;
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 98de72177d12..3d59dd555e64 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1232,19 +1232,17 @@ static int gen8_rcs_signal(struct drm_i915_gem_request *signaller_req,
 		return ret;
 
 	for_each_ring(waiter, dev_priv, i) {
-		u32 seqno;
 		u64 gtt_offset = signaller->semaphore.signal_ggtt[i];
 		if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
 			continue;
 
-		seqno = i915_gem_request_get_seqno(signaller_req);
 		intel_ring_emit(signaller, GFX_OP_PIPE_CONTROL(6));
 		intel_ring_emit(signaller, PIPE_CONTROL_GLOBAL_GTT_IVB |
 					   PIPE_CONTROL_QW_WRITE |
 					   PIPE_CONTROL_FLUSH_ENABLE);
 		intel_ring_emit(signaller, lower_32_bits(gtt_offset));
 		intel_ring_emit(signaller, upper_32_bits(gtt_offset));
-		intel_ring_emit(signaller, seqno);
+		intel_ring_emit(signaller, signaller_req->seqno);
 		intel_ring_emit(signaller, 0);
 		intel_ring_emit(signaller, MI_SEMAPHORE_SIGNAL |
 					   MI_SEMAPHORE_TARGET(waiter->id));
@@ -1273,18 +1271,16 @@ static int gen8_xcs_signal(struct drm_i915_gem_request *signaller_req,
 		return ret;
 
 	for_each_ring(waiter, dev_priv, i) {
-		u32 seqno;
 		u64 gtt_offset = signaller->semaphore.signal_ggtt[i];
 		if (gtt_offset == MI_SEMAPHORE_SYNC_INVALID)
 			continue;
 
-		seqno = i915_gem_request_get_seqno(signaller_req);
 		intel_ring_emit(signaller, (MI_FLUSH_DW + 1) |
 					   MI_FLUSH_DW_OP_STOREDW);
 		intel_ring_emit(signaller, lower_32_bits(gtt_offset) |
 					   MI_FLUSH_DW_USE_GTT);
 		intel_ring_emit(signaller, upper_32_bits(gtt_offset));
-		intel_ring_emit(signaller, seqno);
+		intel_ring_emit(signaller, signaller_req->seqno);
 		intel_ring_emit(signaller, MI_SEMAPHORE_SIGNAL |
 					   MI_SEMAPHORE_TARGET(waiter->id));
 		intel_ring_emit(signaller, 0);
@@ -1315,11 +1311,9 @@ static int gen6_signal(struct drm_i915_gem_request *signaller_req,
 		i915_reg_t mbox_reg = signaller->semaphore.mbox.signal[i];
 
 		if (i915_mmio_reg_valid(mbox_reg)) {
-			u32 seqno = i915_gem_request_get_seqno(signaller_req);
-
 			intel_ring_emit(signaller, MI_LOAD_REGISTER_IMM(1));
 			intel_ring_emit_reg(signaller, mbox_reg);
-			intel_ring_emit(signaller, seqno);
+			intel_ring_emit(signaller, signaller_req->seqno);
 		}
 	}
 
@@ -1354,7 +1348,7 @@ gen6_add_request(struct drm_i915_gem_request *req)
 
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-	intel_ring_emit(ring, i915_gem_request_get_seqno(req));
+	intel_ring_emit(ring, req->seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	__intel_ring_advance(ring);
 
@@ -1456,7 +1450,9 @@ static int
 pc_render_add_request(struct drm_i915_gem_request *req)
 {
 	struct intel_engine_cs *ring = req->ring;
-	u32 scratch_addr = ring->scratch.gtt_offset + 2 * CACHELINE_BYTES;
+	u32 addr = req->ring->status_page.gfx_addr +
+		(I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
+	u32 scratch_addr = addr;
 	int ret;
 
 	/* For Ironlake, MI_USER_INTERRUPT was deprecated and apparently
@@ -1471,11 +1467,12 @@ pc_render_add_request(struct drm_i915_gem_request *req)
 	if (ret)
 		return ret;
 
-	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE |
-			PIPE_CONTROL_WRITE_FLUSH |
-			PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE);
-	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
-	intel_ring_emit(ring, i915_gem_request_get_seqno(req));
+	intel_ring_emit(ring,
+			GFX_OP_PIPE_CONTROL(4) |
+			PIPE_CONTROL_QW_WRITE |
+			PIPE_CONTROL_WRITE_FLUSH);
+	intel_ring_emit(ring, addr | PIPE_CONTROL_GLOBAL_GTT);
+	intel_ring_emit(ring, req->seqno);
 	intel_ring_emit(ring, 0);
 	PIPE_CONTROL_FLUSH(ring, scratch_addr);
 	scratch_addr += 2 * CACHELINE_BYTES; /* write to separate cachelines */
@@ -1489,12 +1486,12 @@ pc_render_add_request(struct drm_i915_gem_request *req)
 	scratch_addr += 2 * CACHELINE_BYTES;
 	PIPE_CONTROL_FLUSH(ring, scratch_addr);
 
-	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4) | PIPE_CONTROL_QW_WRITE |
+	intel_ring_emit(ring, GFX_OP_PIPE_CONTROL(4) |
+			PIPE_CONTROL_QW_WRITE |
 			PIPE_CONTROL_WRITE_FLUSH |
-			PIPE_CONTROL_TEXTURE_CACHE_INVALIDATE |
 			PIPE_CONTROL_NOTIFY);
-	intel_ring_emit(ring, ring->scratch.gtt_offset | PIPE_CONTROL_GLOBAL_GTT);
-	intel_ring_emit(ring, i915_gem_request_get_seqno(req));
+	intel_ring_emit(ring, addr | PIPE_CONTROL_GLOBAL_GTT);
+	intel_ring_emit(ring, req->seqno);
 	intel_ring_emit(ring, 0);
 	__intel_ring_advance(ring);
 
@@ -1511,30 +1508,6 @@ gen6_seqno_barrier(struct intel_engine_cs *ring)
 	POSTING_READ(RING_ACTHD(ring->mmio_base));
 }
 
-static u32
-ring_get_seqno(struct intel_engine_cs *ring)
-{
-	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
-}
-
-static void
-ring_set_seqno(struct intel_engine_cs *ring, u32 seqno)
-{
-	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
-}
-
-static u32
-pc_render_get_seqno(struct intel_engine_cs *ring)
-{
-	return ring->scratch.cpu_page[0];
-}
-
-static void
-pc_render_set_seqno(struct intel_engine_cs *ring, u32 seqno)
-{
-	ring->scratch.cpu_page[0] = seqno;
-}
-
 static bool
 gen5_ring_get_irq(struct intel_engine_cs *ring)
 {
@@ -1670,7 +1643,7 @@ i9xx_add_request(struct drm_i915_gem_request *req)
 
 	intel_ring_emit(ring, MI_STORE_DWORD_INDEX);
 	intel_ring_emit(ring, I915_GEM_HWS_INDEX << MI_STORE_DWORD_INDEX_SHIFT);
-	intel_ring_emit(ring, i915_gem_request_get_seqno(req));
+	intel_ring_emit(ring, req->seqno);
 	intel_ring_emit(ring, MI_USER_INTERRUPT);
 	__intel_ring_advance(ring);
 
@@ -2462,7 +2435,10 @@ void intel_ring_init_seqno(struct intel_engine_cs *ring, u32 seqno)
 			I915_WRITE(RING_SYNC_2(ring->mmio_base), 0);
 	}
 
-	ring->set_seqno(ring, seqno);
+	intel_write_status_page(ring, I915_GEM_HWS_INDEX, seqno);
+	if (ring->seqno_barrier)
+		ring->seqno_barrier(ring);
+
 	ring->hangcheck.seqno = seqno;
 }
 
@@ -2700,8 +2676,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->irq_put = gen8_ring_put_irq;
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 		ring->seqno_barrier = gen6_seqno_barrier;
-		ring->get_seqno = ring_get_seqno;
-		ring->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
 			WARN_ON(!dev_priv->semaphore_obj);
 			ring->semaphore.sync_to = gen8_ring_sync;
@@ -2718,8 +2692,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 		ring->irq_put = gen6_ring_put_irq;
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT;
 		ring->seqno_barrier = gen6_seqno_barrier;
-		ring->get_seqno = ring_get_seqno;
-		ring->set_seqno = ring_set_seqno;
 		if (i915_semaphore_is_enabled(dev)) {
 			ring->semaphore.sync_to = gen6_ring_sync;
 			ring->semaphore.signal = gen6_signal;
@@ -2744,8 +2716,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 	} else if (IS_GEN5(dev)) {
 		ring->add_request = pc_render_add_request;
 		ring->flush = gen4_render_ring_flush;
-		ring->get_seqno = pc_render_get_seqno;
-		ring->set_seqno = pc_render_set_seqno;
 		ring->irq_get = gen5_ring_get_irq;
 		ring->irq_put = gen5_ring_put_irq;
 		ring->irq_enable_mask = GT_RENDER_USER_INTERRUPT |
@@ -2756,8 +2726,6 @@ int intel_init_render_ring_buffer(struct drm_device *dev)
 			ring->flush = gen2_render_ring_flush;
 		else
 			ring->flush = gen4_render_ring_flush;
-		ring->get_seqno = ring_get_seqno;
-		ring->set_seqno = ring_set_seqno;
 		if (IS_GEN2(dev)) {
 			ring->irq_get = i8xx_ring_get_irq;
 			ring->irq_put = i8xx_ring_put_irq;
@@ -2833,8 +2801,6 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 		ring->flush = gen6_bsd_ring_flush;
 		ring->add_request = gen6_add_request;
 		ring->seqno_barrier = gen6_seqno_barrier;
-		ring->get_seqno = ring_get_seqno;
-		ring->set_seqno = ring_set_seqno;
 		if (INTEL_INFO(dev)->gen >= 8) {
 			ring->irq_enable_mask =
 				GT_RENDER_USER_INTERRUPT << GEN8_VCS1_IRQ_SHIFT;
@@ -2872,8 +2838,6 @@ int intel_init_bsd_ring_buffer(struct drm_device *dev)
 		ring->mmio_base = BSD_RING_BASE;
 		ring->flush = bsd_ring_flush;
 		ring->add_request = i9xx_add_request;
-		ring->get_seqno = ring_get_seqno;
-		ring->set_seqno = ring_set_seqno;
 		if (IS_GEN5(dev)) {
 			ring->irq_enable_mask = ILK_BSD_USER_INTERRUPT;
 			ring->irq_get = gen5_ring_get_irq;
@@ -2906,8 +2870,6 @@ int intel_init_bsd2_ring_buffer(struct drm_device *dev)
 	ring->flush = gen6_bsd_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->seqno_barrier = gen6_seqno_barrier;
-	ring->get_seqno = ring_get_seqno;
-	ring->set_seqno = ring_set_seqno;
 	ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_VCS2_IRQ_SHIFT;
 	ring->irq_get = gen8_ring_get_irq;
@@ -2937,8 +2899,6 @@ int intel_init_blt_ring_buffer(struct drm_device *dev)
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->seqno_barrier = gen6_seqno_barrier;
-	ring->get_seqno = ring_get_seqno;
-	ring->set_seqno = ring_set_seqno;
 	if (INTEL_INFO(dev)->gen >= 8) {
 		ring->irq_enable_mask =
 			GT_RENDER_USER_INTERRUPT << GEN8_BCS_IRQ_SHIFT;
@@ -2995,8 +2955,6 @@ int intel_init_vebox_ring_buffer(struct drm_device *dev)
 	ring->flush = gen6_ring_flush;
 	ring->add_request = gen6_add_request;
 	ring->seqno_barrier = gen6_seqno_barrier;
-	ring->get_seqno = ring_get_seqno;
-	ring->set_seqno = ring_set_seqno;
 
 	if (INTEL_INFO(dev)->gen >= 8) {
 		ring->irq_enable_mask =
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index af66119ecca9..74b9df21f062 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -194,9 +194,6 @@ struct  intel_engine_cs {
 	 * monotonic, even if not coherent.
 	 */
 	void		(*seqno_barrier)(struct intel_engine_cs *ring);
-	u32		(*get_seqno)(struct intel_engine_cs *ring);
-	void		(*set_seqno)(struct intel_engine_cs *ring,
-				     u32 seqno);
 	int		(*dispatch_execbuffer)(struct drm_i915_gem_request *req,
 					       u64 offset, u32 length,
 					       unsigned dispatch_flags);
@@ -472,6 +469,10 @@ int intel_init_blt_ring_buffer(struct drm_device *dev);
 int intel_init_vebox_ring_buffer(struct drm_device *dev);
 
 u64 intel_ring_get_active_head(struct intel_engine_cs *ring);
+static inline u32 intel_ring_get_seqno(struct intel_engine_cs *ring)
+{
+	return intel_read_status_page(ring, I915_GEM_HWS_INDEX);
+}
 
 int init_workarounds_ring(struct intel_engine_cs *ring);
 
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 12/15] drm/i915: Reduce seqno/irq barrier to a clflush on legacy gen6+
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (10 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 11/15] drm/i915: Use HWS for seqno tracking everywhere Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-29  8:48 ` [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation Chris Wilson
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/intel_ringbuffer.c | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 3d59dd555e64..ccceb43f14ac 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1501,11 +1501,7 @@ pc_render_add_request(struct drm_i915_gem_request *req)
 static void
 gen6_seqno_barrier(struct intel_engine_cs *ring)
 {
-	/* Workaround to force correct ordering between irq and seqno writes on
-	 * ivb (and maybe also on snb) by reading from a CS register (like
-	 * ACTHD) before reading the status page. */
-	struct drm_i915_private *dev_priv = ring->i915;
-	POSTING_READ(RING_ACTHD(ring->mmio_base));
+	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
 }
 
 static bool
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (11 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 12/15] drm/i915: Reduce seqno/irq barrier to a clflush on legacy gen6+ Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-12-01 16:57   ` Dave Gordon
  2015-11-29  8:48 ` [PATCH 14/15] drm/i915: Only query timestamp when measuring elapsed time Chris Wilson
  2015-11-29  8:48 ` [PATCH 15/15] drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno Chris Wilson
  14 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

We have testcases to ensure that seqno wraparound works fine, so we can
forgo forcing everyone to encounter seqno wraparound during early
uptime. seqno wraparound incurs a full GPU stall so not forcing it
will eliminate one jitter from the early system.

Advancing the global next_seqno after a GPU reset is equally pointless.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 16 +---------------
 1 file changed, 1 insertion(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0cfdd971e8d7..2c3e36e19cb0 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4774,14 +4774,6 @@ i915_gem_init_hw(struct drm_device *dev)
 		}
 	}
 
-	/*
-	 * Increment the next seqno by 0x100 so we have a visible break
-	 * on re-initialisation
-	 */
-	ret = i915_gem_set_seqno(dev, dev_priv->next_seqno+0x100);
-	if (ret)
-		goto out;
-
 	/* Now it is safe to go back round and do everything else: */
 	for_each_ring(ring, dev_priv, i) {
 		struct drm_i915_gem_request *req;
@@ -4969,13 +4961,7 @@ i915_gem_load(struct drm_device *dev)
 		dev_priv->num_fence_regs =
 				I915_READ(vgtif_reg(avail_rs.fence_num));
 
-	/*
-	 * Set initial sequence number for requests.
-	 * Using this number allows the wraparound to happen early,
-	 * catching any obvious problems.
-	 */
-	dev_priv->next_seqno = ((u32)~0 - 0x1100);
-	dev_priv->last_seqno = ((u32)~0 - 0x1101);
+	dev_priv->next_seqno = 1;
 
 	/* Initialize fence registers to zero */
 	INIT_LIST_HEAD(&dev_priv->mm.fence_list);
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 14/15] drm/i915: Only query timestamp when measuring elapsed time
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (12 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  2015-11-30 10:19   ` Tvrtko Ursulin
  2015-11-29  8:48 ` [PATCH 15/15] drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno Chris Wilson
  14 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as
we only need to compute the elapsed time for a timed wait.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 2c3e36e19cb0..871201713c73 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1227,7 +1227,6 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
 	DEFINE_WAIT(wait);
 	unsigned long timeout_remain;
-	s64 before, now;
 	int ret;
 
 	if (list_empty(&req->list))
@@ -1244,13 +1243,12 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 		if (*timeout == 0)
 			return -ETIME;
 
+		/* Record current time in case interrupted, or wedged */
 		timeout_remain = nsecs_to_jiffies_timeout(*timeout);
+		*timeout += ktime_get_raw_ns();
 	}
 
-	/* Record current time in case interrupted by signal, or wedged */
 	trace_i915_gem_request_wait_begin(req);
-	before = ktime_get_raw_ns();
-
 	if (INTEL_INFO(req->i915)->gen >= 6)
 		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
 
@@ -1286,14 +1284,13 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 	}
 	finish_wait(&req->wait, &wait);
 out:
-	now = ktime_get_raw_ns();
 	intel_breadcrumbs_remove_waiter(req);
 	trace_i915_gem_request_wait_end(req);
 
 	if (timeout) {
-		s64 tres = *timeout - (now - before);
-
-		*timeout = tres < 0 ? 0 : tres;
+		*timeout -= ktime_get_raw_ns();
+		if (*timeout < 0)
+			*timeout = 0;
 
 		/*
 		 * Apparently ktime isn't accurate enough and occasionally has a
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 15/15] drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno
  2015-11-29  8:47 i915_wait_request scaling Chris Wilson
                   ` (13 preceding siblings ...)
  2015-11-29  8:48 ` [PATCH 14/15] drm/i915: Only query timestamp when measuring elapsed time Chris Wilson
@ 2015-11-29  8:48 ` Chris Wilson
  14 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-29  8:48 UTC (permalink / raw)
  To: intel-gfx

After the GPU reset and we discard all of the incomplete requests, mark
the GPU as having advanced to the last_submitted_seqno (as having
completed the requests and ready for fresh work). The impact of this is
negligble, as all the requests will be considered completed by this
point, it just brings the HWS into line with expectations for external
viewers.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_gem.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 871201713c73..f82ee12c9492 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2793,6 +2793,8 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
 		buffer->last_retired_head = buffer->tail;
 		intel_ring_update_space(buffer);
 	}
+
+	intel_ring_init_seqno(ring, ring->last_submitted_seqno);
 }
 
 void i915_gem_reset(struct drm_device *dev)
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* Re: [PATCH 01/15] drm/i915: Break busywaiting for requests on pending signals
  2015-11-29  8:47 ` [PATCH 01/15] drm/i915: Break busywaiting for requests on pending signals Chris Wilson
@ 2015-11-30 10:01   ` Tvrtko Ursulin
  0 siblings, 0 replies; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 10:01 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx
  Cc: Jens Axboe, Daniel Vetter, Eero Tamminen, stable, Rantala, Valtteri


On 29/11/15 08:47, Chris Wilson wrote:
> The busywait in __i915_spin_request() does not respect pending signals
> and so may consume the entire timeslice for the task instead of
> returning to userspace to handle the signal.
>
> In the worst case this could cause a delay in signal processing of 20ms,
> which would be a noticeable jitter in cursor tracking. If a higher
> resolution signal was being used, for example to provide fairness of a
> server timeslices between clients, we could expect to detect some
> unfairness between clients (i.e. some windows not updating as fast as
> others). This issue was noticed when inspecting a report of poor
> interactivity resulting from excessively high __i915_spin_request usage.
>
> Fixes regression from
> commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2]
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Apr 7 16:20:41 2015 +0100
>
>       drm/i915: Optimistically spin for the request completion
>
> v2: Try to assess the impact of the bug
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Jens Axboe <axboe@kernel.dk>
> Cc; "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
> Cc: stable@kernel.vger.org
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)

One more time,

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 33adc8f8ab20..87fc34f5899f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1146,7 +1146,7 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
>   	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
>   }
>
> -static int __i915_spin_request(struct drm_i915_gem_request *req)
> +static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   {
>   	unsigned long timeout;
>
> @@ -1158,6 +1158,9 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
>   		if (i915_gem_request_completed(req, true))
>   			return 0;
>
> +		if (signal_pending_state(state, current))
> +			break;
> +
>   		if (time_after_eq(jiffies, timeout))
>   			break;
>
> @@ -1197,6 +1200,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   	struct drm_i915_private *dev_priv = dev->dev_private;
>   	const bool irq_test_in_progress =
>   		ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
> +	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
>   	DEFINE_WAIT(wait);
>   	unsigned long timeout_expire;
>   	s64 before, now;
> @@ -1221,7 +1225,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   	before = ktime_get_raw_ns();
>
>   	/* Optimistic spin for the next jiffie before touching IRQs */
> -	ret = __i915_spin_request(req);
> +	ret = __i915_spin_request(req, state);
>   	if (ret == 0)
>   		goto out;
>
> @@ -1233,8 +1237,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   	for (;;) {
>   		struct timer_list timer;
>
> -		prepare_to_wait(&ring->irq_queue, &wait,
> -				interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE);
> +		prepare_to_wait(&ring->irq_queue, &wait, state);
>
>   		/* We need to check whether any gpu reset happened in between
>   		 * the caller grabbing the seqno and now ... */
> @@ -1252,7 +1255,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   			break;
>   		}
>
> -		if (interruptible && signal_pending(current)) {
> +		if (signal_pending_state(state, current)) {
>   			ret = -ERESTARTSYS;
>   			break;
>   		}
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 02/15] drm/i915: Limit the busy wait on requests to 10us not 10ms!
  2015-11-29  8:48 ` [PATCH 02/15] drm/i915: Limit the busy wait on requests to 10us not 10ms! Chris Wilson
@ 2015-11-30 10:02   ` Tvrtko Ursulin
  2015-11-30 10:08     ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 10:02 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx
  Cc: Daniel Vetter, Eero Tamminen, stable, Rantala, Valtteri



On 29/11/15 08:48, Chris Wilson wrote:
> When waiting for high frequency requests, the finite amount of time
> required to set up the irq and wait upon it limits the response rate. By
> busywaiting on the request completion for a short while we can service
> the high frequency waits as quick as possible. However, if it is a slow
> request, we want to sleep as quickly as possible. The tradeoff between
> waiting and sleeping is roughly the time it takes to sleep on a request,
> on the order of a microsecond. Based on measurements of synchronous
> workloads from across big core and little atom, I have set the limit for
> busywaiting as 10 microseconds. In most of the synchronous cases, we can
> reduce the limit down to as little as 2 miscroseconds, but that leaves
> quite a few test cases regressing by factors of 3 and more.
>
> The code currently uses the jiffie clock, but that is far too coarse (on
> the order of 10 milliseconds) and results in poor interactivity as the
> CPU ends up being hogged by slow requests. To get microsecond resolution
> we need to use a high resolution timer. The cheapest of which is polling
> local_clock(), but that is only valid on the same CPU. If we switch CPUs
> because the task was preempted, we can also use that as an indicator that
>   the system is too busy to waste cycles on spinning and we should sleep
> instead.
>
> __i915_spin_request was introduced in
> commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2]
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Tue Apr 7 16:20:41 2015 +0100
>
>       drm/i915: Optimistically spin for the request completion
>
> v2: Drop full u64 for unsigned long - the timer is 32bit wraparound safe,
> so we can use native register sizes on smaller architectures. Mention
> the approximate microseconds units for elapsed time and add some extra
> comments describing the reason for busywaiting.
>
> v3: Raise the limit to 10us
>
> Reported-by: Jens Axboe <axboe@kernel.dk>
> Link: https://lkml.org/lkml/2015/11/12/621
> Cc; "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
> Cc: stable@kernel.vger.org
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 47 +++++++++++++++++++++++++++++++++++++++--
>   1 file changed, 45 insertions(+), 2 deletions(-)

Again,

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 87fc34f5899f..bad112abb16b 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1146,14 +1146,57 @@ static bool missed_irq(struct drm_i915_private *dev_priv,
>   	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
>   }
>
> +static unsigned long local_clock_us(unsigned *cpu)
> +{
> +	unsigned long t;
> +
> +	/* Cheaply and approximately convert from nanoseconds to microseconds.
> +	 * The result and subsequent calculations are also defined in the same
> +	 * approximate microseconds units. The principal source of timing
> +	 * error here is from the simple truncation.
> +	 *
> +	 * Note that local_clock() is only defined wrt to the current CPU;
> +	 * the comparisons are no longer valid if we switch CPUs. Instead of
> +	 * blocking preemption for the entire busywait, we can detect the CPU
> +	 * switch and use that as indicator of system load and a reason to
> +	 * stop busywaiting, see busywait_stop().
> +	 */
> +	*cpu = get_cpu();
> +	t = local_clock() >> 10;
> +	put_cpu();
> +
> +	return t;
> +}
> +
> +static bool busywait_stop(unsigned long timeout, unsigned cpu)
> +{
> +	unsigned this_cpu;
> +
> +	if (time_after(local_clock_us(&this_cpu), timeout))
> +		return true;
> +
> +	return this_cpu != cpu;
> +}
> +
>   static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   {
>   	unsigned long timeout;
> +	unsigned cpu;
> +
> +	/* When waiting for high frequency requests, e.g. during synchronous
> +	 * rendering split between the CPU and GPU, the finite amount of time
> +	 * required to set up the irq and wait upon it limits the response
> +	 * rate. By busywaiting on the request completion for a short while we
> +	 * can service the high frequency waits as quick as possible. However,
> +	 * if it is a slow request, we want to sleep as quickly as possible.
> +	 * The tradeoff between waiting and sleeping is roughly the time it
> +	 * takes to sleep on a request, on the order of a microsecond.
> +	 */
>
>   	if (i915_gem_request_get_ring(req)->irq_refcount)
>   		return -EBUSY;
>
> -	timeout = jiffies + 1;
> +	timeout = local_clock_us(&cpu) + 10;
>   	while (!need_resched()) {
>   		if (i915_gem_request_completed(req, true))
>   			return 0;
> @@ -1161,7 +1204,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   		if (signal_pending_state(state, current))
>   			break;
>
> -		if (time_after_eq(jiffies, timeout))
> +		if (busywait_stop(timeout, cpu))
>   			break;
>
>   		cpu_relax_lowlatency();
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request
  2015-11-29  8:48 ` [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request Chris Wilson
@ 2015-11-30 10:06   ` Tvrtko Ursulin
  2015-12-01 15:47     ` Dave Gordon
  0 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 10:06 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx
  Cc: Daniel Vetter, Eero Tamminen, stable, Rantala, Valtteri


On 29/11/15 08:48, Chris Wilson wrote:
> Limit busywaiting only to the request currently being processed by the
> GPU. If the request is not currently being processed by the GPU, there
> is a very low likelihood of it being completed within the 2 microsecond
> spin timeout and so we will just be wasting CPU cycles.
>
> v2: Check for logical inversion when rebasing - we were incorrectly
> checking for this request being active, and instead busywaiting for
> when the GPU was not yet processing the request of interest.
>
> v3: Try another colour for the seqno names.
> v4: Another colour for the function names.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
> Cc: stable@kernel.vger.org
> ---
>   drivers/gpu/drm/i915/i915_drv.h | 27 +++++++++++++++++++--------
>   drivers/gpu/drm/i915/i915_gem.c |  8 +++++++-
>   2 files changed, 26 insertions(+), 9 deletions(-)

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index bc865e234df2..d07041c1729d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2182,8 +2182,17 @@ struct drm_i915_gem_request {
>   	struct drm_i915_private *i915;
>   	struct intel_engine_cs *ring;
>
> -	/** GEM sequence number associated with this request. */
> -	uint32_t seqno;
> +	 /** GEM sequence number associated with the previous request,
> +	  * when the HWS breadcrumb is equal to this the GPU is processing
> +	  * this request.
> +	  */
> +	u32 previous_seqno;
> +
> +	 /** GEM sequence number associated with this request,
> +	  * when the HWS breadcrumb is equal or greater than this the GPU
> +	  * has finished processing this request.
> +	  */
> +	u32 seqno;
>
>   	/** Position in the ringbuffer of the start of the request */
>   	u32 head;
> @@ -2945,15 +2954,17 @@ i915_seqno_passed(uint32_t seq1, uint32_t seq2)
>   	return (int32_t)(seq1 - seq2) >= 0;
>   }
>
> +static inline bool i915_gem_request_started(struct drm_i915_gem_request *req,
> +					   bool lazy_coherency)
> +{
> +	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
> +	return i915_seqno_passed(seqno, req->previous_seqno);
> +}
> +
>   static inline bool i915_gem_request_completed(struct drm_i915_gem_request *req,
>   					      bool lazy_coherency)
>   {
> -	u32 seqno;
> -
> -	BUG_ON(req == NULL);
> -
> -	seqno = req->ring->get_seqno(req->ring, lazy_coherency);
> -
> +	u32 seqno = req->ring->get_seqno(req->ring, lazy_coherency);
>   	return i915_seqno_passed(seqno, req->seqno);
>   }
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bad112abb16b..ada461e02718 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1193,9 +1193,13 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   	 * takes to sleep on a request, on the order of a microsecond.
>   	 */
>
> -	if (i915_gem_request_get_ring(req)->irq_refcount)
> +	if (req->ring->irq_refcount)
>   		return -EBUSY;
>
> +	/* Only spin if we know the GPU is processing this request */
> +	if (!i915_gem_request_started(req, false))
> +		return -EAGAIN;
> +
>   	timeout = local_clock_us(&cpu) + 10;
>   	while (!need_resched()) {
>   		if (i915_gem_request_completed(req, true))
> @@ -1209,6 +1213,7 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>
>   		cpu_relax_lowlatency();
>   	}
> +
>   	if (i915_gem_request_completed(req, false))
>   		return 0;
>
> @@ -2592,6 +2597,7 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>   	request->batch_obj = obj;
>
>   	request->emitted_jiffies = jiffies;
> +	request->previous_seqno = ring->last_submitted_seqno;
>   	ring->last_submitted_seqno = request->seqno;
>   	list_add_tail(&request->list, &ring->request_list);
>
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 02/15] drm/i915: Limit the busy wait on requests to 10us not 10ms!
  2015-11-30 10:02   ` Tvrtko Ursulin
@ 2015-11-30 10:08     ` Chris Wilson
  0 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 10:08 UTC (permalink / raw)
  To: Tvrtko Ursulin
  Cc: Daniel Vetter, intel-gfx, stable, Rantala, Valtteri, Eero Tamminen

On Mon, Nov 30, 2015 at 10:02:30AM +0000, Tvrtko Ursulin wrote:
> 
> 
> On 29/11/15 08:48, Chris Wilson wrote:
> >When waiting for high frequency requests, the finite amount of time
> >required to set up the irq and wait upon it limits the response rate. By
> >busywaiting on the request completion for a short while we can service
> >the high frequency waits as quick as possible. However, if it is a slow
> >request, we want to sleep as quickly as possible. The tradeoff between
> >waiting and sleeping is roughly the time it takes to sleep on a request,
> >on the order of a microsecond. Based on measurements of synchronous
> >workloads from across big core and little atom, I have set the limit for
> >busywaiting as 10 microseconds. In most of the synchronous cases, we can
> >reduce the limit down to as little as 2 miscroseconds, but that leaves
> >quite a few test cases regressing by factors of 3 and more.
> >
> >The code currently uses the jiffie clock, but that is far too coarse (on
> >the order of 10 milliseconds) and results in poor interactivity as the
> >CPU ends up being hogged by slow requests. To get microsecond resolution
> >we need to use a high resolution timer. The cheapest of which is polling
> >local_clock(), but that is only valid on the same CPU. If we switch CPUs
> >because the task was preempted, we can also use that as an indicator that
> >  the system is too busy to waste cycles on spinning and we should sleep
> >instead.
> >
> >__i915_spin_request was introduced in
> >commit 2def4ad99befa25775dd2f714fdd4d92faec6e34 [v4.2]
> >Author: Chris Wilson <chris@chris-wilson.co.uk>
> >Date:   Tue Apr 7 16:20:41 2015 +0100
> >
> >      drm/i915: Optimistically spin for the request completion
> >
> >v2: Drop full u64 for unsigned long - the timer is 32bit wraparound safe,
> >so we can use native register sizes on smaller architectures. Mention
> >the approximate microseconds units for elapsed time and add some extra
> >comments describing the reason for busywaiting.
> >
> >v3: Raise the limit to 10us
> >
> >Reported-by: Jens Axboe <axboe@kernel.dk>
> >Link: https://lkml.org/lkml/2015/11/12/621
> >Cc; "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> >Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >Cc: Eero Tamminen <eero.t.tamminen@intel.com>
> >Cc: "Rantala, Valtteri" <valtteri.rantala@intel.com>
> >Cc: stable@kernel.vger.org
> >---
> >  drivers/gpu/drm/i915/i915_gem.c | 47 +++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 45 insertions(+), 2 deletions(-)
> 
> Again,
> 
> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Sorry, I was just including these patches here so that anyone who wanted
to look at the wait-request patches only had one series to pull :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request
  2015-11-29  8:48 ` [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request Chris Wilson
@ 2015-11-30 10:14   ` Tvrtko Ursulin
  2015-11-30 10:19     ` Chris Wilson
  2015-11-30 10:22   ` Chris Wilson
  1 sibling, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 10:14 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx; +Cc: Lionel Landwerlin, Daniel Vetter


On 29/11/15 08:48, Chris Wilson wrote:
> We have relied upon the sole caller (wait_ioctl) validating the timeout
> argument. However, when waiting for multiple requests I forgot to ensure
> that the timeout was still positive on the later requests. This is more
> simply done inside __i915_wait_request.

As discussed on IRC please mention that the extra jiffie happens because 
nsecs_to_jiffies_timeout adds it. Otherwise it is not immediately clear 
why would it wait an extra one since __i915_wait_request has explicit 
code to ensure timeout does not go negative already.

With that clarified,

Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>

Regards,

Tvrtko

> Fixes a minor regression introduced in
>
> commit b47161858ba13c9c7e03333132230d66e008dd55
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Mon Apr 27 13:41:17 2015 +0100
>
>      drm/i915: Implement inter-engine read-read optimisations
>
> where we may end up waiting for an extra jiffie for each active ring
> after consuming all of the user's timeout.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index f33c35c6130f..65d101b47d8e 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1251,8 +1251,16 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   	if (i915_gem_request_completed(req, true))
>   		return 0;
>
> -	timeout_expire = timeout ?
> -		jiffies + nsecs_to_jiffies_timeout((u64)*timeout) : 0;
> +	timeout_expire = 0;
> +	if (timeout) {
> +		if (WARN_ON(*timeout < 0))
> +			return -EINVAL;
> +
> +		if (*timeout == 0)
> +			return -ETIME;
> +
> +		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
> +	}
>
>   	if (INTEL_INFO(dev_priv)->gen >= 6)
>   		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request
  2015-11-30 10:14   ` Tvrtko Ursulin
@ 2015-11-30 10:19     ` Chris Wilson
  2015-11-30 10:27       ` Tvrtko Ursulin
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 10:19 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: Lionel Landwerlin, Daniel Vetter, intel-gfx

On Mon, Nov 30, 2015 at 10:14:04AM +0000, Tvrtko Ursulin wrote:
> 
> On 29/11/15 08:48, Chris Wilson wrote:
> >We have relied upon the sole caller (wait_ioctl) validating the timeout
> >argument. However, when waiting for multiple requests I forgot to ensure
> >that the timeout was still positive on the later requests. This is more
> >simply done inside __i915_wait_request.
> 
> As discussed on IRC please mention that the extra jiffie happens
> because nsecs_to_jiffies_timeout adds it. Otherwise it is not
> immediately clear why would it wait an extra one since
> __i915_wait_request has explicit code to ensure timeout does not go
> negative already.

Sorry, I was under the impression that everyone knew the history of our
*to_jiffies_timeout function variants.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 14/15] drm/i915: Only query timestamp when measuring elapsed time
  2015-11-29  8:48 ` [PATCH 14/15] drm/i915: Only query timestamp when measuring elapsed time Chris Wilson
@ 2015-11-30 10:19   ` Tvrtko Ursulin
  2015-11-30 14:31     ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 10:19 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


On 29/11/15 08:48, Chris Wilson wrote:
> Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as
> we only need to compute the elapsed time for a timed wait.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>   drivers/gpu/drm/i915/i915_gem.c | 13 +++++--------
>   1 file changed, 5 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 2c3e36e19cb0..871201713c73 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1227,7 +1227,6 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
>   	DEFINE_WAIT(wait);
>   	unsigned long timeout_remain;
> -	s64 before, now;
>   	int ret;
>
>   	if (list_empty(&req->list))
> @@ -1244,13 +1243,12 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   		if (*timeout == 0)
>   			return -ETIME;
>
> +		/* Record current time in case interrupted, or wedged */
>   		timeout_remain = nsecs_to_jiffies_timeout(*timeout);
> +		*timeout += ktime_get_raw_ns();

Don't really like this one, how you use the passed in pointer to store 
the intermediate local state.

It works etc but just feels too hacky.

Regards,

Tvrtko


>   	}
>
> -	/* Record current time in case interrupted by signal, or wedged */
>   	trace_i915_gem_request_wait_begin(req);
> -	before = ktime_get_raw_ns();
> -
>   	if (INTEL_INFO(req->i915)->gen >= 6)
>   		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
>
> @@ -1286,14 +1284,13 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   	}
>   	finish_wait(&req->wait, &wait);
>   out:
> -	now = ktime_get_raw_ns();
>   	intel_breadcrumbs_remove_waiter(req);
>   	trace_i915_gem_request_wait_end(req);
>
>   	if (timeout) {
> -		s64 tres = *timeout - (now - before);
> -
> -		*timeout = tres < 0 ? 0 : tres;
> +		*timeout -= ktime_get_raw_ns();
> +		if (*timeout < 0)
> +			*timeout = 0;
>
>   		/*
>   		 * Apparently ktime isn't accurate enough and occasionally has a
>
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request
  2015-11-29  8:48 ` [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request Chris Wilson
  2015-11-30 10:14   ` Tvrtko Ursulin
@ 2015-11-30 10:22   ` Chris Wilson
  2015-11-30 10:28     ` Tvrtko Ursulin
  1 sibling, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 10:22 UTC (permalink / raw)
  To: intel-gfx; +Cc: Lionel Landwerlin, Daniel Vetter

On Sun, Nov 29, 2015 at 08:48:05AM +0000, Chris Wilson wrote:
> We have relied upon the sole caller (wait_ioctl) validating the timeout
> argument. However, when waiting for multiple requests I forgot to ensure
> that the timeout was still positive on the later requests. This is more
> simply done inside __i915_wait_request.
> 
> Fixes a minor regression introduced in
> 
> commit b47161858ba13c9c7e03333132230d66e008dd55
> Author: Chris Wilson <chris@chris-wilson.co.uk>
> Date:   Mon Apr 27 13:41:17 2015 +0100
> 
>     drm/i915: Implement inter-engine read-read optimisations
> 
> where we may end up waiting for an extra jiffie for each active ring
> after consuming all of the user's timeout.

where we may end up waiting for an extra jiffie (added by
nsecs_to_jiffie_timeout to guarantee minimum duration) for each active
ring after consuming all of the user's timeout
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request
  2015-11-30 10:19     ` Chris Wilson
@ 2015-11-30 10:27       ` Tvrtko Ursulin
  0 siblings, 0 replies; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 10:27 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Gong, Zhipeng, Rogozhkin, Dmitry V,
	Lionel Landwerlin, Tvrtko Ursulin, Daniel Vetter


On 30/11/15 10:19, Chris Wilson wrote:
> On Mon, Nov 30, 2015 at 10:14:04AM +0000, Tvrtko Ursulin wrote:
>>
>> On 29/11/15 08:48, Chris Wilson wrote:
>>> We have relied upon the sole caller (wait_ioctl) validating the timeout
>>> argument. However, when waiting for multiple requests I forgot to ensure
>>> that the timeout was still positive on the later requests. This is more
>>> simply done inside __i915_wait_request.
>>
>> As discussed on IRC please mention that the extra jiffie happens
>> because nsecs_to_jiffies_timeout adds it. Otherwise it is not
>> immediately clear why would it wait an extra one since
>> __i915_wait_request has explicit code to ensure timeout does not go
>> negative already.
>
> Sorry, I was under the impression that everyone knew the history of our
> *to_jiffies_timeout function variants.

I don't know if I am the only one who didn't, but when I ask or suggest 
for more or specific comments, or commit message additions during 
review, it is always genuinely for things which I think would have 
helped me with the review.  Which means I also think they would help 
anyone else getting up to speed with the code base.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request
  2015-11-30 10:22   ` Chris Wilson
@ 2015-11-30 10:28     ` Tvrtko Ursulin
  0 siblings, 0 replies; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 10:28 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Gong, Zhipeng, Rogozhkin, Dmitry V,
	Lionel Landwerlin, Tvrtko Ursulin, Daniel Vetter



On 30/11/15 10:22, Chris Wilson wrote:
> On Sun, Nov 29, 2015 at 08:48:05AM +0000, Chris Wilson wrote:
>> We have relied upon the sole caller (wait_ioctl) validating the timeout
>> argument. However, when waiting for multiple requests I forgot to ensure
>> that the timeout was still positive on the later requests. This is more
>> simply done inside __i915_wait_request.
>>
>> Fixes a minor regression introduced in
>>
>> commit b47161858ba13c9c7e03333132230d66e008dd55
>> Author: Chris Wilson <chris@chris-wilson.co.uk>
>> Date:   Mon Apr 27 13:41:17 2015 +0100
>>
>>      drm/i915: Implement inter-engine read-read optimisations
>>
>> where we may end up waiting for an extra jiffie for each active ring
>> after consuming all of the user's timeout.
>
> where we may end up waiting for an extra jiffie (added by
> nsecs_to_jiffie_timeout to guarantee minimum duration) for each active
> ring after consuming all of the user's timeout

Sounds good! R-b given previously can be applied. :)

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-29  8:48 ` [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
@ 2015-11-30 10:53   ` Chris Wilson
  2015-11-30 12:09     ` Tvrtko Ursulin
  2015-11-30 12:05   ` Tvrtko Ursulin
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 10:53 UTC (permalink / raw)
  To: intel-gfx

On Sun, Nov 29, 2015 at 08:48:06AM +0000, Chris Wilson wrote:
> +	/* Optimistic spin for the next jiffie before touching IRQs */
> +	if (intel_breadcrumbs_add_waiter(req)) {
> +		ret = __i915_spin_request(req, state);
> +		if (ret == 0)
> +			goto out;

There are a couple of interesting side-effects here.  As we know start
up the irq in parallel and keep it running for longer, irq/i915 now
consumes a lot of CPU time (like 50-100%!) for synchronous batches, but
doesn't seem to interfere with latency (as spin_request is still nicely
running and catching the request completion). That should also still
work nicely on single cpu machines as the irq enabling should not
preempt us.  The second interesting side-effect is that the synchronous
loads that regressed with a 2us spin-request timeout are now happy again
at 2us. Also with the active-request and the can-spin check from
add_waiter, running a low fps game with a compositor is not burning the
CPU with any spinning.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-29  8:48 ` [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
  2015-11-30 10:53   ` Chris Wilson
@ 2015-11-30 12:05   ` Tvrtko Ursulin
  2015-11-30 12:30     ` Chris Wilson
  2015-11-30 14:34   ` [PATCH v4] " Chris Wilson
  2015-11-30 15:45   ` [PATCH] drm/i915: Convert trace-irq to the breadcrumb waiter Chris Wilson
  3 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 12:05 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


Hi,

On 29/11/15 08:48, Chris Wilson wrote:
> One particularly stressful scenario consists of many independent tasks
> all competing for GPU time and waiting upon the results (e.g. realtime
> transcoding of many, many streams). One bottleneck in particular is that
> each client waits on its own results, but every client is woken up after
> every batchbuffer - hence the thunder of hooves as then every client must
> do its heavyweight dance to read a coherent seqno to see if it is the
> lucky one. Alternatively, we can have one kthread responsible for waking
> after an interrupt, checking the seqno and only waking up the waiting
> clients who are complete. The disadvantage is that in the uncontended
> scenario (i.e. only one waiter) we incur an extra context switch in the
> wakeup path - though that should be mitigated somewhat by the busy-wait
> we do first before sleeping.
>
> v2: Convert from a kworker per engine into a dedicated kthread for the
> bottom-half.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile            |   1 +
>   drivers/gpu/drm/i915/i915_dma.c          |   9 +-
>   drivers/gpu/drm/i915/i915_drv.h          |  35 ++++-
>   drivers/gpu/drm/i915/i915_gem.c          |  99 +++---------
>   drivers/gpu/drm/i915/i915_gpu_error.c    |   8 +-
>   drivers/gpu/drm/i915/i915_irq.c          |  17 +--
>   drivers/gpu/drm/i915/intel_breadcrumbs.c | 253 +++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
>   drivers/gpu/drm/i915/intel_ringbuffer.c  |   3 +-
>   drivers/gpu/drm/i915/intel_ringbuffer.h  |   3 +-
>   10 files changed, 333 insertions(+), 97 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/intel_breadcrumbs.c
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 0851de07bd13..d3b9d3618719 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
>   	  i915_gem_userptr.o \
>   	  i915_gpu_error.o \
>   	  i915_trace_points.o \
> +	  intel_breadcrumbs.o \
>   	  intel_lrc.o \
>   	  intel_mocs.o \
>   	  intel_ringbuffer.o \
> diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
> index a81c76603544..93c6762d87b4 100644
> --- a/drivers/gpu/drm/i915/i915_dma.c
> +++ b/drivers/gpu/drm/i915/i915_dma.c
> @@ -388,12 +388,16 @@ static int i915_load_modeset_init(struct drm_device *dev)
>   	if (ret)
>   		goto cleanup_vga_client;
>
> +	ret = intel_breadcrumbs_init(dev_priv);
> +	if (ret)
> +		goto cleanup_vga_switcheroo;
> +
>   	/* Initialise stolen first so that we may reserve preallocated
>   	 * objects for the BIOS to KMS transition.
>   	 */
>   	ret = i915_gem_init_stolen(dev);
>   	if (ret)
> -		goto cleanup_vga_switcheroo;
> +		goto cleanup_breadcrumbs;
>
>   	intel_power_domains_init_hw(dev_priv, false);
>
> @@ -454,6 +458,8 @@ cleanup_irq:
>   	drm_irq_uninstall(dev);
>   cleanup_gem_stolen:
>   	i915_gem_cleanup_stolen(dev);
> +cleanup_breadcrumbs:
> +	intel_breadcrumbs_fini(dev_priv);
>   cleanup_vga_switcheroo:
>   	vga_switcheroo_unregister_client(dev->pdev);
>   cleanup_vga_client:
> @@ -1193,6 +1199,7 @@ int i915_driver_unload(struct drm_device *dev)
>   	mutex_unlock(&dev->struct_mutex);
>   	intel_fbc_cleanup_cfb(dev_priv);
>   	i915_gem_cleanup_stolen(dev);
> +	intel_breadcrumbs_fini(dev_priv);
>
>   	intel_csr_ucode_fini(dev_priv);
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index e1980212ba37..782d08ab6264 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -564,6 +564,7 @@ struct drm_i915_error_state {
>   			long jiffies;
>   			u32 seqno;
>   			u32 tail;
> +			u32 waiters;
>   		} *requests;
>
>   		struct {
> @@ -1383,7 +1384,7 @@ struct i915_gpu_error {
>   #define I915_STOP_RING_ALLOW_WARN      (1 << 30)
>
>   	/* For missed irq/seqno simulation. */
> -	unsigned int test_irq_rings;
> +	unsigned long test_irq_rings;
>   };
>
>   enum modeset_restore {
> @@ -1695,6 +1696,27 @@ struct drm_i915_private {
>
>   	struct intel_uncore uncore;
>
> +	/* Rather than have every client wait upon all user interrupts,
> +	 * with the herd waking after every interrupt and each doing the
> +	 * heavyweight seqno dance, we delegate the task to a bottom-half
> +	 * for the user interrupt (one bh handling all engines). Now,
> +	 * just one task is woken up and does the coherent seqno read and
> +	 * then wakes up all the waiters registered with the bottom-half
> +	 * that are complete. This avoid the thundering herd problem
> +	 * with lots of concurrent waiters, at the expense of an extra
> +	 * context switch between the interrupt and the client. That should
> +	 * be mitigated somewhat by the busyspin in the client before we go to
> +	 * sleep waiting for an interrupt from the bottom-half.
> +	 */
> +	struct intel_breadcrumbs {
> +		spinlock_t lock; /* protects the per-engine request list */

s/list/tree/

> +		struct task_struct *task; /* bh for all user interrupts */
> +		struct intel_breadcrumbs_engine {
> +			struct rb_root requests; /* sorted by retirement */
> +			struct drm_i915_gem_request *first;

/* First request in the tree to be completed next by the hw. */

?

> +		} engine[I915_NUM_RINGS];
> +	} breadcrumbs;
> +
>   	struct i915_virtual_gpu vgpu;
>
>   	struct intel_guc guc;
> @@ -2228,6 +2250,11 @@ struct drm_i915_gem_request {
>   	/** global list entry for this request */
>   	struct list_head list;
>
> +	/** List of clients waiting for completion of this request */
> +	wait_queue_head_t wait;
> +	struct rb_node irq_node;
> +	unsigned irq_count;

To me irq prefix does not fit here that well.

req_tree_node?

waiter_count, waiters?

> +
>   	struct drm_i915_file_private *file_priv;
>   	/** file_priv list entry for this request */
>   	struct list_head client_list;
> @@ -2800,6 +2827,12 @@ ibx_disable_display_interrupt(struct drm_i915_private *dev_priv, uint32_t bits)
>   }
>
>
> +/* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
> +int intel_breadcrumbs_init(struct drm_i915_private *i915);
> +bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request);
> +void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request);
> +void intel_breadcrumbs_fini(struct drm_i915_private *i915);
> +
>   /* i915_gem.c */
>   int i915_gem_create_ioctl(struct drm_device *dev, void *data,
>   			  struct drm_file *file_priv);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 65d101b47d8e..969592fb0969 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1126,17 +1126,6 @@ i915_gem_check_wedge(unsigned reset_counter,
>   	return 0;
>   }
>
> -static void fake_irq(unsigned long data)
> -{
> -	wake_up_process((struct task_struct *)data);
> -}
> -
> -static bool missed_irq(struct drm_i915_private *dev_priv,
> -		       struct intel_engine_cs *ring)
> -{
> -	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
> -}
> -
>   static unsigned long local_clock_us(unsigned *cpu)
>   {
>   	unsigned long t;
> @@ -1184,9 +1173,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   	 * takes to sleep on a request, on the order of a microsecond.
>   	 */
>
> -	if (req->ring->irq_refcount)
> -		return -EBUSY;
> -
>   	/* Only spin if we know the GPU is processing this request */
>   	if (!i915_gem_request_started(req, false))
>   		return -EAGAIN;
> @@ -1232,26 +1218,19 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   			s64 *timeout,
>   			struct intel_rps_client *rps)
>   {
> -	struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
> -	struct drm_device *dev = ring->dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	const bool irq_test_in_progress =
> -		ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
>   	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
>   	DEFINE_WAIT(wait);
> -	unsigned long timeout_expire;
> +	unsigned long timeout_remain;
>   	s64 before, now;
>   	int ret;
>
> -	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
> -
>   	if (list_empty(&req->list))
>   		return 0;
>
>   	if (i915_gem_request_completed(req, true))
>   		return 0;
>
> -	timeout_expire = 0;
> +	timeout_remain = 0;
>   	if (timeout) {
>   		if (WARN_ON(*timeout < 0))
>   			return -EINVAL;
> @@ -1259,43 +1238,28 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   		if (*timeout == 0)
>   			return -ETIME;
>
> -		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
> +		timeout_remain = nsecs_to_jiffies_timeout(*timeout);
>   	}
>
> -	if (INTEL_INFO(dev_priv)->gen >= 6)
> -		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
>
>   	/* Record current time in case interrupted by signal, or wedged */
>   	trace_i915_gem_request_wait_begin(req);
>   	before = ktime_get_raw_ns();
>
> -	/* Optimistic spin for the next jiffie before touching IRQs */
> -	ret = __i915_spin_request(req, state);
> -	if (ret == 0)
> -		goto out;
> +	if (INTEL_INFO(req->i915)->gen >= 6)
> +		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
>
> -	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
> -		ret = -ENODEV;
> -		goto out;
> +	/* Optimistic spin for the next jiffie before touching IRQs */
> +	if (intel_breadcrumbs_add_waiter(req)) {
> +		ret = __i915_spin_request(req, state);
> +		if (ret == 0)
> +			goto out;
>   	}
>
>   	for (;;) {
> -		struct timer_list timer;
> -
> -		prepare_to_wait(&ring->irq_queue, &wait, state);
> +		prepare_to_wait(&req->wait, &wait, state);
>
> -		/* We need to check whether any gpu reset happened in between
> -		 * the caller grabbing the seqno and now ... */
> -		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
> -			/* As we do not requeue the request over a GPU reset,
> -			 * if one does occur we know that the request is
> -			 * effectively complete.
> -			 */
> -			ret = 0;
> -			break;
> -		}
> -
> -		if (i915_gem_request_completed(req, false)) {
> +		if (i915_gem_request_completed(req, true) ||

Why it is OK to change the lazy coherency mode here?

> +		    req->reset_counter != i915_reset_counter(&req->i915->gpu_error)) {

It looks like it would be valuable to preserve the comment about no 
re-queuing etc on GPU reset.

>   			ret = 0;
>   			break;
>   		}
> @@ -1305,36 +1269,19 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   			break;
>   		}
>
> -		if (timeout && time_after_eq(jiffies, timeout_expire)) {
> -			ret = -ETIME;
> -			break;
> -		}
> -
> -		i915_queue_hangcheck(dev_priv);
> -
> -		timer.function = NULL;
> -		if (timeout || missed_irq(dev_priv, ring)) {
> -			unsigned long expire;
> -
> -			setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
> -			expire = missed_irq(dev_priv, ring) ? jiffies + 1 : timeout_expire;
> -			mod_timer(&timer, expire);
> -		}
> -
> -		io_schedule();
> -
> -		if (timer.function) {
> -			del_singleshot_timer_sync(&timer);
> -			destroy_timer_on_stack(&timer);
> -		}
> +		if (timeout) {
> +			timeout_remain = io_schedule_timeout(timeout_remain);
> +			if (timeout_remain == 0) {
> +				ret = -ETIME;
> +				break;
> +			}
> +		} else
> +			io_schedule();

Could set timeout_remain to that MAX value when timeout == NULL and have 
a single call site to io_schedule_timeout and less indentation. It 
doesn't matter hugely, maybe it would be a bit easier to read.

>   	}
> -	if (!irq_test_in_progress)
> -		ring->irq_put(ring);
> -
> -	finish_wait(&ring->irq_queue, &wait);
> -
> +	finish_wait(&req->wait, &wait);
>   out:
>   	now = ktime_get_raw_ns();
> +	intel_breadcrumbs_remove_waiter(req);
>   	trace_i915_gem_request_wait_end(req);
>
>   	if (timeout) {
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 06ca4082735b..d1df405b905c 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -453,10 +453,11 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
>   				   dev_priv->ring[i].name,
>   				   error->ring[i].num_requests);
>   			for (j = 0; j < error->ring[i].num_requests; j++) {
> -				err_printf(m, "  seqno 0x%08x, emitted %ld, tail 0x%08x\n",
> +				err_printf(m, "  seqno 0x%08x, emitted %ld, tail 0x%08x, waiters? %d\n",
>   					   error->ring[i].requests[j].seqno,
>   					   error->ring[i].requests[j].jiffies,
> -					   error->ring[i].requests[j].tail);
> +					   error->ring[i].requests[j].tail,
> +					   error->ring[i].requests[j].waiters);
>   			}
>   		}
>
> @@ -900,7 +901,7 @@ static void i915_record_ring_state(struct drm_device *dev,
>   		ering->instdone = I915_READ(GEN2_INSTDONE);
>   	}
>
> -	ering->waiting = waitqueue_active(&ring->irq_queue);
> +	ering->waiting = READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first);

What does the READ_ONCE do here?

>   	ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
>   	ering->seqno = ring->get_seqno(ring, false);
>   	ering->acthd = intel_ring_get_active_head(ring);
> @@ -1105,6 +1106,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
>   			erq->seqno = request->seqno;
>   			erq->jiffies = request->emitted_jiffies;
>   			erq->tail = request->postfix;
> +			erq->waiters = request->irq_count;
>   		}
>   	}
>   }
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 28c0329f8281..0c2390476bdb 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -996,12 +996,11 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
>
>   static void notify_ring(struct intel_engine_cs *ring)
>   {
> -	if (!intel_ring_initialized(ring))
> +	if (ring->i915 == NULL)
>   		return;
>
>   	trace_i915_gem_request_notify(ring);
> -
> -	wake_up_all(&ring->irq_queue);
> +	wake_up_process(ring->i915->breadcrumbs.task);

For a later extension:

How would you view, some time in the future, adding a bool return to 
ring->put_irq() which would say to the thread that it failed to disable 
interrupts?

In this case thread would exit and would need to be restarted when the 
next waiter queues up. Possibly extending the idle->put_irq->exit 
durations as well then.

>   }
>
>   static void vlv_c0_read(struct drm_i915_private *dev_priv,
> @@ -2399,9 +2398,6 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
>   static void i915_error_wake_up(struct drm_i915_private *dev_priv,
>   			       bool reset_completed)
>   {
> -	struct intel_engine_cs *ring;
> -	int i;
> -
>   	/*
>   	 * Notify all waiters for GPU completion events that reset state has
>   	 * been changed, and that they need to restart their wait after
> @@ -2410,8 +2406,7 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
>   	 */
>
>   	/* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
> -	for_each_ring(ring, dev_priv, i)
> -		wake_up_all(&ring->irq_queue);
> +	wake_up_process(dev_priv->breadcrumbs.task);
>
>   	/* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
>   	wake_up_all(&dev_priv->pending_flip_queue);
> @@ -2986,16 +2981,16 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
>   			if (ring_idle(ring, seqno)) {
>   				ring->hangcheck.action = HANGCHECK_IDLE;
>
> -				if (waitqueue_active(&ring->irq_queue)) {
> +				if (READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first)) {

READ_ONCE is again strange, but other than that I don't know the 
hangcheck code to really evaulate it.

Perhaps this also means this line needs a comment describing what 
condition/state is actually approximating with the check.

>   					/* Issue a wake-up to catch stuck h/w. */
>   					if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) {
> -						if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)))
> +						if (!test_bit(ring->id, &dev_priv->gpu_error.test_irq_rings))
>   							DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
>   								  ring->name);
>   						else
>   							DRM_INFO("Fake missed irq on %s\n",
>   								 ring->name);
> -						wake_up_all(&ring->irq_queue);
> +						wake_up_process(dev_priv->breadcrumbs.task);
>   					}
>   					/* Safeguard against driver failure */
>   					ring->hangcheck.score += BUSY;
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> new file mode 100644
> index 000000000000..0a18405c8689
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -0,0 +1,253 @@
> +/*
> + * Copyright © 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include <linux/kthread.h>
> +
> +#include "i915_drv.h"
> +
> +static bool __irq_enable(struct intel_engine_cs *engine)
> +{
> +	if (test_bit(engine->id, &engine->i915->gpu_error.test_irq_rings))
> +		return false;
> +
> +	if (!intel_irqs_enabled(engine->i915))
> +		return false;
> +
> +	return engine->irq_get(engine);
> +}
> +
> +static struct drm_i915_gem_request *to_request(struct rb_node *rb)
> +{
> +	if (rb == NULL)
> +		return NULL;
> +
> +	return rb_entry(rb, struct drm_i915_gem_request, irq_node);
> +}
> +
> +/*
> + * intel_breadcrumbs_irq() acts as the bottom-half for the user interrupt,
> + * which we use as the breadcrumb after every request. In order to scale
> + * to many concurrent waiters, we delegate the task of reading the coherent
> + * seqno to this bottom-half. (Otherwise every waiter is woken after each
> + * interrupt and they all attempt to perform the heavyweight coherent
> + * seqno check.) Individual clients register with the bottom-half when
> + * waiting on a request, and are then woken when the bottom-half notices
> + * their seqno is complete. This incurs an extra context switch from the
> + * interrupt to the client in the uncontended case.
> + */
> +static int intel_breadcrumbs_irq(void *data)
> +{
> +	struct drm_i915_private *i915 = data;
> +	struct intel_breadcrumbs *b = &i915->breadcrumbs;
> +	unsigned irq_get = 0;
> +	unsigned irq_enabled = 0;
> +	int i;
> +
> +	while (!kthread_should_stop()) {
> +		unsigned reset_counter = i915_reset_counter(&i915->gpu_error);
> +		unsigned long timeout_jiffies;
> +		bool idling = false;
> +
> +		/* On every tick, walk the seqno-ordered list of requests
> +		 * and for each retired request wakeup its clients. If we
> +		 * find an unfinished request, go back to sleep.
> +		 */

s/tick/loop/ ?

And if we don't find and unfinished request still go back to sleep. :)

> +		set_current_state(TASK_INTERRUPTIBLE);
> +
> +		/* Note carefully that we do not hold a reference for the
> +		 * requests on this list. Therefore we only inspect them
> +		 * whilst holding the spinlock to ensure that they are not
> +		 * freed in the meantime, and the client must remove the
> +		 * request from the list if it is interrupted (before it
> +		 * itself releases its reference).
> +		 */
> +		spin_lock(&b->lock);

This lock is more per engine that global in its nature unless you think 
it was more efficient to do fewer lock atomics vs potential overlap of 
waiter activity per engines?

Would need a new lock for req->irq_count, per request. So 
req->lock/req->irq_count and be->lock for the tree operations.

Thread would only need to deal with the later. Feels like it would work, 
just not sure if it is worth it.

> +		for (i = 0; i < I915_NUM_RINGS; i++) {
> +			struct intel_engine_cs *engine = &i915->ring[i];
> +			struct intel_breadcrumbs_engine *be = &b->engine[i];
> +			struct drm_i915_gem_request *request = be->first;
> +
> +			if (request == NULL) {
> +				if ((irq_get & (1 << i))) {
> +					if (irq_enabled & (1 << i)) {
> +						engine->irq_put(engine);
> +						irq_enabled &= ~ (1 << i);
> +					}
> +					intel_runtime_pm_put(i915);
> +					irq_get &= ~(1 << i);
> +				}
> +				continue;
> +			}
> +
> +			if ((irq_get & (1 << i)) == 0) {
> +				intel_runtime_pm_get(i915);
> +				irq_enabled |= __irq_enable(engine) << i;
> +				irq_get |= 1 << i;
> +			}

irq_get and irq_enabled are creating a little bit of a headache for me. 
And that would probably mean a comment is be in order to explain what 
they are for and how they work.

Like this I don't see the need for irq_get to persist across loops?

irq_get looks like "try to get these", irq_enabled is "these ones I 
got". Apart for the weird usage of it to balance the runtime pm gets and 
puts.

A bit confusing as I said.

> +			do {
> +				struct rb_node *next;
> +
> +				if (request->reset_counter == reset_counter &&
> +				    !i915_gem_request_completed(request, false))
> +					break;
> +
> +				next = rb_next(&request->irq_node);
> +				rb_erase(&request->irq_node, &be->requests);
> +				RB_CLEAR_NODE(&request->irq_node);
> +
> +				wake_up_all(&request->wait);
> +
> +				request = to_request(next);
> +			} while (request);
> +			be->first = request;
> +			idling |= request == NULL;
> +
> +			/* Make sure the hangcheck timer is armed in case
> +			 * the GPU hangs and we are never woken up.
> +			 */
> +			if (request)
> +				i915_queue_hangcheck(i915);
> +		}
> +		spin_unlock(&b->lock);
> +
> +		/* If we don't have interrupts available (either we have
> +		 * detected the hardware is missing interrupts or the
> +		 * interrupt delivery was disabled by the user), wake up
> +		 * every jiffie to check for request completion.
> +		 *
> +		 * If we have processed all requests for one engine, we
> +		 * also wish to wake up in a jiffie to turn off the
> +		 * breadcrumb interrupt for that engine. We delay
> +		 * switching off the interrupt in order to allow another
> +		 * waiter to start without incurring additional latency
> +		 * enabling the interrupt.
> +		 */
> +		timeout_jiffies = MAX_SCHEDULE_TIMEOUT;
> +		if (idling || i915->gpu_error.missed_irq_rings & irq_enabled)
> +			timeout_jiffies = 1;
> +
> +		/* Unlike the individual clients, we do not want this
> +		 * background thread to contribute to the system load,
> +		 * i.e. we do not want to use io_schedule() here.
> +		 */
> +		schedule_timeout(timeout_jiffies);
> +	}
> +
> +	for (i = 0; i < I915_NUM_RINGS; i++) {
> +		if ((irq_get & (1 << i))) {
> +			if (irq_enabled & (1 << i)) {
> +				struct intel_engine_cs *engine = &i915->ring[i];
> +				engine->irq_put(engine);
> +			}
> +			intel_runtime_pm_put(i915);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request)
> +{
> +	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
> +	bool first = false;
> +
> +	spin_lock(&b->lock);
> +	if (request->irq_count++ == 0) {
> +		struct intel_breadcrumbs_engine *be =
> +			&b->engine[request->ring->id];
> +		struct rb_node **p, *parent;
> +
> +		if (be->first == NULL)
> +			wake_up_process(b->task);

With a global b->lock it makes no difference but in the logical sense 
this would usually go last, after the request has been inserted into the 
tree and waiter initialized.

> +
> +		init_waitqueue_head(&request->wait);
> +
> +		first = true;
> +		parent = NULL;
> +		p = &be->requests.rb_node;
> +		while (*p) {
> +			struct drm_i915_gem_request *__req;
> +
> +			parent = *p;
> +			__req = rb_entry(parent, typeof(*__req), irq_node);
> +
> +			if (i915_seqno_passed(request->seqno, __req->seqno)) {
> +				p = &parent->rb_right;
> +				first = false;
> +			} else
> +				p = &parent->rb_left;
> +		}
> +		if (first)
> +			be->first = request;
> +
> +		rb_link_node(&request->irq_node, parent, p);
> +		rb_insert_color(&request->irq_node, &be->requests);
> +	}
> +	spin_unlock(&b->lock);
> +
> +	return first;
> +}
> +
> +void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request)
> +{
> +	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
> +
> +	spin_lock(&b->lock);
> +	if (--request->irq_count == 0 && !RB_EMPTY_NODE(&request->irq_node)) {
> +		struct intel_breadcrumbs_engine *be =
> +			&b->engine[request->ring->id];
> +		if (be->first == request)
> +			be->first = to_request(rb_next(&request->irq_node));
> +		rb_erase(&request->irq_node, &be->requests);
> +	}
> +	spin_unlock(&b->lock);
> +}
> +
> +int intel_breadcrumbs_init(struct drm_i915_private *i915)
> +{
> +	struct intel_breadcrumbs *b = &i915->breadcrumbs;
> +	struct task_struct *task;
> +
> +	spin_lock_init(&b->lock);
> +
> +	task = kthread_run(intel_breadcrumbs_irq, i915, "irq/i915");
> +	if (IS_ERR(task))
> +		return PTR_ERR(task);
> +
> +	b->task = task;
> +
> +	return 0;
> +}
> +
> +void intel_breadcrumbs_fini(struct drm_i915_private *i915)
> +{
> +	struct intel_breadcrumbs *b = &i915->breadcrumbs;
> +
> +	if (b->task == NULL)
> +		return;
> +
> +	kthread_stop(b->task);
> +	b->task = NULL;
> +}
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index b40ffc3607e7..604c081b0a0f 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1922,10 +1922,10 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
>   	ring->buffer = NULL;
>
>   	ring->dev = dev;
> +	ring->i915 = to_i915(dev);
>   	INIT_LIST_HEAD(&ring->active_list);
>   	INIT_LIST_HEAD(&ring->request_list);
>   	i915_gem_batch_pool_init(dev, &ring->batch_pool);
> -	init_waitqueue_head(&ring->irq_queue);
>
>   	INIT_LIST_HEAD(&ring->buffers);
>   	INIT_LIST_HEAD(&ring->execlist_queue);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 511efe556d73..5a5a7195dd54 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2157,6 +2157,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>   	WARN_ON(ring->buffer);
>
>   	ring->dev = dev;
> +	ring->i915 = to_i915(dev);
>   	INIT_LIST_HEAD(&ring->active_list);
>   	INIT_LIST_HEAD(&ring->request_list);
>   	INIT_LIST_HEAD(&ring->execlist_queue);
> @@ -2164,8 +2165,6 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>   	i915_gem_batch_pool_init(dev, &ring->batch_pool);
>   	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
>
> -	init_waitqueue_head(&ring->irq_queue);
> -
>   	ringbuf = intel_engine_create_ringbuffer(ring, 32 * PAGE_SIZE);
>   	if (IS_ERR(ringbuf))
>   		return PTR_ERR(ringbuf);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 5d1eb206151d..49b5bded2767 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -157,6 +157,7 @@ struct  intel_engine_cs {
>   #define LAST_USER_RING (VECS + 1)
>   	u32		mmio_base;
>   	struct		drm_device *dev;
> +	struct drm_i915_private *i915;
>   	struct intel_ringbuffer *buffer;
>   	struct list_head buffers;
>
> @@ -303,8 +304,6 @@ struct  intel_engine_cs {
>
>   	bool gpu_caches_dirty;
>
> -	wait_queue_head_t irq_queue;
> -
>   	struct intel_context *default_context;
>   	struct intel_context *last_context;
>
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 10:53   ` Chris Wilson
@ 2015-11-30 12:09     ` Tvrtko Ursulin
  2015-11-30 12:38       ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 12:09 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Gong, Zhipeng, Rogozhkin, Dmitry V


Hi,

On 30/11/15 10:53, Chris Wilson wrote:
> On Sun, Nov 29, 2015 at 08:48:06AM +0000, Chris Wilson wrote:
>> +	/* Optimistic spin for the next jiffie before touching IRQs */
>> +	if (intel_breadcrumbs_add_waiter(req)) {
>> +		ret = __i915_spin_request(req, state);
>> +		if (ret == 0)
>> +			goto out;
>
> There are a couple of interesting side-effects here.  As we know start
> up the irq in parallel and keep it running for longer, irq/i915 now
> consumes a lot of CPU time (like 50-100%!) for synchronous batches, but
> doesn't seem to interfere with latency (as spin_request is still nicely
> running and catching the request completion). That should also still
> work nicely on single cpu machines as the irq enabling should not
> preempt us.  The second interesting side-effect is that the synchronous
> loads that regressed with a 2us spin-request timeout are now happy again
> at 2us. Also with the active-request and the can-spin check from
> add_waiter, running a low fps game with a compositor is not burning the
> CPU with any spinning.

Interesting? :) Sounds bad the way you presented it.

Why and where is the thread burning so much CPU? Would per engine req 
tree locks help?

Is this new CPU time or the one which would previously be accounted 
against each waiter, polling in wait request?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 12:05   ` Tvrtko Ursulin
@ 2015-11-30 12:30     ` Chris Wilson
  2015-11-30 13:32       ` Tvrtko Ursulin
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 12:30 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 30, 2015 at 12:05:50PM +0000, Tvrtko Ursulin wrote:
> >+	struct intel_breadcrumbs {
> >+		spinlock_t lock; /* protects the per-engine request list */
> 
> s/list/tree/

list. We use the tree as a list!

> >+		struct task_struct *task; /* bh for all user interrupts */
> >+		struct intel_breadcrumbs_engine {
> >+			struct rb_root requests; /* sorted by retirement */
> >+			struct drm_i915_gem_request *first;
> 
> /* First request in the tree to be completed next by the hw. */
> 
> ?

/* first */ ?
 
> >+		} engine[I915_NUM_RINGS];
> >+	} breadcrumbs;
> >+
> >  	struct i915_virtual_gpu vgpu;
> >
> >  	struct intel_guc guc;
> >@@ -2228,6 +2250,11 @@ struct drm_i915_gem_request {
> >  	/** global list entry for this request */
> >  	struct list_head list;
> >
> >+	/** List of clients waiting for completion of this request */
> >+	wait_queue_head_t wait;
> >+	struct rb_node irq_node;
> >+	unsigned irq_count;
> 
> To me irq prefix does not fit here that well.
> 
> req_tree_node?
> 
> waiter_count, waiters?

waiters, breadcrumb_node, waiters_count ?

> >  	for (;;) {
> >-		struct timer_list timer;
> >-
> >-		prepare_to_wait(&ring->irq_queue, &wait, state);
> >+		prepare_to_wait(&req->wait, &wait, state);
> >
> >-		/* We need to check whether any gpu reset happened in between
> >-		 * the caller grabbing the seqno and now ... */
> >-		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
> >-			/* As we do not requeue the request over a GPU reset,
> >-			 * if one does occur we know that the request is
> >-			 * effectively complete.
> >-			 */
> >-			ret = 0;
> >-			break;
> >-		}
> >-
> >-		if (i915_gem_request_completed(req, false)) {
> >+		if (i915_gem_request_completed(req, true) ||
> 
> Why it is OK to change the lazy coherency mode here?

We bang on the coherency after the IRQ. Here, we are woken by the IRQ
waiter so we know that the IRQ/seqno coherency is fine and we can just
check the CPU cache.
 
> >+		    req->reset_counter != i915_reset_counter(&req->i915->gpu_error)) {
> 
> It looks like it would be valuable to preserve the comment about no
> re-queuing etc on GPU reset.

I can drop it from the previous patch :)

> >+		if (timeout) {
> >+			timeout_remain = io_schedule_timeout(timeout_remain);
> >+			if (timeout_remain == 0) {
> >+				ret = -ETIME;
> >+				break;
> >+			}
> >+		} else
> >+			io_schedule();
> 
> Could set timeout_remain to that MAX value when timeout == NULL and
> have a single call site to io_schedule_timeout and less indentation.
> It doesn't matter hugely, maybe it would be a bit easier to read.

Done. io_schedule() calls io_schedule_timeout(), whereas
schedule_timeout() calls schedule()!

> >@@ -900,7 +901,7 @@ static void i915_record_ring_state(struct drm_device *dev,
> >  		ering->instdone = I915_READ(GEN2_INSTDONE);
> >  	}
> >
> >-	ering->waiting = waitqueue_active(&ring->irq_queue);
> >+	ering->waiting = READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first);
> 
> What does the READ_ONCE do here?

This element is protected by the breadcrumbs spinlock, the READ_ONCE is
documentation that we don't care about the lock here.

> >  static void notify_ring(struct intel_engine_cs *ring)
> >  {
> >-	if (!intel_ring_initialized(ring))
> >+	if (ring->i915 == NULL)
> >  		return;
> >
> >  	trace_i915_gem_request_notify(ring);
> >-
> >-	wake_up_all(&ring->irq_queue);
> >+	wake_up_process(ring->i915->breadcrumbs.task);
> 
> For a later extension:
> 
> How would you view, some time in the future, adding a bool return to
> ring->put_irq() which would say to the thread that it failed to
> disable interrupts?
> 
> In this case thread would exit and would need to be restarted when
> the next waiter queues up. Possibly extending the
> idle->put_irq->exit durations as well then.

Honestly, not keen on the idea that the lowlevel put_irq can fail (makes
cleanup much more fraught). I don't see what improvement you intend
here, maybe clarifying that would help explain what you mean.

> >@@ -2986,16 +2981,16 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
> >  			if (ring_idle(ring, seqno)) {
> >  				ring->hangcheck.action = HANGCHECK_IDLE;
> >
> >-				if (waitqueue_active(&ring->irq_queue)) {
> >+				if (READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first)) {
> 
> READ_ONCE is again strange, but other than that I don't know the
> hangcheck code to really evaulate it.
> 
> Perhaps this also means this line needs a comment describing what
> condition/state is actually approximating with the check.

All we are doing is asking if anyone is waiting on the GPU, because the
GPU has finished all of its work. If so, the IRQ handling is suspect
(modulo the pre-existing race condition clearly earmarked by READ_ONCE).

> >+	while (!kthread_should_stop()) {
> >+		unsigned reset_counter = i915_reset_counter(&i915->gpu_error);
> >+		unsigned long timeout_jiffies;
> >+		bool idling = false;
> >+
> >+		/* On every tick, walk the seqno-ordered list of requests
> >+		 * and for each retired request wakeup its clients. If we
> >+		 * find an unfinished request, go back to sleep.
> >+		 */
> 
> s/tick/loop/ ?
s/tick/iteration/ I'm sticking with tick
 
> And if we don't find and unfinished request still go back to sleep. :)

Ok.
 
> >+		set_current_state(TASK_INTERRUPTIBLE);
> >+
> >+		/* Note carefully that we do not hold a reference for the
> >+		 * requests on this list. Therefore we only inspect them
> >+		 * whilst holding the spinlock to ensure that they are not
> >+		 * freed in the meantime, and the client must remove the
> >+		 * request from the list if it is interrupted (before it
> >+		 * itself releases its reference).
> >+		 */
> >+		spin_lock(&b->lock);
> 
> This lock is more per engine that global in its nature unless you
> think it was more efficient to do fewer lock atomics vs potential
> overlap of waiter activity per engines?

Indeed. I decided not to care about making it per-engine.
 
> Would need a new lock for req->irq_count, per request. So
> req->lock/req->irq_count and be->lock for the tree operations.

Why? A request is tied to an individual engine, therefore we can use
that breadcrumb_engine.lock for the request.
 
> Thread would only need to deal with the later. Feels like it would
> work, just not sure if it is worth it.

That was my feeling as well. Not sure if we will have huge contention
that would be solved with per-engine spinlocks, whilst we still have
struct_mutex.

> >+		for (i = 0; i < I915_NUM_RINGS; i++) {
> >+			struct intel_engine_cs *engine = &i915->ring[i];
> >+			struct intel_breadcrumbs_engine *be = &b->engine[i];
> >+			struct drm_i915_gem_request *request = be->first;
> >+
> >+			if (request == NULL) {
> >+				if ((irq_get & (1 << i))) {
> >+					if (irq_enabled & (1 << i)) {
> >+						engine->irq_put(engine);
> >+						irq_enabled &= ~ (1 << i);
> >+					}
> >+					intel_runtime_pm_put(i915);
> >+					irq_get &= ~(1 << i);
> >+				}
> >+				continue;
> >+			}
> >+
> >+			if ((irq_get & (1 << i)) == 0) {
> >+				intel_runtime_pm_get(i915);
> >+				irq_enabled |= __irq_enable(engine) << i;
> >+				irq_get |= 1 << i;
> >+			}
> 
> irq_get and irq_enabled are creating a little bit of a headache for
> me. And that would probably mean a comment is be in order to explain
> what they are for and how they work.
> 
> Like this I don't see the need for irq_get to persist across loops?

Because getting the irq is quite slow, we add a jiffie shadow.
 
> irq_get looks like "try to get these", irq_enabled is "these ones I
> got". Apart for the weird usage of it to balance the runtime pm gets
> and puts.
> 
> A bit confusing as I said.

I don't like the names, but haven't thought of anything better.
 
> >+bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request)
> >+{
> >+	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
> >+	bool first = false;
> >+
> >+	spin_lock(&b->lock);
> >+	if (request->irq_count++ == 0) {
> >+		struct intel_breadcrumbs_engine *be =
> >+			&b->engine[request->ring->id];
> >+		struct rb_node **p, *parent;
> >+
> >+		if (be->first == NULL)
> >+			wake_up_process(b->task);
> 
> With a global b->lock

or even per engine

> it makes no difference but in the logical
> sense this would usually go last, after the request has been
> inserted into the tree and waiter initialized.

Except that the code is simpler with kicking the task first.

Since we both thought it might make sense with a per-engine lock, let's
go with it. Maybe one day it will pay off.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 12:09     ` Tvrtko Ursulin
@ 2015-11-30 12:38       ` Chris Wilson
  2015-11-30 13:33         ` Tvrtko Ursulin
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 12:38 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 30, 2015 at 12:09:30PM +0000, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> On 30/11/15 10:53, Chris Wilson wrote:
> >On Sun, Nov 29, 2015 at 08:48:06AM +0000, Chris Wilson wrote:
> >>+	/* Optimistic spin for the next jiffie before touching IRQs */
> >>+	if (intel_breadcrumbs_add_waiter(req)) {
> >>+		ret = __i915_spin_request(req, state);
> >>+		if (ret == 0)
> >>+			goto out;
> >
> >There are a couple of interesting side-effects here.  As we know start
> >up the irq in parallel and keep it running for longer, irq/i915 now
> >consumes a lot of CPU time (like 50-100%!) for synchronous batches, but
> >doesn't seem to interfere with latency (as spin_request is still nicely
> >running and catching the request completion). That should also still
> >work nicely on single cpu machines as the irq enabling should not
> >preempt us.  The second interesting side-effect is that the synchronous
> >loads that regressed with a 2us spin-request timeout are now happy again
> >at 2us. Also with the active-request and the can-spin check from
> >add_waiter, running a low fps game with a compositor is not burning the
> >CPU with any spinning.
> 
> Interesting? :) Sounds bad the way you presented it.
> 
> Why and where is the thread burning so much CPU? Would per engine
> req tree locks help?

Just simply being woken up after every batch and checking the seqno is
that expensive. Almost as expensive as the IRQ handler itself! (I expect
top is adding the IRQ handler time to the irq/i915 in this case, perf
says that more time is spent in the IRQ than the bottom-half.) Almost
all waiters will be on the same engine, so I don't expect finer grained
spinlocks to be hugely important.
 
> Is this new CPU time or the one which would previously be accounted
> against each waiter, polling in wait request?

This is CPU time that used to be absorbed in i915_wait_request(),
currently hidden by i915_spin_request(). It is the reason why having
every waiter doing the check after every interrupt is such a big issue
for some workloads.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 12:30     ` Chris Wilson
@ 2015-11-30 13:32       ` Tvrtko Ursulin
  2015-11-30 14:18         ` Chris Wilson
  2015-11-30 14:26         ` Chris Wilson
  0 siblings, 2 replies; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 13:32 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Gong, Zhipeng, Rogozhkin, Dmitry V


On 30/11/15 12:30, Chris Wilson wrote:
> On Mon, Nov 30, 2015 at 12:05:50PM +0000, Tvrtko Ursulin wrote:
>>> +	struct intel_breadcrumbs {
>>> +		spinlock_t lock; /* protects the per-engine request list */
>>
>> s/list/tree/
>
> list. We use the tree as a list!

Maybe but it confuses the reader a bit who goes and looks for some list 
then.

Related, when I was playing with this I was piggy backing on the 
existing per engine request list.

If they are seqno ordered there, and I think they are, then the waker 
thread just traverses it in order until hitting the not completed one, 
waking up every req->wait_queue as it goes.

In this case it doesn't matter that the waiters come in in 
indeterministic order so we don't need a (new) tree.

But the downside is the thread needs struct mutex. And that the 
retirement from that list is so lazy..

>>> +		struct task_struct *task; /* bh for all user interrupts */
>>> +		struct intel_breadcrumbs_engine {
>>> +			struct rb_root requests; /* sorted by retirement */
>>> +			struct drm_i915_gem_request *first;
>>
>> /* First request in the tree to be completed next by the hw. */
>>
>> ?
>
> /* first */ ?

Funny? :) First what? Why make the reader reverse engineer it?

>>> +		} engine[I915_NUM_RINGS];
>>> +	} breadcrumbs;
>>> +
>>>   	struct i915_virtual_gpu vgpu;
>>>
>>>   	struct intel_guc guc;
>>> @@ -2228,6 +2250,11 @@ struct drm_i915_gem_request {
>>>   	/** global list entry for this request */
>>>   	struct list_head list;
>>>
>>> +	/** List of clients waiting for completion of this request */
>>> +	wait_queue_head_t wait;
>>> +	struct rb_node irq_node;
>>> +	unsigned irq_count;
>>
>> To me irq prefix does not fit here that well.
>>
>> req_tree_node?
>>
>> waiter_count, waiters?
>
> waiters, breadcrumb_node, waiters_count ?

Deal.

>>>   	for (;;) {
>>> -		struct timer_list timer;
>>> -
>>> -		prepare_to_wait(&ring->irq_queue, &wait, state);
>>> +		prepare_to_wait(&req->wait, &wait, state);
>>>
>>> -		/* We need to check whether any gpu reset happened in between
>>> -		 * the caller grabbing the seqno and now ... */
>>> -		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
>>> -			/* As we do not requeue the request over a GPU reset,
>>> -			 * if one does occur we know that the request is
>>> -			 * effectively complete.
>>> -			 */
>>> -			ret = 0;
>>> -			break;
>>> -		}
>>> -
>>> -		if (i915_gem_request_completed(req, false)) {
>>> +		if (i915_gem_request_completed(req, true) ||
>>
>> Why it is OK to change the lazy coherency mode here?
>
> We bang on the coherency after the IRQ. Here, we are woken by the IRQ
> waiter so we know that the IRQ/seqno coherency is fine and we can just
> check the CPU cache.

Makes sense, put into comment? There is space now since function is much 
shorter. :)

>>> +		    req->reset_counter != i915_reset_counter(&req->i915->gpu_error)) {
>>
>> It looks like it would be valuable to preserve the comment about no
>> re-queuing etc on GPU reset.
>
> I can drop it from the previous patch :)

No it is useful!

>>> +		if (timeout) {
>>> +			timeout_remain = io_schedule_timeout(timeout_remain);
>>> +			if (timeout_remain == 0) {
>>> +				ret = -ETIME;
>>> +				break;
>>> +			}
>>> +		} else
>>> +			io_schedule();
>>
>> Could set timeout_remain to that MAX value when timeout == NULL and
>> have a single call site to io_schedule_timeout and less indentation.
>> It doesn't matter hugely, maybe it would be a bit easier to read.
>
> Done. io_schedule() calls io_schedule_timeout(), whereas
> schedule_timeout() calls schedule()!

Who disagreed with that? :) Ok!

>>> @@ -900,7 +901,7 @@ static void i915_record_ring_state(struct drm_device *dev,
>>>   		ering->instdone = I915_READ(GEN2_INSTDONE);
>>>   	}
>>>
>>> -	ering->waiting = waitqueue_active(&ring->irq_queue);
>>> +	ering->waiting = READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first);
>>
>> What does the READ_ONCE do here?
>
> This element is protected by the breadcrumbs spinlock, the READ_ONCE is
> documentation that we don't care about the lock here.

Hm I find it confusing but OK.

>>>   static void notify_ring(struct intel_engine_cs *ring)
>>>   {
>>> -	if (!intel_ring_initialized(ring))
>>> +	if (ring->i915 == NULL)
>>>   		return;
>>>
>>>   	trace_i915_gem_request_notify(ring);
>>> -
>>> -	wake_up_all(&ring->irq_queue);
>>> +	wake_up_process(ring->i915->breadcrumbs.task);
>>
>> For a later extension:
>>
>> How would you view, some time in the future, adding a bool return to
>> ring->put_irq() which would say to the thread that it failed to
>> disable interrupts?
>>
>> In this case thread would exit and would need to be restarted when
>> the next waiter queues up. Possibly extending the
>> idle->put_irq->exit durations as well then.
>
> Honestly, not keen on the idea that the lowlevel put_irq can fail (makes
> cleanup much more fraught). I don't see what improvement you intend
> here, maybe clarifying that would help explain what you mean.

Don't know, maybe reflecting the fact it wasn't the last put_irq call so 
let the caller handle it if they want. We can leave this discussion for 
the future.

>>> @@ -2986,16 +2981,16 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
>>>   			if (ring_idle(ring, seqno)) {
>>>   				ring->hangcheck.action = HANGCHECK_IDLE;
>>>
>>> -				if (waitqueue_active(&ring->irq_queue)) {
>>> +				if (READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first)) {
>>
>> READ_ONCE is again strange, but other than that I don't know the
>> hangcheck code to really evaulate it.
>>
>> Perhaps this also means this line needs a comment describing what
>> condition/state is actually approximating with the check.
>
> All we are doing is asking if anyone is waiting on the GPU, because the
> GPU has finished all of its work. If so, the IRQ handling is suspect
> (modulo the pre-existing race condition clearly earmarked by READ_ONCE).

Cool, /* Check if someone is waiting on the GPU */ then would make me happy.

>>> +	while (!kthread_should_stop()) {
>>> +		unsigned reset_counter = i915_reset_counter(&i915->gpu_error);
>>> +		unsigned long timeout_jiffies;
>>> +		bool idling = false;
>>> +
>>> +		/* On every tick, walk the seqno-ordered list of requests
>>> +		 * and for each retired request wakeup its clients. If we
>>> +		 * find an unfinished request, go back to sleep.
>>> +		 */
>>
>> s/tick/loop/ ?
> s/tick/iteration/ I'm sticking with tick

Tick makes me tick of a scheduler tick so to me it is the worst of the 
three. Iteration sounds really good. Whether that will be a 
tick/jiffie/orsomething is definite a bit lower in the code.

>> And if we don't find and unfinished request still go back to sleep. :)
>
> Ok.
>
>>> +		set_current_state(TASK_INTERRUPTIBLE);
>>> +
>>> +		/* Note carefully that we do not hold a reference for the
>>> +		 * requests on this list. Therefore we only inspect them
>>> +		 * whilst holding the spinlock to ensure that they are not
>>> +		 * freed in the meantime, and the client must remove the
>>> +		 * request from the list if it is interrupted (before it
>>> +		 * itself releases its reference).
>>> +		 */
>>> +		spin_lock(&b->lock);
>>
>> This lock is more per engine that global in its nature unless you
>> think it was more efficient to do fewer lock atomics vs potential
>> overlap of waiter activity per engines?
>
> Indeed. I decided not to care about making it per-engine.
>
>> Would need a new lock for req->irq_count, per request. So
>> req->lock/req->irq_count and be->lock for the tree operations.
>
> Why? A request is tied to an individual engine, therefore we can use
> that breadcrumb_engine.lock for the request.

Just so 2nd+ waiters would not touch the per engine lock.

>> Thread would only need to deal with the later. Feels like it would
>> work, just not sure if it is worth it.
>
> That was my feeling as well. Not sure if we will have huge contention
> that would be solved with per-engine spinlocks, whilst we still have
> struct_mutex.
>
>>> +		for (i = 0; i < I915_NUM_RINGS; i++) {
>>> +			struct intel_engine_cs *engine = &i915->ring[i];
>>> +			struct intel_breadcrumbs_engine *be = &b->engine[i];
>>> +			struct drm_i915_gem_request *request = be->first;
>>> +
>>> +			if (request == NULL) {
>>> +				if ((irq_get & (1 << i))) {
>>> +					if (irq_enabled & (1 << i)) {
>>> +						engine->irq_put(engine);
>>> +						irq_enabled &= ~ (1 << i);
>>> +					}
>>> +					intel_runtime_pm_put(i915);
>>> +					irq_get &= ~(1 << i);
>>> +				}
>>> +				continue;
>>> +			}
>>> +
>>> +			if ((irq_get & (1 << i)) == 0) {
>>> +				intel_runtime_pm_get(i915);
>>> +				irq_enabled |= __irq_enable(engine) << i;
>>> +				irq_get |= 1 << i;
>>> +			}
>>
>> irq_get and irq_enabled are creating a little bit of a headache for
>> me. And that would probably mean a comment is be in order to explain
>> what they are for and how they work.
>>
>> Like this I don't see the need for irq_get to persist across loops?
>
> Because getting the irq is quite slow, we add a jiffie shadow.

But it is not a shadow in the sense of the same bitmask from the 
previous iteration. The bits are different depending on special cases in 
__irq_enable.

So it needs some comments I think.

>> irq_get looks like "try to get these", irq_enabled is "these ones I
>> got". Apart for the weird usage of it to balance the runtime pm gets
>> and puts.
>>
>> A bit confusing as I said.
>
> I don't like the names, but haven't thought of anything better.
>
>>> +bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request)
>>> +{
>>> +	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
>>> +	bool first = false;
>>> +
>>> +	spin_lock(&b->lock);
>>> +	if (request->irq_count++ == 0) {
>>> +		struct intel_breadcrumbs_engine *be =
>>> +			&b->engine[request->ring->id];
>>> +		struct rb_node **p, *parent;
>>> +
>>> +		if (be->first == NULL)
>>> +			wake_up_process(b->task);
>>
>> With a global b->lock
>
> or even per engine
>
>> it makes no difference but in the logical
>> sense this would usually go last, after the request has been
>> inserted into the tree and waiter initialized.
>
> Except that the code is simpler with kicking the task first.
>
> Since we both thought it might make sense with a per-engine lock, let's
> go with it. Maybe one day it will pay off.

Ok.

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 12:38       ` Chris Wilson
@ 2015-11-30 13:33         ` Tvrtko Ursulin
  2015-11-30 14:30           ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-11-30 13:33 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Gong, Zhipeng, Rogozhkin, Dmitry V


On 30/11/15 12:38, Chris Wilson wrote:
> On Mon, Nov 30, 2015 at 12:09:30PM +0000, Tvrtko Ursulin wrote:
>>
>> Hi,
>>
>> On 30/11/15 10:53, Chris Wilson wrote:
>>> On Sun, Nov 29, 2015 at 08:48:06AM +0000, Chris Wilson wrote:
>>>> +	/* Optimistic spin for the next jiffie before touching IRQs */
>>>> +	if (intel_breadcrumbs_add_waiter(req)) {
>>>> +		ret = __i915_spin_request(req, state);
>>>> +		if (ret == 0)
>>>> +			goto out;
>>>
>>> There are a couple of interesting side-effects here.  As we know start
>>> up the irq in parallel and keep it running for longer, irq/i915 now
>>> consumes a lot of CPU time (like 50-100%!) for synchronous batches, but
>>> doesn't seem to interfere with latency (as spin_request is still nicely
>>> running and catching the request completion). That should also still
>>> work nicely on single cpu machines as the irq enabling should not
>>> preempt us.  The second interesting side-effect is that the synchronous
>>> loads that regressed with a 2us spin-request timeout are now happy again
>>> at 2us. Also with the active-request and the can-spin check from
>>> add_waiter, running a low fps game with a compositor is not burning the
>>> CPU with any spinning.
>>
>> Interesting? :) Sounds bad the way you presented it.
>>
>> Why and where is the thread burning so much CPU? Would per engine
>> req tree locks help?
>
> Just simply being woken up after every batch and checking the seqno is
> that expensive. Almost as expensive as the IRQ handler itself! (I expect
> top is adding the IRQ handler time to the irq/i915 in this case, perf
> says that more time is spent in the IRQ than the bottom-half.) Almost
> all waiters will be on the same engine, so I don't expect finer grained
> spinlocks to be hugely important.
>
>> Is this new CPU time or the one which would previously be accounted
>> against each waiter, polling in wait request?
>
> This is CPU time that used to be absorbed in i915_wait_request(),
> currently hidden by i915_spin_request(). It is the reason why having
> every waiter doing the check after every interrupt is such a big issue
> for some workloads.

So overall total CPU time is similar, just split differently?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 13:32       ` Tvrtko Ursulin
@ 2015-11-30 14:18         ` Chris Wilson
  2015-12-01 17:06           ` Dave Gordon
  2015-11-30 14:26         ` Chris Wilson
  1 sibling, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 14:18 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 30, 2015 at 01:32:16PM +0000, Tvrtko Ursulin wrote:
> 
> On 30/11/15 12:30, Chris Wilson wrote:
> >On Mon, Nov 30, 2015 at 12:05:50PM +0000, Tvrtko Ursulin wrote:
> >>>+	struct intel_breadcrumbs {
> >>>+		spinlock_t lock; /* protects the per-engine request list */
> >>
> >>s/list/tree/
> >
> >list. We use the tree as a list!
> 
> Maybe but it confuses the reader a bit who goes and looks for some
> list then.
> 
> Related, when I was playing with this I was piggy backing on the
> existing per engine request list.

A separate list with separate locking only used when waiting, that's the
appeal for me.
 
> If they are seqno ordered there, and I think they are, then the
> waker thread just traverses it in order until hitting the not
> completed one, waking up every req->wait_queue as it goes.
>
> In this case it doesn't matter that the waiters come in in
> indeterministic order so we don't need a (new) tree.
> 
> But the downside is the thread needs struct mutex. And that the
> retirement from that list is so lazy..

And we can't even take struct_mutex!

> 
> >>>+		struct task_struct *task; /* bh for all user interrupts */
> >>>+		struct intel_breadcrumbs_engine {
> >>>+			struct rb_root requests; /* sorted by retirement */
> >>>+			struct drm_i915_gem_request *first;
> >>
> >>/* First request in the tree to be completed next by the hw. */
> >>
> >>?
> >
> >/* first */ ?
> 
> Funny? :) First what? Why make the reader reverse engineer it?

struct drm_i915_gem_request *first_waiter; ?

Please don't make me add /* ^^^ */

> >>For a later extension:
> >>
> >>How would you view, some time in the future, adding a bool return to
> >>ring->put_irq() which would say to the thread that it failed to
> >>disable interrupts?
> >>
> >>In this case thread would exit and would need to be restarted when
> >>the next waiter queues up. Possibly extending the
> >>idle->put_irq->exit durations as well then.
> >
> >Honestly, not keen on the idea that the lowlevel put_irq can fail (makes
> >cleanup much more fraught). I don't see what improvement you intend
> >here, maybe clarifying that would help explain what you mean.
> 
> Don't know, maybe reflecting the fact it wasn't the last put_irq
> call so let the caller handle it if they want. We can leave this
> discussion for the future.

Hmm. The only other use is the trace_irq. It would be nice to eliminate
that and the ring->irq_count.
> 
> >>>@@ -2986,16 +2981,16 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
> >>>  			if (ring_idle(ring, seqno)) {
> >>>  				ring->hangcheck.action = HANGCHECK_IDLE;
> >>>
> >>>-				if (waitqueue_active(&ring->irq_queue)) {
> >>>+				if (READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first)) {
> >>
> >>READ_ONCE is again strange, but other than that I don't know the
> >>hangcheck code to really evaulate it.
> >>
> >>Perhaps this also means this line needs a comment describing what
> >>condition/state is actually approximating with the check.
> >
> >All we are doing is asking if anyone is waiting on the GPU, because the
> >GPU has finished all of its work. If so, the IRQ handling is suspect
> >(modulo the pre-existing race condition clearly earmarked by READ_ONCE).
> 
> Cool, /* Check if someone is waiting on the GPU */ then would make me happy.

+static bool any_waiters_on_engine(struct intel_engine_cs *engine)
+{
+       return READ_ONCE(engine->i915->breadcrumbs.engine[engine->id].first_waiter);
+}
+
 static bool any_waiters(struct drm_i915_private *dev_priv)
 {
        struct intel_engine_cs *ring;
        int i;
 
        for_each_ring(ring, dev_priv, i)
-               if (ring->irq_refcount)
+               if (any_waiters_on_engine(engine))
                        return true;
 
        return false;

> >>>+	while (!kthread_should_stop()) {
> >>>+		unsigned reset_counter = i915_reset_counter(&i915->gpu_error);
> >>>+		unsigned long timeout_jiffies;
> >>>+		bool idling = false;
> >>>+
> >>>+		/* On every tick, walk the seqno-ordered list of requests
> >>>+		 * and for each retired request wakeup its clients. If we
> >>>+		 * find an unfinished request, go back to sleep.
> >>>+		 */
> >>
> >>s/tick/loop/ ?
> >s/tick/iteration/ I'm sticking with tick
> 
> Tick makes me tick of a scheduler tick so to me it is the worst of
> the three. Iteration sounds really good. Whether that will be a
> tick/jiffie/orsomething is definite a bit lower in the code.

Hmm, that was the connection I liked with tick.
 
> >>And if we don't find and unfinished request still go back to sleep. :)
> >
> >Ok.
> >
> >>>+		set_current_state(TASK_INTERRUPTIBLE);
> >>>+
> >>>+		/* Note carefully that we do not hold a reference for the
> >>>+		 * requests on this list. Therefore we only inspect them
> >>>+		 * whilst holding the spinlock to ensure that they are not
> >>>+		 * freed in the meantime, and the client must remove the
> >>>+		 * request from the list if it is interrupted (before it
> >>>+		 * itself releases its reference).
> >>>+		 */
> >>>+		spin_lock(&b->lock);
> >>
> >>This lock is more per engine that global in its nature unless you
> >>think it was more efficient to do fewer lock atomics vs potential
> >>overlap of waiter activity per engines?
> >
> >Indeed. I decided not to care about making it per-engine.
> >
> >>Would need a new lock for req->irq_count, per request. So
> >>req->lock/req->irq_count and be->lock for the tree operations.
> >
> >Why? A request is tied to an individual engine, therefore we can use
> >that breadcrumb_engine.lock for the request.
> 
> Just so 2nd+ waiters would not touch the per engine lock.

Ah. I am not expecting that much contention and even fewer multiple
waiters for a request.

> >>>+		for (i = 0; i < I915_NUM_RINGS; i++) {
> >>>+			struct intel_engine_cs *engine = &i915->ring[i];
> >>>+			struct intel_breadcrumbs_engine *be = &b->engine[i];
> >>>+			struct drm_i915_gem_request *request = be->first;
> >>>+
> >>>+			if (request == NULL) {
> >>>+				if ((irq_get & (1 << i))) {
> >>>+					if (irq_enabled & (1 << i)) {
> >>>+						engine->irq_put(engine);
> >>>+						irq_enabled &= ~ (1 << i);
> >>>+					}
> >>>+					intel_runtime_pm_put(i915);
> >>>+					irq_get &= ~(1 << i);
> >>>+				}
> >>>+				continue;
> >>>+			}
> >>>+
> >>>+			if ((irq_get & (1 << i)) == 0) {
> >>>+				intel_runtime_pm_get(i915);
> >>>+				irq_enabled |= __irq_enable(engine) << i;
> >>>+				irq_get |= 1 << i;
> >>>+			}
> >>
> >>irq_get and irq_enabled are creating a little bit of a headache for
> >>me. And that would probably mean a comment is be in order to explain
> >>what they are for and how they work.
> >>
> >>Like this I don't see the need for irq_get to persist across loops?
> >
> >Because getting the irq is quite slow, we add a jiffie shadow.
> 
> But it is not a shadow in the sense of the same bitmask from the
> previous iteration. The bits are different depending on special
> cases in __irq_enable.
> 
> So it needs some comments I think.

Didn't you get that from idling and its comments?

First, how about engines_active and irqs_enabled?

...
                        /* When enabling waiting upon an engine, we need
                         * to unmask the user interrupt, and before we
                         * do so we need to hold a wakelock for our
                         * hardware access (and the interrupts thereafter).
                         */
                        if ((engines_active & (1 << i)) == 0) {
...
                        if (request == NULL) {
                                /* No more waiters, mask the user interrupt
                                 * (if we were able to enable it!) and
                                 * release the wakelock.
                                 */
                                if (engines_active & (1 << i)) {


with a bit of work, I can then get the irq and rpm references out of the
spinlock.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 13:32       ` Tvrtko Ursulin
  2015-11-30 14:18         ` Chris Wilson
@ 2015-11-30 14:26         ` Chris Wilson
  1 sibling, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 14:26 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 30, 2015 at 01:32:16PM +0000, Tvrtko Ursulin wrote:
> 
> On 30/11/15 12:30, Chris Wilson wrote:
> >On Mon, Nov 30, 2015 at 12:05:50PM +0000, Tvrtko Ursulin wrote:
> >>>+	while (!kthread_should_stop()) {
> >>>+		unsigned reset_counter = i915_reset_counter(&i915->gpu_error);
> >>>+		unsigned long timeout_jiffies;
> >>>+		bool idling = false;
> >>>+
> >>>+		/* On every tick, walk the seqno-ordered list of requests
> >>>+		 * and for each retired request wakeup its clients. If we
> >>>+		 * find an unfinished request, go back to sleep.
> >>>+		 */
> >>
> >>s/tick/loop/ ?
> >s/tick/iteration/ I'm sticking with tick
> 
> Tick makes me tick of a scheduler tick so to me it is the worst of
> the three. Iteration sounds really good. Whether that will be a
> tick/jiffie/orsomething is definite a bit lower in the code.

wakeup? Going with "On every wakeup". The duplication of wakeup then
emphasises the extra wakeup we introduction into i915_wait_request.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 13:33         ` Tvrtko Ursulin
@ 2015-11-30 14:30           ` Chris Wilson
  0 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 14:30 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 30, 2015 at 01:33:50PM +0000, Tvrtko Ursulin wrote:
> 
> On 30/11/15 12:38, Chris Wilson wrote:
> >On Mon, Nov 30, 2015 at 12:09:30PM +0000, Tvrtko Ursulin wrote:
> >>
> >>Hi,
> >>
> >>On 30/11/15 10:53, Chris Wilson wrote:
> >>>On Sun, Nov 29, 2015 at 08:48:06AM +0000, Chris Wilson wrote:
> >>>>+	/* Optimistic spin for the next jiffie before touching IRQs */
> >>>>+	if (intel_breadcrumbs_add_waiter(req)) {
> >>>>+		ret = __i915_spin_request(req, state);
> >>>>+		if (ret == 0)
> >>>>+			goto out;
> >>>
> >>>There are a couple of interesting side-effects here.  As we know start
> >>>up the irq in parallel and keep it running for longer, irq/i915 now
> >>>consumes a lot of CPU time (like 50-100%!) for synchronous batches, but
> >>>doesn't seem to interfere with latency (as spin_request is still nicely
> >>>running and catching the request completion). That should also still
> >>>work nicely on single cpu machines as the irq enabling should not
> >>>preempt us.  The second interesting side-effect is that the synchronous
> >>>loads that regressed with a 2us spin-request timeout are now happy again
> >>>at 2us. Also with the active-request and the can-spin check from
> >>>add_waiter, running a low fps game with a compositor is not burning the
> >>>CPU with any spinning.
> >>
> >>Interesting? :) Sounds bad the way you presented it.
> >>
> >>Why and where is the thread burning so much CPU? Would per engine
> >>req tree locks help?
> >
> >Just simply being woken up after every batch and checking the seqno is
> >that expensive. Almost as expensive as the IRQ handler itself! (I expect
> >top is adding the IRQ handler time to the irq/i915 in this case, perf
> >says that more time is spent in the IRQ than the bottom-half.) Almost
> >all waiters will be on the same engine, so I don't expect finer grained
> >spinlocks to be hugely important.
> >
> >>Is this new CPU time or the one which would previously be accounted
> >>against each waiter, polling in wait request?
> >
> >This is CPU time that used to be absorbed in i915_wait_request(),
> >currently hidden by i915_spin_request(). It is the reason why having
> >every waiter doing the check after every interrupt is such a big issue
> >for some workloads.
> 
> So overall total CPU time is similar, just split differently?

With the pre-enable + spin, we increase CPU time. On those synchronous
tests, the runtime is the same but we are now at 100% CPU in the test
and an additional 50-100% in the irq/i915. Afaict, this is only
affecting those igt, my desktop usage doesn't spin as much.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 14/15] drm/i915: Only query timestamp when measuring elapsed time
  2015-11-30 10:19   ` Tvrtko Ursulin
@ 2015-11-30 14:31     ` Chris Wilson
  0 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 14:31 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Nov 30, 2015 at 10:19:40AM +0000, Tvrtko Ursulin wrote:
> 
> On 29/11/15 08:48, Chris Wilson wrote:
> >Avoid the two calls to ktime_get_raw_ns() (at best it reads the TSC) as
> >we only need to compute the elapsed time for a timed wait.
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >---
> >  drivers/gpu/drm/i915/i915_gem.c | 13 +++++--------
> >  1 file changed, 5 insertions(+), 8 deletions(-)
> >
> >diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> >index 2c3e36e19cb0..871201713c73 100644
> >--- a/drivers/gpu/drm/i915/i915_gem.c
> >+++ b/drivers/gpu/drm/i915/i915_gem.c
> >@@ -1227,7 +1227,6 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
> >  	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
> >  	DEFINE_WAIT(wait);
> >  	unsigned long timeout_remain;
> >-	s64 before, now;
> >  	int ret;
> >
> >  	if (list_empty(&req->list))
> >@@ -1244,13 +1243,12 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
> >  		if (*timeout == 0)
> >  			return -ETIME;
> >
> >+		/* Record current time in case interrupted, or wedged */
> >  		timeout_remain = nsecs_to_jiffies_timeout(*timeout);
> >+		*timeout += ktime_get_raw_ns();
> 
> Don't really like this one, how you use the passed in pointer to
> store the intermediate local state.
> 
> It works etc but just feels too hacky.

Bah, it's already in the CPU cache I might as well take advantage of the
space. :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH v4] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-29  8:48 ` [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
  2015-11-30 10:53   ` Chris Wilson
  2015-11-30 12:05   ` Tvrtko Ursulin
@ 2015-11-30 14:34   ` Chris Wilson
  2015-11-30 16:30     ` Chris Wilson
  2015-12-01 18:34     ` Dave Gordon
  2015-11-30 15:45   ` [PATCH] drm/i915: Convert trace-irq to the breadcrumb waiter Chris Wilson
  3 siblings, 2 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 14:34 UTC (permalink / raw)
  To: intel-gfx

One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batchbuffer - hence the thunder of hooves as then every client must
do its heavyweight dance to read a coherent seqno to see if it is the
lucky one. Alternatively, we can have one kthread responsible for waking
after an interrupt, checking the seqno and only waking up the waiting
clients who are complete. The disadvantage is that in the uncontended
scenario (i.e. only one waiter) we incur an extra context switch in the
wakeup path - though that should be mitigated somewhat by the busy-wait
we do first before sleeping.

v2: Convert from a kworker per engine into a dedicated kthread for the
bottom-half.
v3: Rename request members and tweak comments.
v4: Use a per-engine spinlock in the breadcrumbs bottom-half.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
---
 drivers/gpu/drm/i915/Makefile            |   1 +
 drivers/gpu/drm/i915/i915_dma.c          |   9 +-
 drivers/gpu/drm/i915/i915_drv.h          |  37 ++++-
 drivers/gpu/drm/i915/i915_gem.c          |  93 +++--------
 drivers/gpu/drm/i915/i915_gpu_error.c    |   9 +-
 drivers/gpu/drm/i915/i915_irq.c          |  24 +--
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 270 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c         |   2 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   3 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   3 +-
 10 files changed, 361 insertions(+), 90 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_breadcrumbs.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0851de07bd13..d3b9d3618719 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
 	  i915_gem_userptr.o \
 	  i915_gpu_error.o \
 	  i915_trace_points.o \
+	  intel_breadcrumbs.o \
 	  intel_lrc.o \
 	  intel_mocs.o \
 	  intel_ringbuffer.o \
diff --git a/drivers/gpu/drm/i915/i915_dma.c b/drivers/gpu/drm/i915/i915_dma.c
index a81c76603544..93c6762d87b4 100644
--- a/drivers/gpu/drm/i915/i915_dma.c
+++ b/drivers/gpu/drm/i915/i915_dma.c
@@ -388,12 +388,16 @@ static int i915_load_modeset_init(struct drm_device *dev)
 	if (ret)
 		goto cleanup_vga_client;
 
+	ret = intel_breadcrumbs_init(dev_priv);
+	if (ret)
+		goto cleanup_vga_switcheroo;
+
 	/* Initialise stolen first so that we may reserve preallocated
 	 * objects for the BIOS to KMS transition.
 	 */
 	ret = i915_gem_init_stolen(dev);
 	if (ret)
-		goto cleanup_vga_switcheroo;
+		goto cleanup_breadcrumbs;
 
 	intel_power_domains_init_hw(dev_priv, false);
 
@@ -454,6 +458,8 @@ cleanup_irq:
 	drm_irq_uninstall(dev);
 cleanup_gem_stolen:
 	i915_gem_cleanup_stolen(dev);
+cleanup_breadcrumbs:
+	intel_breadcrumbs_fini(dev_priv);
 cleanup_vga_switcheroo:
 	vga_switcheroo_unregister_client(dev->pdev);
 cleanup_vga_client:
@@ -1193,6 +1199,7 @@ int i915_driver_unload(struct drm_device *dev)
 	mutex_unlock(&dev->struct_mutex);
 	intel_fbc_cleanup_cfb(dev_priv);
 	i915_gem_cleanup_stolen(dev);
+	intel_breadcrumbs_fini(dev_priv);
 
 	intel_csr_ucode_fini(dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e1980212ba37..ab8c694a9ba5 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -564,6 +564,7 @@ struct drm_i915_error_state {
 			long jiffies;
 			u32 seqno;
 			u32 tail;
+			u32 waiters;
 		} *requests;
 
 		struct {
@@ -1383,7 +1384,7 @@ struct i915_gpu_error {
 #define I915_STOP_RING_ALLOW_WARN      (1 << 30)
 
 	/* For missed irq/seqno simulation. */
-	unsigned int test_irq_rings;
+	unsigned long test_irq_rings;
 };
 
 enum modeset_restore {
@@ -1695,6 +1696,27 @@ struct drm_i915_private {
 
 	struct intel_uncore uncore;
 
+	/* Rather than have every client wait upon all user interrupts,
+	 * with the herd waking after every interrupt and each doing the
+	 * heavyweight seqno dance, we delegate the task to a bottom-half
+	 * for the user interrupt (one bh handling all engines). Now,
+	 * just one task is woken up and does the coherent seqno read and
+	 * then wakes up all the waiters registered with the bottom-half
+	 * that are complete. This avoid the thundering herd problem
+	 * with lots of concurrent waiters, at the expense of an extra
+	 * context switch between the interrupt and the client. That should
+	 * be mitigated somewhat by the busyspin in the client before we go to
+	 * sleep waiting for an interrupt from the bottom-half.
+	 */
+	struct intel_breadcrumbs {
+		struct task_struct *task; /* bh for all user interrupts */
+		struct intel_breadcrumbs_engine {
+			spinlock_t lock; /* protects the lists of requests */
+			struct rb_root requests; /* sorted by retirement */
+			struct drm_i915_gem_request *first_waiter;
+		} engine[I915_NUM_RINGS];
+	} breadcrumbs;
+
 	struct i915_virtual_gpu vgpu;
 
 	struct intel_guc guc;
@@ -2228,6 +2250,13 @@ struct drm_i915_gem_request {
 	/** global list entry for this request */
 	struct list_head list;
 
+	/** Entry into the breadcrumb list of waiting requests. */
+	struct rb_node breadcrumb_node;
+	/** List of clients waiting for completion of this request. */
+	wait_queue_head_t waiters;
+	/** Count of clients waiting for completion of this request. */
+	unsigned waiters_count;
+
 	struct drm_i915_file_private *file_priv;
 	/** file_priv list entry for this request */
 	struct list_head client_list;
@@ -2800,6 +2829,12 @@ ibx_disable_display_interrupt(struct drm_i915_private *dev_priv, uint32_t bits)
 }
 
 
+/* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
+int intel_breadcrumbs_init(struct drm_i915_private *i915);
+bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request);
+void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request);
+void intel_breadcrumbs_fini(struct drm_i915_private *i915);
+
 /* i915_gem.c */
 int i915_gem_create_ioctl(struct drm_device *dev, void *data,
 			  struct drm_file *file_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 65d101b47d8e..cec1bc460f57 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1126,17 +1126,6 @@ i915_gem_check_wedge(unsigned reset_counter,
 	return 0;
 }
 
-static void fake_irq(unsigned long data)
-{
-	wake_up_process((struct task_struct *)data);
-}
-
-static bool missed_irq(struct drm_i915_private *dev_priv,
-		       struct intel_engine_cs *ring)
-{
-	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
-}
-
 static unsigned long local_clock_us(unsigned *cpu)
 {
 	unsigned long t;
@@ -1184,9 +1173,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 	 * takes to sleep on a request, on the order of a microsecond.
 	 */
 
-	if (req->ring->irq_refcount)
-		return -EBUSY;
-
 	/* Only spin if we know the GPU is processing this request */
 	if (!i915_gem_request_started(req, false))
 		return -EAGAIN;
@@ -1232,26 +1218,19 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			s64 *timeout,
 			struct intel_rps_client *rps)
 {
-	struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
-	struct drm_device *dev = ring->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const bool irq_test_in_progress =
-		ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
 	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
 	DEFINE_WAIT(wait);
-	unsigned long timeout_expire;
+	unsigned long timeout_remain;
 	s64 before, now;
 	int ret;
 
-	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
-
 	if (list_empty(&req->list))
 		return 0;
 
 	if (i915_gem_request_completed(req, true))
 		return 0;
 
-	timeout_expire = 0;
+	timeout_remain = MAX_SCHEDULE_TIMEOUT;
 	if (timeout) {
 		if (WARN_ON(*timeout < 0))
 			return -EINVAL;
@@ -1259,34 +1238,37 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 		if (*timeout == 0)
 			return -ETIME;
 
-		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
+		timeout_remain = nsecs_to_jiffies_timeout(*timeout);
 	}
 
-	if (INTEL_INFO(dev_priv)->gen >= 6)
-		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
-
 	/* Record current time in case interrupted by signal, or wedged */
 	trace_i915_gem_request_wait_begin(req);
 	before = ktime_get_raw_ns();
 
-	/* Optimistic spin for the next jiffie before touching IRQs */
-	ret = __i915_spin_request(req, state);
-	if (ret == 0)
-		goto out;
+	if (INTEL_INFO(req->i915)->gen >= 6)
+		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
 
-	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
-		ret = -ENODEV;
-		goto out;
+	/* Optimistic spin for the next jiffie before touching IRQs */
+	if (intel_breadcrumbs_add_waiter(req)) {
+		ret = __i915_spin_request(req, state);
+		if (ret == 0)
+			goto out;
 	}
 
 	for (;;) {
-		struct timer_list timer;
+		prepare_to_wait(&req->waiters, &wait, state);
 
-		prepare_to_wait(&ring->irq_queue, &wait, state);
+		/* When the bottom-half handler for the user interrupts runs,
+		 * it will perform a coherent read of seqno before waking us.
+		 * When we are then woken by the bottom-half, we can just
+		 * inspect the CPU cache for the seqno.
+		 */
+		if (i915_gem_request_completed(req, true)) {
+			ret = 0;
+			break;
+		}
 
-		/* We need to check whether any gpu reset happened in between
-		 * the caller grabbing the seqno and now ... */
-		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
+		if (unlikely(req->reset_counter != i915_reset_counter(&req->i915->gpu_error))) {
 			/* As we do not requeue the request over a GPU reset,
 			 * if one does occur we know that the request is
 			 * effectively complete.
@@ -1295,46 +1277,21 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			break;
 		}
 
-		if (i915_gem_request_completed(req, false)) {
-			ret = 0;
-			break;
-		}
-
 		if (signal_pending_state(state, current)) {
 			ret = -ERESTARTSYS;
 			break;
 		}
 
-		if (timeout && time_after_eq(jiffies, timeout_expire)) {
+		timeout_remain = io_schedule_timeout(timeout_remain);
+		if (timeout_remain == 0) {
 			ret = -ETIME;
 			break;
 		}
-
-		i915_queue_hangcheck(dev_priv);
-
-		timer.function = NULL;
-		if (timeout || missed_irq(dev_priv, ring)) {
-			unsigned long expire;
-
-			setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
-			expire = missed_irq(dev_priv, ring) ? jiffies + 1 : timeout_expire;
-			mod_timer(&timer, expire);
-		}
-
-		io_schedule();
-
-		if (timer.function) {
-			del_singleshot_timer_sync(&timer);
-			destroy_timer_on_stack(&timer);
-		}
 	}
-	if (!irq_test_in_progress)
-		ring->irq_put(ring);
-
-	finish_wait(&ring->irq_queue, &wait);
-
+	finish_wait(&req->waiters, &wait);
 out:
 	now = ktime_get_raw_ns();
+	intel_breadcrumbs_remove_waiter(req);
 	trace_i915_gem_request_wait_end(req);
 
 	if (timeout) {
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 06ca4082735b..0e63b6ebddda 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -453,10 +453,11 @@ int i915_error_state_to_str(struct drm_i915_error_state_buf *m,
 				   dev_priv->ring[i].name,
 				   error->ring[i].num_requests);
 			for (j = 0; j < error->ring[i].num_requests; j++) {
-				err_printf(m, "  seqno 0x%08x, emitted %ld, tail 0x%08x\n",
+				err_printf(m, "  seqno 0x%08x, emitted %ld, tail 0x%08x, waiters? %d\n",
 					   error->ring[i].requests[j].seqno,
 					   error->ring[i].requests[j].jiffies,
-					   error->ring[i].requests[j].tail);
+					   error->ring[i].requests[j].tail,
+					   error->ring[i].requests[j].waiters);
 			}
 		}
 
@@ -900,7 +901,8 @@ static void i915_record_ring_state(struct drm_device *dev,
 		ering->instdone = I915_READ(GEN2_INSTDONE);
 	}
 
-	ering->waiting = waitqueue_active(&ring->irq_queue);
+	ering->waiting =
+	       	READ_ONCE(dev_priv->breadcrumbs.engine[ring->id].first_waiter);
 	ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
 	ering->seqno = ring->get_seqno(ring, false);
 	ering->acthd = intel_ring_get_active_head(ring);
@@ -1105,6 +1107,7 @@ static void i915_gem_record_rings(struct drm_device *dev,
 			erq->seqno = request->seqno;
 			erq->jiffies = request->emitted_jiffies;
 			erq->tail = request->postfix;
+			erq->waiters = request->waiters_count;
 		}
 	}
 }
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 28c0329f8281..182cc9efe9ba 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -996,12 +996,11 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 
 static void notify_ring(struct intel_engine_cs *ring)
 {
-	if (!intel_ring_initialized(ring))
+	if (ring->i915 == NULL)
 		return;
 
 	trace_i915_gem_request_notify(ring);
-
-	wake_up_all(&ring->irq_queue);
+	wake_up_process(ring->i915->breadcrumbs.task);
 }
 
 static void vlv_c0_read(struct drm_i915_private *dev_priv,
@@ -1077,13 +1076,18 @@ static u32 vlv_wa_c0_ei(struct drm_i915_private *dev_priv, u32 pm_iir)
 	return events;
 }
 
+static bool any_waiters_on_engine(struct intel_engine_cs *engine)
+{
+	return READ_ONCE(engine->i915->breadcrumbs.engine[engine->id].first_waiter);
+}
+
 static bool any_waiters(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *ring;
 	int i;
 
 	for_each_ring(ring, dev_priv, i)
-		if (ring->irq_refcount)
+		if (any_waiters_on_engine(ring))
 			return true;
 
 	return false;
@@ -2399,9 +2403,6 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
 static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 			       bool reset_completed)
 {
-	struct intel_engine_cs *ring;
-	int i;
-
 	/*
 	 * Notify all waiters for GPU completion events that reset state has
 	 * been changed, and that they need to restart their wait after
@@ -2410,8 +2411,7 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 	 */
 
 	/* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
-	for_each_ring(ring, dev_priv, i)
-		wake_up_all(&ring->irq_queue);
+	wake_up_process(dev_priv->breadcrumbs.task);
 
 	/* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
 	wake_up_all(&dev_priv->pending_flip_queue);
@@ -2986,16 +2986,16 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 			if (ring_idle(ring, seqno)) {
 				ring->hangcheck.action = HANGCHECK_IDLE;
 
-				if (waitqueue_active(&ring->irq_queue)) {
+				if (any_waiters_on_engine(ring)){
 					/* Issue a wake-up to catch stuck h/w. */
 					if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) {
-						if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)))
+						if (!test_bit(ring->id, &dev_priv->gpu_error.test_irq_rings))
 							DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
 								  ring->name);
 						else
 							DRM_INFO("Fake missed irq on %s\n",
 								 ring->name);
-						wake_up_all(&ring->irq_queue);
+						wake_up_process(dev_priv->breadcrumbs.task);
 					}
 					/* Safeguard against driver failure */
 					ring->hangcheck.score += BUSY;
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
new file mode 100644
index 000000000000..77af6240573d
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -0,0 +1,270 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/kthread.h>
+
+#include "i915_drv.h"
+
+static bool __irq_enable(struct intel_engine_cs *engine)
+{
+	if (test_bit(engine->id, &engine->i915->gpu_error.test_irq_rings))
+		return false;
+
+	if (!intel_irqs_enabled(engine->i915))
+		return false;
+
+	return engine->irq_get(engine);
+}
+
+static struct drm_i915_gem_request *to_request(struct rb_node *rb)
+{
+	if (rb == NULL)
+		return NULL;
+
+	return rb_entry(rb, struct drm_i915_gem_request, breadcrumb_node);
+}
+
+/*
+ * intel_breadcrumbs_irq() acts as the bottom-half for the user interrupt,
+ * which we use as the breadcrumb after every request. In order to scale
+ * to many concurrent waiters, we delegate the task of reading the coherent
+ * seqno to this bottom-half. (Otherwise every waiter is woken after each
+ * interrupt and they all attempt to perform the heavyweight coherent
+ * seqno check.) Individual clients register with the bottom-half when
+ * waiting on a request, and are then woken when the bottom-half notices
+ * their seqno is complete. This incurs an extra context switch from the
+ * interrupt to the client in the uncontended case.
+ */
+static int intel_breadcrumbs_irq(void *data)
+{
+	struct drm_i915_private *i915 = data;
+	struct intel_breadcrumbs *b = &i915->breadcrumbs;
+	unsigned engines_active = 0;
+	unsigned irqs_enabled = 0;
+	int i;
+
+	while (!kthread_should_stop()) {
+		unsigned reset_counter = i915_reset_counter(&i915->gpu_error);
+		unsigned long timeout_jiffies;
+		bool idling = false;
+
+		/* On every wakeup, walk the seqno-ordered list of requests
+		 * and for each retired request wakeup its clients. If we
+		 * find an unfinished request or none, go back to sleep.
+		 */
+		set_current_state(TASK_INTERRUPTIBLE);
+
+		/* Note carefully that we do not hold a reference for the
+		 * requests on this list. Therefore we only inspect them
+		 * whilst holding the spinlock to ensure that they are not
+		 * freed in the meantime, and the client must remove the
+		 * request from the list if it is interrupted (before it
+		 * itself releases its reference).
+		 */
+		for (i = 0; i < I915_NUM_RINGS; i++) {
+			struct intel_engine_cs *engine = &i915->ring[i];
+			struct intel_breadcrumbs_engine *be = &b->engine[i];
+			struct drm_i915_gem_request *request;
+
+			if (READ_ONCE(be->first_waiter) == NULL) {
+				/* No more waiters, mask the user interrupt
+				 * (if we were able to enable it!) and
+				 * release the wakelock.
+				 */
+				if (engines_active & (1 << i)) {
+					if (irqs_enabled & (1 << i)) {
+						engine->irq_put(engine);
+						irqs_enabled &= ~ (1 << i);
+					}
+					intel_runtime_pm_put(i915);
+					engines_active &= ~(1 << i);
+				}
+
+				continue;
+			}
+
+			/* When enabling waiting upon an engine, we need
+			 * to unmask the user interrupt, and before we
+			 * do so we need to hold a wakelock for our
+			 * hardware access (and the interrupts thereafter).
+			 */
+			if ((engines_active & (1 << i)) == 0) {
+				intel_runtime_pm_get(i915);
+				irqs_enabled |= __irq_enable(engine) << i;
+				engines_active |= 1 << i;
+			}
+
+			spin_lock(&be->lock);
+			request = be->first_waiter;
+			while (request) {
+				struct rb_node *next;
+
+				if (request->reset_counter == reset_counter &&
+				    !i915_gem_request_completed(request, false))
+					break;
+
+				next = rb_next(&request->breadcrumb_node);
+				rb_erase(&request->breadcrumb_node,
+					 &be->requests);
+				RB_CLEAR_NODE(&request->breadcrumb_node);
+
+				wake_up_all(&request->waiters);
+
+				request = to_request(next);
+			}
+			be->first_waiter = request;
+			spin_unlock(&be->lock);
+
+			/* Make sure the hangcheck timer is armed in case
+			 * the GPU hangs and we are never woken up.
+			 */
+			if (request)
+				i915_queue_hangcheck(i915);
+			else
+				idling = true;
+		}
+
+		/* If we don't have interrupts available (either we have
+		 * detected the hardware is missing interrupts or the
+		 * interrupt delivery was disabled by the user), wake up
+		 * every jiffie to check for request completion.
+		 *
+		 * If we have processed all requests for one engine, we
+		 * also wish to wake up in a jiffie to turn off the
+		 * breadcrumb interrupt for that engine. We delay
+		 * switching off the interrupt in order to allow another
+		 * waiter to start without incurring additional latency
+		 * enabling the interrupt.
+		 */
+		timeout_jiffies = MAX_SCHEDULE_TIMEOUT;
+		if (idling || i915->gpu_error.missed_irq_rings & irqs_enabled)
+			timeout_jiffies = 1;
+
+		/* Unlike the individual clients, we do not want this
+		 * background thread to contribute to the system load,
+		 * i.e. we do not want to use io_schedule() here.
+		 */
+		schedule_timeout(timeout_jiffies);
+	}
+
+	for (i = 0; i < I915_NUM_RINGS; i++) {
+		if (engines_active & (1 << i)) {
+			if (irqs_enabled & (1 << i)) {
+				struct intel_engine_cs *engine = &i915->ring[i];
+				engine->irq_put(engine);
+			}
+			intel_runtime_pm_put(i915);
+		}
+	}
+
+	return 0;
+}
+
+bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request)
+{
+	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
+	struct intel_breadcrumbs_engine *be = &b->engine[request->ring->id];
+	bool first = false;
+
+	spin_lock(&be->lock);
+	if (request->waiters_count++ == 0) {
+		struct rb_node **p, *parent;
+
+		if (be->first_waiter == NULL)
+			wake_up_process(b->task);
+
+		init_waitqueue_head(&request->waiters);
+
+		first = true;
+		parent = NULL;
+		p = &be->requests.rb_node;
+		while (*p) {
+			struct drm_i915_gem_request *__req;
+
+			parent = *p;
+			__req = rb_entry(parent,
+					 typeof(*__req),
+					 breadcrumb_node);
+
+			if (i915_seqno_passed(request->seqno, __req->seqno)) {
+				p = &parent->rb_right;
+				first = false;
+			} else
+				p = &parent->rb_left;
+		}
+		if (first)
+			be->first_waiter = request;
+
+		rb_link_node(&request->breadcrumb_node, parent, p);
+		rb_insert_color(&request->breadcrumb_node, &be->requests);
+	}
+	spin_unlock(&be->lock);
+
+	return first;
+}
+
+void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request)
+{
+	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
+	struct intel_breadcrumbs_engine *be = &b->engine[request->ring->id];
+
+	spin_lock(&be->lock);
+	if (--request->waiters_count == 0 &&
+	    !RB_EMPTY_NODE(&request->breadcrumb_node)) {
+		if (be->first_waiter == request)
+			be->first_waiter =
+				to_request(rb_next(&request->breadcrumb_node));
+		rb_erase(&request->breadcrumb_node, &be->requests);
+	}
+	spin_unlock(&be->lock);
+}
+
+int intel_breadcrumbs_init(struct drm_i915_private *i915)
+{
+	struct intel_breadcrumbs *b = &i915->breadcrumbs;
+	struct task_struct *task;
+	int i;
+
+	for (i = 0; i < I915_NUM_RINGS; i++)
+		spin_lock_init(&b->engine[i].lock);
+
+	task = kthread_run(intel_breadcrumbs_irq, i915, "irq/i915");
+	if (IS_ERR(task))
+		return PTR_ERR(task);
+
+	b->task = task;
+
+	return 0;
+}
+
+void intel_breadcrumbs_fini(struct drm_i915_private *i915)
+{
+	struct intel_breadcrumbs *b = &i915->breadcrumbs;
+
+	if (b->task == NULL)
+		return;
+
+	kthread_stop(b->task);
+	b->task = NULL;
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index b40ffc3607e7..604c081b0a0f 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1922,10 +1922,10 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 	ring->buffer = NULL;
 
 	ring->dev = dev;
+	ring->i915 = to_i915(dev);
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
-	init_waitqueue_head(&ring->irq_queue);
 
 	INIT_LIST_HEAD(&ring->buffers);
 	INIT_LIST_HEAD(&ring->execlist_queue);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 511efe556d73..5a5a7195dd54 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2157,6 +2157,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	WARN_ON(ring->buffer);
 
 	ring->dev = dev;
+	ring->i915 = to_i915(dev);
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	INIT_LIST_HEAD(&ring->execlist_queue);
@@ -2164,8 +2165,6 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
 	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
 
-	init_waitqueue_head(&ring->irq_queue);
-
 	ringbuf = intel_engine_create_ringbuffer(ring, 32 * PAGE_SIZE);
 	if (IS_ERR(ringbuf))
 		return PTR_ERR(ringbuf);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 5d1eb206151d..49b5bded2767 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -157,6 +157,7 @@ struct  intel_engine_cs {
 #define LAST_USER_RING (VECS + 1)
 	u32		mmio_base;
 	struct		drm_device *dev;
+	struct drm_i915_private *i915;
 	struct intel_ringbuffer *buffer;
 	struct list_head buffers;
 
@@ -303,8 +304,6 @@ struct  intel_engine_cs {
 
 	bool gpu_caches_dirty;
 
-	wait_queue_head_t irq_queue;
-
 	struct intel_context *default_context;
 	struct intel_context *last_context;
 
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH] drm/i915: Convert trace-irq to the breadcrumb waiter
  2015-11-29  8:48 ` [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
                     ` (2 preceding siblings ...)
  2015-11-30 14:34   ` [PATCH v4] " Chris Wilson
@ 2015-11-30 15:45   ` Chris Wilson
  3 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 15:45 UTC (permalink / raw)
  To: intel-gfx

If we convert the tracing over from direct use of ring->irq_get() and
over to the breadcrumb infrastructure, we only have a single user of the
ring->irq_get and so can simplify the driver routines (eliminating the
redundant validation and irq refcounting).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h          |   4 +-
 drivers/gpu/drm/i915/i915_gem.c          |   2 +-
 drivers/gpu/drm/i915/intel_breadcrumbs.c |   3 +-
 drivers/gpu/drm/i915/intel_lrc.c         |  24 ++--
 drivers/gpu/drm/i915/intel_ringbuffer.c  | 210 +++++++++++--------------------
 drivers/gpu/drm/i915/intel_ringbuffer.h  |   3 +-
 6 files changed, 87 insertions(+), 159 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f50aa580e0b5..e4d3abe24950 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -3639,8 +3639,10 @@ wait_remaining_ms_from_jiffies(unsigned long timestamp_jiffies, int to_wait_ms)
 static inline void i915_trace_irq_get(struct intel_engine_cs *ring,
 				      struct drm_i915_gem_request *req)
 {
-	if (ring->trace_irq_req == NULL && ring->irq_get(ring))
+	if (ring->trace_irq_req == NULL) {
 		i915_gem_request_assign(&ring->trace_irq_req, req);
+		intel_breadcrumbs_add_waiter(req);
+	}
 }
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 0afcbef7ff5f..5902bcd87178 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2876,7 +2876,7 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
 
 	if (unlikely(ring->trace_irq_req &&
 		     i915_gem_request_completed(ring->trace_irq_req))) {
-		ring->irq_put(ring);
+		intel_breadcrumbs_remove_waiter(ring->trace_irq_req);
 		i915_gem_request_assign(&ring->trace_irq_req, NULL);
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index 69091f5ae0f5..5f659079ac1c 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -34,7 +34,8 @@ static bool __irq_enable(struct intel_engine_cs *engine)
 	if (!intel_irqs_enabled(engine->i915))
 		return false;
 
-	return engine->irq_get(engine);
+	engine->irq_get(engine);
+	return true;
 }
 
 static struct drm_i915_gem_request *to_request(struct rb_node *rb)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index c457dd035900..8842ea0b53fe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1620,23 +1620,15 @@ static int gen8_emit_bb_start(struct drm_i915_gem_request *req,
 	return 0;
 }
 
-static bool gen8_logical_ring_get_irq(struct intel_engine_cs *ring)
+static void gen8_logical_ring_get_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
-
-	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
-		return false;
-
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (ring->irq_refcount++ == 0) {
-		I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
-		POSTING_READ(RING_IMR(ring->mmio_base));
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
 
-	return true;
+	spin_lock_irq(&dev_priv->irq_lock);
+	I915_WRITE_IMR(ring, ~(ring->irq_enable_mask | ring->irq_keep_mask));
+	POSTING_READ(RING_IMR(ring->mmio_base));
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static void gen8_logical_ring_put_irq(struct intel_engine_cs *ring)
@@ -1646,10 +1638,8 @@ static void gen8_logical_ring_put_irq(struct intel_engine_cs *ring)
 	unsigned long flags;
 
 	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (--ring->irq_refcount == 0) {
-		I915_WRITE_IMR(ring, ~ring->irq_keep_mask);
-		POSTING_READ(RING_IMR(ring->mmio_base));
-	}
+	I915_WRITE_IMR(ring, ~ring->irq_keep_mask);
+	POSTING_READ(RING_IMR(ring->mmio_base));
 	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index ccceb43f14ac..2c6f74c004cb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1504,22 +1504,15 @@ gen6_seqno_barrier(struct intel_engine_cs *ring)
 	intel_flush_status_page(ring, I915_GEM_HWS_INDEX);
 }
 
-static bool
+static void
 gen5_ring_get_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
-
-	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
-		return false;
-
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (ring->irq_refcount++ == 0)
-		gen5_enable_gt_irq(dev_priv, ring->irq_enable_mask);
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
 
-	return true;
+	spin_lock_irq(&dev_priv->irq_lock);
+	gen5_enable_gt_irq(dev_priv, ring->irq_enable_mask);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static void
@@ -1527,33 +1520,23 @@ gen5_ring_put_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (--ring->irq_refcount == 0)
-		gen5_disable_gt_irq(dev_priv, ring->irq_enable_mask);
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+	spin_lock_irq(&dev_priv->irq_lock);
+	gen5_disable_gt_irq(dev_priv, ring->irq_enable_mask);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
-static bool
+static void
 i9xx_ring_get_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
-
-	if (!intel_irqs_enabled(dev_priv))
-		return false;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (ring->irq_refcount++ == 0) {
-		dev_priv->irq_mask &= ~ring->irq_enable_mask;
-		I915_WRITE(IMR, dev_priv->irq_mask);
-		POSTING_READ(IMR);
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
-
-	return true;
+	spin_lock_irq(&dev_priv->irq_lock);
+	dev_priv->irq_mask &= ~ring->irq_enable_mask;
+	I915_WRITE(IMR, dev_priv->irq_mask);
+	POSTING_READ(IMR);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static void
@@ -1561,36 +1544,25 @@ i9xx_ring_put_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (--ring->irq_refcount == 0) {
-		dev_priv->irq_mask |= ring->irq_enable_mask;
-		I915_WRITE(IMR, dev_priv->irq_mask);
-		POSTING_READ(IMR);
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+	spin_lock_irq(&dev_priv->irq_lock);
+	dev_priv->irq_mask |= ring->irq_enable_mask;
+	I915_WRITE(IMR, dev_priv->irq_mask);
+	POSTING_READ(IMR);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
-static bool
+static void
 i8xx_ring_get_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
-
-	if (!intel_irqs_enabled(dev_priv))
-		return false;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (ring->irq_refcount++ == 0) {
-		dev_priv->irq_mask &= ~ring->irq_enable_mask;
-		I915_WRITE16(IMR, dev_priv->irq_mask);
-		POSTING_READ16(IMR);
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
-
-	return true;
+	spin_lock_irq(&dev_priv->irq_lock);
+	dev_priv->irq_mask &= ~ring->irq_enable_mask;
+	I915_WRITE16(IMR, dev_priv->irq_mask);
+	POSTING_READ16(IMR);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static void
@@ -1598,15 +1570,12 @@ i8xx_ring_put_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (--ring->irq_refcount == 0) {
-		dev_priv->irq_mask |= ring->irq_enable_mask;
-		I915_WRITE16(IMR, dev_priv->irq_mask);
-		POSTING_READ16(IMR);
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+	spin_lock_irq(&dev_priv->irq_lock);
+	dev_priv->irq_mask |= ring->irq_enable_mask;
+	I915_WRITE16(IMR, dev_priv->irq_mask);
+	POSTING_READ16(IMR);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static int
@@ -1646,29 +1615,21 @@ i9xx_add_request(struct drm_i915_gem_request *req)
 	return 0;
 }
 
-static bool
+static void
 gen6_ring_get_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
-
-	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
-		return false;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (ring->irq_refcount++ == 0) {
-		if (HAS_L3_DPF(dev) && ring->id == RCS)
-			I915_WRITE_IMR(ring,
-				       ~(ring->irq_enable_mask |
-					 GT_PARITY_ERROR(dev)));
-		else
-			I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
-		gen5_enable_gt_irq(dev_priv, ring->irq_enable_mask);
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
-
-	return true;
+	spin_lock_irq(&dev_priv->irq_lock);
+	if (HAS_L3_DPF(dev) && ring->id == RCS)
+		I915_WRITE_IMR(ring,
+			       ~(ring->irq_enable_mask |
+				 GT_PARITY_ERROR(dev)));
+	else
+		I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
+	gen5_enable_gt_irq(dev_priv, ring->irq_enable_mask);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static void
@@ -1676,37 +1637,26 @@ gen6_ring_put_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (--ring->irq_refcount == 0) {
-		if (HAS_L3_DPF(dev) && ring->id == RCS)
-			I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
-		else
-			I915_WRITE_IMR(ring, ~0);
-		gen5_disable_gt_irq(dev_priv, ring->irq_enable_mask);
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+	spin_lock_irq(&dev_priv->irq_lock);
+	if (HAS_L3_DPF(dev) && ring->id == RCS)
+		I915_WRITE_IMR(ring, ~GT_PARITY_ERROR(dev));
+	else
+		I915_WRITE_IMR(ring, ~0);
+	gen5_disable_gt_irq(dev_priv, ring->irq_enable_mask);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
-static bool
+static void
 hsw_vebox_get_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
-
-	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
-		return false;
-
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (ring->irq_refcount++ == 0) {
-		I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
-		gen6_enable_pm_irq(dev_priv, ring->irq_enable_mask);
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
 
-	return true;
+	spin_lock_irq(&dev_priv->irq_lock);
+	I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
+	gen6_enable_pm_irq(dev_priv, ring->irq_enable_mask);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static void
@@ -1714,40 +1664,29 @@ hsw_vebox_put_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (--ring->irq_refcount == 0) {
-		I915_WRITE_IMR(ring, ~0);
-		gen6_disable_pm_irq(dev_priv, ring->irq_enable_mask);
-	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+	spin_lock_irq(&dev_priv->irq_lock);
+	I915_WRITE_IMR(ring, ~0);
+	gen6_disable_pm_irq(dev_priv, ring->irq_enable_mask);
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
-static bool
+static void
 gen8_ring_get_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
-
-	if (WARN_ON(!intel_irqs_enabled(dev_priv)))
-		return false;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (ring->irq_refcount++ == 0) {
-		if (HAS_L3_DPF(dev) && ring->id == RCS) {
-			I915_WRITE_IMR(ring,
-				       ~(ring->irq_enable_mask |
-					 GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
-		} else {
-			I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
-		}
-		POSTING_READ(RING_IMR(ring->mmio_base));
+	spin_lock_irq(&dev_priv->irq_lock);
+	if (HAS_L3_DPF(dev) && ring->id == RCS) {
+		I915_WRITE_IMR(ring,
+			       ~(ring->irq_enable_mask |
+				 GT_RENDER_L3_PARITY_ERROR_INTERRUPT));
+	} else {
+		I915_WRITE_IMR(ring, ~ring->irq_enable_mask);
 	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
-
-	return true;
+	POSTING_READ(RING_IMR(ring->mmio_base));
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static void
@@ -1755,19 +1694,16 @@ gen8_ring_put_irq(struct intel_engine_cs *ring)
 {
 	struct drm_device *dev = ring->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned long flags;
 
-	spin_lock_irqsave(&dev_priv->irq_lock, flags);
-	if (--ring->irq_refcount == 0) {
-		if (HAS_L3_DPF(dev) && ring->id == RCS) {
-			I915_WRITE_IMR(ring,
-				       ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
-		} else {
-			I915_WRITE_IMR(ring, ~0);
-		}
-		POSTING_READ(RING_IMR(ring->mmio_base));
+	spin_lock_irq(&dev_priv->irq_lock);
+	if (HAS_L3_DPF(dev) && ring->id == RCS) {
+		I915_WRITE_IMR(ring,
+			       ~GT_RENDER_L3_PARITY_ERROR_INTERRUPT);
+	} else {
+		I915_WRITE_IMR(ring, ~0);
 	}
-	spin_unlock_irqrestore(&dev_priv->irq_lock, flags);
+	POSTING_READ(RING_IMR(ring->mmio_base));
+	spin_unlock_irq(&dev_priv->irq_lock);
 }
 
 static int
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 74b9df21f062..9f4a945e312c 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -171,10 +171,9 @@ struct  intel_engine_cs {
 	struct intel_hw_status_page status_page;
 	struct i915_ctx_workarounds wa_ctx;
 
-	unsigned irq_refcount; /* protected by dev_priv->irq_lock */
 	u32		irq_enable_mask;	/* bitmask to enable ring interrupt */
 	struct drm_i915_gem_request *trace_irq_req;
-	bool __must_check (*irq_get)(struct intel_engine_cs *ring);
+	void		(*irq_get)(struct intel_engine_cs *ring);
 	void		(*irq_put)(struct intel_engine_cs *ring);
 
 	int		(*init_hw)(struct intel_engine_cs *ring);
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* Re: [PATCH v4] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 14:34   ` [PATCH v4] " Chris Wilson
@ 2015-11-30 16:30     ` Chris Wilson
  2015-11-30 16:40       ` Chris Wilson
  2015-12-01 18:34     ` Dave Gordon
  1 sibling, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 16:30 UTC (permalink / raw)
  To: intel-gfx

On Mon, Nov 30, 2015 at 02:34:40PM +0000, Chris Wilson wrote:
> +bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request)
> +{
> +	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
> +	struct intel_breadcrumbs_engine *be = &b->engine[request->ring->id];
> +	bool first = false;
> +
> +	spin_lock(&be->lock);
> +	if (request->waiters_count++ == 0) {
> +		struct rb_node **p, *parent;
> +
> +		if (be->first_waiter == NULL)
> +			wake_up_process(b->task);

Oh, the irony.

Converting the other side to a lockless READ_ONCE() means that I do know
have to kick after setting be->first_waiter and not rely on the spinlock
serialising the reads.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH v4] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 16:30     ` Chris Wilson
@ 2015-11-30 16:40       ` Chris Wilson
  0 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-11-30 16:40 UTC (permalink / raw)
  To: intel-gfx, Rogozhkin, Dmitry V, Gong, Zhipeng, Tvrtko Ursulin

On Mon, Nov 30, 2015 at 04:30:57PM +0000, Chris Wilson wrote:
> On Mon, Nov 30, 2015 at 02:34:40PM +0000, Chris Wilson wrote:
> > +bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request)
> > +{
> > +	struct intel_breadcrumbs *b = &request->i915->breadcrumbs;
> > +	struct intel_breadcrumbs_engine *be = &b->engine[request->ring->id];
> > +	bool first = false;
> > +
> > +	spin_lock(&be->lock);
> > +	if (request->waiters_count++ == 0) {
> > +		struct rb_node **p, *parent;
> > +
> > +		if (be->first_waiter == NULL)
> > +			wake_up_process(b->task);
> 
> Oh, the irony.
> 
> Converting the other side to a lockless READ_ONCE() means that I do know

And that's the second mispelling of "now" today!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 05/15] drm/i915: Suppress error message when GPU resets are disabled
  2015-11-29  8:48 ` [PATCH 05/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
@ 2015-12-01  8:30   ` Daniel Vetter
  0 siblings, 0 replies; 92+ messages in thread
From: Daniel Vetter @ 2015-12-01  8:30 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sun, Nov 29, 2015 at 08:48:03AM +0000, Chris Wilson wrote:
> If we do not have lowlevel support for reseting the GPU, or if the user
> has explicitly disabled reseting the device, the failure is expected.
> Since it is an expected failure, we should be using a lower priority
> message than *ERROR*, perhaps NOTICE. In the absence of DRM_NOTICE, just
> emit the expected failure as a DEBUG message.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Works for me too.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> ---
>  drivers/gpu/drm/i915/i915_drv.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index eb893a5e00b1..476bb696dbcb 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -959,7 +959,10 @@ int i915_reset(struct drm_device *dev)
>  		pr_notice("drm/i915: Resetting chip after gpu hang\n");
>  
>  	if (ret) {
> -		DRM_ERROR("Failed to reset chip: %i\n", ret);
> +		if (ret != -ENODEV)
> +			DRM_ERROR("Failed to reset chip: %i\n", ret);
> +		else
> +			DRM_DEBUG_DRIVER("GPU reset disabled\n");
>  		goto error;
>  	}
>  
> -- 
> 2.6.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 04/15] drm/i915: Cache the reset_counter for the request
  2015-11-29  8:48 ` [PATCH 04/15] drm/i915: Cache the reset_counter for the request Chris Wilson
@ 2015-12-01  8:31   ` Daniel Vetter
  2015-12-01  8:47     ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Daniel Vetter @ 2015-12-01  8:31 UTC (permalink / raw)
  To: Chris Wilson; +Cc: intel-gfx

On Sun, Nov 29, 2015 at 08:48:02AM +0000, Chris Wilson wrote:
> Instead of querying the reset counter before every access to the ring,
> query it the first time we touch the ring, and do a final compare when
> submitting the request. For correctness, we need to then sanitize how
> the reset_counter is incremented to prevent broken submission and
> waiting across resets, in the process fixing the persistent -EIO we
> still see today on failed waits.
> 
> v2: Rebase
> v3: Now with added testcase
> v4: Rebase
> 
> Testcase: igt/gem_eio
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

Imo you really can't hide an ABI change in a refactor/optimize patch like
this ... I really think moving check_wedge out of wait_request must be
it's own little patch.
-Daniel

> ---
>  drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
>  drivers/gpu/drm/i915/i915_drv.c         | 32 +++++++-----
>  drivers/gpu/drm/i915/i915_drv.h         | 39 ++++++++++----
>  drivers/gpu/drm/i915/i915_gem.c         | 90 +++++++++++++++------------------
>  drivers/gpu/drm/i915/i915_irq.c         | 21 +-------
>  drivers/gpu/drm/i915/intel_display.c    | 13 ++---
>  drivers/gpu/drm/i915/intel_lrc.c        | 13 +++--
>  drivers/gpu/drm/i915/intel_ringbuffer.c | 14 ++---
>  8 files changed, 111 insertions(+), 113 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index a728ff11e389..8458447ddc17 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4660,7 +4660,7 @@ i915_wedged_get(void *data, u64 *val)
>  	struct drm_device *dev = data;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	*val = atomic_read(&dev_priv->gpu_error.reset_counter);
> +	*val = i915_terminally_wedged(&dev_priv->gpu_error);
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 90faa8e03fca..eb893a5e00b1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -923,23 +923,31 @@ int i915_resume_switcheroo(struct drm_device *dev)
>  int i915_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_gpu_error *error = &dev_priv->gpu_error;
> +	unsigned reset_counter;
>  	bool simulated;
>  	int ret;
>  
>  	intel_reset_gt_powersave(dev);
>  
>  	mutex_lock(&dev->struct_mutex);
> +	atomic_andnot(I915_WEDGED, &error->reset_counter);
> +	reset_counter = atomic_inc_return(&error->reset_counter);
> +	if (WARN_ON(__i915_reset_in_progress(reset_counter))) {
> +		ret = -EIO;
> +		goto error;
> +	}
>  
>  	i915_gem_reset(dev);
>  
> -	simulated = dev_priv->gpu_error.stop_rings != 0;
> +	simulated = error->stop_rings != 0;
>  
>  	ret = intel_gpu_reset(dev);
>  
>  	/* Also reset the gpu hangman. */
>  	if (simulated) {
>  		DRM_INFO("Simulated gpu hang, resetting stop_rings\n");
> -		dev_priv->gpu_error.stop_rings = 0;
> +		error->stop_rings = 0;
>  		if (ret == -ENODEV) {
>  			DRM_INFO("Reset not implemented, but ignoring "
>  				 "error for simulated gpu hangs\n");
> @@ -952,8 +960,7 @@ int i915_reset(struct drm_device *dev)
>  
>  	if (ret) {
>  		DRM_ERROR("Failed to reset chip: %i\n", ret);
> -		mutex_unlock(&dev->struct_mutex);
> -		return ret;
> +		goto error;
>  	}
>  
>  	intel_overlay_reset(dev_priv);
> @@ -972,20 +979,14 @@ int i915_reset(struct drm_device *dev)
>  	 * was running at the time of the reset (i.e. we weren't VT
>  	 * switched away).
>  	 */
> -
> -	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */
> -	dev_priv->gpu_error.reload_in_reset = true;
> -
>  	ret = i915_gem_init_hw(dev);
> -
> -	dev_priv->gpu_error.reload_in_reset = false;
> -
> -	mutex_unlock(&dev->struct_mutex);
>  	if (ret) {
>  		DRM_ERROR("Failed hw init on reset %d\n", ret);
> -		return ret;
> +		goto error;
>  	}
>  
> +	mutex_unlock(&dev->struct_mutex);
> +
>  	/*
>  	 * rps/rc6 re-init is necessary to restore state lost after the
>  	 * reset and the re-install of gt irqs. Skip for ironlake per
> @@ -996,6 +997,11 @@ int i915_reset(struct drm_device *dev)
>  		intel_enable_gt_powersave(dev);
>  
>  	return 0;
> +
> +error:
> +	atomic_or(I915_WEDGED, &error->reset_counter);
> +	mutex_unlock(&dev->struct_mutex);
> +	return ret;
>  }
>  
>  static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d07041c1729d..c24c23d8a0c0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1384,9 +1384,6 @@ struct i915_gpu_error {
>  
>  	/* For missed irq/seqno simulation. */
>  	unsigned int test_irq_rings;
> -
> -	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset   */
> -	bool reload_in_reset;
>  };
>  
>  enum modeset_restore {
> @@ -2181,6 +2178,7 @@ struct drm_i915_gem_request {
>  	/** On Which ring this request was generated */
>  	struct drm_i915_private *i915;
>  	struct intel_engine_cs *ring;
> +	unsigned reset_counter;
>  
>  	 /** GEM sequence number associated with the previous request,
>  	  * when the HWS breadcrumb is equal to this the GPU is processing
> @@ -2976,23 +2974,47 @@ i915_gem_find_active_request(struct intel_engine_cs *ring);
>  
>  bool i915_gem_retire_requests(struct drm_device *dev);
>  void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
> -int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> +int __must_check i915_gem_check_wedge(unsigned reset_counter,
>  				      bool interruptible);
>  
> +static inline u32 i915_reset_counter(struct i915_gpu_error *error)
> +{
> +	return atomic_read(&error->reset_counter);
> +}
> +
> +static inline bool __i915_reset_in_progress(u32 reset)
> +{
> +	return unlikely(reset & I915_RESET_IN_PROGRESS_FLAG);
> +}
> +
> +static inline bool __i915_reset_in_progress_or_wedged(u32 reset)
> +{
> +	return unlikely(reset & (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
> +}
> +
> +static inline bool __i915_terminally_wedged(u32 reset)
> +{
> +	return unlikely(reset & I915_WEDGED);
> +}
> +
>  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
>  {
> -	return unlikely(atomic_read(&error->reset_counter)
> -			& (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
> +	return __i915_reset_in_progress(i915_reset_counter(error));
> +}
> +
> +static inline bool i915_reset_in_progress_or_wedged(struct i915_gpu_error *error)
> +{
> +	return __i915_reset_in_progress_or_wedged(i915_reset_counter(error));
>  }
>  
>  static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
>  {
> -	return atomic_read(&error->reset_counter) & I915_WEDGED;
> +	return __i915_terminally_wedged(i915_reset_counter(error));
>  }
>  
>  static inline u32 i915_reset_count(struct i915_gpu_error *error)
>  {
> -	return ((atomic_read(&error->reset_counter) & ~I915_WEDGED) + 1) / 2;
> +	return ((i915_reset_counter(error) & ~I915_WEDGED) + 1) / 2;
>  }
>  
>  static inline bool i915_stop_ring_allow_ban(struct drm_i915_private *dev_priv)
> @@ -3025,7 +3047,6 @@ void __i915_add_request(struct drm_i915_gem_request *req,
>  #define i915_add_request_no_flush(req) \
>  	__i915_add_request(req, NULL, false)
>  int __i915_wait_request(struct drm_i915_gem_request *req,
> -			unsigned reset_counter,
>  			bool interruptible,
>  			s64 *timeout,
>  			struct intel_rps_client *rps);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index ada461e02718..646d189e23a1 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -85,9 +85,7 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
>  {
>  	int ret;
>  
> -#define EXIT_COND (!i915_reset_in_progress(error) || \
> -		   i915_terminally_wedged(error))
> -	if (EXIT_COND)
> +	if (!i915_reset_in_progress(error))
>  		return 0;
>  
>  	/*
> @@ -96,17 +94,16 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
>  	 * we should simply try to bail out and fail as gracefully as possible.
>  	 */
>  	ret = wait_event_interruptible_timeout(error->reset_queue,
> -					       EXIT_COND,
> +					       !i915_reset_in_progress(error),
>  					       10*HZ);
>  	if (ret == 0) {
>  		DRM_ERROR("Timed out waiting for the gpu reset to complete\n");
>  		return -EIO;
>  	} else if (ret < 0) {
>  		return ret;
> +	} else {
> +		return 0;
>  	}
> -#undef EXIT_COND
> -
> -	return 0;
>  }
>  
>  int i915_mutex_lock_interruptible(struct drm_device *dev)
> @@ -1110,26 +1107,20 @@ put_rpm:
>  }
>  
>  int
> -i915_gem_check_wedge(struct i915_gpu_error *error,
> +i915_gem_check_wedge(unsigned reset_counter,
>  		     bool interruptible)
>  {
> -	if (i915_reset_in_progress(error)) {
> +	if (__i915_reset_in_progress_or_wedged(reset_counter)) {
>  		/* Non-interruptible callers can't handle -EAGAIN, hence return
>  		 * -EIO unconditionally for these. */
>  		if (!interruptible)
>  			return -EIO;
>  
>  		/* Recovery complete, but the reset failed ... */
> -		if (i915_terminally_wedged(error))
> +		if (__i915_terminally_wedged(reset_counter))
>  			return -EIO;
>  
> -		/*
> -		 * Check if GPU Reset is in progress - we need intel_ring_begin
> -		 * to work properly to reinit the hw state while the gpu is
> -		 * still marked as reset-in-progress. Handle this with a flag.
> -		 */
> -		if (!error->reload_in_reset)
> -			return -EAGAIN;
> +		return -EAGAIN;
>  	}
>  
>  	return 0;
> @@ -1223,7 +1214,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>  /**
>   * __i915_wait_request - wait until execution of request has finished
>   * @req: duh!
> - * @reset_counter: reset sequence associated with the given request
>   * @interruptible: do an interruptible wait (normally yes)
>   * @timeout: in - how long to wait (NULL forever); out - how much time remaining
>   *
> @@ -1238,7 +1228,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   * errno with remaining time filled in timeout argument.
>   */
>  int __i915_wait_request(struct drm_i915_gem_request *req,
> -			unsigned reset_counter,
>  			bool interruptible,
>  			s64 *timeout,
>  			struct intel_rps_client *rps)
> @@ -1289,12 +1278,12 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  
>  		/* We need to check whether any gpu reset happened in between
>  		 * the caller grabbing the seqno and now ... */
> -		if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
> -			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
> -			 * is truely gone. */
> -			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> -			if (ret == 0)
> -				ret = -EAGAIN;
> +		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
> +			/* As we do not requeue the request over a GPU reset,
> +			 * if one does occur we know that the request is
> +			 * effectively complete.
> +			 */
> +			ret = 0;
>  			break;
>  		}
>  
> @@ -1462,13 +1451,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
>  
>  	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> -	if (ret)
> -		return ret;
> -
> -	ret = __i915_wait_request(req,
> -				  atomic_read(&dev_priv->gpu_error.reset_counter),
> -				  interruptible, NULL, NULL);
> +	ret = __i915_wait_request(req, interruptible, NULL, NULL);
>  	if (ret)
>  		return ret;
>  
> @@ -1543,7 +1526,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_request *requests[I915_NUM_RINGS];
> -	unsigned reset_counter;
>  	int ret, i, n = 0;
>  
>  	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> @@ -1552,12 +1534,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  	if (!obj->active)
>  		return 0;
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, true);
> -	if (ret)
> -		return ret;
> -
> -	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> -
>  	if (readonly) {
>  		struct drm_i915_gem_request *req;
>  
> @@ -1579,9 +1555,9 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  	}
>  
>  	mutex_unlock(&dev->struct_mutex);
> +	ret = 0;
>  	for (i = 0; ret == 0 && i < n; i++)
> -		ret = __i915_wait_request(requests[i], reset_counter, true,
> -					  NULL, rps);
> +		ret = __i915_wait_request(requests[i], true, NULL, rps);
>  	mutex_lock(&dev->struct_mutex);
>  
>  	for (i = 0; i < n; i++) {
> @@ -2685,6 +2661,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  			   struct drm_i915_gem_request **req_out)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(ring->dev);
> +	unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  	struct drm_i915_gem_request *req;
>  	int ret;
>  
> @@ -2693,6 +2670,10 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  
>  	*req_out = NULL;
>  
> +	ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
> +	if (ret)
> +		return ret;
> +
>  	req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
>  	if (req == NULL)
>  		return -ENOMEM;
> @@ -2704,6 +2685,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  	kref_init(&req->ref);
>  	req->i915 = dev_priv;
>  	req->ring = ring;
> +	req->reset_counter = reset_counter;
>  	req->ctx  = ctx;
>  	i915_gem_context_reference(req->ctx);
>  
> @@ -3064,11 +3046,9 @@ retire:
>  int
>  i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_wait *args = data;
>  	struct drm_i915_gem_object *obj;
>  	struct drm_i915_gem_request *req[I915_NUM_RINGS];
> -	unsigned reset_counter;
>  	int i, n = 0;
>  	int ret;
>  
> @@ -3102,7 +3082,6 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  	}
>  
>  	drm_gem_object_unreference(&obj->base);
> -	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
>  
>  	for (i = 0; i < I915_NUM_RINGS; i++) {
>  		if (obj->last_read_req[i] == NULL)
> @@ -3115,11 +3094,23 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  
>  	for (i = 0; i < n; i++) {
>  		if (ret == 0)
> -			ret = __i915_wait_request(req[i], reset_counter, true,
> +			ret = __i915_wait_request(req[i], true,
>  						  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
>  						  file->driver_priv);
> +
> +		/* If the GPU hung before this, report it. Ideally we only
> +		 * report if this request cannot be completed. Currently
> +		 * when we don't mark the guilty party and abort all
> +		 * requests on reset, so just mark all as EIO.
> +		 */
> +		if (ret == 0 &&
> +		    req[i]->reset_counter != i915_reset_counter(&req[i]->i915->gpu_error))
> +			ret = -EIO;
> +
>  		i915_gem_request_unreference__unlocked(req[i]);
>  	}
> +
> +
>  	return ret;
>  
>  out:
> @@ -3147,7 +3138,6 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  	if (!i915_semaphore_is_enabled(obj->base.dev)) {
>  		struct drm_i915_private *i915 = to_i915(obj->base.dev);
>  		ret = __i915_wait_request(from_req,
> -					  atomic_read(&i915->gpu_error.reset_counter),
>  					  i915->mm.interruptible,
>  					  NULL,
>  					  &i915->rps.semaphores);
> @@ -4076,14 +4066,15 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
>  	unsigned long recent_enough = jiffies - DRM_I915_THROTTLE_JIFFIES;
>  	struct drm_i915_gem_request *request, *target = NULL;
> -	unsigned reset_counter;
>  	int ret;
>  
>  	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
>  	if (ret)
>  		return ret;
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
> +	/* ABI: return -EIO if wedged */
> +	ret = i915_gem_check_wedge(i915_reset_counter(&dev_priv->gpu_error),
> +				   false);
>  	if (ret)
>  		return ret;
>  
> @@ -4101,7 +4092,6 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  
>  		target = request;
>  	}
> -	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
>  	if (target)
>  		i915_gem_request_reference(target);
>  	spin_unlock(&file_priv->mm.lock);
> @@ -4109,7 +4099,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	if (target == NULL)
>  		return 0;
>  
> -	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
> +	ret = __i915_wait_request(target, true, NULL, NULL);
>  	if (ret == 0)
>  		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
>  
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index e88d692583a5..3b2a8fbe0392 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2434,7 +2434,6 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
>  static void i915_reset_and_wakeup(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> -	struct i915_gpu_error *error = &dev_priv->gpu_error;
>  	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
>  	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
>  	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
> @@ -2452,7 +2451,7 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
>  	 * the reset in-progress bit is only ever set by code outside of this
>  	 * work we don't need to worry about any other races.
>  	 */
> -	if (i915_reset_in_progress(error) && !i915_terminally_wedged(error)) {
> +	if (i915_reset_in_progress(&dev_priv->gpu_error)) {
>  		DRM_DEBUG_DRIVER("resetting chip\n");
>  		kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE,
>  				   reset_event);
> @@ -2480,25 +2479,9 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
>  
>  		intel_runtime_pm_put(dev_priv);
>  
> -		if (ret == 0) {
> -			/*
> -			 * After all the gem state is reset, increment the reset
> -			 * counter and wake up everyone waiting for the reset to
> -			 * complete.
> -			 *
> -			 * Since unlock operations are a one-sided barrier only,
> -			 * we need to insert a barrier here to order any seqno
> -			 * updates before
> -			 * the counter increment.
> -			 */
> -			smp_mb__before_atomic();
> -			atomic_inc(&dev_priv->gpu_error.reset_counter);
> -
> +		if (ret == 0)
>  			kobject_uevent_env(&dev->primary->kdev->kobj,
>  					   KOBJ_CHANGE, reset_done_event);
> -		} else {
> -			atomic_or(I915_WEDGED, &error->reset_counter);
> -		}
>  
>  		/*
>  		 * Note: The wake_up also serves as a memory barrier so that
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 0743337ac9e8..d77a17bb94d0 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3289,8 +3289,7 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
>  	bool pending;
>  
> -	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
> -	    intel_crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
> +	if (intel_crtc->reset_counter != i915_reset_counter(&dev_priv->gpu_error))
>  		return false;
>  
>  	spin_lock_irq(&dev->event_lock);
> @@ -10879,8 +10878,7 @@ static bool page_flip_finished(struct intel_crtc *crtc)
>  	struct drm_device *dev = crtc->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
> -	    crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
> +	if (crtc->reset_counter != i915_reset_counter(&dev_priv->gpu_error))
>  		return true;
>  
>  	/*
> @@ -11322,7 +11320,6 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
>  
>  	if (mmio_flip->req) {
>  		WARN_ON(__i915_wait_request(mmio_flip->req,
> -					    mmio_flip->crtc->reset_counter,
>  					    false, NULL,
>  					    &mmio_flip->i915->rps.mmioflips));
>  		i915_gem_request_unreference__unlocked(mmio_flip->req);
> @@ -13312,9 +13309,6 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
>  
>  	ret = drm_atomic_helper_prepare_planes(dev, state);
>  	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
> -		u32 reset_counter;
> -
> -		reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
>  		mutex_unlock(&dev->struct_mutex);
>  
>  		for_each_plane_in_state(state, plane, plane_state, i) {
> @@ -13325,8 +13319,7 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
>  				continue;
>  
>  			ret = __i915_wait_request(intel_plane_state->wait_req,
> -						  reset_counter, true,
> -						  NULL, NULL);
> +						  true, NULL, NULL);
>  
>  			/* Swallow -EIO errors to allow updates during hw lockup. */
>  			if (ret == -EIO)
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 4ebafab53f30..b40ffc3607e7 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -711,6 +711,14 @@ static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
>  	if (ret)
>  		return ret;
>  
> +	/* If the request was completed due to a GPU hang, we want to
> +	 * error out before we continue to emit more commands to the GPU.
> +	 */
> +	ret = i915_gem_check_wedge(i915_reset_counter(&req->i915->gpu_error),
> +				   req->i915->mm.interruptible);
> +	if (ret)
> +		return ret;
> +
>  	ringbuf->space = space;
>  	return 0;
>  }
> @@ -825,11 +833,6 @@ int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
>  	WARN_ON(req == NULL);
>  	dev_priv = req->ring->dev->dev_private;
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> -				   dev_priv->mm.interruptible);
> -	if (ret)
> -		return ret;
> -
>  	ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
>  	if (ret)
>  		return ret;
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 57d78f264b53..511efe556d73 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2254,6 +2254,14 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>  	if (ret)
>  		return ret;
>  
> +	/* If the request was completed due to a GPU hang, we want to
> +	 * error out before we continue to emit more commands to the GPU.
> +	 */
> +	ret = i915_gem_check_wedge(i915_reset_counter(&to_i915(ring->dev)->gpu_error),
> +				   to_i915(ring->dev)->mm.interruptible);
> +	if (ret)
> +		return ret;
> +
>  	ringbuf->space = space;
>  	return 0;
>  }
> @@ -2286,7 +2294,6 @@ int intel_ring_idle(struct intel_engine_cs *ring)
>  
>  	/* Make sure we do not trigger any retires */
>  	return __i915_wait_request(req,
> -				   atomic_read(&to_i915(ring->dev)->gpu_error.reset_counter),
>  				   to_i915(ring->dev)->mm.interruptible,
>  				   NULL, NULL);
>  }
> @@ -2417,11 +2424,6 @@ int intel_ring_begin(struct drm_i915_gem_request *req,
>  	ring = req->ring;
>  	dev_priv = ring->dev->dev_private;
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> -				   dev_priv->mm.interruptible);
> -	if (ret)
> -		return ret;
> -
>  	ret = __intel_ring_prepare(ring, num_dwords * sizeof(uint32_t));
>  	if (ret)
>  		return ret;
> -- 
> 2.6.2
> 
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/intel-gfx

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 04/15] drm/i915: Cache the reset_counter for the request
  2015-12-01  8:31   ` Daniel Vetter
@ 2015-12-01  8:47     ` Chris Wilson
  2015-12-01  9:15       ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-01  8:47 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Tue, Dec 01, 2015 at 09:31:31AM +0100, Daniel Vetter wrote:
> On Sun, Nov 29, 2015 at 08:48:02AM +0000, Chris Wilson wrote:
> > Instead of querying the reset counter before every access to the ring,
> > query it the first time we touch the ring, and do a final compare when
> > submitting the request. For correctness, we need to then sanitize how
> > the reset_counter is incremented to prevent broken submission and
> > waiting across resets, in the process fixing the persistent -EIO we
> > still see today on failed waits.
> > 
> > v2: Rebase
> > v3: Now with added testcase
> > v4: Rebase
> > 
> > Testcase: igt/gem_eio
> > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> Imo you really can't hide an ABI change in a refactor/optimize patch like
> this ... I really think moving check_wedge out of wait_request must be
> it's own little patch.

Urm, that is the essential part of the patch in this series.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 04/15] drm/i915: Cache the reset_counter for the request
  2015-12-01  8:47     ` Chris Wilson
@ 2015-12-01  9:15       ` Chris Wilson
  2015-12-01 11:05         ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-01  9:15 UTC (permalink / raw)
  To: Daniel Vetter, intel-gfx

On Tue, Dec 01, 2015 at 08:47:44AM +0000, Chris Wilson wrote:
> On Tue, Dec 01, 2015 at 09:31:31AM +0100, Daniel Vetter wrote:
> > On Sun, Nov 29, 2015 at 08:48:02AM +0000, Chris Wilson wrote:
> > > Instead of querying the reset counter before every access to the ring,
> > > query it the first time we touch the ring, and do a final compare when
> > > submitting the request. For correctness, we need to then sanitize how
> > > the reset_counter is incremented to prevent broken submission and
> > > waiting across resets, in the process fixing the persistent -EIO we
> > > still see today on failed waits.
> > > 
> > > v2: Rebase
> > > v3: Now with added testcase
> > > v4: Rebase
> > > 
> > > Testcase: igt/gem_eio
> > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > Imo you really can't hide an ABI change in a refactor/optimize patch like
> > this ... I really think moving check_wedge out of wait_request must be
> > it's own little patch.
> 
> Urm, that is the essential part of the patch in this series.

Ok, no it's not. The key is having the request->reset_counter. I was
thinking that the wakeup logic had to be the same in both - which is
true, but that doesn't restrict how the wakeup is then propagated from
i915_wait_request.

Getting the wakeup robust though is the meat of the patch.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper
  2015-12-01  9:15       ` Chris Wilson
@ 2015-12-01 11:05         ` Chris Wilson
  2015-12-01 11:05           ` [PATCH 2/3] drm/i915: Store the reset counter when constructing a request Chris Wilson
                             ` (2 more replies)
  0 siblings, 3 replies; 92+ messages in thread
From: Chris Wilson @ 2015-12-01 11:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: daniel.vetter

This is just a little bit of syntatic sugar to hide the atomic_reads()
throughout the code to retrieve the current reset_counter. It also
provides the other utility functions to check the reset state on the
already read reset_counter, so that we can read it once and do multiple
tests rather than risk the value changing between tests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
 drivers/gpu/drm/i915/i915_drv.h         | 32 ++++++++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_gem.c         | 12 ++++++------
 drivers/gpu/drm/i915/intel_display.c    | 16 ++++++++++------
 drivers/gpu/drm/i915/intel_ringbuffer.c |  2 +-
 5 files changed, 46 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index bfd57fb597dc..864b3ae9419c 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4656,7 +4656,7 @@ i915_wedged_get(void *data, u64 *val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	*val = atomic_read(&dev_priv->gpu_error.reset_counter);
+	*val = i915_reset_counter(&dev_priv->gpu_error);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d07041c1729d..2a4cfa06c28d 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2979,20 +2979,44 @@ void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
 int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
 				      bool interruptible);
 
+static inline u32 i915_reset_counter(struct i915_gpu_error *error)
+{
+	return atomic_read(&error->reset_counter);
+}
+
+static inline bool __i915_reset_in_progress(u32 reset)
+{
+	return unlikely(reset & I915_RESET_IN_PROGRESS_FLAG);
+}
+
+static inline bool __i915_reset_in_progress_or_wedged(u32 reset)
+{
+	return unlikely(reset & (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
+}
+
+static inline bool __i915_terminally_wedged(u32 reset)
+{
+	return unlikely(reset & I915_WEDGED);
+}
+
 static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
 {
-	return unlikely(atomic_read(&error->reset_counter)
-			& (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
+	return __i915_reset_in_progress(i915_reset_counter(error));
+}
+
+static inline bool i915_reset_in_progress_or_wedged(struct i915_gpu_error *error)
+{
+	return __i915_reset_in_progress_or_wedged(i915_reset_counter(error));
 }
 
 static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
 {
-	return atomic_read(&error->reset_counter) & I915_WEDGED;
+	return __i915_terminally_wedged(i915_reset_counter(error));
 }
 
 static inline u32 i915_reset_count(struct i915_gpu_error *error)
 {
-	return ((atomic_read(&error->reset_counter) & ~I915_WEDGED) + 1) / 2;
+	return ((i915_reset_counter(error) & ~I915_WEDGED) + 1) / 2;
 }
 
 static inline bool i915_stop_ring_allow_ban(struct drm_i915_private *dev_priv)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index cccfe5bfe5bf..bff245de8ade 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1297,7 +1297,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 
 		/* We need to check whether any gpu reset happened in between
 		 * the caller grabbing the seqno and now ... */
-		if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
+		if (reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
 			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
 			 * is truely gone. */
 			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
@@ -1475,7 +1475,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
 		return ret;
 
 	ret = __i915_wait_request(req,
-				  atomic_read(&dev_priv->gpu_error.reset_counter),
+				  i915_reset_counter(&dev_priv->gpu_error),
 				  interruptible, NULL, NULL);
 	if (ret)
 		return ret;
@@ -1564,7 +1564,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 
 	if (readonly) {
 		struct drm_i915_gem_request *req;
@@ -3110,7 +3110,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	}
 
 	drm_gem_object_unreference(&obj->base);
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
 		if (obj->last_read_req[i] == NULL)
@@ -3155,7 +3155,7 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
 		struct drm_i915_private *i915 = to_i915(obj->base.dev);
 		ret = __i915_wait_request(from_req,
-					  atomic_read(&i915->gpu_error.reset_counter),
+					  i915_reset_counter(&i915->gpu_error),
 					  i915->mm.interruptible,
 					  NULL,
 					  &i915->rps.semaphores);
@@ -4109,7 +4109,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 		target = request;
 	}
-	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 	if (target)
 		i915_gem_request_reference(target);
 	spin_unlock(&file_priv->mm.lock);
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4ae490dfe2c4..511ead08ccd8 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3287,10 +3287,12 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
 	struct drm_device *dev = crtc->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	unsigned reset_counter;
 	bool pending;
 
-	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
-	    intel_crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
+	if (intel_crtc->reset_counter != reset_counter ||
+	    __i915_reset_in_progress(reset_counter))
 		return false;
 
 	spin_lock_irq(&dev->event_lock);
@@ -10865,9 +10867,11 @@ static bool page_flip_finished(struct intel_crtc *crtc)
 {
 	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	unsigned reset_counter;
 
-	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
-	    crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
+	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
+	if (crtc->reset_counter != reset_counter ||
+	    __i915_reset_in_progress(reset_counter))
 		return true;
 
 	/*
@@ -11511,7 +11515,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
 		goto cleanup;
 
 	atomic_inc(&intel_crtc->unpin_work_count);
-	intel_crtc->reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+	intel_crtc->reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 
 	if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev))
 		work->flip_count = I915_READ(PIPE_FLIPCOUNT_G4X(pipe)) + 1;
@@ -13303,7 +13307,7 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
 		u32 reset_counter;
 
-		reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
+		reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 		mutex_unlock(&dev->struct_mutex);
 
 		for_each_plane_in_state(state, plane, plane_state, i) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 57d78f264b53..8970267d27bb 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2286,7 +2286,7 @@ int intel_ring_idle(struct intel_engine_cs *ring)
 
 	/* Make sure we do not trigger any retires */
 	return __i915_wait_request(req,
-				   atomic_read(&to_i915(ring->dev)->gpu_error.reset_counter),
+				   i915_reset_counter(&to_i915(ring->dev)->gpu_error),
 				   to_i915(ring->dev)->mm.interruptible,
 				   NULL, NULL);
 }
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 2/3] drm/i915: Store the reset counter when constructing a request
  2015-12-01 11:05         ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
@ 2015-12-01 11:05           ` Chris Wilson
  2015-12-03  8:59             ` Daniel Vetter
  2015-12-01 11:05           ` [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
  2015-12-03  8:57           ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Daniel Vetter
  2 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-01 11:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: daniel.vetter

As the request is only valid during the same global reset epoch, we can
record the current reset_counter when constructing the request and reuse
it when waiting upon that request in future. This removes a very hairy
atomic check serialised by the struct_mutex at the time of waiting and
allows us to transfer those waits to a central dispatcher for all
waiters and all requests.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_drv.h         |  2 +-
 drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++----------------------
 drivers/gpu/drm/i915/intel_display.c    |  7 +-----
 drivers/gpu/drm/i915/intel_lrc.c        |  7 ------
 drivers/gpu/drm/i915/intel_ringbuffer.c |  6 -----
 5 files changed, 15 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2a4cfa06c28d..ee7677343056 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2181,6 +2181,7 @@ struct drm_i915_gem_request {
 	/** On Which ring this request was generated */
 	struct drm_i915_private *i915;
 	struct intel_engine_cs *ring;
+	unsigned reset_counter;
 
 	 /** GEM sequence number associated with the previous request,
 	  * when the HWS breadcrumb is equal to this the GPU is processing
@@ -3049,7 +3050,6 @@ void __i915_add_request(struct drm_i915_gem_request *req,
 #define i915_add_request_no_flush(req) \
 	__i915_add_request(req, NULL, false)
 int __i915_wait_request(struct drm_i915_gem_request *req,
-			unsigned reset_counter,
 			bool interruptible,
 			s64 *timeout,
 			struct intel_rps_client *rps);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index bff245de8ade..68d4bab93fd8 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1223,7 +1223,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 /**
  * __i915_wait_request - wait until execution of request has finished
  * @req: duh!
- * @reset_counter: reset sequence associated with the given request
  * @interruptible: do an interruptible wait (normally yes)
  * @timeout: in - how long to wait (NULL forever); out - how much time remaining
  *
@@ -1238,7 +1237,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
  * errno with remaining time filled in timeout argument.
  */
 int __i915_wait_request(struct drm_i915_gem_request *req,
-			unsigned reset_counter,
 			bool interruptible,
 			s64 *timeout,
 			struct intel_rps_client *rps)
@@ -1297,7 +1295,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 
 		/* We need to check whether any gpu reset happened in between
 		 * the caller grabbing the seqno and now ... */
-		if (reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
+		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
 			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
 			 * is truely gone. */
 			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
@@ -1470,13 +1468,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
-	if (ret)
-		return ret;
-
-	ret = __i915_wait_request(req,
-				  i915_reset_counter(&dev_priv->gpu_error),
-				  interruptible, NULL, NULL);
+	ret = __i915_wait_request(req, interruptible, NULL, NULL);
 	if (ret)
 		return ret;
 
@@ -1551,7 +1543,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	struct drm_device *dev = obj->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_request *requests[I915_NUM_RINGS];
-	unsigned reset_counter;
 	int ret, i, n = 0;
 
 	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
@@ -1560,12 +1551,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	if (!obj->active)
 		return 0;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, true);
-	if (ret)
-		return ret;
-
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
-
 	if (readonly) {
 		struct drm_i915_gem_request *req;
 
@@ -1587,9 +1572,9 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
 	}
 
 	mutex_unlock(&dev->struct_mutex);
+	ret = 0;
 	for (i = 0; ret == 0 && i < n; i++)
-		ret = __i915_wait_request(requests[i], reset_counter, true,
-					  NULL, rps);
+		ret = __i915_wait_request(requests[i], true, NULL, rps);
 	mutex_lock(&dev->struct_mutex);
 
 	for (i = 0; i < n; i++) {
@@ -2693,6 +2678,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 			   struct drm_i915_gem_request **req_out)
 {
 	struct drm_i915_private *dev_priv = to_i915(ring->dev);
+	unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 	struct drm_i915_gem_request *req;
 	int ret;
 
@@ -2701,6 +2687,11 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 
 	*req_out = NULL;
 
+	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
+				   dev_priv->mm.interruptible);
+	if (ret)
+		return ret;
+
 	req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
 	if (req == NULL)
 		return -ENOMEM;
@@ -2712,6 +2703,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 	kref_init(&req->ref);
 	req->i915 = dev_priv;
 	req->ring = ring;
+	req->reset_counter = reset_counter;
 	req->ctx  = ctx;
 	i915_gem_context_reference(req->ctx);
 
@@ -3072,11 +3064,9 @@ retire:
 int
 i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 {
-	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct drm_i915_gem_wait *args = data;
 	struct drm_i915_gem_object *obj;
 	struct drm_i915_gem_request *req[I915_NUM_RINGS];
-	unsigned reset_counter;
 	int i, n = 0;
 	int ret;
 
@@ -3110,7 +3100,6 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 	}
 
 	drm_gem_object_unreference(&obj->base);
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 
 	for (i = 0; i < I915_NUM_RINGS; i++) {
 		if (obj->last_read_req[i] == NULL)
@@ -3123,7 +3112,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 
 	for (i = 0; i < n; i++) {
 		if (ret == 0)
-			ret = __i915_wait_request(req[i], reset_counter, true,
+			ret = __i915_wait_request(req[i], true,
 						  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
 						  file->driver_priv);
 		i915_gem_request_unreference__unlocked(req[i]);
@@ -3155,7 +3144,6 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
 	if (!i915_semaphore_is_enabled(obj->base.dev)) {
 		struct drm_i915_private *i915 = to_i915(obj->base.dev);
 		ret = __i915_wait_request(from_req,
-					  i915_reset_counter(&i915->gpu_error),
 					  i915->mm.interruptible,
 					  NULL,
 					  &i915->rps.semaphores);
@@ -4084,7 +4072,6 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	unsigned long recent_enough = jiffies - DRM_I915_THROTTLE_JIFFIES;
 	struct drm_i915_gem_request *request, *target = NULL;
-	unsigned reset_counter;
 	int ret;
 
 	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
@@ -4109,7 +4096,6 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 
 		target = request;
 	}
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 	if (target)
 		i915_gem_request_reference(target);
 	spin_unlock(&file_priv->mm.lock);
@@ -4117,7 +4103,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (target == NULL)
 		return 0;
 
-	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
+	ret = __i915_wait_request(target, true, NULL, NULL);
 	if (ret == 0)
 		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
 
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 511ead08ccd8..4447e73b54db 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -11313,7 +11313,6 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
 
 	if (mmio_flip->req) {
 		WARN_ON(__i915_wait_request(mmio_flip->req,
-					    mmio_flip->crtc->reset_counter,
 					    false, NULL,
 					    &mmio_flip->i915->rps.mmioflips));
 		i915_gem_request_unreference__unlocked(mmio_flip->req);
@@ -13305,9 +13304,6 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);
 	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
-		u32 reset_counter;
-
-		reset_counter = i915_reset_counter(&dev_priv->gpu_error);
 		mutex_unlock(&dev->struct_mutex);
 
 		for_each_plane_in_state(state, plane, plane_state, i) {
@@ -13318,8 +13314,7 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 				continue;
 
 			ret = __i915_wait_request(intel_plane_state->wait_req,
-						  reset_counter, true,
-						  NULL, NULL);
+						  true, NULL, NULL);
 
 			/* Swallow -EIO errors to allow updates during hw lockup. */
 			if (ret == -EIO)
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 4ebafab53f30..fe71c768dbc4 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -819,16 +819,9 @@ static int logical_ring_prepare(struct drm_i915_gem_request *req, int bytes)
  */
 int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
 {
-	struct drm_i915_private *dev_priv;
 	int ret;
 
 	WARN_ON(req == NULL);
-	dev_priv = req->ring->dev->dev_private;
-
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
-				   dev_priv->mm.interruptible);
-	if (ret)
-		return ret;
 
 	ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
 	if (ret)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 8970267d27bb..913526c0264f 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2286,7 +2286,6 @@ int intel_ring_idle(struct intel_engine_cs *ring)
 
 	/* Make sure we do not trigger any retires */
 	return __i915_wait_request(req,
-				   i915_reset_counter(&to_i915(ring->dev)->gpu_error),
 				   to_i915(ring->dev)->mm.interruptible,
 				   NULL, NULL);
 }
@@ -2417,11 +2416,6 @@ int intel_ring_begin(struct drm_i915_gem_request *req,
 	ring = req->ring;
 	dev_priv = ring->dev->dev_private;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
-				   dev_priv->mm.interruptible);
-	if (ret)
-		return ret;
-
 	ret = __intel_ring_prepare(ring, num_dwords * sizeof(uint32_t));
 	if (ret)
 		return ret;
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request()
  2015-12-01 11:05         ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
  2015-12-01 11:05           ` [PATCH 2/3] drm/i915: Store the reset counter when constructing a request Chris Wilson
@ 2015-12-01 11:05           ` Chris Wilson
  2015-12-03  9:14             ` Daniel Vetter
  2015-12-03  8:57           ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Daniel Vetter
  2 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-01 11:05 UTC (permalink / raw)
  To: intel-gfx; +Cc: daniel.vetter

Reporting -EIO from i915_wait_request() has proven very troublematic
over the years, with numerous hard-to-reproduce bugs cropping up in the
corner case of where a reset occurs and the code wasn't expecting such
an error.

If the we reset the GPU or have detected a hang and wish to reset the
GPU, the request is forcibly complete and the wait broken. Currently, we
report either -EAGAIN or -EIO in order for the caller to retreat and
restart the wait (if appropriate) after dropping and then reacquiring
the struct_mutex (essential to allow the GPU reset to proceed). However,
if we take the view that the request is complete (no further work will
be done on it by the GPU because it is dead and soon to be reset), then
we can proceed with the task at hand and then drop the struct_mutex
allowing the reset to occur. This transfers the burden of checking
whether it is safe to proceed to the caller, which in all but one
instance it is safe - completely eliminating the source of all spurious
-EIO.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
 drivers/gpu/drm/i915/i915_drv.c         | 32 +++++++++++++----------
 drivers/gpu/drm/i915/i915_drv.h         |  5 +---
 drivers/gpu/drm/i915/i915_gem.c         | 45 +++++++++++++--------------------
 drivers/gpu/drm/i915/i915_irq.c         | 21 ++-------------
 drivers/gpu/drm/i915/intel_display.c    | 32 +++++++----------------
 drivers/gpu/drm/i915/intel_lrc.c        |  8 ++++++
 drivers/gpu/drm/i915/intel_ringbuffer.c |  8 ++++++
 8 files changed, 65 insertions(+), 88 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 864b3ae9419c..0583eda642d7 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -4656,7 +4656,7 @@ i915_wedged_get(void *data, u64 *val)
 	struct drm_device *dev = data;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 
-	*val = i915_reset_counter(&dev_priv->gpu_error);
+	*val = i915_terminally_wedged(&dev_priv->gpu_error);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index 90faa8e03fca..eb893a5e00b1 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -923,23 +923,31 @@ int i915_resume_switcheroo(struct drm_device *dev)
 int i915_reset(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct i915_gpu_error *error = &dev_priv->gpu_error;
+	unsigned reset_counter;
 	bool simulated;
 	int ret;
 
 	intel_reset_gt_powersave(dev);
 
 	mutex_lock(&dev->struct_mutex);
+	atomic_andnot(I915_WEDGED, &error->reset_counter);
+	reset_counter = atomic_inc_return(&error->reset_counter);
+	if (WARN_ON(__i915_reset_in_progress(reset_counter))) {
+		ret = -EIO;
+		goto error;
+	}
 
 	i915_gem_reset(dev);
 
-	simulated = dev_priv->gpu_error.stop_rings != 0;
+	simulated = error->stop_rings != 0;
 
 	ret = intel_gpu_reset(dev);
 
 	/* Also reset the gpu hangman. */
 	if (simulated) {
 		DRM_INFO("Simulated gpu hang, resetting stop_rings\n");
-		dev_priv->gpu_error.stop_rings = 0;
+		error->stop_rings = 0;
 		if (ret == -ENODEV) {
 			DRM_INFO("Reset not implemented, but ignoring "
 				 "error for simulated gpu hangs\n");
@@ -952,8 +960,7 @@ int i915_reset(struct drm_device *dev)
 
 	if (ret) {
 		DRM_ERROR("Failed to reset chip: %i\n", ret);
-		mutex_unlock(&dev->struct_mutex);
-		return ret;
+		goto error;
 	}
 
 	intel_overlay_reset(dev_priv);
@@ -972,20 +979,14 @@ int i915_reset(struct drm_device *dev)
 	 * was running at the time of the reset (i.e. we weren't VT
 	 * switched away).
 	 */
-
-	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */
-	dev_priv->gpu_error.reload_in_reset = true;
-
 	ret = i915_gem_init_hw(dev);
-
-	dev_priv->gpu_error.reload_in_reset = false;
-
-	mutex_unlock(&dev->struct_mutex);
 	if (ret) {
 		DRM_ERROR("Failed hw init on reset %d\n", ret);
-		return ret;
+		goto error;
 	}
 
+	mutex_unlock(&dev->struct_mutex);
+
 	/*
 	 * rps/rc6 re-init is necessary to restore state lost after the
 	 * reset and the re-install of gt irqs. Skip for ironlake per
@@ -996,6 +997,11 @@ int i915_reset(struct drm_device *dev)
 		intel_enable_gt_powersave(dev);
 
 	return 0;
+
+error:
+	atomic_or(I915_WEDGED, &error->reset_counter);
+	mutex_unlock(&dev->struct_mutex);
+	return ret;
 }
 
 static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index ee7677343056..c24c23d8a0c0 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1384,9 +1384,6 @@ struct i915_gpu_error {
 
 	/* For missed irq/seqno simulation. */
 	unsigned int test_irq_rings;
-
-	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset   */
-	bool reload_in_reset;
 };
 
 enum modeset_restore {
@@ -2977,7 +2974,7 @@ i915_gem_find_active_request(struct intel_engine_cs *ring);
 
 bool i915_gem_retire_requests(struct drm_device *dev);
 void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
-int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
+int __must_check i915_gem_check_wedge(unsigned reset_counter,
 				      bool interruptible);
 
 static inline u32 i915_reset_counter(struct i915_gpu_error *error)
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 68d4bab93fd8..ce4b89f28d38 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -85,9 +85,7 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
 {
 	int ret;
 
-#define EXIT_COND (!i915_reset_in_progress(error) || \
-		   i915_terminally_wedged(error))
-	if (EXIT_COND)
+	if (!i915_reset_in_progress(error))
 		return 0;
 
 	/*
@@ -96,17 +94,16 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
 	 * we should simply try to bail out and fail as gracefully as possible.
 	 */
 	ret = wait_event_interruptible_timeout(error->reset_queue,
-					       EXIT_COND,
+					       !i915_reset_in_progress(error),
 					       10*HZ);
 	if (ret == 0) {
 		DRM_ERROR("Timed out waiting for the gpu reset to complete\n");
 		return -EIO;
 	} else if (ret < 0) {
 		return ret;
+	} else {
+		return 0;
 	}
-#undef EXIT_COND
-
-	return 0;
 }
 
 int i915_mutex_lock_interruptible(struct drm_device *dev)
@@ -1110,26 +1107,19 @@ put_rpm:
 }
 
 int
-i915_gem_check_wedge(struct i915_gpu_error *error,
-		     bool interruptible)
+i915_gem_check_wedge(unsigned reset_counter, bool interruptible)
 {
-	if (i915_reset_in_progress(error)) {
+	if (__i915_reset_in_progress_or_wedged(reset_counter)) {
 		/* Non-interruptible callers can't handle -EAGAIN, hence return
 		 * -EIO unconditionally for these. */
 		if (!interruptible)
 			return -EIO;
 
 		/* Recovery complete, but the reset failed ... */
-		if (i915_terminally_wedged(error))
+		if (__i915_terminally_wedged(reset_counter))
 			return -EIO;
 
-		/*
-		 * Check if GPU Reset is in progress - we need intel_ring_begin
-		 * to work properly to reinit the hw state while the gpu is
-		 * still marked as reset-in-progress. Handle this with a flag.
-		 */
-		if (!error->reload_in_reset)
-			return -EAGAIN;
+		return -EAGAIN;
 	}
 
 	return 0;
@@ -1296,11 +1286,11 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 		/* We need to check whether any gpu reset happened in between
 		 * the caller grabbing the seqno and now ... */
 		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
-			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
-			 * is truely gone. */
-			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
-			if (ret == 0)
-				ret = -EAGAIN;
+			/* As we do not requeue the request over a GPU reset,
+			 * if one does occur we know that the request is
+			 * effectively complete.
+			 */
+			ret = 0;
 			break;
 		}
 
@@ -2687,8 +2677,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
 
 	*req_out = NULL;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
-				   dev_priv->mm.interruptible);
+	ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
 	if (ret)
 		return ret;
 
@@ -4078,9 +4067,9 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
 	if (ret)
 		return ret;
 
-	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
-	if (ret)
-		return ret;
+	/* ABI: return -EIO if already wedged */
+	if (i915_terminally_wedged(&dev_priv->gpu_error))
+		return -EIO;
 
 	spin_lock(&file_priv->mm.lock);
 	list_for_each_entry(request, &file_priv->mm.request_list, client_list) {
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index e88d692583a5..3b2a8fbe0392 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -2434,7 +2434,6 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 static void i915_reset_and_wakeup(struct drm_device *dev)
 {
 	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct i915_gpu_error *error = &dev_priv->gpu_error;
 	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
 	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
 	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
@@ -2452,7 +2451,7 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 	 * the reset in-progress bit is only ever set by code outside of this
 	 * work we don't need to worry about any other races.
 	 */
-	if (i915_reset_in_progress(error) && !i915_terminally_wedged(error)) {
+	if (i915_reset_in_progress(&dev_priv->gpu_error)) {
 		DRM_DEBUG_DRIVER("resetting chip\n");
 		kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE,
 				   reset_event);
@@ -2480,25 +2479,9 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
 
 		intel_runtime_pm_put(dev_priv);
 
-		if (ret == 0) {
-			/*
-			 * After all the gem state is reset, increment the reset
-			 * counter and wake up everyone waiting for the reset to
-			 * complete.
-			 *
-			 * Since unlock operations are a one-sided barrier only,
-			 * we need to insert a barrier here to order any seqno
-			 * updates before
-			 * the counter increment.
-			 */
-			smp_mb__before_atomic();
-			atomic_inc(&dev_priv->gpu_error.reset_counter);
-
+		if (ret == 0)
 			kobject_uevent_env(&dev->primary->kdev->kobj,
 					   KOBJ_CHANGE, reset_done_event);
-		} else {
-			atomic_or(I915_WEDGED, &error->reset_counter);
-		}
 
 		/*
 		 * Note: The wake_up also serves as a memory barrier so that
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 4447e73b54db..73c61b94f7fd 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -3287,12 +3287,9 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
 	struct drm_device *dev = crtc->dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	unsigned reset_counter;
 	bool pending;
 
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
-	if (intel_crtc->reset_counter != reset_counter ||
-	    __i915_reset_in_progress(reset_counter))
+	if (intel_crtc->reset_counter != i915_reset_counter(&dev_priv->gpu_error))
 		return false;
 
 	spin_lock_irq(&dev->event_lock);
@@ -10867,11 +10864,8 @@ static bool page_flip_finished(struct intel_crtc *crtc)
 {
 	struct drm_device *dev = crtc->base.dev;
 	struct drm_i915_private *dev_priv = dev->dev_private;
-	unsigned reset_counter;
 
-	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
-	if (crtc->reset_counter != reset_counter ||
-	    __i915_reset_in_progress(reset_counter))
+	if (crtc->reset_counter != i915_reset_counter(&dev_priv->gpu_error))
 		return true;
 
 	/*
@@ -13303,9 +13297,9 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 		return ret;
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);
-	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
-		mutex_unlock(&dev->struct_mutex);
+	mutex_unlock(&dev->struct_mutex);
 
+	if (!ret && !async) {
 		for_each_plane_in_state(state, plane, plane_state, i) {
 			struct intel_plane_state *intel_plane_state =
 				to_intel_plane_state(plane_state);
@@ -13315,23 +13309,15 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
 
 			ret = __i915_wait_request(intel_plane_state->wait_req,
 						  true, NULL, NULL);
-
-			/* Swallow -EIO errors to allow updates during hw lockup. */
-			if (ret == -EIO)
-				ret = 0;
-
-			if (ret)
+			if (ret) {
+				mutex_lock(&dev->struct_mutex);
+				drm_atomic_helper_cleanup_planes(dev, state);
+				mutex_unlock(&dev->struct_mutex);
 				break;
+			}
 		}
-
-		if (!ret)
-			return 0;
-
-		mutex_lock(&dev->struct_mutex);
-		drm_atomic_helper_cleanup_planes(dev, state);
 	}
 
-	mutex_unlock(&dev->struct_mutex);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index fe71c768dbc4..08b982a6ae81 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -711,6 +711,14 @@ static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
 	if (ret)
 		return ret;
 
+	/* If the request was completed due to a GPU hang, we want to
+	 * error out before we continue to emit more commands to the GPU.
+	 */
+	ret = i915_gem_check_wedge(i915_reset_counter(&req->i915->gpu_error),
+				   req->i915->mm.interruptible);
+	if (ret)
+		return ret;
+
 	ringbuf->space = space;
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 913526c0264f..511efe556d73 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2254,6 +2254,14 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
 	if (ret)
 		return ret;
 
+	/* If the request was completed due to a GPU hang, we want to
+	 * error out before we continue to emit more commands to the GPU.
+	 */
+	ret = i915_gem_check_wedge(i915_reset_counter(&to_i915(ring->dev)->gpu_error),
+				   to_i915(ring->dev)->mm.interruptible);
+	if (ret)
+		return ret;
+
 	ringbuf->space = space;
 	return 0;
 }
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* Re: [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request
  2015-11-30 10:06   ` Tvrtko Ursulin
@ 2015-12-01 15:47     ` Dave Gordon
  2015-12-01 15:58       ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Dave Gordon @ 2015-12-01 15:47 UTC (permalink / raw)
  To: Tvrtko Ursulin, Chris Wilson, intel-gfx
  Cc: Daniel Vetter, Eero Tamminen, Rantala, Valtteri, stable

On 30/11/15 10:06, Tvrtko Ursulin wrote:
>
> On 29/11/15 08:48, Chris Wilson wrote:
>> Limit busywaiting only to the request currently being processed by the
>> GPU. If the request is not currently being processed by the GPU, there
>> is a very low likelihood of it being completed within the 2 microsecond
>> spin timeout and so we will just be wasting CPU cycles.
>>
>> v2: Check for logical inversion when rebasing - we were incorrectly
>> checking for this request being active, and instead busywaiting for
>> when the GPU was not yet processing the request of interest.
>>
>> v3: Try another colour for the seqno names.
>> v4: Another colour for the function names.

Adding a field in the request to track the sequence number of the 
previous request isn't ideal when considering the scheduler and 
preemption. But we've got a separate batch-in-progress sequence number 
in the hardware status page (also for use by TDR to check which batch is 
currently running), so you could use that. Then the check is simply

     curr = intel_read_status_page(ring, I915_BATCH_ACTIVE_SEQNO);
     if (curr != req->seqno)
         return -EAGAIN;

.DAve.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request
  2015-12-01 15:47     ` Dave Gordon
@ 2015-12-01 15:58       ` Chris Wilson
  2015-12-01 16:44         ` Dave Gordon
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-01 15:58 UTC (permalink / raw)
  To: Dave Gordon
  Cc: Daniel Vetter, intel-gfx, stable, Rantala, Valtteri, Eero Tamminen

On Tue, Dec 01, 2015 at 03:47:34PM +0000, Dave Gordon wrote:
> On 30/11/15 10:06, Tvrtko Ursulin wrote:
> >
> >On 29/11/15 08:48, Chris Wilson wrote:
> >>Limit busywaiting only to the request currently being processed by the
> >>GPU. If the request is not currently being processed by the GPU, there
> >>is a very low likelihood of it being completed within the 2 microsecond
> >>spin timeout and so we will just be wasting CPU cycles.
> >>
> >>v2: Check for logical inversion when rebasing - we were incorrectly
> >>checking for this request being active, and instead busywaiting for
> >>when the GPU was not yet processing the request of interest.
> >>
> >>v3: Try another colour for the seqno names.
> >>v4: Another colour for the function names.
> 
> Adding a field in the request to track the sequence number of the
> previous request isn't ideal when considering the scheduler and
> preemption. But we've got a separate batch-in-progress sequence
> number in the hardware status page (also for use by TDR to check
> which batch is currently running), so you could use that. Then the
> check is simply

As demonstrated TDR doesn't need it, and the GPU scheduler has to fix up
the seqno anyway to maintain retirement order of the request list.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request
  2015-12-01 15:58       ` Chris Wilson
@ 2015-12-01 16:44         ` Dave Gordon
  2015-12-03  8:52           ` Daniel Vetter
  0 siblings, 1 reply; 92+ messages in thread
From: Dave Gordon @ 2015-12-01 16:44 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, intel-gfx, Daniel Vetter,
	Eero Tamminen, stable, Rantala, Valtteri

On 01/12/15 15:58, Chris Wilson wrote:
> On Tue, Dec 01, 2015 at 03:47:34PM +0000, Dave Gordon wrote:
>> On 30/11/15 10:06, Tvrtko Ursulin wrote:
>>>
>>> On 29/11/15 08:48, Chris Wilson wrote:
>>>> Limit busywaiting only to the request currently being processed by the
>>>> GPU. If the request is not currently being processed by the GPU, there
>>>> is a very low likelihood of it being completed within the 2 microsecond
>>>> spin timeout and so we will just be wasting CPU cycles.
>>>>
>>>> v2: Check for logical inversion when rebasing - we were incorrectly
>>>> checking for this request being active, and instead busywaiting for
>>>> when the GPU was not yet processing the request of interest.
>>>>
>>>> v3: Try another colour for the seqno names.
>>>> v4: Another colour for the function names.
>>
>> Adding a field in the request to track the sequence number of the
>> previous request isn't ideal when considering the scheduler and
>> preemption. But we've got a separate batch-in-progress sequence
>> number in the hardware status page (also for use by TDR to check
>> which batch is currently running), so you could use that. Then the
>> check is simply
>
> As demonstrated TDR doesn't need it, and the GPU scheduler has to fix up
> the seqno anyway to maintain retirement order of the request list.
> -Chris

If the request list is in retirement order, then the previous-seqno of a 
request is simply the seqno of the previous request in the request list. 
If there isn't a previous request, then the request under consideration 
is at the head of the list and must be the currently-active request 
anyway (of course, the converse does not hold, due to lazy retirement 
and such).

The scheduler will reassign seqnos as batches are submitted to the 
backend submission mechanism (ringbuffer, execlist, or GuC), so they 
will start out in order. But that doesn't guarantee in-order completion 
once we add preemption, so we will in any case be tracking both the 
start and finish of each batch.

The (Android) TDR previously worked on by Tomas Elf did make use of the 
batch-started seqno; at least some of the enhancements in that version 
of TDR are expected to be ported into the upstream Linux soon.

BTW I would find it easier to test & review these patches if they were 
in a different order, specifically I'd prefer to see the infrastructure 
updates (patches 9-12: lazy coherency, get rid of get_seqno(), etc) 
before the features that they help simplify (spin-wait optimisation, 
thundering herd suppression, etc).

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation
  2015-11-29  8:48 ` [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation Chris Wilson
@ 2015-12-01 16:57   ` Dave Gordon
  2015-12-04  9:36     ` Daniel Vetter
  0 siblings, 1 reply; 92+ messages in thread
From: Dave Gordon @ 2015-12-01 16:57 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 29/11/15 08:48, Chris Wilson wrote:
> We have testcases to ensure that seqno wraparound works fine, so we can
> forgo forcing everyone to encounter seqno wraparound during early
> uptime. seqno wraparound incurs a full GPU stall so not forcing it
> will eliminate one jitter from the early system.
>
> Advancing the global next_seqno after a GPU reset is equally pointless.
>
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

The real bug is that seqno wraparound forces GPU idle, can we fix that 
instead?

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 14:18         ` Chris Wilson
@ 2015-12-01 17:06           ` Dave Gordon
  0 siblings, 0 replies; 92+ messages in thread
From: Dave Gordon @ 2015-12-01 17:06 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, intel-gfx, Gong, Zhipeng,
	Rogozhkin, Dmitry V

On 30/11/15 14:18, Chris Wilson wrote:
> On Mon, Nov 30, 2015 at 01:32:16PM +0000, Tvrtko Ursulin wrote:
>>
>> On 30/11/15 12:30, Chris Wilson wrote:
>>> On Mon, Nov 30, 2015 at 12:05:50PM +0000, Tvrtko Ursulin wrote:
>>>>> +	struct intel_breadcrumbs {
>>>>> +		spinlock_t lock; /* protects the per-engine request list */
>>>>
>>>> s/list/tree/
>>>
>>> list. We use the tree as a list!
>>
>> Maybe but it confuses the reader a bit who goes and looks for some
>> list then.
>>
>> Related, when I was playing with this I was piggy backing on the
>> existing per engine request list.
>
> A separate list with separate locking only used when waiting, that's the
> appeal for me.

The patchset recently posted by John Harrison
"[RFC 00/12] Convert requests to use struct fence"
provides exactly that. No struct_mutex involved,
and only the relevant waiters get woken up.

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH v4] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-30 14:34   ` [PATCH v4] " Chris Wilson
  2015-11-30 16:30     ` Chris Wilson
@ 2015-12-01 18:34     ` Dave Gordon
  2015-12-03 16:22       ` [PATCH v7] " Chris Wilson
  1 sibling, 1 reply; 92+ messages in thread
From: Dave Gordon @ 2015-12-01 18:34 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx

On 30/11/15 14:34, Chris Wilson wrote:
> One particularly stressful scenario consists of many independent tasks
> all competing for GPU time and waiting upon the results (e.g. realtime
> transcoding of many, many streams). One bottleneck in particular is that
> each client waits on its own results, but every client is woken up after
> every batchbuffer - hence the thunder of hooves as then every client must
> do its heavyweight dance to read a coherent seqno to see if it is the
> lucky one. Alternatively, we can have one kthread responsible for waking
> after an interrupt, checking the seqno and only waking up the waiting
> clients who are complete. The disadvantage is that in the uncontended
> scenario (i.e. only one waiter) we incur an extra context switch in the
> wakeup path - though that should be mitigated somewhat by the busy-wait
> we do first before sleeping.

This discussion reminds me about an approach we took in [another OS], 
where the interrupt handler always just woke the first waiter, but that 
thread, if the wakeup wasn't of interest to itself, then did the extra 
work to figure out which other thread /should/ be woken. That both 
minimised latency for the single-waiter scenario, and avoided wake_all() 
from interrupt code in the multiple-waiter case. Oh, and IIRC we had a 
yield_to() in there so that the spuriously-woken first waiter went back 
to waiting and the correctly-woken thread immediately got to take over 
the CPU :)

I don't know how practical that would be inside Linux though ...

.Dave.

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request
  2015-12-01 16:44         ` Dave Gordon
@ 2015-12-03  8:52           ` Daniel Vetter
  0 siblings, 0 replies; 92+ messages in thread
From: Daniel Vetter @ 2015-12-03  8:52 UTC (permalink / raw)
  To: Dave Gordon
  Cc: Daniel Vetter, intel-gfx, stable, Rantala, Valtteri, Eero Tamminen

On Tue, Dec 01, 2015 at 04:44:53PM +0000, Dave Gordon wrote:
> On 01/12/15 15:58, Chris Wilson wrote:
> >On Tue, Dec 01, 2015 at 03:47:34PM +0000, Dave Gordon wrote:
> >>On 30/11/15 10:06, Tvrtko Ursulin wrote:
> >>>
> >>>On 29/11/15 08:48, Chris Wilson wrote:
> >>>>Limit busywaiting only to the request currently being processed by the
> >>>>GPU. If the request is not currently being processed by the GPU, there
> >>>>is a very low likelihood of it being completed within the 2 microsecond
> >>>>spin timeout and so we will just be wasting CPU cycles.
> >>>>
> >>>>v2: Check for logical inversion when rebasing - we were incorrectly
> >>>>checking for this request being active, and instead busywaiting for
> >>>>when the GPU was not yet processing the request of interest.
> >>>>
> >>>>v3: Try another colour for the seqno names.
> >>>>v4: Another colour for the function names.
> >>
> >>Adding a field in the request to track the sequence number of the
> >>previous request isn't ideal when considering the scheduler and
> >>preemption. But we've got a separate batch-in-progress sequence
> >>number in the hardware status page (also for use by TDR to check
> >>which batch is currently running), so you could use that. Then the
> >>check is simply
> >
> >As demonstrated TDR doesn't need it, and the GPU scheduler has to fix up
> >the seqno anyway to maintain retirement order of the request list.
> >-Chris
> 
> If the request list is in retirement order, then the previous-seqno of a
> request is simply the seqno of the previous request in the request list. If
> there isn't a previous request, then the request under consideration is at
> the head of the list and must be the currently-active request anyway (of
> course, the converse does not hold, due to lazy retirement and such).
> 
> The scheduler will reassign seqnos as batches are submitted to the backend
> submission mechanism (ringbuffer, execlist, or GuC), so they will start out
> in order. But that doesn't guarantee in-order completion once we add
> preemption, so we will in any case be tracking both the start and finish of
> each batch.

If the scheduler stops reassigning seqnos and instead hands out seqnos
per-ctx all these problems go away. And a metric pile more problems too
...

Cheers, Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper
  2015-12-01 11:05         ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
  2015-12-01 11:05           ` [PATCH 2/3] drm/i915: Store the reset counter when constructing a request Chris Wilson
  2015-12-01 11:05           ` [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
@ 2015-12-03  8:57           ` Daniel Vetter
  2015-12-03  9:02             ` Chris Wilson
  2 siblings, 1 reply; 92+ messages in thread
From: Daniel Vetter @ 2015-12-03  8:57 UTC (permalink / raw)
  To: Chris Wilson; +Cc: daniel.vetter, intel-gfx

On Tue, Dec 01, 2015 at 11:05:33AM +0000, Chris Wilson wrote:
> This is just a little bit of syntatic sugar to hide the atomic_reads()
> throughout the code to retrieve the current reset_counter. It also
> provides the other utility functions to check the reset state on the
> already read reset_counter, so that we can read it once and do multiple
> tests rather than risk the value changing between tests.

This patch also changes the meaning of reset_in_progress to not include
WEDGED afaict. I agree with that change, but it needs to be mentioned in
the commit message.

Also with that change there's some cleanup potential since a bunch of
callers that explicitly checked for reset_in_progress &&
!terminally_wedged now can drop the 2nd part of the condition. That
simplification is why I've done this change in my patch.
-Daniel


> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
>  drivers/gpu/drm/i915/i915_drv.h         | 32 ++++++++++++++++++++++++++++----
>  drivers/gpu/drm/i915/i915_gem.c         | 12 ++++++------
>  drivers/gpu/drm/i915/intel_display.c    | 16 ++++++++++------
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  2 +-
>  5 files changed, 46 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index bfd57fb597dc..864b3ae9419c 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4656,7 +4656,7 @@ i915_wedged_get(void *data, u64 *val)
>  	struct drm_device *dev = data;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	*val = atomic_read(&dev_priv->gpu_error.reset_counter);
> +	*val = i915_reset_counter(&dev_priv->gpu_error);
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d07041c1729d..2a4cfa06c28d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2979,20 +2979,44 @@ void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
>  int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
>  				      bool interruptible);
>  
> +static inline u32 i915_reset_counter(struct i915_gpu_error *error)
> +{
> +	return atomic_read(&error->reset_counter);
> +}
> +
> +static inline bool __i915_reset_in_progress(u32 reset)
> +{
> +	return unlikely(reset & I915_RESET_IN_PROGRESS_FLAG);
> +}
> +
> +static inline bool __i915_reset_in_progress_or_wedged(u32 reset)
> +{
> +	return unlikely(reset & (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
> +}
> +
> +static inline bool __i915_terminally_wedged(u32 reset)
> +{
> +	return unlikely(reset & I915_WEDGED);
> +}
> +
>  static inline bool i915_reset_in_progress(struct i915_gpu_error *error)
>  {
> -	return unlikely(atomic_read(&error->reset_counter)
> -			& (I915_RESET_IN_PROGRESS_FLAG | I915_WEDGED));
> +	return __i915_reset_in_progress(i915_reset_counter(error));
> +}
> +
> +static inline bool i915_reset_in_progress_or_wedged(struct i915_gpu_error *error)
> +{
> +	return __i915_reset_in_progress_or_wedged(i915_reset_counter(error));
>  }
>  
>  static inline bool i915_terminally_wedged(struct i915_gpu_error *error)
>  {
> -	return atomic_read(&error->reset_counter) & I915_WEDGED;
> +	return __i915_terminally_wedged(i915_reset_counter(error));
>  }
>  
>  static inline u32 i915_reset_count(struct i915_gpu_error *error)
>  {
> -	return ((atomic_read(&error->reset_counter) & ~I915_WEDGED) + 1) / 2;
> +	return ((i915_reset_counter(error) & ~I915_WEDGED) + 1) / 2;
>  }
>  
>  static inline bool i915_stop_ring_allow_ban(struct drm_i915_private *dev_priv)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index cccfe5bfe5bf..bff245de8ade 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1297,7 +1297,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  
>  		/* We need to check whether any gpu reset happened in between
>  		 * the caller grabbing the seqno and now ... */
> -		if (reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter)) {
> +		if (reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
>  			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
>  			 * is truely gone. */
>  			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> @@ -1475,7 +1475,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
>  		return ret;
>  
>  	ret = __i915_wait_request(req,
> -				  atomic_read(&dev_priv->gpu_error.reset_counter),
> +				  i915_reset_counter(&dev_priv->gpu_error),
>  				  interruptible, NULL, NULL);
>  	if (ret)
>  		return ret;
> @@ -1564,7 +1564,7 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  	if (ret)
>  		return ret;
>  
> -	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> +	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  
>  	if (readonly) {
>  		struct drm_i915_gem_request *req;
> @@ -3110,7 +3110,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  	}
>  
>  	drm_gem_object_unreference(&obj->base);
> -	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> +	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  
>  	for (i = 0; i < I915_NUM_RINGS; i++) {
>  		if (obj->last_read_req[i] == NULL)
> @@ -3155,7 +3155,7 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  	if (!i915_semaphore_is_enabled(obj->base.dev)) {
>  		struct drm_i915_private *i915 = to_i915(obj->base.dev);
>  		ret = __i915_wait_request(from_req,
> -					  atomic_read(&i915->gpu_error.reset_counter),
> +					  i915_reset_counter(&i915->gpu_error),
>  					  i915->mm.interruptible,
>  					  NULL,
>  					  &i915->rps.semaphores);
> @@ -4109,7 +4109,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  
>  		target = request;
>  	}
> -	reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> +	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  	if (target)
>  		i915_gem_request_reference(target);
>  	spin_unlock(&file_priv->mm.lock);
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 4ae490dfe2c4..511ead08ccd8 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3287,10 +3287,12 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
>  	struct drm_device *dev = crtc->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> +	unsigned reset_counter;
>  	bool pending;
>  
> -	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
> -	    intel_crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
> +	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
> +	if (intel_crtc->reset_counter != reset_counter ||
> +	    __i915_reset_in_progress(reset_counter))
>  		return false;
>  
>  	spin_lock_irq(&dev->event_lock);
> @@ -10865,9 +10867,11 @@ static bool page_flip_finished(struct intel_crtc *crtc)
>  {
>  	struct drm_device *dev = crtc->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	unsigned reset_counter;
>  
> -	if (i915_reset_in_progress(&dev_priv->gpu_error) ||
> -	    crtc->reset_counter != atomic_read(&dev_priv->gpu_error.reset_counter))
> +	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
> +	if (crtc->reset_counter != reset_counter ||
> +	    __i915_reset_in_progress(reset_counter))
>  		return true;
>  
>  	/*
> @@ -11511,7 +11515,7 @@ static int intel_crtc_page_flip(struct drm_crtc *crtc,
>  		goto cleanup;
>  
>  	atomic_inc(&intel_crtc->unpin_work_count);
> -	intel_crtc->reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> +	intel_crtc->reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  
>  	if (INTEL_INFO(dev)->gen >= 5 || IS_G4X(dev))
>  		work->flip_count = I915_READ(PIPE_FLIPCOUNT_G4X(pipe)) + 1;
> @@ -13303,7 +13307,7 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
>  	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
>  		u32 reset_counter;
>  
> -		reset_counter = atomic_read(&dev_priv->gpu_error.reset_counter);
> +		reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  		mutex_unlock(&dev->struct_mutex);
>  
>  		for_each_plane_in_state(state, plane, plane_state, i) {
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 57d78f264b53..8970267d27bb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2286,7 +2286,7 @@ int intel_ring_idle(struct intel_engine_cs *ring)
>  
>  	/* Make sure we do not trigger any retires */
>  	return __i915_wait_request(req,
> -				   atomic_read(&to_i915(ring->dev)->gpu_error.reset_counter),
> +				   i915_reset_counter(&to_i915(ring->dev)->gpu_error),
>  				   to_i915(ring->dev)->mm.interruptible,
>  				   NULL, NULL);
>  }
> -- 
> 2.6.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 2/3] drm/i915: Store the reset counter when constructing a request
  2015-12-01 11:05           ` [PATCH 2/3] drm/i915: Store the reset counter when constructing a request Chris Wilson
@ 2015-12-03  8:59             ` Daniel Vetter
  0 siblings, 0 replies; 92+ messages in thread
From: Daniel Vetter @ 2015-12-03  8:59 UTC (permalink / raw)
  To: Chris Wilson; +Cc: daniel.vetter, intel-gfx

On Tue, Dec 01, 2015 at 11:05:34AM +0000, Chris Wilson wrote:
> As the request is only valid during the same global reset epoch, we can
> record the current reset_counter when constructing the request and reuse
> it when waiting upon that request in future. This removes a very hairy
> atomic check serialised by the struct_mutex at the time of waiting and
> allows us to transfer those waits to a central dispatcher for all
> waiters and all requests.

Please add

"This also allows us to clean up the check_wedge handling a bit, since we
don't have to be super-careful any more with running into a zombie gpu.
There's now only two places we check for a wedge gpu: request_alloc (to
catch command submission) and in the inner loop of wait_request (no point
bailing out if we don't have to wait anyway)."

With that added Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>

> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         |  2 +-
>  drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++----------------------
>  drivers/gpu/drm/i915/intel_display.c    |  7 +-----
>  drivers/gpu/drm/i915/intel_lrc.c        |  7 ------
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  6 -----
>  5 files changed, 15 insertions(+), 47 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2a4cfa06c28d..ee7677343056 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2181,6 +2181,7 @@ struct drm_i915_gem_request {
>  	/** On Which ring this request was generated */
>  	struct drm_i915_private *i915;
>  	struct intel_engine_cs *ring;
> +	unsigned reset_counter;
>  
>  	 /** GEM sequence number associated with the previous request,
>  	  * when the HWS breadcrumb is equal to this the GPU is processing
> @@ -3049,7 +3050,6 @@ void __i915_add_request(struct drm_i915_gem_request *req,
>  #define i915_add_request_no_flush(req) \
>  	__i915_add_request(req, NULL, false)
>  int __i915_wait_request(struct drm_i915_gem_request *req,
> -			unsigned reset_counter,
>  			bool interruptible,
>  			s64 *timeout,
>  			struct intel_rps_client *rps);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index bff245de8ade..68d4bab93fd8 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1223,7 +1223,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>  /**
>   * __i915_wait_request - wait until execution of request has finished
>   * @req: duh!
> - * @reset_counter: reset sequence associated with the given request
>   * @interruptible: do an interruptible wait (normally yes)
>   * @timeout: in - how long to wait (NULL forever); out - how much time remaining
>   *
> @@ -1238,7 +1237,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   * errno with remaining time filled in timeout argument.
>   */
>  int __i915_wait_request(struct drm_i915_gem_request *req,
> -			unsigned reset_counter,
>  			bool interruptible,
>  			s64 *timeout,
>  			struct intel_rps_client *rps)
> @@ -1297,7 +1295,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  
>  		/* We need to check whether any gpu reset happened in between
>  		 * the caller grabbing the seqno and now ... */
> -		if (reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
> +		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
>  			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
>  			 * is truely gone. */
>  			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> @@ -1470,13 +1468,7 @@ i915_wait_request(struct drm_i915_gem_request *req)
>  
>  	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> -	if (ret)
> -		return ret;
> -
> -	ret = __i915_wait_request(req,
> -				  i915_reset_counter(&dev_priv->gpu_error),
> -				  interruptible, NULL, NULL);
> +	ret = __i915_wait_request(req, interruptible, NULL, NULL);
>  	if (ret)
>  		return ret;
>  
> @@ -1551,7 +1543,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  	struct drm_device *dev = obj->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_request *requests[I915_NUM_RINGS];
> -	unsigned reset_counter;
>  	int ret, i, n = 0;
>  
>  	BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> @@ -1560,12 +1551,6 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  	if (!obj->active)
>  		return 0;
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, true);
> -	if (ret)
> -		return ret;
> -
> -	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
> -
>  	if (readonly) {
>  		struct drm_i915_gem_request *req;
>  
> @@ -1587,9 +1572,9 @@ i915_gem_object_wait_rendering__nonblocking(struct drm_i915_gem_object *obj,
>  	}
>  
>  	mutex_unlock(&dev->struct_mutex);
> +	ret = 0;
>  	for (i = 0; ret == 0 && i < n; i++)
> -		ret = __i915_wait_request(requests[i], reset_counter, true,
> -					  NULL, rps);
> +		ret = __i915_wait_request(requests[i], true, NULL, rps);
>  	mutex_lock(&dev->struct_mutex);
>  
>  	for (i = 0; i < n; i++) {
> @@ -2693,6 +2678,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  			   struct drm_i915_gem_request **req_out)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(ring->dev);
> +	unsigned reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  	struct drm_i915_gem_request *req;
>  	int ret;
>  
> @@ -2701,6 +2687,11 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  
>  	*req_out = NULL;
>  
> +	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> +				   dev_priv->mm.interruptible);
> +	if (ret)
> +		return ret;
> +
>  	req = kmem_cache_zalloc(dev_priv->requests, GFP_KERNEL);
>  	if (req == NULL)
>  		return -ENOMEM;
> @@ -2712,6 +2703,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  	kref_init(&req->ref);
>  	req->i915 = dev_priv;
>  	req->ring = ring;
> +	req->reset_counter = reset_counter;
>  	req->ctx  = ctx;
>  	i915_gem_context_reference(req->ctx);
>  
> @@ -3072,11 +3064,9 @@ retire:
>  int
>  i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  {
> -	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct drm_i915_gem_wait *args = data;
>  	struct drm_i915_gem_object *obj;
>  	struct drm_i915_gem_request *req[I915_NUM_RINGS];
> -	unsigned reset_counter;
>  	int i, n = 0;
>  	int ret;
>  
> @@ -3110,7 +3100,6 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  	}
>  
>  	drm_gem_object_unreference(&obj->base);
> -	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  
>  	for (i = 0; i < I915_NUM_RINGS; i++) {
>  		if (obj->last_read_req[i] == NULL)
> @@ -3123,7 +3112,7 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
>  
>  	for (i = 0; i < n; i++) {
>  		if (ret == 0)
> -			ret = __i915_wait_request(req[i], reset_counter, true,
> +			ret = __i915_wait_request(req[i], true,
>  						  args->timeout_ns > 0 ? &args->timeout_ns : NULL,
>  						  file->driver_priv);
>  		i915_gem_request_unreference__unlocked(req[i]);
> @@ -3155,7 +3144,6 @@ __i915_gem_object_sync(struct drm_i915_gem_object *obj,
>  	if (!i915_semaphore_is_enabled(obj->base.dev)) {
>  		struct drm_i915_private *i915 = to_i915(obj->base.dev);
>  		ret = __i915_wait_request(from_req,
> -					  i915_reset_counter(&i915->gpu_error),
>  					  i915->mm.interruptible,
>  					  NULL,
>  					  &i915->rps.semaphores);
> @@ -4084,7 +4072,6 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	struct drm_i915_file_private *file_priv = file->driver_priv;
>  	unsigned long recent_enough = jiffies - DRM_I915_THROTTLE_JIFFIES;
>  	struct drm_i915_gem_request *request, *target = NULL;
> -	unsigned reset_counter;
>  	int ret;
>  
>  	ret = i915_gem_wait_for_error(&dev_priv->gpu_error);
> @@ -4109,7 +4096,6 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  
>  		target = request;
>  	}
> -	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  	if (target)
>  		i915_gem_request_reference(target);
>  	spin_unlock(&file_priv->mm.lock);
> @@ -4117,7 +4103,7 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	if (target == NULL)
>  		return 0;
>  
> -	ret = __i915_wait_request(target, reset_counter, true, NULL, NULL);
> +	ret = __i915_wait_request(target, true, NULL, NULL);
>  	if (ret == 0)
>  		queue_delayed_work(dev_priv->wq, &dev_priv->mm.retire_work, 0);
>  
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 511ead08ccd8..4447e73b54db 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -11313,7 +11313,6 @@ static void intel_mmio_flip_work_func(struct work_struct *work)
>  
>  	if (mmio_flip->req) {
>  		WARN_ON(__i915_wait_request(mmio_flip->req,
> -					    mmio_flip->crtc->reset_counter,
>  					    false, NULL,
>  					    &mmio_flip->i915->rps.mmioflips));
>  		i915_gem_request_unreference__unlocked(mmio_flip->req);
> @@ -13305,9 +13304,6 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
>  
>  	ret = drm_atomic_helper_prepare_planes(dev, state);
>  	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
> -		u32 reset_counter;
> -
> -		reset_counter = i915_reset_counter(&dev_priv->gpu_error);
>  		mutex_unlock(&dev->struct_mutex);
>  
>  		for_each_plane_in_state(state, plane, plane_state, i) {
> @@ -13318,8 +13314,7 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
>  				continue;
>  
>  			ret = __i915_wait_request(intel_plane_state->wait_req,
> -						  reset_counter, true,
> -						  NULL, NULL);
> +						  true, NULL, NULL);
>  
>  			/* Swallow -EIO errors to allow updates during hw lockup. */
>  			if (ret == -EIO)
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 4ebafab53f30..fe71c768dbc4 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -819,16 +819,9 @@ static int logical_ring_prepare(struct drm_i915_gem_request *req, int bytes)
>   */
>  int intel_logical_ring_begin(struct drm_i915_gem_request *req, int num_dwords)
>  {
> -	struct drm_i915_private *dev_priv;
>  	int ret;
>  
>  	WARN_ON(req == NULL);
> -	dev_priv = req->ring->dev->dev_private;
> -
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> -				   dev_priv->mm.interruptible);
> -	if (ret)
> -		return ret;
>  
>  	ret = logical_ring_prepare(req, num_dwords * sizeof(uint32_t));
>  	if (ret)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 8970267d27bb..913526c0264f 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2286,7 +2286,6 @@ int intel_ring_idle(struct intel_engine_cs *ring)
>  
>  	/* Make sure we do not trigger any retires */
>  	return __i915_wait_request(req,
> -				   i915_reset_counter(&to_i915(ring->dev)->gpu_error),
>  				   to_i915(ring->dev)->mm.interruptible,
>  				   NULL, NULL);
>  }
> @@ -2417,11 +2416,6 @@ int intel_ring_begin(struct drm_i915_gem_request *req,
>  	ring = req->ring;
>  	dev_priv = ring->dev->dev_private;
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> -				   dev_priv->mm.interruptible);
> -	if (ret)
> -		return ret;
> -
>  	ret = __intel_ring_prepare(ring, num_dwords * sizeof(uint32_t));
>  	if (ret)
>  		return ret;
> -- 
> 2.6.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper
  2015-12-03  8:57           ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Daniel Vetter
@ 2015-12-03  9:02             ` Chris Wilson
  2015-12-03  9:20               ` Daniel Vetter
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-03  9:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: daniel.vetter, intel-gfx

On Thu, Dec 03, 2015 at 09:57:01AM +0100, Daniel Vetter wrote:
> On Tue, Dec 01, 2015 at 11:05:33AM +0000, Chris Wilson wrote:
> > This is just a little bit of syntatic sugar to hide the atomic_reads()
> > throughout the code to retrieve the current reset_counter. It also
> > provides the other utility functions to check the reset state on the
> > already read reset_counter, so that we can read it once and do multiple
> > tests rather than risk the value changing between tests.
> 
> This patch also changes the meaning of reset_in_progress to not include
> WEDGED afaict. I agree with that change, but it needs to be mentioned in
> the commit message.
> 
> Also with that change there's some cleanup potential since a bunch of
> callers that explicitly checked for reset_in_progress &&
> !terminally_wedged now can drop the 2nd part of the condition. That
> simplification is why I've done this change in my patch.

I was just splitting out the header change and simplest of seds from the
complete patch. Changing the reset/recovery interfaces I left till last.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request()
  2015-12-01 11:05           ` [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
@ 2015-12-03  9:14             ` Daniel Vetter
  2015-12-03  9:41               ` Chris Wilson
  2015-12-11  9:02               ` Chris Wilson
  0 siblings, 2 replies; 92+ messages in thread
From: Daniel Vetter @ 2015-12-03  9:14 UTC (permalink / raw)
  To: Chris Wilson; +Cc: daniel.vetter, intel-gfx

On Tue, Dec 01, 2015 at 11:05:35AM +0000, Chris Wilson wrote:
> Reporting -EIO from i915_wait_request() has proven very troublematic
> over the years, with numerous hard-to-reproduce bugs cropping up in the
> corner case of where a reset occurs and the code wasn't expecting such
> an error.
> 
> If the we reset the GPU or have detected a hang and wish to reset the
> GPU, the request is forcibly complete and the wait broken. Currently, we
> report either -EAGAIN or -EIO in order for the caller to retreat and
> restart the wait (if appropriate) after dropping and then reacquiring
> the struct_mutex (essential to allow the GPU reset to proceed). However,
> if we take the view that the request is complete (no further work will
> be done on it by the GPU because it is dead and soon to be reset), then
> we can proceed with the task at hand and then drop the struct_mutex
> allowing the reset to occur. This transfers the burden of checking
> whether it is safe to proceed to the caller, which in all but one
> instance it is safe - completely eliminating the source of all spurious
> -EIO.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> ---
>  drivers/gpu/drm/i915/i915_debugfs.c     |  2 +-
>  drivers/gpu/drm/i915/i915_drv.c         | 32 +++++++++++++----------
>  drivers/gpu/drm/i915/i915_drv.h         |  5 +---
>  drivers/gpu/drm/i915/i915_gem.c         | 45 +++++++++++++--------------------
>  drivers/gpu/drm/i915/i915_irq.c         | 21 ++-------------
>  drivers/gpu/drm/i915/intel_display.c    | 32 +++++++----------------
>  drivers/gpu/drm/i915/intel_lrc.c        |  8 ++++++
>  drivers/gpu/drm/i915/intel_ringbuffer.c |  8 ++++++
>  8 files changed, 65 insertions(+), 88 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 864b3ae9419c..0583eda642d7 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -4656,7 +4656,7 @@ i915_wedged_get(void *data, u64 *val)
>  	struct drm_device *dev = data;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  
> -	*val = i915_reset_counter(&dev_priv->gpu_error);
> +	*val = i915_terminally_wedged(&dev_priv->gpu_error);
>  
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
> index 90faa8e03fca..eb893a5e00b1 100644
> --- a/drivers/gpu/drm/i915/i915_drv.c
> +++ b/drivers/gpu/drm/i915/i915_drv.c
> @@ -923,23 +923,31 @@ int i915_resume_switcheroo(struct drm_device *dev)
>  int i915_reset(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> +	struct i915_gpu_error *error = &dev_priv->gpu_error;
> +	unsigned reset_counter;
>  	bool simulated;
>  	int ret;
>  
>  	intel_reset_gt_powersave(dev);
>  
>  	mutex_lock(&dev->struct_mutex);
> +	atomic_andnot(I915_WEDGED, &error->reset_counter);
> +	reset_counter = atomic_inc_return(&error->reset_counter);
> +	if (WARN_ON(__i915_reset_in_progress(reset_counter))) {
> +		ret = -EIO;
> +		goto error;
> +	}
>  
>  	i915_gem_reset(dev);
>  
> -	simulated = dev_priv->gpu_error.stop_rings != 0;
> +	simulated = error->stop_rings != 0;
>  
>  	ret = intel_gpu_reset(dev);
>  
>  	/* Also reset the gpu hangman. */
>  	if (simulated) {
>  		DRM_INFO("Simulated gpu hang, resetting stop_rings\n");
> -		dev_priv->gpu_error.stop_rings = 0;
> +		error->stop_rings = 0;
>  		if (ret == -ENODEV) {
>  			DRM_INFO("Reset not implemented, but ignoring "
>  				 "error for simulated gpu hangs\n");
> @@ -952,8 +960,7 @@ int i915_reset(struct drm_device *dev)
>  
>  	if (ret) {
>  		DRM_ERROR("Failed to reset chip: %i\n", ret);
> -		mutex_unlock(&dev->struct_mutex);
> -		return ret;
> +		goto error;
>  	}
>  
>  	intel_overlay_reset(dev_priv);
> @@ -972,20 +979,14 @@ int i915_reset(struct drm_device *dev)
>  	 * was running at the time of the reset (i.e. we weren't VT
>  	 * switched away).
>  	 */
> -
> -	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset */
> -	dev_priv->gpu_error.reload_in_reset = true;
> -
>  	ret = i915_gem_init_hw(dev);
> -
> -	dev_priv->gpu_error.reload_in_reset = false;
> -
> -	mutex_unlock(&dev->struct_mutex);
>  	if (ret) {
>  		DRM_ERROR("Failed hw init on reset %d\n", ret);
> -		return ret;
> +		goto error;
>  	}
>  
> +	mutex_unlock(&dev->struct_mutex);
> +
>  	/*
>  	 * rps/rc6 re-init is necessary to restore state lost after the
>  	 * reset and the re-install of gt irqs. Skip for ironlake per
> @@ -996,6 +997,11 @@ int i915_reset(struct drm_device *dev)
>  		intel_enable_gt_powersave(dev);
>  
>  	return 0;
> +
> +error:
> +	atomic_or(I915_WEDGED, &error->reset_counter);
> +	mutex_unlock(&dev->struct_mutex);
> +	return ret;

I've warmed up to your approach to removing reload_in_reset, but I still
maintain that it should be a separate patch and not mixed in with the -EIO
ABI change.

>  }
>  
>  static int i915_pci_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index ee7677343056..c24c23d8a0c0 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1384,9 +1384,6 @@ struct i915_gpu_error {
>  
>  	/* For missed irq/seqno simulation. */
>  	unsigned int test_irq_rings;
> -
> -	/* Used to prevent gem_check_wedged returning -EAGAIN during gpu reset   */
> -	bool reload_in_reset;
>  };
>  
>  enum modeset_restore {
> @@ -2977,7 +2974,7 @@ i915_gem_find_active_request(struct intel_engine_cs *ring);
>  
>  bool i915_gem_retire_requests(struct drm_device *dev);
>  void i915_gem_retire_requests_ring(struct intel_engine_cs *ring);
> -int __must_check i915_gem_check_wedge(struct i915_gpu_error *error,
> +int __must_check i915_gem_check_wedge(unsigned reset_counter,
>  				      bool interruptible);
>  
>  static inline u32 i915_reset_counter(struct i915_gpu_error *error)
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 68d4bab93fd8..ce4b89f28d38 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -85,9 +85,7 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
>  {
>  	int ret;
>  
> -#define EXIT_COND (!i915_reset_in_progress(error) || \
> -		   i915_terminally_wedged(error))
> -	if (EXIT_COND)
> +	if (!i915_reset_in_progress(error))

This is one of the cases I think should be moved to patch 1. Iirc there's
1 more somewhere where we really don't need to check for terminally
wedged.

>  		return 0;
>  
>  	/*
> @@ -96,17 +94,16 @@ i915_gem_wait_for_error(struct i915_gpu_error *error)
>  	 * we should simply try to bail out and fail as gracefully as possible.
>  	 */
>  	ret = wait_event_interruptible_timeout(error->reset_queue,
> -					       EXIT_COND,
> +					       !i915_reset_in_progress(error),
>  					       10*HZ);
>  	if (ret == 0) {
>  		DRM_ERROR("Timed out waiting for the gpu reset to complete\n");
>  		return -EIO;
>  	} else if (ret < 0) {
>  		return ret;
> +	} else {
> +		return 0;
>  	}
> -#undef EXIT_COND
> -
> -	return 0;
>  }
>  
>  int i915_mutex_lock_interruptible(struct drm_device *dev)
> @@ -1110,26 +1107,19 @@ put_rpm:
>  }
>  
>  int
> -i915_gem_check_wedge(struct i915_gpu_error *error,
> -		     bool interruptible)
> +i915_gem_check_wedge(unsigned reset_counter, bool interruptible)
>  {
> -	if (i915_reset_in_progress(error)) {
> +	if (__i915_reset_in_progress_or_wedged(reset_counter)) {
>  		/* Non-interruptible callers can't handle -EAGAIN, hence return
>  		 * -EIO unconditionally for these. */
>  		if (!interruptible)
>  			return -EIO;
>  
>  		/* Recovery complete, but the reset failed ... */
> -		if (i915_terminally_wedged(error))
> +		if (__i915_terminally_wedged(reset_counter))
>  			return -EIO;
>  
> -		/*
> -		 * Check if GPU Reset is in progress - we need intel_ring_begin
> -		 * to work properly to reinit the hw state while the gpu is
> -		 * still marked as reset-in-progress. Handle this with a flag.
> -		 */
> -		if (!error->reload_in_reset)
> -			return -EAGAIN;
> +		return -EAGAIN;
>  	}
>  
>  	return 0;
> @@ -1296,11 +1286,11 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>  		/* We need to check whether any gpu reset happened in between
>  		 * the caller grabbing the seqno and now ... */
>  		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
> -			/* ... but upgrade the -EAGAIN to an -EIO if the gpu
> -			 * is truely gone. */
> -			ret = i915_gem_check_wedge(&dev_priv->gpu_error, interruptible);
> -			if (ret == 0)
> -				ret = -EAGAIN;
> +			/* As we do not requeue the request over a GPU reset,
> +			 * if one does occur we know that the request is
> +			 * effectively complete.
> +			 */
> +			ret = 0;
>  			break;
>  		}
>  
> @@ -2687,8 +2677,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>  
>  	*req_out = NULL;
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error,
> -				   dev_priv->mm.interruptible);
> +	ret = i915_gem_check_wedge(reset_counter, dev_priv->mm.interruptible);
>  	if (ret)
>  		return ret;
>  
> @@ -4078,9 +4067,9 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
>  	if (ret)
>  		return ret;
>  
> -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
> -	if (ret)
> -		return ret;
> +	/* ABI: return -EIO if already wedged */
> +	if (i915_terminally_wedged(&dev_priv->gpu_error))

Commit message says there's only one case, but I count two:
- throttle here
- waiting for ring space in case the gpu died meanwhile

Please amend the commit message with that.

> +		return -EIO;
>  
>  	spin_lock(&file_priv->mm.lock);
>  	list_for_each_entry(request, &file_priv->mm.request_list, client_list) {
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index e88d692583a5..3b2a8fbe0392 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -2434,7 +2434,6 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
>  static void i915_reset_and_wakeup(struct drm_device *dev)
>  {
>  	struct drm_i915_private *dev_priv = to_i915(dev);
> -	struct i915_gpu_error *error = &dev_priv->gpu_error;
>  	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
>  	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
>  	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
> @@ -2452,7 +2451,7 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
>  	 * the reset in-progress bit is only ever set by code outside of this
>  	 * work we don't need to worry about any other races.
>  	 */
> -	if (i915_reset_in_progress(error) && !i915_terminally_wedged(error)) {
> +	if (i915_reset_in_progress(&dev_priv->gpu_error)) {

Imo we should take this path here iff wedged == true in i915_handle_error.
Otherwise control flow just becomes inconsistent I think. Rechecking the
reset counter just seems fragile.

>  		DRM_DEBUG_DRIVER("resetting chip\n");
>  		kobject_uevent_env(&dev->primary->kdev->kobj, KOBJ_CHANGE,
>  				   reset_event);
> @@ -2480,25 +2479,9 @@ static void i915_reset_and_wakeup(struct drm_device *dev)
>  
>  		intel_runtime_pm_put(dev_priv);
>  
> -		if (ret == 0) {
> -			/*
> -			 * After all the gem state is reset, increment the reset
> -			 * counter and wake up everyone waiting for the reset to
> -			 * complete.
> -			 *
> -			 * Since unlock operations are a one-sided barrier only,
> -			 * we need to insert a barrier here to order any seqno
> -			 * updates before
> -			 * the counter increment.
> -			 */
> -			smp_mb__before_atomic();
> -			atomic_inc(&dev_priv->gpu_error.reset_counter);
> -
> +		if (ret == 0)
>  			kobject_uevent_env(&dev->primary->kdev->kobj,
>  					   KOBJ_CHANGE, reset_done_event);
> -		} else {
> -			atomic_or(I915_WEDGED, &error->reset_counter);
> -		}
>  
>  		/*
>  		 * Note: The wake_up also serves as a memory barrier so that
> diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> index 4447e73b54db..73c61b94f7fd 100644
> --- a/drivers/gpu/drm/i915/intel_display.c
> +++ b/drivers/gpu/drm/i915/intel_display.c
> @@ -3287,12 +3287,9 @@ static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
>  	struct drm_device *dev = crtc->dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
>  	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
> -	unsigned reset_counter;
>  	bool pending;
>  
> -	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
> -	if (intel_crtc->reset_counter != reset_counter ||
> -	    __i915_reset_in_progress(reset_counter))
> +	if (intel_crtc->reset_counter != i915_reset_counter(&dev_priv->gpu_error))
>  		return false;
>  
>  	spin_lock_irq(&dev->event_lock);
> @@ -10867,11 +10864,8 @@ static bool page_flip_finished(struct intel_crtc *crtc)
>  {
>  	struct drm_device *dev = crtc->base.dev;
>  	struct drm_i915_private *dev_priv = dev->dev_private;
> -	unsigned reset_counter;
>  
> -	reset_counter = i915_reset_counter(&dev_priv->gpu_error);
> -	if (crtc->reset_counter != reset_counter ||
> -	    __i915_reset_in_progress(reset_counter))
> +	if (crtc->reset_counter != i915_reset_counter(&dev_priv->gpu_error))
>  		return true;
>  
>  	/*
> @@ -13303,9 +13297,9 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
>  		return ret;
>  
>  	ret = drm_atomic_helper_prepare_planes(dev, state);
> -	if (!ret && !async && !i915_reset_in_progress(&dev_priv->gpu_error)) {
> -		mutex_unlock(&dev->struct_mutex);
> +	mutex_unlock(&dev->struct_mutex);
>  
> +	if (!ret && !async) {
>  		for_each_plane_in_state(state, plane, plane_state, i) {
>  			struct intel_plane_state *intel_plane_state =
>  				to_intel_plane_state(plane_state);
> @@ -13315,23 +13309,15 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
>  
>  			ret = __i915_wait_request(intel_plane_state->wait_req,
>  						  true, NULL, NULL);
> -
> -			/* Swallow -EIO errors to allow updates during hw lockup. */
> -			if (ret == -EIO)
> -				ret = 0;
> -
> -			if (ret)
> +			if (ret) {
> +				mutex_lock(&dev->struct_mutex);
> +				drm_atomic_helper_cleanup_planes(dev, state);
> +				mutex_unlock(&dev->struct_mutex);
>  				break;
> +			}
>  		}
> -
> -		if (!ret)
> -			return 0;
> -
> -		mutex_lock(&dev->struct_mutex);
> -		drm_atomic_helper_cleanup_planes(dev, state);
>  	}
>  
> -	mutex_unlock(&dev->struct_mutex);

Sneaking in lockless waits! Separate patch please.

>  	return ret;
>  }
>  
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index fe71c768dbc4..08b982a6ae81 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -711,6 +711,14 @@ static int logical_ring_wait_for_space(struct drm_i915_gem_request *req,
>  	if (ret)
>  		return ret;
>  
> +	/* If the request was completed due to a GPU hang, we want to
> +	 * error out before we continue to emit more commands to the GPU.
> +	 */
> +	ret = i915_gem_check_wedge(i915_reset_counter(&req->i915->gpu_error),
> +				   req->i915->mm.interruptible);
> +	if (ret)
> +		return ret;

Do we really need to recheck here? We check when creating the request, and
at most we can transition to wedged. But without dropping struct_mutex the
reset handler can't sneak in and declare the gpu terminal. And even if
that changes worst case we'll restart the entire ordeal and then notice at
the start of execbuf that it's really dead and return the -EIO to
userspace.

I think we can drop these two parts, or did I miss something?

Besides the nitpicks and 2 patches that should be split out lgtm.
-Daniel

> +
>  	ringbuf->space = space;
>  	return 0;
>  }
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 913526c0264f..511efe556d73 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2254,6 +2254,14 @@ static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
>  	if (ret)
>  		return ret;
>  
> +	/* If the request was completed due to a GPU hang, we want to
> +	 * error out before we continue to emit more commands to the GPU.
> +	 */
> +	ret = i915_gem_check_wedge(i915_reset_counter(&to_i915(ring->dev)->gpu_error),
> +				   to_i915(ring->dev)->mm.interruptible);
> +	if (ret)
> +		return ret;
> +
>  	ringbuf->space = space;
>  	return 0;
>  }
> -- 
> 2.6.2
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper
  2015-12-03  9:02             ` Chris Wilson
@ 2015-12-03  9:20               ` Daniel Vetter
  0 siblings, 0 replies; 92+ messages in thread
From: Daniel Vetter @ 2015-12-03  9:20 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, intel-gfx, daniel.vetter

On Thu, Dec 03, 2015 at 09:02:16AM +0000, Chris Wilson wrote:
> On Thu, Dec 03, 2015 at 09:57:01AM +0100, Daniel Vetter wrote:
> > On Tue, Dec 01, 2015 at 11:05:33AM +0000, Chris Wilson wrote:
> > > This is just a little bit of syntatic sugar to hide the atomic_reads()
> > > throughout the code to retrieve the current reset_counter. It also
> > > provides the other utility functions to check the reset state on the
> > > already read reset_counter, so that we can read it once and do multiple
> > > tests rather than risk the value changing between tests.
> > 
> > This patch also changes the meaning of reset_in_progress to not include
> > WEDGED afaict. I agree with that change, but it needs to be mentioned in
> > the commit message.
> > 
> > Also with that change there's some cleanup potential since a bunch of
> > callers that explicitly checked for reset_in_progress &&
> > !terminally_wedged now can drop the 2nd part of the condition. That
> > simplification is why I've done this change in my patch.
> 
> I was just splitting out the header change and simplest of seds from the
> complete patch. Changing the reset/recovery interfaces I left till last.

I've found at least 1 huk I was looking for in patch 3. The other is in
reset_and_wakeup, and I think it's better to change the control flow there
differently. See my "[PATCH] drm/i915: Fix up -EIO ABI per igt/gem_eio"
for what I think should be changed around terminally_wedged() calls if you
drop the terminal state from reset_in_progress().
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request()
  2015-12-03  9:14             ` Daniel Vetter
@ 2015-12-03  9:41               ` Chris Wilson
  2015-12-11  9:02               ` Chris Wilson
  1 sibling, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-12-03  9:41 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: daniel.vetter, intel-gfx

On Thu, Dec 03, 2015 at 10:14:54AM +0100, Daniel Vetter wrote:
> > @@ -4078,9 +4067,9 @@ i915_gem_ring_throttle(struct drm_device *dev, struct drm_file *file)
> >  	if (ret)
> >  		return ret;
> >  
> > -	ret = i915_gem_check_wedge(&dev_priv->gpu_error, false);
> > -	if (ret)
> > -		return ret;
> > +	/* ABI: return -EIO if already wedged */
> > +	if (i915_terminally_wedged(&dev_priv->gpu_error))
> 
> Commit message says there's only one case, but I count two:
> - throttle here
> - waiting for ring space in case the gpu died meanwhile

That wasn't intended for ABI preservation. (If you look at the execbuffer
reporting -EIO that is from request_alloc). I hit a bug after
wait_for_ring space if I didn't abort when the hang was detected - we
walked off the end of the ring. That shouldn't be possible, I think I
papered over a bug rather than diagnose it correctly. :|
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-12-01 18:34     ` Dave Gordon
@ 2015-12-03 16:22       ` Chris Wilson
  2015-12-07 15:08         ` Tvrtko Ursulin
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-03 16:22 UTC (permalink / raw)
  To: intel-gfx

One particularly stressful scenario consists of many independent tasks
all competing for GPU time and waiting upon the results (e.g. realtime
transcoding of many, many streams). One bottleneck in particular is that
each client waits on its own results, but every client is woken up after
every batchbuffer - hence the thunder of hooves as then every client must
do its heavyweight dance to read a coherent seqno to see if it is the
lucky one.

Ideally, we only want one client to wake up after the interrupt and
check its request for completion. Since the requests must retire in
order, we can select the first client on the oldest request to be woken.
Once that client has completed his wait, we can then wake up the
next client and so on.

v2: Convert from a kworker per engine into a dedicated kthread for the
bottom-half.
v3: Rename request members and tweak comments.
v4: Use a per-engine spinlock in the breadcrumbs bottom-half.
v5: Fix race in locklessly checking waiter status and kicking the task on
adding a new waiter.
v6: Fix deciding when to force the timer to hide missing interrupts.
v7: Move the bottom-half from the kthread to the first client process.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Dave Gordon <david.s.gordon@intel.com>
---
 drivers/gpu/drm/i915/Makefile            |   1 +
 drivers/gpu/drm/i915/i915_debugfs.c      |   2 +-
 drivers/gpu/drm/i915/i915_drv.h          |   3 +-
 drivers/gpu/drm/i915/i915_gem.c          | 108 ++++++-----------
 drivers/gpu/drm/i915/i915_gpu_error.c    |   2 +-
 drivers/gpu/drm/i915/i915_irq.c          |  17 +--
 drivers/gpu/drm/i915/intel_breadcrumbs.c | 197 +++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c         |   5 +-
 drivers/gpu/drm/i915/intel_ringbuffer.c  |   5 +-
 drivers/gpu/drm/i915/intel_ringbuffer.h  |  47 +++++++-
 10 files changed, 297 insertions(+), 90 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/intel_breadcrumbs.c

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 0851de07bd13..d3b9d3618719 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
 	  i915_gem_userptr.o \
 	  i915_gpu_error.o \
 	  i915_trace_points.o \
+	  intel_breadcrumbs.o \
 	  intel_lrc.o \
 	  intel_mocs.o \
 	  intel_ringbuffer.o \
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index 0583eda642d7..68fbe4d1f91d 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -2323,7 +2323,7 @@ static int count_irq_waiters(struct drm_i915_private *i915)
 	int i;
 
 	for_each_ring(ring, i915, i)
-		count += ring->irq_refcount;
+		count += intel_engine_has_waiter(ring);
 
 	return count;
 }
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 2acf2cf0b836..30d470934874 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1383,7 +1383,7 @@ struct i915_gpu_error {
 #define I915_STOP_RING_ALLOW_WARN      (1 << 30)
 
 	/* For missed irq/seqno simulation. */
-	unsigned int test_irq_rings;
+	unsigned long test_irq_rings;
 };
 
 enum modeset_restore {
@@ -2800,7 +2800,6 @@ ibx_disable_display_interrupt(struct drm_i915_private *dev_priv, uint32_t bits)
 	ibx_display_interrupt_update(dev_priv, bits, 0);
 }
 
-
 /* i915_gem.c */
 int i915_gem_create_ioctl(struct drm_device *dev, void *data,
 			  struct drm_file *file_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 6e051402e340..8dcbac584e84 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1125,17 +1125,6 @@ i915_gem_check_wedge(unsigned reset_counter, bool interruptible)
 	return 0;
 }
 
-static void fake_irq(unsigned long data)
-{
-	wake_up_process((struct task_struct *)data);
-}
-
-static bool missed_irq(struct drm_i915_private *dev_priv,
-		       struct intel_engine_cs *ring)
-{
-	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
-}
-
 static unsigned long local_clock_us(unsigned *cpu)
 {
 	unsigned long t;
@@ -1183,9 +1172,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 	 * takes to sleep on a request, on the order of a microsecond.
 	 */
 
-	if (req->ring->irq_refcount)
-		return -EBUSY;
-
 	/* Only spin if we know the GPU is processing this request */
 	if (!i915_gem_request_started(req, false))
 		return -EAGAIN;
@@ -1204,9 +1190,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
 		cpu_relax_lowlatency();
 	}
 
-	if (i915_gem_request_completed(req, false))
-		return 0;
-
 	return -EAGAIN;
 }
 
@@ -1231,26 +1214,19 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			s64 *timeout,
 			struct intel_rps_client *rps)
 {
-	struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
-	struct drm_device *dev = ring->dev;
-	struct drm_i915_private *dev_priv = dev->dev_private;
-	const bool irq_test_in_progress =
-		ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
 	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
-	DEFINE_WAIT(wait);
-	unsigned long timeout_expire;
+	struct intel_breadcrumb wait;
+	unsigned long timeout_remain;
 	s64 before, now;
 	int ret;
 
-	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
-
 	if (list_empty(&req->list))
 		return 0;
 
 	if (i915_gem_request_completed(req, true))
 		return 0;
 
-	timeout_expire = 0;
+	timeout_remain = MAX_SCHEDULE_TIMEOUT;
 	if (timeout) {
 		if (WARN_ON(*timeout < 0))
 			return -EINVAL;
@@ -1258,34 +1234,43 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 		if (*timeout == 0)
 			return -ETIME;
 
-		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
+		timeout_remain = nsecs_to_jiffies_timeout(*timeout);
 	}
 
-	if (INTEL_INFO(dev_priv)->gen >= 6)
-		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
-
 	/* Record current time in case interrupted by signal, or wedged */
 	trace_i915_gem_request_wait_begin(req);
 	before = ktime_get_raw_ns();
 
-	/* Optimistic spin for the next jiffie before touching IRQs */
-	ret = __i915_spin_request(req, state);
-	if (ret == 0)
-		goto out;
+	if (INTEL_INFO(req->i915)->gen >= 6)
+		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
 
-	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
-		ret = -ENODEV;
-		goto out;
+	/* Optimistic spin for the next jiffie before touching IRQs */
+	if (intel_breadcrumbs_add_waiter(req, &wait)) {
+		ret = __i915_spin_request(req, state);
+		if (ret == 0)
+			goto out;
 	}
 
 	for (;;) {
-		struct timer_list timer;
-
-		prepare_to_wait(&ring->irq_queue, &wait, state);
+		set_task_state(wait.task, state);
+
+		/* Ensure our read of the seqno is coherent so that we
+		 * do not "miss an interrupt" (i.e. if this is the last
+		 * request and the seqno write from the GPU is not visible
+		 * by the time the interrupt fires, we will see that the
+		 * request is incomplete and go back to sleep awaiting
+		 * another interrupt that will never come.)
+		 *
+		 * Strictly, we only need to do this once after an interrupt,
+		 * but it is easier and safer to do it every time the waiter
+		 * is woken.
+		 */
+		if (i915_gem_request_completed(req, false)) {
+			ret = 0;
+			break;
+		}
 
-		/* We need to check whether any gpu reset happened in between
-		 * the caller grabbing the seqno and now ... */
-		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
+		if (unlikely(req->reset_counter != i915_reset_counter(&req->i915->gpu_error))) {
 			/* As we do not requeue the request over a GPU reset,
 			 * if one does occur we know that the request is
 			 * effectively complete.
@@ -1294,46 +1279,21 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
 			break;
 		}
 
-		if (i915_gem_request_completed(req, false)) {
-			ret = 0;
-			break;
-		}
-
-		if (signal_pending_state(state, current)) {
+		if (signal_pending_state(state, wait.task)) {
 			ret = -ERESTARTSYS;
 			break;
 		}
 
-		if (timeout && time_after_eq(jiffies, timeout_expire)) {
+		timeout_remain = io_schedule_timeout(timeout_remain);
+		if (timeout_remain == 0) {
 			ret = -ETIME;
 			break;
 		}
-
-		i915_queue_hangcheck(dev_priv);
-
-		timer.function = NULL;
-		if (timeout || missed_irq(dev_priv, ring)) {
-			unsigned long expire;
-
-			setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
-			expire = missed_irq(dev_priv, ring) ? jiffies + 1 : timeout_expire;
-			mod_timer(&timer, expire);
-		}
-
-		io_schedule();
-
-		if (timer.function) {
-			del_singleshot_timer_sync(&timer);
-			destroy_timer_on_stack(&timer);
-		}
 	}
-	if (!irq_test_in_progress)
-		ring->irq_put(ring);
-
-	finish_wait(&ring->irq_queue, &wait);
-
+	__set_task_state(wait.task, TASK_RUNNING);
 out:
 	now = ktime_get_raw_ns();
+	intel_breadcrumbs_remove_waiter(req, &wait);
 	trace_i915_gem_request_wait_end(req);
 
 	if (timeout) {
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index 06ca4082735b..f805d117f3d1 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -900,7 +900,7 @@ static void i915_record_ring_state(struct drm_device *dev,
 		ering->instdone = I915_READ(GEN2_INSTDONE);
 	}
 
-	ering->waiting = waitqueue_active(&ring->irq_queue);
+	ering->waiting = intel_engine_has_waiter(ring);
 	ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
 	ering->seqno = ring->get_seqno(ring, false);
 	ering->acthd = intel_ring_get_active_head(ring);
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 28c0329f8281..57b113132fd7 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -996,12 +996,11 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
 
 static void notify_ring(struct intel_engine_cs *ring)
 {
-	if (!intel_ring_initialized(ring))
+	if (ring->i915 == NULL)
 		return;
 
 	trace_i915_gem_request_notify(ring);
-
-	wake_up_all(&ring->irq_queue);
+	intel_engine_wakeup(ring);
 }
 
 static void vlv_c0_read(struct drm_i915_private *dev_priv,
@@ -1083,7 +1082,7 @@ static bool any_waiters(struct drm_i915_private *dev_priv)
 	int i;
 
 	for_each_ring(ring, dev_priv, i)
-		if (ring->irq_refcount)
+		if (intel_engine_has_waiter(ring))
 			return true;
 
 	return false;
@@ -2411,7 +2410,7 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
 
 	/* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
 	for_each_ring(ring, dev_priv, i)
-		wake_up_all(&ring->irq_queue);
+		intel_engine_wakeup(ring);
 
 	/* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
 	wake_up_all(&dev_priv->pending_flip_queue);
@@ -2986,16 +2985,18 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
 			if (ring_idle(ring, seqno)) {
 				ring->hangcheck.action = HANGCHECK_IDLE;
 
-				if (waitqueue_active(&ring->irq_queue)) {
+				if (intel_engine_has_waiter(ring)) {
 					/* Issue a wake-up to catch stuck h/w. */
 					if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) {
-						if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)))
+						if (!test_bit(ring->id, &dev_priv->gpu_error.test_irq_rings))
 							DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
 								  ring->name);
 						else
 							DRM_INFO("Fake missed irq on %s\n",
 								 ring->name);
-						wake_up_all(&ring->irq_queue);
+						mod_delayed_work(system_highpri_wq,
+								 &ring->breadcrumbs.irq,
+								 0);
 					}
 					/* Safeguard against driver failure */
 					ring->hangcheck.score += BUSY;
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
new file mode 100644
index 000000000000..17d51269b774
--- /dev/null
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -0,0 +1,197 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include <linux/kthread.h>
+
+#include "i915_drv.h"
+
+static bool irq_get(struct intel_engine_cs *engine)
+{
+	if (test_bit(engine->id, &engine->i915->gpu_error.test_irq_rings))
+		return false;
+
+	if (!intel_irqs_enabled(engine->i915))
+		return false;
+
+	return engine->irq_get(engine);
+}
+
+static void intel_breadcrumbs_irq(struct work_struct *work)
+{
+	struct intel_breadcrumbs *b =
+		container_of(work, struct intel_breadcrumbs, irq.work);
+	struct intel_engine_cs *engine =
+		container_of(b, struct intel_engine_cs, breadcrumbs);
+
+	/* We unmask the user-interrupt in the background so that the
+	 * first-waiter can do a little busywaiting before sleeping and
+	 * not incur additional latency from setting up the IRQ.
+	 *
+	 * The worker also persists in case we cannot enable interrupts,
+	 * or if we have previously seen seqno/interrupt incoherency
+	 * ("missed interrupt" syndrome). Here the worker will wake up
+	 * every jiffie in order to kick the original waiter to do the
+	 * coherent seqno check. (Not as efficient as having a timer
+	 * source kick the waiter directly, but this should only have to
+	 * be used in rare cases where we try to workaround some
+	 * hardware/driver failures.)
+	 */
+
+	spin_lock(&b->lock);
+	if (b->first_waiter) {
+		if (!b->rpm_wakelock) {
+			intel_runtime_pm_get(engine->i915);
+			b->rpm_wakelock = true;
+		}
+
+		if (!b->irq_enabled) {
+			b->irq_enabled = irq_get(engine);
+			/* If we were delayed, the original waiter may have
+			 * gone to sleep already and we may have missed the
+			 * interrupt. So after enabling the interrupt,
+			 * kick the waiter.
+			 */
+			wake_up_process(b->first_waiter);
+		}
+
+		/* No interrupts? Kick the waiter every jiffie! */
+		if (!b->irq_enabled ||
+		    test_bit(engine->id,
+			     &engine->i915->gpu_error.missed_irq_rings)) {
+			wake_up_process(b->first_waiter);
+			queue_delayed_work(system_highpri_wq, &b->irq, 1);
+		}
+	} else {
+		if (b->irq_enabled) {
+			engine->irq_put(engine);
+			b->irq_enabled = false;
+		}
+		if (b->rpm_wakelock) {
+			intel_runtime_pm_put(engine->i915);
+			b->rpm_wakelock = false;
+		}
+	}
+	spin_unlock(&b->lock);
+}
+
+inline struct intel_breadcrumb *to_wait(struct rb_node *node)
+{
+	if (node == NULL)
+		return NULL;
+
+	return container_of(node, struct intel_breadcrumb, node);
+}
+
+bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request,
+				  struct intel_breadcrumb *wait)
+{
+	struct intel_breadcrumbs *b = &request->ring->breadcrumbs;
+	struct rb_node **p, *parent;
+	bool first = false;
+
+	wait->task = current;
+	wait->seqno = request->seqno;
+
+	spin_lock(&b->lock);
+
+	/* First? Unmask the user interrupt */
+	if (b->first_waiter == NULL)
+		queue_delayed_work(system_highpri_wq, &b->irq, 0);
+
+	/* Insert the request into the retirment ordered list
+	 * of waiters by walking the rbtree. If we are the oldest
+	 * seqno in the tree (the first to be retired), then
+	 * set ourselves as the bottom-half.
+	 */
+	first = true;
+	parent = NULL;
+	p = &b->requests.rb_node;
+	while (*p) {
+		parent = *p;
+		if (i915_seqno_passed(wait->seqno, to_wait(parent)->seqno)) {
+			p = &parent->rb_right;
+			first = false;
+		} else
+			p = &parent->rb_left;
+	}
+	rb_link_node(&wait->node, parent, p);
+	rb_insert_color(&wait->node, &b->requests);
+
+	if (first)
+		b->first_waiter = wait->task;
+
+	spin_unlock(&b->lock);
+
+	return first;
+}
+
+void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request,
+				     struct intel_breadcrumb *wait)
+{
+	struct intel_breadcrumbs *b = &request->ring->breadcrumbs;
+
+	spin_lock(&b->lock);
+
+	if (b->first_waiter == wait->task) {
+		struct intel_breadcrumb *next;
+		struct task_struct *task;
+
+		/* We are the current bottom-half. Find the next candidate,
+		 * the first waiter in the queue on the remaining oldest
+		 * request. As multiple seqnos may complete in the time it
+		 * takes us to wake up and find the next waiter, we have to
+		 * wake up that waiter for it to perform its own coherent
+		 * completion check.
+		 */
+		task = NULL;
+		next = to_wait(rb_next(&wait->node));
+		if (next)
+			task = next->task;
+
+		b->first_waiter = task;
+		if (task)
+			wake_up_process(task);
+		else
+			/* Disable the user interrupts */
+			mod_delayed_work(system_highpri_wq, &b->irq, 1);
+	}
+
+	rb_erase(&wait->node, &b->requests);
+	spin_unlock(&b->lock);
+}
+
+void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine)
+{
+	struct intel_breadcrumbs *b = &engine->breadcrumbs;
+
+	spin_lock_init(&b->lock);
+	INIT_DELAYED_WORK(&b->irq, intel_breadcrumbs_irq);
+}
+
+void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine)
+{
+	struct intel_breadcrumbs *b = &engine->breadcrumbs;
+
+	flush_delayed_work(&b->irq);
+}
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 08b982a6ae81..ce5119e2bcef 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1904,6 +1904,8 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *ring)
 	i915_cmd_parser_fini_ring(ring);
 	i915_gem_batch_pool_fini(&ring->batch_pool);
 
+	intel_engine_fini_breadcrumbs(ring);
+
 	if (ring->status_page.obj) {
 		kunmap(sg_page(ring->status_page.obj->pages->sgl));
 		ring->status_page.obj = NULL;
@@ -1920,10 +1922,11 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
 	ring->buffer = NULL;
 
 	ring->dev = dev;
+	ring->i915 = to_i915(dev);
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
-	init_waitqueue_head(&ring->irq_queue);
+	intel_engine_init_breadcrumbs(ring);
 
 	INIT_LIST_HEAD(&ring->buffers);
 	INIT_LIST_HEAD(&ring->execlist_queue);
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index 511efe556d73..8a0f02e73a4e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -2157,6 +2157,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	WARN_ON(ring->buffer);
 
 	ring->dev = dev;
+	ring->i915 = to_i915(dev);
 	INIT_LIST_HEAD(&ring->active_list);
 	INIT_LIST_HEAD(&ring->request_list);
 	INIT_LIST_HEAD(&ring->execlist_queue);
@@ -2164,7 +2165,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
 	i915_gem_batch_pool_init(dev, &ring->batch_pool);
 	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
 
-	init_waitqueue_head(&ring->irq_queue);
+	intel_engine_init_breadcrumbs(ring);
 
 	ringbuf = intel_engine_create_ringbuffer(ring, 32 * PAGE_SIZE);
 	if (IS_ERR(ringbuf))
@@ -2225,6 +2226,8 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring)
 
 	i915_cmd_parser_fini_ring(ring);
 	i915_gem_batch_pool_fini(&ring->batch_pool);
+
+	intel_engine_fini_breadcrumbs(ring);
 }
 
 static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 5d1eb206151d..7dde9310df1b 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -157,9 +157,31 @@ struct  intel_engine_cs {
 #define LAST_USER_RING (VECS + 1)
 	u32		mmio_base;
 	struct		drm_device *dev;
+	struct drm_i915_private *i915;
 	struct intel_ringbuffer *buffer;
 	struct list_head buffers;
 
+	/* Rather than have every client wait upon all user interrupts,
+	 * with the herd waking after every interrupt and each doing the
+	 * heavyweight seqno dance, we delegate the task to the first client.
+	 * After the interrupt, we wake up one client, who does the heavyweight
+	 * coherent seqno read and either goes back to sleep (if incomplete),
+	 * or wakes up the next client in the queue and so forth.
+	 *
+	 * With respect to walking the entire list of waiters in a single
+	 * dedicated bottom-half, we reduce the latency of the first waiter
+	 * by avoiding a context switch, but incur additional coherent seqno
+	 * reads when following the chain of request breadcrumbs.
+	 */
+	struct intel_breadcrumbs {
+		spinlock_t lock; /* protects the lists of requests */
+		struct rb_root requests; /* sorted by retirement */
+		struct task_struct *first_waiter; /* bh for user interrupts */
+		struct delayed_work irq; /* used to enable or fake inerrupts */
+		bool irq_enabled : 1;
+		bool rpm_wakelock : 1;
+	} breadcrumbs;
+
 	/*
 	 * A pool of objects to use as shadow copies of client batch buffers
 	 * when the command parser is enabled. Prevents the client from
@@ -303,8 +325,6 @@ struct  intel_engine_cs {
 
 	bool gpu_caches_dirty;
 
-	wait_queue_head_t irq_queue;
-
 	struct intel_context *default_context;
 	struct intel_context *last_context;
 
@@ -506,4 +526,27 @@ void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
 /* Legacy ringbuffer specific portion of reservation code: */
 int intel_ring_reserve_space(struct drm_i915_gem_request *request);
 
+/* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
+struct intel_breadcrumb {
+	struct rb_node node;
+	struct task_struct *task;
+	u32 seqno;
+};
+void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine);
+bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request,
+				  struct intel_breadcrumb *wait);
+void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request,
+				     struct intel_breadcrumb *wait);
+static inline bool intel_engine_has_waiter(struct intel_engine_cs *engine)
+{
+	return READ_ONCE(engine->breadcrumbs.first_waiter);
+}
+static inline void intel_engine_wakeup(struct intel_engine_cs *engine)
+{
+	struct task_struct *task = READ_ONCE(engine->breadcrumbs.first_waiter);
+	if (task)
+		wake_up_process(task);
+}
+void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
+
 #endif /* _INTEL_RINGBUFFER_H_ */
-- 
2.6.2

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* Re: [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation
  2015-12-01 16:57   ` Dave Gordon
@ 2015-12-04  9:36     ` Daniel Vetter
  2015-12-04  9:51       ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Daniel Vetter @ 2015-12-04  9:36 UTC (permalink / raw)
  To: Dave Gordon; +Cc: intel-gfx

On Tue, Dec 01, 2015 at 04:57:11PM +0000, Dave Gordon wrote:
> On 29/11/15 08:48, Chris Wilson wrote:
> >We have testcases to ensure that seqno wraparound works fine, so we can
> >forgo forcing everyone to encounter seqno wraparound during early
> >uptime. seqno wraparound incurs a full GPU stall so not forcing it
> >will eliminate one jitter from the early system.
> >
> >Advancing the global next_seqno after a GPU reset is equally pointless.
> >
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> 
> The real bug is that seqno wraparound forces GPU idle, can we fix that
> instead?

hw semaphores will fall over if we don't, but I guess we can forgo the
idle if there's no hw semaphores enabled. And yeah I'd like to keep this
little trick to exercise wrap-around more often too.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation
  2015-12-04  9:36     ` Daniel Vetter
@ 2015-12-04  9:51       ` Chris Wilson
  0 siblings, 0 replies; 92+ messages in thread
From: Chris Wilson @ 2015-12-04  9:51 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: intel-gfx

On Fri, Dec 04, 2015 at 10:36:45AM +0100, Daniel Vetter wrote:
> On Tue, Dec 01, 2015 at 04:57:11PM +0000, Dave Gordon wrote:
> > On 29/11/15 08:48, Chris Wilson wrote:
> > >We have testcases to ensure that seqno wraparound works fine, so we can
> > >forgo forcing everyone to encounter seqno wraparound during early
> > >uptime. seqno wraparound incurs a full GPU stall so not forcing it
> > >will eliminate one jitter from the early system.
> > >
> > >Advancing the global next_seqno after a GPU reset is equally pointless.
> > >
> > >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> > 
> > The real bug is that seqno wraparound forces GPU idle, can we fix that
> > instead?
> 
> hw semaphores will fall over if we don't, but I guess we can forgo the
> idle if there's no hw semaphores enabled. And yeah I'd like to keep this
> little trick to exercise wrap-around more often too.

But we have a test case that you can run as often as you like as part of
CI setup even, to set the wraparound to happen at a psuedo-random location
during the testing.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-12-03 16:22       ` [PATCH v7] " Chris Wilson
@ 2015-12-07 15:08         ` Tvrtko Ursulin
  2015-12-08 10:44           ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-12-07 15:08 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx


Hi,

On 03/12/15 16:22, Chris Wilson wrote:
> One particularly stressful scenario consists of many independent tasks
> all competing for GPU time and waiting upon the results (e.g. realtime
> transcoding of many, many streams). One bottleneck in particular is that
> each client waits on its own results, but every client is woken up after
> every batchbuffer - hence the thunder of hooves as then every client must
> do its heavyweight dance to read a coherent seqno to see if it is the
> lucky one.
>
> Ideally, we only want one client to wake up after the interrupt and
> check its request for completion. Since the requests must retire in
> order, we can select the first client on the oldest request to be woken.
> Once that client has completed his wait, we can then wake up the
> next client and so on.
>
> v2: Convert from a kworker per engine into a dedicated kthread for the
> bottom-half.
> v3: Rename request members and tweak comments.
> v4: Use a per-engine spinlock in the breadcrumbs bottom-half.
> v5: Fix race in locklessly checking waiter status and kicking the task on
> adding a new waiter.
> v6: Fix deciding when to force the timer to hide missing interrupts.
> v7: Move the bottom-half from the kthread to the first client process.

I like the idea a lot so I'll make some comments even before you post v9 
or later. :)

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> Cc: Dave Gordon <david.s.gordon@intel.com>
> ---
>   drivers/gpu/drm/i915/Makefile            |   1 +
>   drivers/gpu/drm/i915/i915_debugfs.c      |   2 +-
>   drivers/gpu/drm/i915/i915_drv.h          |   3 +-
>   drivers/gpu/drm/i915/i915_gem.c          | 108 ++++++-----------
>   drivers/gpu/drm/i915/i915_gpu_error.c    |   2 +-
>   drivers/gpu/drm/i915/i915_irq.c          |  17 +--
>   drivers/gpu/drm/i915/intel_breadcrumbs.c | 197 +++++++++++++++++++++++++++++++
>   drivers/gpu/drm/i915/intel_lrc.c         |   5 +-
>   drivers/gpu/drm/i915/intel_ringbuffer.c  |   5 +-
>   drivers/gpu/drm/i915/intel_ringbuffer.h  |  47 +++++++-
>   10 files changed, 297 insertions(+), 90 deletions(-)
>   create mode 100644 drivers/gpu/drm/i915/intel_breadcrumbs.c
>
> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> index 0851de07bd13..d3b9d3618719 100644
> --- a/drivers/gpu/drm/i915/Makefile
> +++ b/drivers/gpu/drm/i915/Makefile
> @@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
>   	  i915_gem_userptr.o \
>   	  i915_gpu_error.o \
>   	  i915_trace_points.o \
> +	  intel_breadcrumbs.o \
>   	  intel_lrc.o \
>   	  intel_mocs.o \
>   	  intel_ringbuffer.o \
> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> index 0583eda642d7..68fbe4d1f91d 100644
> --- a/drivers/gpu/drm/i915/i915_debugfs.c
> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
> @@ -2323,7 +2323,7 @@ static int count_irq_waiters(struct drm_i915_private *i915)
>   	int i;
>
>   	for_each_ring(ring, i915, i)
> -		count += ring->irq_refcount;
> +		count += intel_engine_has_waiter(ring);

Don't care how many any longer? Okay in debugfs, but also in error capture?

>   	return count;
>   }
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index 2acf2cf0b836..30d470934874 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -1383,7 +1383,7 @@ struct i915_gpu_error {
>   #define I915_STOP_RING_ALLOW_WARN      (1 << 30)
>
>   	/* For missed irq/seqno simulation. */
> -	unsigned int test_irq_rings;
> +	unsigned long test_irq_rings;
>   };
>
>   enum modeset_restore {
> @@ -2800,7 +2800,6 @@ ibx_disable_display_interrupt(struct drm_i915_private *dev_priv, uint32_t bits)
>   	ibx_display_interrupt_update(dev_priv, bits, 0);
>   }
>
> -
>   /* i915_gem.c */
>   int i915_gem_create_ioctl(struct drm_device *dev, void *data,
>   			  struct drm_file *file_priv);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 6e051402e340..8dcbac584e84 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1125,17 +1125,6 @@ i915_gem_check_wedge(unsigned reset_counter, bool interruptible)
>   	return 0;
>   }
>
> -static void fake_irq(unsigned long data)
> -{
> -	wake_up_process((struct task_struct *)data);
> -}
> -
> -static bool missed_irq(struct drm_i915_private *dev_priv,
> -		       struct intel_engine_cs *ring)
> -{
> -	return test_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings);
> -}
> -
>   static unsigned long local_clock_us(unsigned *cpu)
>   {
>   	unsigned long t;
> @@ -1183,9 +1172,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   	 * takes to sleep on a request, on the order of a microsecond.
>   	 */
>
> -	if (req->ring->irq_refcount)
> -		return -EBUSY;
> -
>   	/* Only spin if we know the GPU is processing this request */
>   	if (!i915_gem_request_started(req, false))
>   		return -EAGAIN;
> @@ -1204,9 +1190,6 @@ static int __i915_spin_request(struct drm_i915_gem_request *req, int state)
>   		cpu_relax_lowlatency();
>   	}
>
> -	if (i915_gem_request_completed(req, false))
> -		return 0;
> -
>   	return -EAGAIN;
>   }
>
> @@ -1231,26 +1214,19 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   			s64 *timeout,
>   			struct intel_rps_client *rps)
>   {
> -	struct intel_engine_cs *ring = i915_gem_request_get_ring(req);
> -	struct drm_device *dev = ring->dev;
> -	struct drm_i915_private *dev_priv = dev->dev_private;
> -	const bool irq_test_in_progress =
> -		ACCESS_ONCE(dev_priv->gpu_error.test_irq_rings) & intel_ring_flag(ring);
>   	int state = interruptible ? TASK_INTERRUPTIBLE : TASK_UNINTERRUPTIBLE;
> -	DEFINE_WAIT(wait);
> -	unsigned long timeout_expire;
> +	struct intel_breadcrumb wait;
> +	unsigned long timeout_remain;
>   	s64 before, now;
>   	int ret;
>
> -	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
> -
>   	if (list_empty(&req->list))
>   		return 0;
>
>   	if (i915_gem_request_completed(req, true))
>   		return 0;
>
> -	timeout_expire = 0;
> +	timeout_remain = MAX_SCHEDULE_TIMEOUT;
>   	if (timeout) {
>   		if (WARN_ON(*timeout < 0))
>   			return -EINVAL;
> @@ -1258,34 +1234,43 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   		if (*timeout == 0)
>   			return -ETIME;
>
> -		timeout_expire = jiffies + nsecs_to_jiffies_timeout(*timeout);
> +		timeout_remain = nsecs_to_jiffies_timeout(*timeout);
>   	}
>
> -	if (INTEL_INFO(dev_priv)->gen >= 6)
> -		gen6_rps_boost(dev_priv, rps, req->emitted_jiffies);
> -
>   	/* Record current time in case interrupted by signal, or wedged */
>   	trace_i915_gem_request_wait_begin(req);
>   	before = ktime_get_raw_ns();
>
> -	/* Optimistic spin for the next jiffie before touching IRQs */
> -	ret = __i915_spin_request(req, state);
> -	if (ret == 0)
> -		goto out;
> +	if (INTEL_INFO(req->i915)->gen >= 6)
> +		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
>
> -	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
> -		ret = -ENODEV;
> -		goto out;
> +	/* Optimistic spin for the next jiffie before touching IRQs */

Exactly the opposite! :)

> +	if (intel_breadcrumbs_add_waiter(req, &wait)) {
> +		ret = __i915_spin_request(req, state);
> +		if (ret == 0)
> +			goto out;
>   	}
>
>   	for (;;) {
> -		struct timer_list timer;
> -
> -		prepare_to_wait(&ring->irq_queue, &wait, state);
> +		set_task_state(wait.task, state);
> +
> +		/* Ensure our read of the seqno is coherent so that we
> +		 * do not "miss an interrupt" (i.e. if this is the last
> +		 * request and the seqno write from the GPU is not visible
> +		 * by the time the interrupt fires, we will see that the
> +		 * request is incomplete and go back to sleep awaiting
> +		 * another interrupt that will never come.)
> +		 *
> +		 * Strictly, we only need to do this once after an interrupt,
> +		 * but it is easier and safer to do it every time the waiter
> +		 * is woken.
> +		 */
> +		if (i915_gem_request_completed(req, false)) {
> +			ret = 0;
> +			break;
> +		}
>
> -		/* We need to check whether any gpu reset happened in between
> -		 * the caller grabbing the seqno and now ... */
> -		if (req->reset_counter != i915_reset_counter(&dev_priv->gpu_error)) {
> +		if (unlikely(req->reset_counter != i915_reset_counter(&req->i915->gpu_error))) {
>   			/* As we do not requeue the request over a GPU reset,
>   			 * if one does occur we know that the request is
>   			 * effectively complete.
> @@ -1294,46 +1279,21 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>   			break;
>   		}
>
> -		if (i915_gem_request_completed(req, false)) {
> -			ret = 0;
> -			break;
> -		}
> -
> -		if (signal_pending_state(state, current)) {
> +		if (signal_pending_state(state, wait.task)) {
>   			ret = -ERESTARTSYS;
>   			break;
>   		}
>
> -		if (timeout && time_after_eq(jiffies, timeout_expire)) {
> +		timeout_remain = io_schedule_timeout(timeout_remain);
> +		if (timeout_remain == 0) {
>   			ret = -ETIME;
>   			break;
>   		}
> -
> -		i915_queue_hangcheck(dev_priv);
> -
> -		timer.function = NULL;
> -		if (timeout || missed_irq(dev_priv, ring)) {
> -			unsigned long expire;
> -
> -			setup_timer_on_stack(&timer, fake_irq, (unsigned long)current);
> -			expire = missed_irq(dev_priv, ring) ? jiffies + 1 : timeout_expire;
> -			mod_timer(&timer, expire);
> -		}
> -
> -		io_schedule();
> -
> -		if (timer.function) {
> -			del_singleshot_timer_sync(&timer);
> -			destroy_timer_on_stack(&timer);
> -		}
>   	}
> -	if (!irq_test_in_progress)
> -		ring->irq_put(ring);
> -
> -	finish_wait(&ring->irq_queue, &wait);
> -
> +	__set_task_state(wait.task, TASK_RUNNING);
>   out:
>   	now = ktime_get_raw_ns();
> +	intel_breadcrumbs_remove_waiter(req, &wait);
>   	trace_i915_gem_request_wait_end(req);
>
>   	if (timeout) {
> diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
> index 06ca4082735b..f805d117f3d1 100644
> --- a/drivers/gpu/drm/i915/i915_gpu_error.c
> +++ b/drivers/gpu/drm/i915/i915_gpu_error.c
> @@ -900,7 +900,7 @@ static void i915_record_ring_state(struct drm_device *dev,
>   		ering->instdone = I915_READ(GEN2_INSTDONE);
>   	}
>
> -	ering->waiting = waitqueue_active(&ring->irq_queue);
> +	ering->waiting = intel_engine_has_waiter(ring);
>   	ering->instpm = I915_READ(RING_INSTPM(ring->mmio_base));
>   	ering->seqno = ring->get_seqno(ring, false);
>   	ering->acthd = intel_ring_get_active_head(ring);
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 28c0329f8281..57b113132fd7 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -996,12 +996,11 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
>
>   static void notify_ring(struct intel_engine_cs *ring)
>   {
> -	if (!intel_ring_initialized(ring))
> +	if (ring->i915 == NULL)

Glanced something about this in some other thread, maybe best to keep it 
consolidated in one patch?

>   		return;
>
>   	trace_i915_gem_request_notify(ring);
> -
> -	wake_up_all(&ring->irq_queue);
> +	intel_engine_wakeup(ring);
>   }
>
>   static void vlv_c0_read(struct drm_i915_private *dev_priv,
> @@ -1083,7 +1082,7 @@ static bool any_waiters(struct drm_i915_private *dev_priv)
>   	int i;
>
>   	for_each_ring(ring, dev_priv, i)
> -		if (ring->irq_refcount)
> +		if (intel_engine_has_waiter(ring))
>   			return true;
>
>   	return false;
> @@ -2411,7 +2410,7 @@ static void i915_error_wake_up(struct drm_i915_private *dev_priv,
>
>   	/* Wake up __wait_seqno, potentially holding dev->struct_mutex. */
>   	for_each_ring(ring, dev_priv, i)
> -		wake_up_all(&ring->irq_queue);
> +		intel_engine_wakeup(ring);
>
>   	/* Wake up intel_crtc_wait_for_pending_flips, holding crtc->mutex. */
>   	wake_up_all(&dev_priv->pending_flip_queue);
> @@ -2986,16 +2985,18 @@ static void i915_hangcheck_elapsed(struct work_struct *work)
>   			if (ring_idle(ring, seqno)) {
>   				ring->hangcheck.action = HANGCHECK_IDLE;
>
> -				if (waitqueue_active(&ring->irq_queue)) {
> +				if (intel_engine_has_waiter(ring)) {
>   					/* Issue a wake-up to catch stuck h/w. */
>   					if (!test_and_set_bit(ring->id, &dev_priv->gpu_error.missed_irq_rings)) {
> -						if (!(dev_priv->gpu_error.test_irq_rings & intel_ring_flag(ring)))
> +						if (!test_bit(ring->id, &dev_priv->gpu_error.test_irq_rings))
>   							DRM_ERROR("Hangcheck timer elapsed... %s idle\n",
>   								  ring->name);
>   						else
>   							DRM_INFO("Fake missed irq on %s\n",
>   								 ring->name);
> -						wake_up_all(&ring->irq_queue);
> +						mod_delayed_work(system_highpri_wq,
> +								 &ring->breadcrumbs.irq,
> +								 0);
>   					}
>   					/* Safeguard against driver failure */
>   					ring->hangcheck.score += BUSY;
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> new file mode 100644
> index 000000000000..17d51269b774
> --- /dev/null
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -0,0 +1,197 @@
> +/*
> + * Copyright © 2015 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
> + * IN THE SOFTWARE.
> + *
> + */
> +
> +#include <linux/kthread.h>

Not any more.

> +
> +#include "i915_drv.h"
> +
> +static bool irq_get(struct intel_engine_cs *engine)
> +{
> +	if (test_bit(engine->id, &engine->i915->gpu_error.test_irq_rings))
> +		return false;

I don't understand why knowledge of this debugfs interface needs to be 
sprinkled around? I mean, I am not even 100% sure what the debugfs 
interface is for. But it looks to be about software masking user 
interrupts on a ring? Why not just mask the interrupt delivery based on 
this mask at the source?

> +
> +	if (!intel_irqs_enabled(engine->i915))
> +		return false;
> +
> +	return engine->irq_get(engine);
> +}
> +
> +static void intel_breadcrumbs_irq(struct work_struct *work)
> +{
> +	struct intel_breadcrumbs *b =
> +		container_of(work, struct intel_breadcrumbs, irq.work);
> +	struct intel_engine_cs *engine =
> +		container_of(b, struct intel_engine_cs, breadcrumbs);
> +
> +	/* We unmask the user-interrupt in the background so that the
> +	 * first-waiter can do a little busywaiting before sleeping and
> +	 * not incur additional latency from setting up the IRQ.
> +	 *
> +	 * The worker also persists in case we cannot enable interrupts,
> +	 * or if we have previously seen seqno/interrupt incoherency
> +	 * ("missed interrupt" syndrome). Here the worker will wake up
> +	 * every jiffie in order to kick the original waiter to do the
> +	 * coherent seqno check. (Not as efficient as having a timer
> +	 * source kick the waiter directly, but this should only have to
> +	 * be used in rare cases where we try to workaround some
> +	 * hardware/driver failures.)
> +	 */
> +
> +	spin_lock(&b->lock);
> +	if (b->first_waiter) {
> +		if (!b->rpm_wakelock) {
> +			intel_runtime_pm_get(engine->i915);
> +			b->rpm_wakelock = true;
> +		}
> +
> +		if (!b->irq_enabled) {
> +			b->irq_enabled = irq_get(engine);
> +			/* If we were delayed, the original waiter may have
> +			 * gone to sleep already and we may have missed the
> +			 * interrupt. So after enabling the interrupt,
> +			 * kick the waiter.
> +			 */
> +			wake_up_process(b->first_waiter);
> +		}
> +
> +		/* No interrupts? Kick the waiter every jiffie! */
> +		if (!b->irq_enabled ||
> +		    test_bit(engine->id,
> +			     &engine->i915->gpu_error.missed_irq_rings)) {
> +			wake_up_process(b->first_waiter);
> +			queue_delayed_work(system_highpri_wq, &b->irq, 1);
> +		}
> +	} else {
> +		if (b->irq_enabled) {
> +			engine->irq_put(engine);
> +			b->irq_enabled = false;
> +		}
> +		if (b->rpm_wakelock) {
> +			intel_runtime_pm_put(engine->i915);
> +			b->rpm_wakelock = false;
> +		}
> +	}
> +	spin_unlock(&b->lock);
> +}
> +
> +inline struct intel_breadcrumb *to_wait(struct rb_node *node)
> +{
> +	if (node == NULL)
> +		return NULL;
> +
> +	return container_of(node, struct intel_breadcrumb, node);
> +}
> +
> +bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request,
> +				  struct intel_breadcrumb *wait)
> +{
> +	struct intel_breadcrumbs *b = &request->ring->breadcrumbs;
> +	struct rb_node **p, *parent;
> +	bool first = false;
> +
> +	wait->task = current;
> +	wait->seqno = request->seqno;
> +
> +	spin_lock(&b->lock);
> +
> +	/* First? Unmask the user interrupt */
> +	if (b->first_waiter == NULL)
> +		queue_delayed_work(system_highpri_wq, &b->irq, 0);

Who enables interrupts if the worker gets scheduled and completes before 
settig b->first_waiter below?

> +
> +	/* Insert the request into the retirment ordered list
> +	 * of waiters by walking the rbtree. If we are the oldest
> +	 * seqno in the tree (the first to be retired), then
> +	 * set ourselves as the bottom-half.
> +	 */
> +	first = true;
> +	parent = NULL;
> +	p = &b->requests.rb_node;
> +	while (*p) {
> +		parent = *p;
> +		if (i915_seqno_passed(wait->seqno, to_wait(parent)->seqno)) {
> +			p = &parent->rb_right;
> +			first = false;
> +		} else
> +			p = &parent->rb_left;
> +	}
> +	rb_link_node(&wait->node, parent, p);
> +	rb_insert_color(&wait->node, &b->requests);

WARN_ON(!first && !b->first_waiter) maybe?

> +	if (first)
> +		b->first_waiter = wait->task;
> +
> +	spin_unlock(&b->lock);
> +
> +	return first;
> +}
> +
> +void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request,
> +				     struct intel_breadcrumb *wait)
> +{
> +	struct intel_breadcrumbs *b = &request->ring->breadcrumbs;
> +
> +	spin_lock(&b->lock);
> +
> +	if (b->first_waiter == wait->task) {
> +		struct intel_breadcrumb *next;
> +		struct task_struct *task;
> +
> +		/* We are the current bottom-half. Find the next candidate,
> +		 * the first waiter in the queue on the remaining oldest
> +		 * request. As multiple seqnos may complete in the time it
> +		 * takes us to wake up and find the next waiter, we have to
> +		 * wake up that waiter for it to perform its own coherent
> +		 * completion check.
> +		 */

Equally, why wouldn't we wake up all waiters for which the requests have 
been completed?

Would be a cheap check here and it would save a cascading growing 
latency as one task wakes up the next one.

Even more benefit for multiple waiters on the same request.

> +		task = NULL;
> +		next = to_wait(rb_next(&wait->node));
> +		if (next)
> +			task = next->task;
> +
> +		b->first_waiter = task;
> +		if (task)
> +			wake_up_process(task);
> +		else
> +			/* Disable the user interrupts */
> +			mod_delayed_work(system_highpri_wq, &b->irq, 1);
> +	}
> +
> +	rb_erase(&wait->node, &b->requests);
> +	spin_unlock(&b->lock);
> +}
> +
> +void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine)
> +{
> +	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> +
> +	spin_lock_init(&b->lock);
> +	INIT_DELAYED_WORK(&b->irq, intel_breadcrumbs_irq);
> +}
> +
> +void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine)
> +{
> +	struct intel_breadcrumbs *b = &engine->breadcrumbs;
> +
> +	flush_delayed_work(&b->irq);
> +}
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 08b982a6ae81..ce5119e2bcef 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1904,6 +1904,8 @@ void intel_logical_ring_cleanup(struct intel_engine_cs *ring)
>   	i915_cmd_parser_fini_ring(ring);
>   	i915_gem_batch_pool_fini(&ring->batch_pool);
>
> +	intel_engine_fini_breadcrumbs(ring);
> +
>   	if (ring->status_page.obj) {
>   		kunmap(sg_page(ring->status_page.obj->pages->sgl));
>   		ring->status_page.obj = NULL;
> @@ -1920,10 +1922,11 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
>   	ring->buffer = NULL;
>
>   	ring->dev = dev;
> +	ring->i915 = to_i915(dev);
>   	INIT_LIST_HEAD(&ring->active_list);
>   	INIT_LIST_HEAD(&ring->request_list);
>   	i915_gem_batch_pool_init(dev, &ring->batch_pool);
> -	init_waitqueue_head(&ring->irq_queue);
> +	intel_engine_init_breadcrumbs(ring);
>
>   	INIT_LIST_HEAD(&ring->buffers);
>   	INIT_LIST_HEAD(&ring->execlist_queue);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index 511efe556d73..8a0f02e73a4e 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2157,6 +2157,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>   	WARN_ON(ring->buffer);
>
>   	ring->dev = dev;
> +	ring->i915 = to_i915(dev);
>   	INIT_LIST_HEAD(&ring->active_list);
>   	INIT_LIST_HEAD(&ring->request_list);
>   	INIT_LIST_HEAD(&ring->execlist_queue);
> @@ -2164,7 +2165,7 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>   	i915_gem_batch_pool_init(dev, &ring->batch_pool);
>   	memset(ring->semaphore.sync_seqno, 0, sizeof(ring->semaphore.sync_seqno));
>
> -	init_waitqueue_head(&ring->irq_queue);
> +	intel_engine_init_breadcrumbs(ring);
>
>   	ringbuf = intel_engine_create_ringbuffer(ring, 32 * PAGE_SIZE);
>   	if (IS_ERR(ringbuf))
> @@ -2225,6 +2226,8 @@ void intel_cleanup_ring_buffer(struct intel_engine_cs *ring)
>
>   	i915_cmd_parser_fini_ring(ring);
>   	i915_gem_batch_pool_fini(&ring->batch_pool);
> +
> +	intel_engine_fini_breadcrumbs(ring);
>   }
>
>   static int ring_wait_for_space(struct intel_engine_cs *ring, int n)
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 5d1eb206151d..7dde9310df1b 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -157,9 +157,31 @@ struct  intel_engine_cs {
>   #define LAST_USER_RING (VECS + 1)
>   	u32		mmio_base;
>   	struct		drm_device *dev;
> +	struct drm_i915_private *i915;
>   	struct intel_ringbuffer *buffer;
>   	struct list_head buffers;
>
> +	/* Rather than have every client wait upon all user interrupts,
> +	 * with the herd waking after every interrupt and each doing the
> +	 * heavyweight seqno dance, we delegate the task to the first client.
> +	 * After the interrupt, we wake up one client, who does the heavyweight
> +	 * coherent seqno read and either goes back to sleep (if incomplete),
> +	 * or wakes up the next client in the queue and so forth.
> +	 *
> +	 * With respect to walking the entire list of waiters in a single

s/With respect to/In contrast to/ or "Compared to"?

> +	 * dedicated bottom-half, we reduce the latency of the first waiter
> +	 * by avoiding a context switch, but incur additional coherent seqno
> +	 * reads when following the chain of request breadcrumbs.
> +	 */
> +	struct intel_breadcrumbs {
> +		spinlock_t lock; /* protects the lists of requests */
> +		struct rb_root requests; /* sorted by retirement */
> +		struct task_struct *first_waiter; /* bh for user interrupts */
> +		struct delayed_work irq; /* used to enable or fake inerrupts */
> +		bool irq_enabled : 1;
> +		bool rpm_wakelock : 1;

Are bitfields worth it? There aren't that many engine so maybe it loses 
more in terms of instructions generated.

> +	} breadcrumbs;
> +
>   	/*
>   	 * A pool of objects to use as shadow copies of client batch buffers
>   	 * when the command parser is enabled. Prevents the client from
> @@ -303,8 +325,6 @@ struct  intel_engine_cs {
>
>   	bool gpu_caches_dirty;
>
> -	wait_queue_head_t irq_queue;
> -
>   	struct intel_context *default_context;
>   	struct intel_context *last_context;
>
> @@ -506,4 +526,27 @@ void intel_ring_reserved_space_end(struct intel_ringbuffer *ringbuf);
>   /* Legacy ringbuffer specific portion of reservation code: */
>   int intel_ring_reserve_space(struct drm_i915_gem_request *request);
>
> +/* intel_breadcrumbs.c -- user interrupt bottom-half for waiters */
> +struct intel_breadcrumb {
> +	struct rb_node node;
> +	struct task_struct *task;
> +	u32 seqno;
> +};
> +void intel_engine_init_breadcrumbs(struct intel_engine_cs *engine);
> +bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request,
> +				  struct intel_breadcrumb *wait);
> +void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request,
> +				     struct intel_breadcrumb *wait);
> +static inline bool intel_engine_has_waiter(struct intel_engine_cs *engine)
> +{
> +	return READ_ONCE(engine->breadcrumbs.first_waiter);
> +}
> +static inline void intel_engine_wakeup(struct intel_engine_cs *engine)
> +{
> +	struct task_struct *task = READ_ONCE(engine->breadcrumbs.first_waiter);
> +	if (task)
> +		wake_up_process(task);

What guarantees task is a valid task at this point?

I know it is so extremely unlikely, but it can theoretically sample the 
first_waiter which then exists and disappears bafore the wake_up_process 
call.

> +}
> +void intel_engine_fini_breadcrumbs(struct intel_engine_cs *engine);
> +
>   #endif /* _INTEL_RINGBUFFER_H_ */
>

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-12-07 15:08         ` Tvrtko Ursulin
@ 2015-12-08 10:44           ` Chris Wilson
  2015-12-08 14:03             ` Tvrtko Ursulin
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-08 10:44 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Mon, Dec 07, 2015 at 03:08:28PM +0000, Tvrtko Ursulin wrote:
> 
> Hi,
> 
> On 03/12/15 16:22, Chris Wilson wrote:
> >One particularly stressful scenario consists of many independent tasks
> >all competing for GPU time and waiting upon the results (e.g. realtime
> >transcoding of many, many streams). One bottleneck in particular is that
> >each client waits on its own results, but every client is woken up after
> >every batchbuffer - hence the thunder of hooves as then every client must
> >do its heavyweight dance to read a coherent seqno to see if it is the
> >lucky one.
> >
> >Ideally, we only want one client to wake up after the interrupt and
> >check its request for completion. Since the requests must retire in
> >order, we can select the first client on the oldest request to be woken.
> >Once that client has completed his wait, we can then wake up the
> >next client and so on.
> >
> >v2: Convert from a kworker per engine into a dedicated kthread for the
> >bottom-half.
> >v3: Rename request members and tweak comments.
> >v4: Use a per-engine spinlock in the breadcrumbs bottom-half.
> >v5: Fix race in locklessly checking waiter status and kicking the task on
> >adding a new waiter.
> >v6: Fix deciding when to force the timer to hide missing interrupts.
> >v7: Move the bottom-half from the kthread to the first client process.
> 
> I like the idea a lot so I'll make some comments even before you
> post v9 or later. :)
> 
> >Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> >Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
> >Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
> >Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
> >Cc: Dave Gordon <david.s.gordon@intel.com>
> >---
> >  drivers/gpu/drm/i915/Makefile            |   1 +
> >  drivers/gpu/drm/i915/i915_debugfs.c      |   2 +-
> >  drivers/gpu/drm/i915/i915_drv.h          |   3 +-
> >  drivers/gpu/drm/i915/i915_gem.c          | 108 ++++++-----------
> >  drivers/gpu/drm/i915/i915_gpu_error.c    |   2 +-
> >  drivers/gpu/drm/i915/i915_irq.c          |  17 +--
> >  drivers/gpu/drm/i915/intel_breadcrumbs.c | 197 +++++++++++++++++++++++++++++++
> >  drivers/gpu/drm/i915/intel_lrc.c         |   5 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.c  |   5 +-
> >  drivers/gpu/drm/i915/intel_ringbuffer.h  |  47 +++++++-
> >  10 files changed, 297 insertions(+), 90 deletions(-)
> >  create mode 100644 drivers/gpu/drm/i915/intel_breadcrumbs.c
> >
> >diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
> >index 0851de07bd13..d3b9d3618719 100644
> >--- a/drivers/gpu/drm/i915/Makefile
> >+++ b/drivers/gpu/drm/i915/Makefile
> >@@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
> >  	  i915_gem_userptr.o \
> >  	  i915_gpu_error.o \
> >  	  i915_trace_points.o \
> >+	  intel_breadcrumbs.o \
> >  	  intel_lrc.o \
> >  	  intel_mocs.o \
> >  	  intel_ringbuffer.o \
> >diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
> >index 0583eda642d7..68fbe4d1f91d 100644
> >--- a/drivers/gpu/drm/i915/i915_debugfs.c
> >+++ b/drivers/gpu/drm/i915/i915_debugfs.c
> >@@ -2323,7 +2323,7 @@ static int count_irq_waiters(struct drm_i915_private *i915)
> >  	int i;
> >
> >  	for_each_ring(ring, i915, i)
> >-		count += ring->irq_refcount;
> >+		count += intel_engine_has_waiter(ring);
> 
> Don't care how many any longer? Okay in debugfs, but also in error capture?

It was nice that we were able to mark the requests with waiters for
free. We could add that, but it is no longer a side-effect of this
patch. Here I was tempted to do an intel_engine_count_breadcrumbs(), but
that doesn't match how the information is used by RPS so adding it to
debugfs for RPS is misleading.

> >-	/* Optimistic spin for the next jiffie before touching IRQs */
> >-	ret = __i915_spin_request(req, state);
> >-	if (ret == 0)
> >-		goto out;
> >+	if (INTEL_INFO(req->i915)->gen >= 6)
> >+		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
> >
> >-	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
> >-		ret = -ENODEV;
> >-		goto out;
> >+	/* Optimistic spin for the next jiffie before touching IRQs */
> 
> Exactly the opposite! :)

Bah. It's close, it certainly succinctly captures the intent if not the
implementation! s/jiffie/~jiffie/!

> >@@ -996,12 +996,11 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
> >
> >  static void notify_ring(struct intel_engine_cs *ring)
> >  {
> >-	if (!intel_ring_initialized(ring))
> >+	if (ring->i915 == NULL)
> 
> Glanced something about this in some other thread, maybe best to
> keep it consolidated in one patch?

This can die now. It was justifiable when I was using ring->i915
directly here, but now it is just (mostly) pointless churn.

> >+static bool irq_get(struct intel_engine_cs *engine)
> >+{
> >+	if (test_bit(engine->id, &engine->i915->gpu_error.test_irq_rings))
> >+		return false;
> 
> I don't understand why knowledge of this debugfs interface needs to
> be sprinkled around?

It's not though. This is the only use - the interface is to inject
faults into the waiter that ensure we cannot wake up after an
interrupt, and so force the hangcheck to intervene. As you thinking of
its counterpart missed_irq, which we use to track when we need to force
the fake interrupt?

> I mean, I am not even 100% sure what the
> debugfs interface is for. But it looks to be about software masking
> user interrupts on a ring? Why not just mask the interrupt delivery
> based on this mask at the source?

It is to simulate a broken wait, as opposed to masking interrupts.

> >+bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request,
> >+				  struct intel_breadcrumb *wait)
> >+{
> >+	struct intel_breadcrumbs *b = &request->ring->breadcrumbs;
> >+	struct rb_node **p, *parent;
> >+	bool first = false;
> >+
> >+	wait->task = current;
> >+	wait->seqno = request->seqno;
> >+
> >+	spin_lock(&b->lock);
> >+
> >+	/* First? Unmask the user interrupt */
> >+	if (b->first_waiter == NULL)
> >+		queue_delayed_work(system_highpri_wq, &b->irq, 0);
> 
> Who enables interrupts if the worker gets scheduled and completes
> before settig b->first_waiter below?

In this version, it is serialised by the spinlock. In the next version,
the ordering is explicit (no more delayed worker).
 
> >+	/* Insert the request into the retirment ordered list
> >+	 * of waiters by walking the rbtree. If we are the oldest
> >+	 * seqno in the tree (the first to be retired), then
> >+	 * set ourselves as the bottom-half.
> >+	 */
> >+	first = true;
> >+	parent = NULL;
> >+	p = &b->requests.rb_node;
> >+	while (*p) {
> >+		parent = *p;
> >+		if (i915_seqno_passed(wait->seqno, to_wait(parent)->seqno)) {
> >+			p = &parent->rb_right;
> >+			first = false;
> >+		} else
> >+			p = &parent->rb_left;
> >+	}
> >+	rb_link_node(&wait->node, parent, p);
> >+	rb_insert_color(&wait->node, &b->requests);
> 
> WARN_ON(!first && !b->first_waiter) maybe?

We will get a GPF on b->first_waiter soon enough :)

> 
> >+	if (first)
> >+		b->first_waiter = wait->task;

BUG_ON(b->first_waiter == NULL);

would be clearer I guess.

> >+
> >+	spin_unlock(&b->lock);
> >+
> >+	return first;
> >+}
> >+
> >+void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request,
> >+				     struct intel_breadcrumb *wait)
> >+{
> >+	struct intel_breadcrumbs *b = &request->ring->breadcrumbs;
> >+
> >+	spin_lock(&b->lock);
> >+
> >+	if (b->first_waiter == wait->task) {
> >+		struct intel_breadcrumb *next;
> >+		struct task_struct *task;
> >+
> >+		/* We are the current bottom-half. Find the next candidate,
> >+		 * the first waiter in the queue on the remaining oldest
> >+		 * request. As multiple seqnos may complete in the time it
> >+		 * takes us to wake up and find the next waiter, we have to
> >+		 * wake up that waiter for it to perform its own coherent
> >+		 * completion check.
> >+		 */
> 
> Equally, why wouldn't we wake up all waiters for which the requests
> have been completed?

Because we no longer track the requests to be completed, having moved to
a chain of waiting processes instead of a chain of requests. I could
insert a waitqueue into the intel_breadcrumb and that would indeed
necessitate locking in the irq handler (and irq locks everywhere :(.
 
> Would be a cheap check here and it would save a cascading growing
> latency as one task wakes up the next one.

Well, it can't be here since we may remove_waiter after a signal
(incomplete wait). So this part has to walk the chain of processes. Ugh,
and have to move the waitqueue from one waiter to the next...

> Even more benefit for multiple waiters on the same request.

Yes. That's what I liked about the previous version with the independent
waitqueue. However, latency of a single waiter is a compelling argument.

My head is still full of cold, maybe embedding a waitqueue into the
breadcrumb is an improvement - I am not sure yet.

> >+	/* Rather than have every client wait upon all user interrupts,
> >+	 * with the herd waking after every interrupt and each doing the
> >+	 * heavyweight seqno dance, we delegate the task to the first client.
> >+	 * After the interrupt, we wake up one client, who does the heavyweight
> >+	 * coherent seqno read and either goes back to sleep (if incomplete),
> >+	 * or wakes up the next client in the queue and so forth.
> >+	 *
> >+	 * With respect to walking the entire list of waiters in a single
> 
> s/With respect to/In contrast to/ or "Compared to"?

"Compared to", thanks.
 
> >+	 * dedicated bottom-half, we reduce the latency of the first waiter
> >+	 * by avoiding a context switch, but incur additional coherent seqno
> >+	 * reads when following the chain of request breadcrumbs.
> >+	 */
> >+	struct intel_breadcrumbs {
> >+		spinlock_t lock; /* protects the lists of requests */
> >+		struct rb_root requests; /* sorted by retirement */
> >+		struct task_struct *first_waiter; /* bh for user interrupts */
> >+		struct delayed_work irq; /* used to enable or fake inerrupts */
> >+		bool irq_enabled : 1;
> >+		bool rpm_wakelock : 1;
> 
> Are bitfields worth it? There aren't that many engine so maybe it
> loses more in terms of instructions generated.

I'm always optimistic that gcc gets it's act together. Not that I don't
have patches converting from bitfields to flags when gcc hurts. Here,
the potential extra instruction is going to dwarfed by the irq spinlocks
and worse along these paths.

> >+static inline void intel_engine_wakeup(struct intel_engine_cs *engine)
> >+{
> >+	struct task_struct *task = READ_ONCE(engine->breadcrumbs.first_waiter);
> >+	if (task)
> >+		wake_up_process(task);
> 
> What guarantees task is a valid task at this point?

Not an awful lot! Indirectly, I can point to the that the task struct
cannot be freed whilst we are in an irq (thanks to rcu), but other than
that it is simply that there is a much longer path between clearing the
first_waiter and freeing the task_struct than doing a wake_up_process on
the running task.
 
> I know it is so extremely unlikely, but it can theoretically sample
> the first_waiter which then exists and disappears bafore the
> wake_up_process call.

I can leave a comment: "I told you so -- Tvrtko" :)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-12-08 10:44           ` Chris Wilson
@ 2015-12-08 14:03             ` Tvrtko Ursulin
  2015-12-08 14:33               ` Chris Wilson
  0 siblings, 1 reply; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-12-08 14:03 UTC (permalink / raw)
  To: Chris Wilson, intel-gfx, Rogozhkin, Dmitry V, Gong, Zhipeng, Dave Gordon


On 08/12/15 10:44, Chris Wilson wrote:
> On Mon, Dec 07, 2015 at 03:08:28PM +0000, Tvrtko Ursulin wrote:
>>
>> Hi,
>>
>> On 03/12/15 16:22, Chris Wilson wrote:
>>> One particularly stressful scenario consists of many independent tasks
>>> all competing for GPU time and waiting upon the results (e.g. realtime
>>> transcoding of many, many streams). One bottleneck in particular is that
>>> each client waits on its own results, but every client is woken up after
>>> every batchbuffer - hence the thunder of hooves as then every client must
>>> do its heavyweight dance to read a coherent seqno to see if it is the
>>> lucky one.
>>>
>>> Ideally, we only want one client to wake up after the interrupt and
>>> check its request for completion. Since the requests must retire in
>>> order, we can select the first client on the oldest request to be woken.
>>> Once that client has completed his wait, we can then wake up the
>>> next client and so on.
>>>
>>> v2: Convert from a kworker per engine into a dedicated kthread for the
>>> bottom-half.
>>> v3: Rename request members and tweak comments.
>>> v4: Use a per-engine spinlock in the breadcrumbs bottom-half.
>>> v5: Fix race in locklessly checking waiter status and kicking the task on
>>> adding a new waiter.
>>> v6: Fix deciding when to force the timer to hide missing interrupts.
>>> v7: Move the bottom-half from the kthread to the first client process.
>>
>> I like the idea a lot so I'll make some comments even before you
>> post v9 or later. :)
>>
>>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
>>> Cc: "Rogozhkin, Dmitry V" <dmitry.v.rogozhkin@intel.com>
>>> Cc: "Gong, Zhipeng" <zhipeng.gong@intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
>>> Cc: Dave Gordon <david.s.gordon@intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/Makefile            |   1 +
>>>   drivers/gpu/drm/i915/i915_debugfs.c      |   2 +-
>>>   drivers/gpu/drm/i915/i915_drv.h          |   3 +-
>>>   drivers/gpu/drm/i915/i915_gem.c          | 108 ++++++-----------
>>>   drivers/gpu/drm/i915/i915_gpu_error.c    |   2 +-
>>>   drivers/gpu/drm/i915/i915_irq.c          |  17 +--
>>>   drivers/gpu/drm/i915/intel_breadcrumbs.c | 197 +++++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/i915/intel_lrc.c         |   5 +-
>>>   drivers/gpu/drm/i915/intel_ringbuffer.c  |   5 +-
>>>   drivers/gpu/drm/i915/intel_ringbuffer.h  |  47 +++++++-
>>>   10 files changed, 297 insertions(+), 90 deletions(-)
>>>   create mode 100644 drivers/gpu/drm/i915/intel_breadcrumbs.c
>>>
>>> diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
>>> index 0851de07bd13..d3b9d3618719 100644
>>> --- a/drivers/gpu/drm/i915/Makefile
>>> +++ b/drivers/gpu/drm/i915/Makefile
>>> @@ -35,6 +35,7 @@ i915-y += i915_cmd_parser.o \
>>>   	  i915_gem_userptr.o \
>>>   	  i915_gpu_error.o \
>>>   	  i915_trace_points.o \
>>> +	  intel_breadcrumbs.o \
>>>   	  intel_lrc.o \
>>>   	  intel_mocs.o \
>>>   	  intel_ringbuffer.o \
>>> diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
>>> index 0583eda642d7..68fbe4d1f91d 100644
>>> --- a/drivers/gpu/drm/i915/i915_debugfs.c
>>> +++ b/drivers/gpu/drm/i915/i915_debugfs.c
>>> @@ -2323,7 +2323,7 @@ static int count_irq_waiters(struct drm_i915_private *i915)
>>>   	int i;
>>>
>>>   	for_each_ring(ring, i915, i)
>>> -		count += ring->irq_refcount;
>>> +		count += intel_engine_has_waiter(ring);
>>
>> Don't care how many any longer? Okay in debugfs, but also in error capture?
>
> It was nice that we were able to mark the requests with waiters for
> free. We could add that, but it is no longer a side-effect of this
> patch. Here I was tempted to do an intel_engine_count_breadcrumbs(), but
> that doesn't match how the information is used by RPS so adding it to
> debugfs for RPS is misleading.
>
>>> -	/* Optimistic spin for the next jiffie before touching IRQs */
>>> -	ret = __i915_spin_request(req, state);
>>> -	if (ret == 0)
>>> -		goto out;
>>> +	if (INTEL_INFO(req->i915)->gen >= 6)
>>> +		gen6_rps_boost(req->i915, rps, req->emitted_jiffies);
>>>
>>> -	if (!irq_test_in_progress && WARN_ON(!ring->irq_get(ring))) {
>>> -		ret = -ENODEV;
>>> -		goto out;
>>> +	/* Optimistic spin for the next jiffie before touching IRQs */
>>
>> Exactly the opposite! :)
>
> Bah. It's close, it certainly succinctly captures the intent if not the
> implementation! s/jiffie/~jiffie/!
>
>>> @@ -996,12 +996,11 @@ static void ironlake_rps_change_irq_handler(struct drm_device *dev)
>>>
>>>   static void notify_ring(struct intel_engine_cs *ring)
>>>   {
>>> -	if (!intel_ring_initialized(ring))
>>> +	if (ring->i915 == NULL)
>>
>> Glanced something about this in some other thread, maybe best to
>> keep it consolidated in one patch?
>
> This can die now. It was justifiable when I was using ring->i915
> directly here, but now it is just (mostly) pointless churn.
>
>>> +static bool irq_get(struct intel_engine_cs *engine)
>>> +{
>>> +	if (test_bit(engine->id, &engine->i915->gpu_error.test_irq_rings))
>>> +		return false;
>>
>> I don't understand why knowledge of this debugfs interface needs to
>> be sprinkled around?
>
> It's not though. This is the only use - the interface is to inject
> faults into the waiter that ensure we cannot wake up after an
> interrupt, and so force the hangcheck to intervene. As you thinking of
> its counterpart missed_irq, which we use to track when we need to force
> the fake interrupt?
>
>> I mean, I am not even 100% sure what the
>> debugfs interface is for. But it looks to be about software masking
>> user interrupts on a ring? Why not just mask the interrupt delivery
>> based on this mask at the source?
>
> It is to simulate a broken wait, as opposed to masking interrupts.

I don't see the difference between a broken wait and masking interrupts?

It is masking interrupts before and after this patch. Broken wait could 
have been accomplished by doing the test from wait when 
request_completed is called.

Anyway, I don't get it.

>>> +bool intel_breadcrumbs_add_waiter(struct drm_i915_gem_request *request,
>>> +				  struct intel_breadcrumb *wait)
>>> +{
>>> +	struct intel_breadcrumbs *b = &request->ring->breadcrumbs;
>>> +	struct rb_node **p, *parent;
>>> +	bool first = false;
>>> +
>>> +	wait->task = current;
>>> +	wait->seqno = request->seqno;
>>> +
>>> +	spin_lock(&b->lock);
>>> +
>>> +	/* First? Unmask the user interrupt */
>>> +	if (b->first_waiter == NULL)
>>> +		queue_delayed_work(system_highpri_wq, &b->irq, 0);
>>
>> Who enables interrupts if the worker gets scheduled and completes
>> before settig b->first_waiter below?
>
> In this version, it is serialised by the spinlock. In the next version,
> the ordering is explicit (no more delayed worker).
>
>>> +	/* Insert the request into the retirment ordered list
>>> +	 * of waiters by walking the rbtree. If we are the oldest
>>> +	 * seqno in the tree (the first to be retired), then
>>> +	 * set ourselves as the bottom-half.
>>> +	 */
>>> +	first = true;
>>> +	parent = NULL;
>>> +	p = &b->requests.rb_node;
>>> +	while (*p) {
>>> +		parent = *p;
>>> +		if (i915_seqno_passed(wait->seqno, to_wait(parent)->seqno)) {
>>> +			p = &parent->rb_right;
>>> +			first = false;
>>> +		} else
>>> +			p = &parent->rb_left;
>>> +	}
>>> +	rb_link_node(&wait->node, parent, p);
>>> +	rb_insert_color(&wait->node, &b->requests);
>>
>> WARN_ON(!first && !b->first_waiter) maybe?
>
> We will get a GPF on b->first_waiter soon enough :)
>
>>
>>> +	if (first)
>>> +		b->first_waiter = wait->task;
>
> BUG_ON(b->first_waiter == NULL);
>
> would be clearer I guess.
>
>>> +
>>> +	spin_unlock(&b->lock);
>>> +
>>> +	return first;
>>> +}
>>> +
>>> +void intel_breadcrumbs_remove_waiter(struct drm_i915_gem_request *request,
>>> +				     struct intel_breadcrumb *wait)
>>> +{
>>> +	struct intel_breadcrumbs *b = &request->ring->breadcrumbs;
>>> +
>>> +	spin_lock(&b->lock);
>>> +
>>> +	if (b->first_waiter == wait->task) {
>>> +		struct intel_breadcrumb *next;
>>> +		struct task_struct *task;
>>> +
>>> +		/* We are the current bottom-half. Find the next candidate,
>>> +		 * the first waiter in the queue on the remaining oldest
>>> +		 * request. As multiple seqnos may complete in the time it
>>> +		 * takes us to wake up and find the next waiter, we have to
>>> +		 * wake up that waiter for it to perform its own coherent
>>> +		 * completion check.
>>> +		 */
>>
>> Equally, why wouldn't we wake up all waiters for which the requests
>> have been completed?
>
> Because we no longer track the requests to be completed, having moved to
> a chain of waiting processes instead of a chain of requests. I could
> insert a waitqueue into the intel_breadcrumb and that would indeed
> necessitate locking in the irq handler (and irq locks everywhere :(.

You have a tree of seqnos each with a wait->task and current seqno on 
the engine can be queried. So I don't see where is the problem?

>> Would be a cheap check here and it would save a cascading growing
>> latency as one task wakes up the next one.
>
> Well, it can't be here since we may remove_waiter after a signal
> (incomplete wait). So this part has to walk the chain of processes. Ugh,
> and have to move the waitqueue from one waiter to the next...

Ok on interrupted waiters it makes no sense, but on normal waiter 
removal it would just mean comparing engine->get_seqno() vs the first 
waiter seqno and waking up all until the uncompleted one is found?

>> Even more benefit for multiple waiters on the same request.
>
> Yes. That's what I liked about the previous version with the independent
> waitqueue. However, latency of a single waiter is a compelling argument.
>
> My head is still full of cold, maybe embedding a waitqueue into the
> breadcrumb is an improvement - I am not sure yet.

Leave it for some other day then.

>>> +	/* Rather than have every client wait upon all user interrupts,
>>> +	 * with the herd waking after every interrupt and each doing the
>>> +	 * heavyweight seqno dance, we delegate the task to the first client.
>>> +	 * After the interrupt, we wake up one client, who does the heavyweight
>>> +	 * coherent seqno read and either goes back to sleep (if incomplete),
>>> +	 * or wakes up the next client in the queue and so forth.
>>> +	 *
>>> +	 * With respect to walking the entire list of waiters in a single
>>
>> s/With respect to/In contrast to/ or "Compared to"?
>
> "Compared to", thanks.
>
>>> +	 * dedicated bottom-half, we reduce the latency of the first waiter
>>> +	 * by avoiding a context switch, but incur additional coherent seqno
>>> +	 * reads when following the chain of request breadcrumbs.
>>> +	 */
>>> +	struct intel_breadcrumbs {
>>> +		spinlock_t lock; /* protects the lists of requests */
>>> +		struct rb_root requests; /* sorted by retirement */
>>> +		struct task_struct *first_waiter; /* bh for user interrupts */
>>> +		struct delayed_work irq; /* used to enable or fake inerrupts */
>>> +		bool irq_enabled : 1;
>>> +		bool rpm_wakelock : 1;
>>
>> Are bitfields worth it? There aren't that many engine so maybe it
>> loses more in terms of instructions generated.
>
> I'm always optimistic that gcc gets it's act together. Not that I don't
> have patches converting from bitfields to flags when gcc hurts. Here,
> the potential extra instruction is going to dwarfed by the irq spinlocks
> and worse along these paths.
>
>>> +static inline void intel_engine_wakeup(struct intel_engine_cs *engine)
>>> +{
>>> +	struct task_struct *task = READ_ONCE(engine->breadcrumbs.first_waiter);
>>> +	if (task)
>>> +		wake_up_process(task);
>>
>> What guarantees task is a valid task at this point?
>
> Not an awful lot! Indirectly, I can point to the that the task struct
> cannot be freed whilst we are in an irq (thanks to rcu), but other than
> that it is simply that there is a much longer path between clearing the
> first_waiter and freeing the task_struct than doing a wake_up_process on
> the running task.
>
>> I know it is so extremely unlikely, but it can theoretically sample
>> the first_waiter which then exists and disappears bafore the
>> wake_up_process call.
>
> I can leave a comment: "I told you so -- Tvrtko" :)

Hm, it possibly needs a synchronize_irq before remove_waiter exits then?

Regards,

Tvrtko
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-12-08 14:03             ` Tvrtko Ursulin
@ 2015-12-08 14:33               ` Chris Wilson
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
  0 siblings, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-08 14:33 UTC (permalink / raw)
  To: Tvrtko Ursulin; +Cc: intel-gfx

On Tue, Dec 08, 2015 at 02:03:39PM +0000, Tvrtko Ursulin wrote:
> On 08/12/15 10:44, Chris Wilson wrote:
> >On Mon, Dec 07, 2015 at 03:08:28PM +0000, Tvrtko Ursulin wrote:
> >>Equally, why wouldn't we wake up all waiters for which the requests
> >>have been completed?
> >
> >Because we no longer track the requests to be completed, having moved to
> >a chain of waiting processes instead of a chain of requests. I could
> >insert a waitqueue into the intel_breadcrumb and that would indeed
> >necessitate locking in the irq handler (and irq locks everywhere :(.
> 
> You have a tree of seqnos each with a wait->task and current seqno
> on the engine can be queried. So I don't see where is the problem?

The "problem" is that every process will do its own post-irq seqno check
after being woken up, then grab the common spinlock to remove itself
from the tree.

We could avoid that by using RB_NODE_EMPTY shortcircuiting I suppose.
 
> >>Would be a cheap check here and it would save a cascading growing
> >>latency as one task wakes up the next one.
> >
> >Well, it can't be here since we may remove_waiter after a signal
> >(incomplete wait). So this part has to walk the chain of processes. Ugh,
> >and have to move the waitqueue from one waiter to the next...
> 
> Ok on interrupted waiters it makes no sense, but on normal waiter
> removal it would just mean comparing engine->get_seqno() vs the
> first waiter seqno and waking up all until the uncompleted one is
> found?

It's a tradeoff whether it can be written more neatly than:

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index be76086..474636f 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -1279,6 +1279,9 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                        break;
                }
 
+               if (RB_EMPTY_NODE(&wait.node))
+                       break;
+
                set_task_state(wait.task, state);
 
                /* Before we do the heavier coherent read of the seqno,
@@ -1312,7 +1315,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
                        break;
        }
 out:
-       intel_engine_remove_breadcrumb(req->ring, &wait);
+       intel_engine_remove_breadcrumb(req->ring, &wait, ret);
        __set_task_state(wait.task, TASK_RUNNING);
        trace_i915_gem_request_wait_end(req);
 
diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
index ae3ee3c..421c214 100644
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -169,10 +169,14 @@ void intel_engine_enable_fake_irq(struct intel_engine_cs *engine)
 }
 
 void intel_engine_remove_breadcrumb(struct intel_engine_cs *engine,
-                                   struct intel_breadcrumb *wait)
+                                   struct intel_breadcrumb *wait,
+                                   int ret)
 {
        struct intel_breadcrumbs *b = &engine->breadcrumbs;
 
+       if (RB_EMPTY_NODE(&wait->node))
+               return;
+
        spin_lock(&b->lock);
 
        if (b->first_waiter == wait->task) {
@@ -187,6 +191,18 @@ void intel_engine_remove_breadcrumb(struct intel_engine_cs *engine,
                 * completion check.
                 */
                next = rb_next(&wait->node);
+               if (ret == 0) {
+                       u32 seqno = intel_ring_get_seqno(engine);
+                       while (next &&
+                              i915_seqno_passed(seqno,
+                                                to_crumb(next)->seqno)) {
+                               struct rb_node *n = rb_next(next);
+                               wake_up_process(next->task);
+                               rb_erase(next, &b->requests);
+                               RB_CLEAR_NODE(next);
+                               next = n;
+                       }
+               }
                task = next ? to_crumb(next)->task : NULL;
 
                b->first_waiter = task;

My biggest complaint is that we are mixing request-complete and direct
evaluation of i915_seqno_passed now. (I expect that we can tidy the code
up somewhat.)
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply related	[flat|nested] 92+ messages in thread

* Re: [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd
  2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
                                     ` (12 preceding siblings ...)
  2015-11-23 11:38                   ` [RFC 00/12] Convert requests to use struct fence John Harrison
@ 2015-12-08 14:53                   ` Dave Gordon
  13 siblings, 0 replies; 92+ messages in thread
From: Dave Gordon @ 2015-12-08 14:53 UTC (permalink / raw)
  To: Chris Wilson, Tvrtko Ursulin, intel-gfx, Rogozhkin, Dmitry V,
	Gong, Zhipeng, Harrison, John C

On 08/12/15 14:33, Chris Wilson wrote:
> On Tue, Dec 08, 2015 at 02:03:39PM +0000, Tvrtko Ursulin wrote:
>> On 08/12/15 10:44, Chris Wilson wrote:
>>> On Mon, Dec 07, 2015 at 03:08:28PM +0000, Tvrtko Ursulin wrote:
>>>> Equally, why wouldn't we wake up all waiters for which the requests
>>>> have been completed?
>>>
>>> Because we no longer track the requests to be completed, having moved to
>>> a chain of waiting processes instead of a chain of requests. I could
>>> insert a waitqueue into the intel_breadcrumb and that would indeed
>>> necessitate locking in the irq handler (and irq locks everywhere :(.
>>
>> You have a tree of seqnos each with a wait->task and current seqno
>> on the engine can be queried. So I don't see where is the problem?
>
> The "problem" is that every process will do its own post-irq seqno check
> after being woken up, then grab the common spinlock to remove itself
> from the tree.
>
> We could avoid that by using RB_NODE_EMPTY shortcircuiting I suppose.
>
>>>> Would be a cheap check here and it would save a cascading growing
>>>> latency as one task wakes up the next one.
>>>
>>> Well, it can't be here since we may remove_waiter after a signal
>>> (incomplete wait). So this part has to walk the chain of processes. Ugh,
>>> and have to move the waitqueue from one waiter to the next...
>>
>> Ok on interrupted waiters it makes no sense, but on normal waiter
>> removal it would just mean comparing engine->get_seqno() vs the
>> first waiter seqno and waking up all until the uncompleted one is
>> found?
>
> It's a tradeoff whether it can be written more neatly than:
>
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index be76086..474636f 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1279,6 +1279,9 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>                          break;
>                  }
>
> +               if (RB_EMPTY_NODE(&wait.node))
> +                       break;
> +
>                  set_task_state(wait.task, state);
>
>                  /* Before we do the heavier coherent read of the seqno,
> @@ -1312,7 +1315,7 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>                          break;
>          }
>   out:
> -       intel_engine_remove_breadcrumb(req->ring, &wait);
> +       intel_engine_remove_breadcrumb(req->ring, &wait, ret);
>          __set_task_state(wait.task, TASK_RUNNING);
>          trace_i915_gem_request_wait_end(req);
>
> diff --git a/drivers/gpu/drm/i915/intel_breadcrumbs.c b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> index ae3ee3c..421c214 100644
> --- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
> +++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
> @@ -169,10 +169,14 @@ void intel_engine_enable_fake_irq(struct intel_engine_cs *engine)
>   }
>
>   void intel_engine_remove_breadcrumb(struct intel_engine_cs *engine,
> -                                   struct intel_breadcrumb *wait)
> +                                   struct intel_breadcrumb *wait,
> +                                   int ret)
>   {
>          struct intel_breadcrumbs *b = &engine->breadcrumbs;
>
> +       if (RB_EMPTY_NODE(&wait->node))
> +               return;
> +
>          spin_lock(&b->lock);
>
>          if (b->first_waiter == wait->task) {
> @@ -187,6 +191,18 @@ void intel_engine_remove_breadcrumb(struct intel_engine_cs *engine,
>                   * completion check.
>                   */
>                  next = rb_next(&wait->node);
> +               if (ret == 0) {
> +                       u32 seqno = intel_ring_get_seqno(engine);
> +                       while (next &&
> +                              i915_seqno_passed(seqno,
> +                                                to_crumb(next)->seqno)) {
> +                               struct rb_node *n = rb_next(next);
> +                               wake_up_process(next->task);
> +                               rb_erase(next, &b->requests);
> +                               RB_CLEAR_NODE(next);
> +                               next = n;
> +                       }
> +               }
>                  task = next ? to_crumb(next)->task : NULL;
>
>                  b->first_waiter = task;
>
> My biggest complaint is that we are mixing request-complete and direct
> evaluation of i915_seqno_passed now. (I expect that we can tidy the code
> up somewhat.)
> -Chris

So how is this going to work in the new "struct fence" regime? (See John 
Harrison's RFC of 23rd November).

The fence structure contains the per-request completion indicator, so 
once a fence has been signalled the owner only needs to check that, and 
doesn't need to reevaluate any seqno comparisons after waking up.

Does that help?

.Dave.
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request()
  2015-12-03  9:14             ` Daniel Vetter
  2015-12-03  9:41               ` Chris Wilson
@ 2015-12-11  9:02               ` Chris Wilson
  2015-12-11 16:46                 ` Daniel Vetter
  1 sibling, 1 reply; 92+ messages in thread
From: Chris Wilson @ 2015-12-11  9:02 UTC (permalink / raw)
  To: Daniel Vetter; +Cc: daniel.vetter, intel-gfx

On Thu, Dec 03, 2015 at 10:14:54AM +0100, Daniel Vetter wrote:
> On Tue, Dec 01, 2015 at 11:05:35AM +0000, Chris Wilson wrote:
> > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > index 4447e73b54db..73c61b94f7fd 100644
> > --- a/drivers/gpu/drm/i915/intel_display.c
> > +++ b/drivers/gpu/drm/i915/intel_display.c
> > @@ -13315,23 +13309,15 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
> >  
> >  			ret = __i915_wait_request(intel_plane_state->wait_req,
> >  						  true, NULL, NULL);
> > -
> > -			/* Swallow -EIO errors to allow updates during hw lockup. */
> > -			if (ret == -EIO)
> > -				ret = 0;
> > -
> > -			if (ret)
> > +			if (ret) {
> > +				mutex_lock(&dev->struct_mutex);
> > +				drm_atomic_helper_cleanup_planes(dev, state);
> > +				mutex_unlock(&dev->struct_mutex);
> >  				break;
> > +			}
> >  		}
> > -
> > -		if (!ret)
> > -			return 0;
> > -
> > -		mutex_lock(&dev->struct_mutex);
> > -		drm_atomic_helper_cleanup_planes(dev, state);
> >  	}
> >  
> > -	mutex_unlock(&dev->struct_mutex);
> 
> Sneaking in lockless waits! Separate patch please.

No, it is just badly written code. The wait is already lockless but the
lock is dropped and retaken around the error paths in such a manner that
you cannot see this from a glimpse.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [RFC 08/12] drm/i915: Interrupt driven fences
  2015-11-23 11:34                   ` [RFC 08/12] drm/i915: Interrupt driven fences John.C.Harrison
@ 2015-12-11 12:17                     ` Tvrtko Ursulin
  0 siblings, 0 replies; 92+ messages in thread
From: Tvrtko Ursulin @ 2015-12-11 12:17 UTC (permalink / raw)
  To: John.C.Harrison, Intel-GFX


Hi,

Some random comments, mostly from the point of view of solving the 
thundering herd problem.

On 23/11/15 11:34, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
>
> The intended usage model for struct fence is that the signalled status
> should be set on demand rather than polled. That is, there should not
> be a need for a 'signaled' function to be called everytime the status
> is queried. Instead, 'something' should be done to enable a signal
> callback from the hardware which will update the state directly. In
> the case of requests, this is the seqno update interrupt. The idea is
> that this callback will only be enabled on demand when something
> actually tries to wait on the fence.
>
> This change removes the polling test and replaces it with the callback
> scheme. Each fence is added to a 'please poke me' list at the start of
> i915_add_request(). The interrupt handler then scans through the 'poke
> me' list when a new seqno pops out and signals any matching
> fence/request. The fence is then removed from the list so the entire
> request stack does not need to be scanned every time. Note that the
> fence is added to the list before the commands to generate the seqno
> interrupt are added to the ring. Thus the sequence is guaranteed to be
> race free if the interrupt is already enabled.
>
> Note that the interrupt is only enabled on demand (i.e. when
> __wait_request() is called). Thus there is still a potential race when
> enabling the interrupt as the request may already have completed.
> However, this is simply solved by calling the interrupt processing
> code immediately after enabling the interrupt and thereby checking for
> already completed requests.
>
> Lastly, the ring clean up code has the possibility to cancel
> outstanding requests (e.g. because TDR has reset the ring). These
> requests will never get signalled and so must be removed from the
> signal list manually. This is done by setting a 'cancelled' flag and
> then calling the regular notify/retire code path rather than
> attempting to duplicate the list manipulatation and clean up code in
> multiple places. This also avoid any race condition where the
> cancellation request might occur after/during the completion interrupt
> actually arriving.
>
> v2: Updated to take advantage of the request unreference no longer
> requiring the mutex lock.
>
> v3: Move the signal list processing around to prevent unsubmitted
> requests being added to the list. This was occurring on Android
> because the native sync implementation calls the
> fence->enable_signalling API immediately on fence creation.
>
> Updated after review comments by Tvrtko Ursulin. Renamed list nodes to
> 'link' instead of 'list'. Added support for returning an error code on
> a cancelled fence. Update list processing to be more efficient/safer
> with respect to spinlocks.
>
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> ---
>   drivers/gpu/drm/i915/i915_drv.h         |  10 ++
>   drivers/gpu/drm/i915/i915_gem.c         | 187 ++++++++++++++++++++++++++++++--
>   drivers/gpu/drm/i915/i915_irq.c         |   2 +
>   drivers/gpu/drm/i915/intel_lrc.c        |   2 +
>   drivers/gpu/drm/i915/intel_ringbuffer.c |   2 +
>   drivers/gpu/drm/i915/intel_ringbuffer.h |   2 +
>   6 files changed, 196 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index fbf591f..d013c6d 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -2187,7 +2187,12 @@ void i915_gem_track_fb(struct drm_i915_gem_object *old,
>   struct drm_i915_gem_request {
>   	/** Underlying object for implementing the signal/wait stuff. */
>   	struct fence fence;
> +	struct list_head signal_link;
> +	struct list_head unsignal_link;
>   	struct list_head delayed_free_link;
> +	bool cancelled;
> +	bool irq_enabled;
> +	bool signal_requested;
>
>   	/** On Which ring this request was generated */
>   	struct drm_i915_private *i915;
> @@ -2265,6 +2270,11 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   			   struct drm_i915_gem_request **req_out);
>   void i915_gem_request_cancel(struct drm_i915_gem_request *req);
>
> +void i915_gem_request_submit(struct drm_i915_gem_request *req);
> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req,
> +				       bool fence_locked);
> +void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked);
> +
>   int i915_create_fence_timeline(struct drm_device *dev,
>   			       struct intel_context *ctx,
>   			       struct intel_engine_cs *ring);
> diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
> index 171ae5f..2a0b346 100644
> --- a/drivers/gpu/drm/i915/i915_gem.c
> +++ b/drivers/gpu/drm/i915/i915_gem.c
> @@ -1165,6 +1165,8 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
>
>   	timeout = jiffies + 1;
>   	while (!need_resched()) {
> +		i915_gem_request_notify(req->ring, false);
> +

This looks is a bit heavyweight, hammering on the spinlock and 
interrupts on-off, why it is required to call this here instead of 
relying on interrupts which have been enabled already?

>   		if (i915_gem_request_completed(req))
>   			return 0;
>
> @@ -1173,6 +1175,9 @@ static int __i915_spin_request(struct drm_i915_gem_request *req)
>
>   		cpu_relax_lowlatency();
>   	}
> +
> +	i915_gem_request_notify(req->ring, false);
> +
>   	if (i915_gem_request_completed(req))
>   		return 0;
>
> @@ -1214,9 +1219,14 @@ int __i915_wait_request(struct drm_i915_gem_request *req,
>
>   	WARN(!intel_irqs_enabled(dev_priv), "IRQs disabled");
>
> -	if (list_empty(&req->list))
> +	if (i915_gem_request_completed(req))
>   		return 0;
>
> +	/*
> +	 * Enable interrupt completion of the request.
> +	 */
> +	fence_enable_sw_signaling(&req->fence);
> +

There is duplicated user interrupt handling in this function now. Code 
which does irq get/put directly is still there, but I think it should be 
removed.

>   	if (i915_gem_request_completed(req))
>   		return 0;
>
> @@ -1377,6 +1387,19 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
>   	list_del_init(&request->list);
>   	i915_gem_request_remove_from_client(request);
>
> +	/* In case the request is still in the signal pending list */

When would this happen? (comment hint)

> +	if (!list_empty(&request->signal_link)) {
> +		/*
> +		 * The request must be marked as cancelled and the underlying
> +		 * fence as both failed. NB: There is no explicit fence fail
> +		 * API, there is only a manual poke and signal.
> +		 */
> +		request->cancelled = true;
> +		/* How to propagate to any associated sync_fence??? */
> +		request->fence.status = -EIO;
> +		fence_signal_locked(&request->fence);

Does the cancelled request has to stay on the signal list? If it could 
be moved outside of the irq hot list it might be beneficial for list 
length and simplifying the code in i915_gem_request_notify.

> +	}
> +
>   	i915_gem_request_unreference(request);
>   }
>
> @@ -2535,6 +2558,12 @@ void __i915_add_request(struct drm_i915_gem_request *request,
>   	 */
>   	request->postfix = intel_ring_get_tail(ringbuf);
>
> +	/*
> +	 * Add the fence to the pending list before emitting the commands to
> +	 * generate a seqno notification interrupt.
> +	 */
> +	i915_gem_request_submit(request);
> +
>   	if (i915.enable_execlists)
>   		ret = ring->emit_request(request);
>   	else {
> @@ -2654,25 +2683,135 @@ static void i915_gem_request_free(struct drm_i915_gem_request *req)
>   		i915_gem_context_unreference(ctx);
>   	}
>
> +	if (req->irq_enabled)
> +		req->ring->irq_put(req->ring);
> +
>   	kmem_cache_free(req->i915->requests, req);
>   }
>
> -static bool i915_gem_request_enable_signaling(struct fence *req_fence)
> +/*
> + * The request is about to be submitted to the hardware so add the fence to
> + * the list of signalable fences.
> + *
> + * NB: This does not necessarily enable interrupts yet. That only occurs on
> + * demand when the request is actually waited on. However, adding it to the
> + * list early ensures that there is no race condition where the interrupt
> + * could pop out prematurely and thus be completely lost. The race is merely
> + * that the interrupt must be manually checked for after being enabled.
> + */
> +void i915_gem_request_submit(struct drm_i915_gem_request *req)
>   {
> -	/* Interrupt driven fences are not implemented yet.*/
> -	WARN(true, "This should not be called!");
> -	return true;
> +	unsigned long flags;
> +
> +	/*
> +	 * Always enable signal processing for the request's fence object
> +	 * before that request is submitted to the hardware. Thus there is no
> +	 * race condition whereby the interrupt could pop out before the
> +	 * request has been added to the signal list. Hence no need to check
> +	 * for completion, undo the list add and return false.
> +	 */
> +	i915_gem_request_reference(req);
> +	spin_lock_irqsave(&req->ring->fence_lock, flags);

This only gets called from non-irq context so could use spin_lock_irq.

> +	WARN_ON(!list_empty(&req->signal_link));
> +	list_add_tail(&req->signal_link, &req->ring->fence_signal_list);
> +	spin_unlock_irqrestore(&req->ring->fence_lock, flags);
> +
> +	/*
> +	 * NB: Interrupts are only enabled on demand. Thus there is still a
> +	 * race where the request could complete before the interrupt has
> +	 * been enabled. Thus care must be taken at that point.
> +	 */
> +
> +	 /* Have interrupts already been requested? */
> +	 if (req->signal_requested)

Some code path can enable signaling before the execbuf completed?

> +		i915_gem_request_enable_interrupt(req, false);
> +}
> +
> +/*
> + * The request is being actively waited on, so enable interrupt based
> + * completion signalling.
> + */
> +void i915_gem_request_enable_interrupt(struct drm_i915_gem_request *req,
> +				       bool fence_locked)
> +{
> +	if (req->irq_enabled)
> +		return;
> +
> +	WARN_ON(!req->ring->irq_get(req->ring));
> +	req->irq_enabled = true;

Probably shouldn't set the flag if irq_get failed not to unbalance the 
irq refcount.

> +
> +	/*
> +	 * Because the interrupt is only enabled on demand, there is a race
> +	 * where the interrupt can fire before anyone is looking for it. So
> +	 * do an explicit check for missed interrupts.
> +	 */
> +	i915_gem_request_notify(req->ring, fence_locked);
>   }
>
> -static bool i915_gem_request_is_completed(struct fence *req_fence)
> +static bool i915_gem_request_enable_signaling(struct fence *req_fence)
>   {
>   	struct drm_i915_gem_request *req = container_of(req_fence,
>   						 typeof(*req), fence);
> +
> +	/*
> +	 * No need to actually enable interrupt based processing until the
> +	 * request has been submitted to the hardware. At which point
> +	 * 'i915_gem_request_submit()' is called. So only really enable
> +	 * signalling in there. Just set a flag to say that interrupts are
> +	 * wanted when the request is eventually submitted. On the other hand
> +	 * if the request has already been submitted then interrupts do need
> +	 * to be enabled now.
> +	 */
> +
> +	req->signal_requested = true;
> +
> +	if (!list_empty(&req->signal_link))
> +		i915_gem_request_enable_interrupt(req, true);
> +
> +	return true;
> +}
> +
> +void i915_gem_request_notify(struct intel_engine_cs *ring, bool fence_locked)

I think name could be improve since this made me expect a request in the 
parameter list.

Maybe i915_gem_notify_requests ?

> +{
> +	struct drm_i915_gem_request *req, *req_next;
> +	unsigned long flags;
>   	u32 seqno;
>
> -	seqno = req->ring->get_seqno(req->ring, false/*lazy_coherency*/);
> +	if (list_empty(&ring->fence_signal_list))
> +		return;
> +
> +	if (!fence_locked)
> +		spin_lock_irqsave(&ring->fence_lock, flags);
> +
> +	seqno = ring->get_seqno(ring, false);

Could read the seqno outside the spinlock, although not that important 
since the primary caller of this is (or should be) the irq handler.

> +
> +	list_for_each_entry_safe(req, req_next, &ring->fence_signal_list, signal_link) {
> +		if (!req->cancelled) {
> +			if (!i915_seqno_passed(seqno, req->seqno))
> +				break;
> +		}
> +
> +		/*
> +		 * Start by removing the fence from the signal list otherwise
> +		 * the retire code can run concurrently and get confused.
> +		 */
> +		list_del_init(&req->signal_link);
> +
> +		if (!req->cancelled) {
> +			fence_signal_locked(&req->fence);
> +		}
> +
> +		if (req->irq_enabled) {
> +			req->ring->irq_put(req->ring);
> +			req->irq_enabled = false;
> +		}
> +
> +		/* Can't unreference here because that might grab fence_lock */
> +		list_add_tail(&req->unsignal_link, &ring->fence_unsignal_list);
> +	}
>
> -	return i915_seqno_passed(seqno, req->seqno);
> +	if (!fence_locked)
> +		spin_unlock_irqrestore(&ring->fence_lock, flags);
>   }
>
>   static const char *i915_gem_request_get_driver_name(struct fence *req_fence)
> @@ -2712,7 +2851,6 @@ static void i915_gem_request_fence_value_str(struct fence *req_fence, char *str,
>
>   static const struct fence_ops i915_gem_request_fops = {
>   	.enable_signaling	= i915_gem_request_enable_signaling,
> -	.signaled		= i915_gem_request_is_completed,
>   	.wait			= fence_default_wait,
>   	.release		= i915_gem_request_release,
>   	.get_driver_name	= i915_gem_request_get_driver_name,
> @@ -2795,6 +2933,7 @@ int i915_gem_request_alloc(struct intel_engine_cs *ring,
>   		goto err;
>   	}
>
> +	INIT_LIST_HEAD(&req->signal_link);
>   	fence_init(&req->fence, &i915_gem_request_fops, &ring->fence_lock,
>   		   ctx->engine[ring->id].fence_timeline.fence_context,
>   		   i915_fence_timeline_get_next_seqno(&ctx->engine[ring->id].fence_timeline));
> @@ -2832,6 +2971,11 @@ void i915_gem_request_cancel(struct drm_i915_gem_request *req)
>   {
>   	intel_ring_reserved_space_cancel(req->ringbuf);
>
> +	req->cancelled = true;
> +	/* How to propagate to any associated sync_fence??? */
> +	req->fence.status = -EINVAL;
> +	fence_signal_locked(&req->fence);

Same question from before, could you just move it to unsignaled list 
straight away? (And drop interrupts?)

> +
>   	i915_gem_request_unreference(req);
>   }
>
> @@ -2925,6 +3069,13 @@ static void i915_gem_reset_ring_cleanup(struct drm_i915_private *dev_priv,
>   		i915_gem_request_retire(request);
>   	}
>
> +	/*
> +	 * Tidy up anything left over. This includes a call to
> +	 * i915_gem_request_notify() which will make sure that any requests
> +	 * that were on the signal pending list get also cleaned up.
> +	 */
> +	i915_gem_retire_requests_ring(ring);
> +
>   	/* Having flushed all requests from all queues, we know that all
>   	 * ringbuffers must now be empty. However, since we do not reclaim
>   	 * all space when retiring the request (to prevent HEADs colliding
> @@ -2974,6 +3125,13 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
>
>   	WARN_ON(i915_verify_lists(ring->dev));
>
> +	/*
> +	 * If no-one has waited on a request recently then interrupts will
> +	 * not have been enabled and thus no requests will ever be marked as
> +	 * completed. So do an interrupt check now.
> +	 */
> +	i915_gem_request_notify(ring, false);
> +
>   	/* Retire requests first as we use it above for the early return.
>   	 * If we retire requests last, we may use a later seqno and so clear
>   	 * the requests lists without clearing the active list, leading to
> @@ -3015,6 +3173,15 @@ i915_gem_retire_requests_ring(struct intel_engine_cs *ring)
>   		i915_gem_request_assign(&ring->trace_irq_req, NULL);
>   	}
>
> +	/* Tidy up any requests that were recently signalled */
> +	spin_lock_irqsave(&ring->fence_lock, flags);

Could also use spin_lock_irq here I think.

> +	list_splice_init(&ring->fence_unsignal_list, &list_head);
> +	spin_unlock_irqrestore(&ring->fence_lock, flags);
> +	list_for_each_entry_safe(req, req_next, &list_head, unsignal_link) {
> +		list_del(&req->unsignal_link);
> +		i915_gem_request_unreference(req);
> +	}
> +
>   	/* Really free any requests that were recently unreferenced */
>   	spin_lock_irqsave(&ring->delayed_free_lock, flags);
>   	list_splice_init(&ring->delayed_free_list, &list_head);
> @@ -5066,6 +5233,8 @@ init_ring_lists(struct intel_engine_cs *ring)
>   {
>   	INIT_LIST_HEAD(&ring->active_list);
>   	INIT_LIST_HEAD(&ring->request_list);
> +	INIT_LIST_HEAD(&ring->fence_signal_list);
> +	INIT_LIST_HEAD(&ring->fence_unsignal_list);
>   	INIT_LIST_HEAD(&ring->delayed_free_list);
>   }
>
> diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
> index 68b094b..74f8552 100644
> --- a/drivers/gpu/drm/i915/i915_irq.c
> +++ b/drivers/gpu/drm/i915/i915_irq.c
> @@ -981,6 +981,8 @@ static void notify_ring(struct intel_engine_cs *ring)
>
>   	trace_i915_gem_request_notify(ring);
>
> +	i915_gem_request_notify(ring, false);
> +
>   	wake_up_all(&ring->irq_queue);

This is the big one - I think to solve the thundering herd problem this 
patch should also include the removal of ring->irq_queue since I don't 
think it needs to keep existing after it.

For example adding a req->wait_queue on which __i915_wait_request would 
wait instead and doing a wake_up_all on it from i915_gem_request_notify?

Because in this form, when I've ran this patch series and it caused the 
number of context switches to go up more than six times on one workload. 
Interrupts also went up but only by 50% in comparison and time spent in 
irq handlers doubled.

But context switches and 10% more cpu time look the most dramatic.

So it would be interesting to see if simply moving the wait queue to be 
per request could fix the majority of that or not.

>   }
>
> diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
> index 06a398a..76fc245 100644
> --- a/drivers/gpu/drm/i915/intel_lrc.c
> +++ b/drivers/gpu/drm/i915/intel_lrc.c
> @@ -1920,6 +1920,8 @@ static int logical_ring_init(struct drm_device *dev, struct intel_engine_cs *rin
>   	ring->dev = dev;
>   	INIT_LIST_HEAD(&ring->active_list);
>   	INIT_LIST_HEAD(&ring->request_list);
> +	INIT_LIST_HEAD(&ring->fence_signal_list);
> +	INIT_LIST_HEAD(&ring->fence_unsignal_list);
>   	INIT_LIST_HEAD(&ring->delayed_free_list);
>   	spin_lock_init(&ring->fence_lock);
>   	spin_lock_init(&ring->delayed_free_lock);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
> index e5573e7..1dec252 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.c
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
> @@ -2158,6 +2158,8 @@ static int intel_init_ring_buffer(struct drm_device *dev,
>   	INIT_LIST_HEAD(&ring->request_list);
>   	INIT_LIST_HEAD(&ring->execlist_queue);
>   	INIT_LIST_HEAD(&ring->buffers);
> +	INIT_LIST_HEAD(&ring->fence_signal_list);
> +	INIT_LIST_HEAD(&ring->fence_unsignal_list);
>   	INIT_LIST_HEAD(&ring->delayed_free_list);
>   	spin_lock_init(&ring->fence_lock);
>   	spin_lock_init(&ring->delayed_free_lock);
> diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
> index 77384ed..9d09edb 100644
> --- a/drivers/gpu/drm/i915/intel_ringbuffer.h
> +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
> @@ -354,6 +354,8 @@ struct  intel_engine_cs {
>   	u32 (*get_cmd_length_mask)(u32 cmd_header);
>
>   	spinlock_t fence_lock;
> +	struct list_head fence_signal_list;
> +	struct list_head fence_unsignal_list;
>   };
>
>   bool intel_ring_initialized(struct intel_engine_cs *ring);
>

Regards,

Tvrtko

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

* Re: [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request()
  2015-12-11  9:02               ` Chris Wilson
@ 2015-12-11 16:46                 ` Daniel Vetter
  0 siblings, 0 replies; 92+ messages in thread
From: Daniel Vetter @ 2015-12-11 16:46 UTC (permalink / raw)
  To: Chris Wilson, Daniel Vetter, intel-gfx, daniel.vetter

On Fri, Dec 11, 2015 at 09:02:18AM +0000, Chris Wilson wrote:
> On Thu, Dec 03, 2015 at 10:14:54AM +0100, Daniel Vetter wrote:
> > On Tue, Dec 01, 2015 at 11:05:35AM +0000, Chris Wilson wrote:
> > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
> > > index 4447e73b54db..73c61b94f7fd 100644
> > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > @@ -13315,23 +13309,15 @@ static int intel_atomic_prepare_commit(struct drm_device *dev,
> > >  
> > >  			ret = __i915_wait_request(intel_plane_state->wait_req,
> > >  						  true, NULL, NULL);
> > > -
> > > -			/* Swallow -EIO errors to allow updates during hw lockup. */
> > > -			if (ret == -EIO)
> > > -				ret = 0;
> > > -
> > > -			if (ret)
> > > +			if (ret) {
> > > +				mutex_lock(&dev->struct_mutex);
> > > +				drm_atomic_helper_cleanup_planes(dev, state);
> > > +				mutex_unlock(&dev->struct_mutex);
> > >  				break;
> > > +			}
> > >  		}
> > > -
> > > -		if (!ret)
> > > -			return 0;
> > > -
> > > -		mutex_lock(&dev->struct_mutex);
> > > -		drm_atomic_helper_cleanup_planes(dev, state);
> > >  	}
> > >  
> > > -	mutex_unlock(&dev->struct_mutex);
> > 
> > Sneaking in lockless waits! Separate patch please.
> 
> No, it is just badly written code. The wait is already lockless but the
> lock is dropped and retaken around the error paths in such a manner that
> you cannot see this from a glimpse.

Indeed lack of diff context made me all confused, I stand corrected. Looks
good.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 92+ messages in thread

end of thread, other threads:[~2015-12-11 16:46 UTC | newest]

Thread overview: 92+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-29  8:47 i915_wait_request scaling Chris Wilson
2015-11-29  8:47 ` [PATCH 01/15] drm/i915: Break busywaiting for requests on pending signals Chris Wilson
2015-11-30 10:01   ` Tvrtko Ursulin
2015-11-29  8:48 ` [PATCH 02/15] drm/i915: Limit the busy wait on requests to 10us not 10ms! Chris Wilson
2015-11-30 10:02   ` Tvrtko Ursulin
2015-11-30 10:08     ` Chris Wilson
2015-11-29  8:48 ` [PATCH 03/15] drm/i915: Only spin whilst waiting on the current request Chris Wilson
2015-11-30 10:06   ` Tvrtko Ursulin
2015-12-01 15:47     ` Dave Gordon
2015-12-01 15:58       ` Chris Wilson
2015-12-01 16:44         ` Dave Gordon
2015-12-03  8:52           ` Daniel Vetter
2015-11-29  8:48 ` [PATCH 04/15] drm/i915: Cache the reset_counter for the request Chris Wilson
2015-12-01  8:31   ` Daniel Vetter
2015-12-01  8:47     ` Chris Wilson
2015-12-01  9:15       ` Chris Wilson
2015-12-01 11:05         ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Chris Wilson
2015-12-01 11:05           ` [PATCH 2/3] drm/i915: Store the reset counter when constructing a request Chris Wilson
2015-12-03  8:59             ` Daniel Vetter
2015-12-01 11:05           ` [PATCH 3/3] drm/i915: Prevent leaking of -EIO from i915_wait_request() Chris Wilson
2015-12-03  9:14             ` Daniel Vetter
2015-12-03  9:41               ` Chris Wilson
2015-12-11  9:02               ` Chris Wilson
2015-12-11 16:46                 ` Daniel Vetter
2015-12-03  8:57           ` [PATCH 1/3] drm/i915: Hide the atomic_read(reset_counter) behind a helper Daniel Vetter
2015-12-03  9:02             ` Chris Wilson
2015-12-03  9:20               ` Daniel Vetter
2015-11-29  8:48 ` [PATCH 05/15] drm/i915: Suppress error message when GPU resets are disabled Chris Wilson
2015-12-01  8:30   ` Daniel Vetter
2015-11-29  8:48 ` [PATCH 06/15] drm/i915: Delay queuing hangcheck to wait-request Chris Wilson
2015-11-29  8:48 ` [PATCH 07/15] drm/i915: Check the timeout passed to i915_wait_request Chris Wilson
2015-11-30 10:14   ` Tvrtko Ursulin
2015-11-30 10:19     ` Chris Wilson
2015-11-30 10:27       ` Tvrtko Ursulin
2015-11-30 10:22   ` Chris Wilson
2015-11-30 10:28     ` Tvrtko Ursulin
2015-11-29  8:48 ` [PATCH 08/15] drm/i915: Slaughter the thundering i915_wait_request herd Chris Wilson
2015-11-30 10:53   ` Chris Wilson
2015-11-30 12:09     ` Tvrtko Ursulin
2015-11-30 12:38       ` Chris Wilson
2015-11-30 13:33         ` Tvrtko Ursulin
2015-11-30 14:30           ` Chris Wilson
2015-11-30 12:05   ` Tvrtko Ursulin
2015-11-30 12:30     ` Chris Wilson
2015-11-30 13:32       ` Tvrtko Ursulin
2015-11-30 14:18         ` Chris Wilson
2015-12-01 17:06           ` Dave Gordon
2015-11-30 14:26         ` Chris Wilson
2015-11-30 14:34   ` [PATCH v4] " Chris Wilson
2015-11-30 16:30     ` Chris Wilson
2015-11-30 16:40       ` Chris Wilson
2015-12-01 18:34     ` Dave Gordon
2015-12-03 16:22       ` [PATCH v7] " Chris Wilson
2015-12-07 15:08         ` Tvrtko Ursulin
2015-12-08 10:44           ` Chris Wilson
2015-12-08 14:03             ` Tvrtko Ursulin
2015-12-08 14:33               ` Chris Wilson
2015-11-23 11:34                 ` [RFC 00/12] Convert requests to use struct fence John.C.Harrison
2015-11-23 11:34                   ` [RFC 01/12] staging/android/sync: Support sync points created from dma-fences John.C.Harrison
2015-11-23 13:29                     ` Maarten Lankhorst
2015-11-23 13:31                     ` [Intel-gfx] " Tvrtko Ursulin
2015-11-23 11:34                   ` [RFC 02/12] staging/android/sync: add sync_fence_create_dma John.C.Harrison
2015-11-23 13:27                     ` Maarten Lankhorst
2015-11-23 13:38                       ` John Harrison
2015-11-23 13:44                         ` Tvrtko Ursulin
2015-11-23 13:48                           ` Maarten Lankhorst
2015-11-23 11:34                   ` [RFC 03/12] staging/android/sync: Move sync framework out of staging John.C.Harrison
2015-11-23 11:34                   ` [RFC 04/12] drm/i915: Convert requests to use struct fence John.C.Harrison
2015-11-23 11:34                   ` [RFC 05/12] drm/i915: Removed now redudant parameter to i915_gem_request_completed() John.C.Harrison
2015-11-23 11:34                   ` [RFC 06/12] drm/i915: Add per context timelines to fence object John.C.Harrison
2015-11-23 11:34                   ` [RFC 07/12] drm/i915: Delay the freeing of requests until retire time John.C.Harrison
2015-11-23 11:34                   ` [RFC 08/12] drm/i915: Interrupt driven fences John.C.Harrison
2015-12-11 12:17                     ` Tvrtko Ursulin
2015-11-23 11:34                   ` [RFC 09/12] drm/i915: Updated request structure tracing John.C.Harrison
2015-11-23 11:34                   ` [RFC 10/12] android/sync: Fix reversed sense of signaled fence John.C.Harrison
2015-11-23 11:34                   ` [RFC 11/12] drm/i915: Add sync framework support to execbuff IOCTL John.C.Harrison
2015-11-23 11:34                   ` [RFC 12/12] drm/i915: Cache last IRQ seqno to reduce IRQ overhead John.C.Harrison
2015-11-23 11:38                   ` [RFC 00/12] Convert requests to use struct fence John Harrison
2015-12-08 14:53                   ` [PATCH v7] drm/i915: Slaughter the thundering i915_wait_request herd Dave Gordon
2015-11-30 15:45   ` [PATCH] drm/i915: Convert trace-irq to the breadcrumb waiter Chris Wilson
2015-11-29  8:48 ` [PATCH 09/15] drm/i915: Separate out the seqno-barrier from engine->get_seqno Chris Wilson
2015-11-29  8:48 ` [PATCH 10/15] drm/i915: Remove the lazy_coherency parameter from request-completed? Chris Wilson
2015-11-29  8:48 ` [PATCH 11/15] drm/i915: Use HWS for seqno tracking everywhere Chris Wilson
2015-11-29  8:48 ` [PATCH 12/15] drm/i915: Reduce seqno/irq barrier to a clflush on legacy gen6+ Chris Wilson
2015-11-29  8:48 ` [PATCH 13/15] drm/i915: Stop setting wraparound seqno on initialisation Chris Wilson
2015-12-01 16:57   ` Dave Gordon
2015-12-04  9:36     ` Daniel Vetter
2015-12-04  9:51       ` Chris Wilson
2015-11-29  8:48 ` [PATCH 14/15] drm/i915: Only query timestamp when measuring elapsed time Chris Wilson
2015-11-30 10:19   ` Tvrtko Ursulin
2015-11-30 14:31     ` Chris Wilson
2015-11-29  8:48 ` [PATCH 15/15] drm/i915: On GPU reset, set the HWS breadcrumb to the last seqno Chris Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.