All of lore.kernel.org
 help / color / mirror / Atom feed
From: <gregkh@linuxfoundation.org>
To: airlied@linux.ie, chris@chris-wilson.co.uk, daniel@ffwll.ch,
	dri-devel@lists.freedesktop.org, greg@kroah.com,
	gregkh@linuxfoundation.org, intel-gfx@lists.freedesktop.org,
	jani.nikula@linux.intel.com, joonas.lahtinen@linux.intel.com,
	rodrigo.vivi@intel.com, sultan@kerneltoast.com
Cc: stable-commits@vger.kernel.org
Subject: Patch "drm/i915: Fix ref->mutex deadlock in i915_active_wait()" has been added to the 5.4-stable tree
Date: Fri, 10 Apr 2020 13:46:53 +0200	[thread overview]
Message-ID: <1586519213118220@kroah.com> (raw)
In-Reply-To: <20200407071809.3148-1-sultan@kerneltoast.com>


This is a note to let you know that I've just added the patch titled

    drm/i915: Fix ref->mutex deadlock in i915_active_wait()

to the 5.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     drm-i915-fix-ref-mutex-deadlock-in-i915_active_wait.patch
and it can be found in the queue-5.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


From sultan@kerneltoast.com  Fri Apr 10 11:07:34 2020
From: Sultan Alsawaf <sultan@kerneltoast.com>
Date: Tue,  7 Apr 2020 00:18:09 -0700
Subject: drm/i915: Fix ref->mutex deadlock in i915_active_wait()
To: Greg KH <greg@kroah.com>
Cc: stable@vger.kernel.org, Jani Nikula <jani.nikula@linux.intel.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>, Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Sultan Alsawaf <sultan@kerneltoast.com>
Message-ID: <20200407071809.3148-1-sultan@kerneltoast.com>

From: Sultan Alsawaf <sultan@kerneltoast.com>

The following deadlock exists in i915_active_wait() due to a double lock
on ref->mutex (call chain listed in order from top to bottom):
 i915_active_wait();
 mutex_lock_interruptible(&ref->mutex); <-- ref->mutex first acquired
 i915_active_request_retire();
 node_retire();
 active_retire();
 mutex_lock_nested(&ref->mutex, SINGLE_DEPTH_NESTING); <-- DEADLOCK

Fix the deadlock by skipping the second ref->mutex lock when
active_retire() is called through i915_active_request_retire().

Note that this bug only affects 5.4 and has since been fixed in 5.5.
Normally, a backport of the fix from 5.5 would be in order, but the
patch set that fixes this deadlock involves massive changes that are
neither feasible nor desirable for backporting [1][2][3]. Therefore,
this small patch was made to address the deadlock specifically for 5.4.

[1] 274cbf20fd10 ("drm/i915: Push the i915_active.retire into a worker")
[2] 093b92287363 ("drm/i915: Split i915_active.mutex into an irq-safe spinlock for the rbtree")
[3] 750bde2fd4ff ("drm/i915: Serialise with remote retirement")

Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback")
Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/i915/i915_active.c |   27 +++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_active.h |    4 ++--
 2 files changed, 25 insertions(+), 6 deletions(-)

--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -120,13 +120,17 @@ static inline void debug_active_assert(s
 
 #endif
 
+#define I915_ACTIVE_RETIRE_NOLOCK BIT(0)
+
 static void
 __active_retire(struct i915_active *ref)
 {
 	struct active_node *it, *n;
 	struct rb_root root;
 	bool retire = false;
+	unsigned long bits;
 
+	ref = ptr_unpack_bits(ref, &bits, 2);
 	lockdep_assert_held(&ref->mutex);
 
 	/* return the unused nodes to our slabcache -- flushing the allocator */
@@ -138,7 +142,8 @@ __active_retire(struct i915_active *ref)
 		retire = true;
 	}
 
-	mutex_unlock(&ref->mutex);
+	if (!(bits & I915_ACTIVE_RETIRE_NOLOCK))
+		mutex_unlock(&ref->mutex);
 	if (!retire)
 		return;
 
@@ -155,13 +160,18 @@ __active_retire(struct i915_active *ref)
 static void
 active_retire(struct i915_active *ref)
 {
+	struct i915_active *ref_packed = ref;
+	unsigned long bits;
+
+	ref = ptr_unpack_bits(ref, &bits, 2);
 	GEM_BUG_ON(!atomic_read(&ref->count));
 	if (atomic_add_unless(&ref->count, -1, 1))
 		return;
 
 	/* One active may be flushed from inside the acquire of another */
-	mutex_lock_nested(&ref->mutex, SINGLE_DEPTH_NESTING);
-	__active_retire(ref);
+	if (!(bits & I915_ACTIVE_RETIRE_NOLOCK))
+		mutex_lock_nested(&ref->mutex, SINGLE_DEPTH_NESTING);
+	__active_retire(ref_packed);
 }
 
 static void
@@ -170,6 +180,14 @@ node_retire(struct i915_active_request *
 	active_retire(node_from_active(base)->ref);
 }
 
+static void
+node_retire_nolock(struct i915_active_request *base, struct i915_request *rq)
+{
+	struct i915_active *ref = node_from_active(base)->ref;
+
+	active_retire(ptr_pack_bits(ref, I915_ACTIVE_RETIRE_NOLOCK, 2));
+}
+
 static struct i915_active_request *
 active_instance(struct i915_active *ref, struct intel_timeline *tl)
 {
@@ -421,7 +439,8 @@ int i915_active_wait(struct i915_active
 			break;
 		}
 
-		err = i915_active_request_retire(&it->base, BKL(ref));
+		err = i915_active_request_retire(&it->base, BKL(ref),
+						 node_retire_nolock);
 		if (err)
 			break;
 	}
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -309,7 +309,7 @@ i915_active_request_isset(const struct i
  */
 static inline int __must_check
 i915_active_request_retire(struct i915_active_request *active,
-			   struct mutex *mutex)
+			   struct mutex *mutex, i915_active_retire_fn retire)
 {
 	struct i915_request *request;
 	long ret;
@@ -327,7 +327,7 @@ i915_active_request_retire(struct i915_a
 	list_del_init(&active->link);
 	RCU_INIT_POINTER(active->request, NULL);
 
-	active->retire(active, request);
+	retire(active, request);
 
 	return 0;
 }


Patches currently in stable-queue which might be from sultan@kerneltoast.com are

queue-5.4/drm-i915-fix-ref-mutex-deadlock-in-i915_active_wait.patch
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: <gregkh@linuxfoundation.org>
To: airlied@linux.ie, chris@chris-wilson.co.uk, daniel@ffwll.ch,
	dri-devel@lists.freedesktop.org, greg@kroah.com,
	gregkh@linuxfoundation.org, intel-gfx@lists.freedesktop.org,
	jani.nikula@linux.intel.com, joonas.lahtinen@linux.intel.com,
	rodrigo.vivi@intel.com, sultan@kerneltoast.com
Cc: stable-commits@vger.kernel.org
Subject: [Intel-gfx] Patch "drm/i915: Fix ref->mutex deadlock in i915_active_wait()" has been added to the 5.4-stable tree
Date: Fri, 10 Apr 2020 13:46:53 +0200	[thread overview]
Message-ID: <1586519213118220@kroah.com> (raw)
In-Reply-To: <20200407071809.3148-1-sultan@kerneltoast.com>


This is a note to let you know that I've just added the patch titled

    drm/i915: Fix ref->mutex deadlock in i915_active_wait()

to the 5.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     drm-i915-fix-ref-mutex-deadlock-in-i915_active_wait.patch
and it can be found in the queue-5.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@vger.kernel.org> know about it.


From sultan@kerneltoast.com  Fri Apr 10 11:07:34 2020
From: Sultan Alsawaf <sultan@kerneltoast.com>
Date: Tue,  7 Apr 2020 00:18:09 -0700
Subject: drm/i915: Fix ref->mutex deadlock in i915_active_wait()
To: Greg KH <greg@kroah.com>
Cc: stable@vger.kernel.org, Jani Nikula <jani.nikula@linux.intel.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, David Airlie <airlied@linux.ie>, Daniel Vetter <daniel@ffwll.ch>, Chris Wilson <chris@chris-wilson.co.uk>, intel-gfx@lists.freedesktop.org, dri-devel@lists.freedesktop.org, Sultan Alsawaf <sultan@kerneltoast.com>
Message-ID: <20200407071809.3148-1-sultan@kerneltoast.com>

From: Sultan Alsawaf <sultan@kerneltoast.com>

The following deadlock exists in i915_active_wait() due to a double lock
on ref->mutex (call chain listed in order from top to bottom):
 i915_active_wait();
 mutex_lock_interruptible(&ref->mutex); <-- ref->mutex first acquired
 i915_active_request_retire();
 node_retire();
 active_retire();
 mutex_lock_nested(&ref->mutex, SINGLE_DEPTH_NESTING); <-- DEADLOCK

Fix the deadlock by skipping the second ref->mutex lock when
active_retire() is called through i915_active_request_retire().

Note that this bug only affects 5.4 and has since been fixed in 5.5.
Normally, a backport of the fix from 5.5 would be in order, but the
patch set that fixes this deadlock involves massive changes that are
neither feasible nor desirable for backporting [1][2][3]. Therefore,
this small patch was made to address the deadlock specifically for 5.4.

[1] 274cbf20fd10 ("drm/i915: Push the i915_active.retire into a worker")
[2] 093b92287363 ("drm/i915: Split i915_active.mutex into an irq-safe spinlock for the rbtree")
[3] 750bde2fd4ff ("drm/i915: Serialise with remote retirement")

Fixes: 12c255b5dad1 ("drm/i915: Provide an i915_active.acquire callback")
Cc: <stable@vger.kernel.org> # 5.4.x
Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/gpu/drm/i915/i915_active.c |   27 +++++++++++++++++++++++----
 drivers/gpu/drm/i915/i915_active.h |    4 ++--
 2 files changed, 25 insertions(+), 6 deletions(-)

--- a/drivers/gpu/drm/i915/i915_active.c
+++ b/drivers/gpu/drm/i915/i915_active.c
@@ -120,13 +120,17 @@ static inline void debug_active_assert(s
 
 #endif
 
+#define I915_ACTIVE_RETIRE_NOLOCK BIT(0)
+
 static void
 __active_retire(struct i915_active *ref)
 {
 	struct active_node *it, *n;
 	struct rb_root root;
 	bool retire = false;
+	unsigned long bits;
 
+	ref = ptr_unpack_bits(ref, &bits, 2);
 	lockdep_assert_held(&ref->mutex);
 
 	/* return the unused nodes to our slabcache -- flushing the allocator */
@@ -138,7 +142,8 @@ __active_retire(struct i915_active *ref)
 		retire = true;
 	}
 
-	mutex_unlock(&ref->mutex);
+	if (!(bits & I915_ACTIVE_RETIRE_NOLOCK))
+		mutex_unlock(&ref->mutex);
 	if (!retire)
 		return;
 
@@ -155,13 +160,18 @@ __active_retire(struct i915_active *ref)
 static void
 active_retire(struct i915_active *ref)
 {
+	struct i915_active *ref_packed = ref;
+	unsigned long bits;
+
+	ref = ptr_unpack_bits(ref, &bits, 2);
 	GEM_BUG_ON(!atomic_read(&ref->count));
 	if (atomic_add_unless(&ref->count, -1, 1))
 		return;
 
 	/* One active may be flushed from inside the acquire of another */
-	mutex_lock_nested(&ref->mutex, SINGLE_DEPTH_NESTING);
-	__active_retire(ref);
+	if (!(bits & I915_ACTIVE_RETIRE_NOLOCK))
+		mutex_lock_nested(&ref->mutex, SINGLE_DEPTH_NESTING);
+	__active_retire(ref_packed);
 }
 
 static void
@@ -170,6 +180,14 @@ node_retire(struct i915_active_request *
 	active_retire(node_from_active(base)->ref);
 }
 
+static void
+node_retire_nolock(struct i915_active_request *base, struct i915_request *rq)
+{
+	struct i915_active *ref = node_from_active(base)->ref;
+
+	active_retire(ptr_pack_bits(ref, I915_ACTIVE_RETIRE_NOLOCK, 2));
+}
+
 static struct i915_active_request *
 active_instance(struct i915_active *ref, struct intel_timeline *tl)
 {
@@ -421,7 +439,8 @@ int i915_active_wait(struct i915_active
 			break;
 		}
 
-		err = i915_active_request_retire(&it->base, BKL(ref));
+		err = i915_active_request_retire(&it->base, BKL(ref),
+						 node_retire_nolock);
 		if (err)
 			break;
 	}
--- a/drivers/gpu/drm/i915/i915_active.h
+++ b/drivers/gpu/drm/i915/i915_active.h
@@ -309,7 +309,7 @@ i915_active_request_isset(const struct i
  */
 static inline int __must_check
 i915_active_request_retire(struct i915_active_request *active,
-			   struct mutex *mutex)
+			   struct mutex *mutex, i915_active_retire_fn retire)
 {
 	struct i915_request *request;
 	long ret;
@@ -327,7 +327,7 @@ i915_active_request_retire(struct i915_a
 	list_del_init(&active->link);
 	RCU_INIT_POINTER(active->request, NULL);
 
-	active->retire(active, request);
+	retire(active, request);
 
 	return 0;
 }


Patches currently in stable-queue which might be from sultan@kerneltoast.com are

queue-5.4/drm-i915-fix-ref-mutex-deadlock-in-i915_active_wait.patch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

  parent reply	other threads:[~2020-04-10 11:47 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-07  6:26 [PATCH 0/1] drm/i915: Fix a deadlock that only affects 5.4 Sultan Alsawaf
2020-04-07  6:26 ` [PATCH 1/1] drm/i915: Fix ref->mutex deadlock in i915_active_wait() Sultan Alsawaf
2020-04-14  8:13   ` Chris Wilson
2020-04-14  8:13     ` [Intel-gfx] " Chris Wilson
2020-04-14  8:13     ` Chris Wilson
2020-04-14 14:52     ` Sultan Alsawaf
2020-04-14 14:52       ` [Intel-gfx] " Sultan Alsawaf
2020-04-14 14:52       ` Sultan Alsawaf
2020-04-07  6:52 ` [PATCH 0/1] drm/i915: Fix a deadlock that only affects 5.4 Greg KH
2020-04-07  6:52   ` [Intel-gfx] " Greg KH
2020-04-07  6:52   ` Greg KH
2020-04-07  7:18   ` [PATCH v2] drm/i915: Fix ref->mutex deadlock in i915_active_wait() Sultan Alsawaf
2020-04-07 20:32     ` [PATCH v3] " Sultan Alsawaf
2020-04-11 11:39       ` Patch "drm/i915: Fix ref->mutex deadlock in i915_active_wait()" has been added to the 5.4-stable tree gregkh
2020-04-11 11:39         ` [Intel-gfx] " gregkh
2020-04-11 11:59       ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for Patch "drm/i915: Fix ref->mutex deadlock in i915_active_wait()" has been added to the 5.4-stable tree (rev2) Patchwork
2020-04-10  9:08     ` [PATCH v2] drm/i915: Fix ref->mutex deadlock in i915_active_wait() Greg KH
2020-04-10  9:08       ` [Intel-gfx] " Greg KH
2020-04-10  9:08       ` Greg KH
2020-04-10 14:15       ` Sultan Alsawaf
2020-04-10 14:15         ` [Intel-gfx] " Sultan Alsawaf
2020-04-10 14:15         ` Sultan Alsawaf
2020-04-10 14:17       ` Sultan Alsawaf
2020-04-10 14:17         ` [Intel-gfx] " Sultan Alsawaf
2020-04-10 14:17         ` Sultan Alsawaf
2020-04-11 11:39         ` Greg KH
2020-04-11 11:39           ` [Intel-gfx] " Greg KH
2020-04-11 11:39           ` Greg KH
2020-04-14  8:15           ` Chris Wilson
2020-04-14  8:15             ` [Intel-gfx] " Chris Wilson
2020-04-14  8:15             ` Chris Wilson
2020-04-14  8:23             ` Greg KH
2020-04-14  8:23               ` [Intel-gfx] " Greg KH
2020-04-14  8:23               ` Greg KH
2020-04-20  9:02               ` Joonas Lahtinen
2020-04-20  9:02                 ` [Intel-gfx] " Joonas Lahtinen
2020-04-20  9:02                 ` Joonas Lahtinen
2020-04-20 15:42                 ` Sultan Alsawaf
2020-04-20 15:42                   ` [Intel-gfx] " Sultan Alsawaf
2020-04-20 15:42                   ` Sultan Alsawaf
2020-04-21  8:04                   ` Joonas Lahtinen
2020-04-21  8:04                     ` [Intel-gfx] " Joonas Lahtinen
2020-04-21  8:04                     ` Joonas Lahtinen
2020-04-21 16:38                     ` Sultan Alsawaf
2020-04-21 16:38                       ` [Intel-gfx] " Sultan Alsawaf
2020-04-21 16:38                       ` Sultan Alsawaf
2020-04-21 20:55                       ` Jason A. Donenfeld
2020-04-21 20:55                         ` [Intel-gfx] " Jason A. Donenfeld
2020-04-21 20:55                         ` Jason A. Donenfeld
2020-04-14 14:35             ` Sultan Alsawaf
2020-04-14 14:35               ` [Intel-gfx] " Sultan Alsawaf
2020-04-14 14:35               ` Sultan Alsawaf
2020-04-10 11:46     ` gregkh [this message]
2020-04-10 11:46       ` [Intel-gfx] Patch "drm/i915: Fix ref->mutex deadlock in i915_active_wait()" has been added to the 5.4-stable tree gregkh
2020-04-10 11:56     ` [Intel-gfx] ✗ Fi.CI.BUILD: failure for " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1586519213118220@kroah.com \
    --to=gregkh@linuxfoundation.org \
    --cc=airlied@linux.ie \
    --cc=chris@chris-wilson.co.uk \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=greg@kroah.com \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=jani.nikula@linux.intel.com \
    --cc=joonas.lahtinen@linux.intel.com \
    --cc=rodrigo.vivi@intel.com \
    --cc=stable-commits@vger.kernel.org \
    --cc=sultan@kerneltoast.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.