All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Wilson <chris@chris-wilson.co.uk>
To: dri-devel@lists.freedesktop.org
Cc: intel-gfx@lists.freedesktop.org,
	"Bas Nieuwenhuizen" <bas@basnieuwenhuizen.nl>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Chris Wilson" <chris@chris-wilson.co.uk>,
	"Gustavo Padovan" <gustavo@padovan.org>,
	"Christian König" <christian.koenig@amd.com>,
	stable@vger.kernel.org
Subject: [PATCH 1/3] dma-buf/sw_sync: Avoid recursive lock during fence signal.
Date: Tue, 14 Jul 2020 21:06:44 +0100	[thread overview]
Message-ID: <20200714200646.14041-1-chris@chris-wilson.co.uk> (raw)

From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Calltree:
  timeline_fence_release
  drm_sched_entity_wakeup
  dma_fence_signal_locked
  sync_timeline_signal
  sw_sync_ioctl

Releasing the reference to the fence in the fence signal callback
seems reasonable to me, so this patch avoids the locking issue in
sw_sync.

d3862e44daa7 ("dma-buf/sw-sync: Fix locking around sync_timeline lists")
fixed the recursive locking issue but caused an use-after-free. Later
d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free")
fixed the use-after-free but reintroduced the recursive locking issue.

In this attempt we avoid the use-after-free still because the release
function still always locks, and outside of the locking region in the
signal function we have properly refcounted references.

We furthermore also avoid the recurive lock by making sure that either:

1) We have a properly refcounted reference, preventing the signal from
   triggering the release function inside the locked region.
2) The refcount was already zero, and hence nobody will be able to trigger
   the release function from the signal function.

v2: Move dma_fence_signal() into second loop in preparation to moving
the callback out of the timeline obj->lock.

Fixes: d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free")
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/dma-buf/sw_sync.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 348b3a9170fa..807c82148062 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -192,6 +192,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
 static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
 {
 	struct sync_pt *pt, *next;
+	LIST_HEAD(signal);
 
 	trace_sync_timeline(obj);
 
@@ -203,21 +204,32 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
 		if (!timeline_fence_signaled(&pt->base))
 			break;
 
-		list_del_init(&pt->link);
-		rb_erase(&pt->node, &obj->pt_tree);
-
 		/*
-		 * A signal callback may release the last reference to this
-		 * fence, causing it to be freed. That operation has to be
-		 * last to avoid a use after free inside this loop, and must
-		 * be after we remove the fence from the timeline in order to
-		 * prevent deadlocking on timeline->lock inside
-		 * timeline_fence_release().
+		 * We need to take a reference to avoid a release during
+		 * signalling (which can cause a recursive lock of obj->lock).
+		 * If refcount was already zero, another thread is already
+		 * taking care of destroying the fence.
 		 */
-		dma_fence_signal_locked(&pt->base);
+		if (!dma_fence_get_rcu(&pt->base))
+			continue;
+
+		list_move_tail(&pt->link, &signal);
+		rb_erase(&pt->node, &obj->pt_tree);
 	}
 
 	spin_unlock_irq(&obj->lock);
+
+	list_for_each_entry_safe(pt, next, &signal, link) {
+		/*
+		 * This needs to be cleared before release, otherwise the
+		 * timeline_fence_release function gets confused about also
+		 * removing the fence from the pt_tree.
+		 */
+		list_del_init(&pt->link);
+
+		dma_fence_signal(&pt->base);
+		dma_fence_put(&pt->base);
+	}
 }
 
 /**
-- 
2.20.1


WARNING: multiple messages have this Message-ID (diff)
From: Chris Wilson <chris@chris-wilson.co.uk>
To: dri-devel@lists.freedesktop.org
Cc: "Gustavo Padovan" <gustavo@padovan.org>,
	intel-gfx@lists.freedesktop.org, stable@vger.kernel.org,
	"Chris Wilson" <chris@chris-wilson.co.uk>,
	"Christian König" <christian.koenig@amd.com>
Subject: [PATCH 1/3] dma-buf/sw_sync: Avoid recursive lock during fence signal.
Date: Tue, 14 Jul 2020 21:06:44 +0100	[thread overview]
Message-ID: <20200714200646.14041-1-chris@chris-wilson.co.uk> (raw)

From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Calltree:
  timeline_fence_release
  drm_sched_entity_wakeup
  dma_fence_signal_locked
  sync_timeline_signal
  sw_sync_ioctl

Releasing the reference to the fence in the fence signal callback
seems reasonable to me, so this patch avoids the locking issue in
sw_sync.

d3862e44daa7 ("dma-buf/sw-sync: Fix locking around sync_timeline lists")
fixed the recursive locking issue but caused an use-after-free. Later
d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free")
fixed the use-after-free but reintroduced the recursive locking issue.

In this attempt we avoid the use-after-free still because the release
function still always locks, and outside of the locking region in the
signal function we have properly refcounted references.

We furthermore also avoid the recurive lock by making sure that either:

1) We have a properly refcounted reference, preventing the signal from
   triggering the release function inside the locked region.
2) The refcount was already zero, and hence nobody will be able to trigger
   the release function from the signal function.

v2: Move dma_fence_signal() into second loop in preparation to moving
the callback out of the timeline obj->lock.

Fixes: d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free")
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/dma-buf/sw_sync.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 348b3a9170fa..807c82148062 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -192,6 +192,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
 static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
 {
 	struct sync_pt *pt, *next;
+	LIST_HEAD(signal);
 
 	trace_sync_timeline(obj);
 
@@ -203,21 +204,32 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
 		if (!timeline_fence_signaled(&pt->base))
 			break;
 
-		list_del_init(&pt->link);
-		rb_erase(&pt->node, &obj->pt_tree);
-
 		/*
-		 * A signal callback may release the last reference to this
-		 * fence, causing it to be freed. That operation has to be
-		 * last to avoid a use after free inside this loop, and must
-		 * be after we remove the fence from the timeline in order to
-		 * prevent deadlocking on timeline->lock inside
-		 * timeline_fence_release().
+		 * We need to take a reference to avoid a release during
+		 * signalling (which can cause a recursive lock of obj->lock).
+		 * If refcount was already zero, another thread is already
+		 * taking care of destroying the fence.
 		 */
-		dma_fence_signal_locked(&pt->base);
+		if (!dma_fence_get_rcu(&pt->base))
+			continue;
+
+		list_move_tail(&pt->link, &signal);
+		rb_erase(&pt->node, &obj->pt_tree);
 	}
 
 	spin_unlock_irq(&obj->lock);
+
+	list_for_each_entry_safe(pt, next, &signal, link) {
+		/*
+		 * This needs to be cleared before release, otherwise the
+		 * timeline_fence_release function gets confused about also
+		 * removing the fence from the pt_tree.
+		 */
+		list_del_init(&pt->link);
+
+		dma_fence_signal(&pt->base);
+		dma_fence_put(&pt->base);
+	}
 }
 
 /**
-- 
2.20.1

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: Chris Wilson <chris@chris-wilson.co.uk>
To: dri-devel@lists.freedesktop.org
Cc: "Gustavo Padovan" <gustavo@padovan.org>,
	intel-gfx@lists.freedesktop.org, stable@vger.kernel.org,
	"Chris Wilson" <chris@chris-wilson.co.uk>,
	"Bas Nieuwenhuizen" <bas@basnieuwenhuizen.nl>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>
Subject: [Intel-gfx] [PATCH 1/3] dma-buf/sw_sync: Avoid recursive lock during fence signal.
Date: Tue, 14 Jul 2020 21:06:44 +0100	[thread overview]
Message-ID: <20200714200646.14041-1-chris@chris-wilson.co.uk> (raw)

From: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

Calltree:
  timeline_fence_release
  drm_sched_entity_wakeup
  dma_fence_signal_locked
  sync_timeline_signal
  sw_sync_ioctl

Releasing the reference to the fence in the fence signal callback
seems reasonable to me, so this patch avoids the locking issue in
sw_sync.

d3862e44daa7 ("dma-buf/sw-sync: Fix locking around sync_timeline lists")
fixed the recursive locking issue but caused an use-after-free. Later
d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free")
fixed the use-after-free but reintroduced the recursive locking issue.

In this attempt we avoid the use-after-free still because the release
function still always locks, and outside of the locking region in the
signal function we have properly refcounted references.

We furthermore also avoid the recurive lock by making sure that either:

1) We have a properly refcounted reference, preventing the signal from
   triggering the release function inside the locked region.
2) The refcount was already zero, and hence nobody will be able to trigger
   the release function from the signal function.

v2: Move dma_fence_signal() into second loop in preparation to moving
the callback out of the timeline obj->lock.

Fixes: d3c6dd1fb30d ("dma-buf/sw_sync: Synchronize signal vs syncpt free")
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Christian König <christian.koenig@amd.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 drivers/dma-buf/sw_sync.c | 32 ++++++++++++++++++++++----------
 1 file changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 348b3a9170fa..807c82148062 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -192,6 +192,7 @@ static const struct dma_fence_ops timeline_fence_ops = {
 static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
 {
 	struct sync_pt *pt, *next;
+	LIST_HEAD(signal);
 
 	trace_sync_timeline(obj);
 
@@ -203,21 +204,32 @@ static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
 		if (!timeline_fence_signaled(&pt->base))
 			break;
 
-		list_del_init(&pt->link);
-		rb_erase(&pt->node, &obj->pt_tree);
-
 		/*
-		 * A signal callback may release the last reference to this
-		 * fence, causing it to be freed. That operation has to be
-		 * last to avoid a use after free inside this loop, and must
-		 * be after we remove the fence from the timeline in order to
-		 * prevent deadlocking on timeline->lock inside
-		 * timeline_fence_release().
+		 * We need to take a reference to avoid a release during
+		 * signalling (which can cause a recursive lock of obj->lock).
+		 * If refcount was already zero, another thread is already
+		 * taking care of destroying the fence.
 		 */
-		dma_fence_signal_locked(&pt->base);
+		if (!dma_fence_get_rcu(&pt->base))
+			continue;
+
+		list_move_tail(&pt->link, &signal);
+		rb_erase(&pt->node, &obj->pt_tree);
 	}
 
 	spin_unlock_irq(&obj->lock);
+
+	list_for_each_entry_safe(pt, next, &signal, link) {
+		/*
+		 * This needs to be cleared before release, otherwise the
+		 * timeline_fence_release function gets confused about also
+		 * removing the fence from the pt_tree.
+		 */
+		list_del_init(&pt->link);
+
+		dma_fence_signal(&pt->base);
+		dma_fence_put(&pt->base);
+	}
 }
 
 /**
-- 
2.20.1

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

             reply	other threads:[~2020-07-14 20:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14 20:06 Chris Wilson [this message]
2020-07-14 20:06 ` [Intel-gfx] [PATCH 1/3] dma-buf/sw_sync: Avoid recursive lock during fence signal Chris Wilson
2020-07-14 20:06 ` Chris Wilson
2020-07-14 20:06 ` [PATCH 2/3] dma-buf/sw_sync: Separate signal/timeline locks Chris Wilson
2020-07-14 20:06   ` [Intel-gfx] " Chris Wilson
2020-07-14 20:14   ` [PATCH v2] " Chris Wilson
2020-07-14 20:14     ` [Intel-gfx] " Chris Wilson
2020-07-14 20:06 ` [PATCH 3/3] dma-buf/selftests: Add locking selftests for sw_sync Chris Wilson
2020-07-14 20:06   ` [Intel-gfx] " Chris Wilson
2020-07-14 20:29   ` Bas Nieuwenhuizen
2020-07-14 20:29     ` [Intel-gfx] " Bas Nieuwenhuizen
2020-07-14 20:14 ` [PATCH 1/3] dma-buf/sw_sync: Avoid recursive lock during fence signal Bas Nieuwenhuizen
2020-07-14 20:14   ` [Intel-gfx] " Bas Nieuwenhuizen
2020-07-14 20:14   ` Bas Nieuwenhuizen
2020-07-14 20:27 ` [Intel-gfx] ✗ Fi.CI.CHECKPATCH: warning for series starting with [1/3] dma-buf/sw_sync: Avoid recursive lock during fence signal. (rev2) Patchwork
2020-07-14 20:49 ` [Intel-gfx] ✗ Fi.CI.BAT: failure " Patchwork
2020-07-15  8:57 ` [PATCH 1/3] dma-buf/sw_sync: Avoid recursive lock during fence signal Christian König
2020-07-15  8:57   ` [Intel-gfx] " Christian König
2020-07-15  8:57   ` Christian König

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200714200646.14041-1-chris@chris-wilson.co.uk \
    --to=chris@chris-wilson.co.uk \
    --cc=bas@basnieuwenhuizen.nl \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gustavo@padovan.org \
    --cc=intel-gfx@lists.freedesktop.org \
    --cc=stable@vger.kernel.org \
    --cc=sumit.semwal@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.