All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jordan Crouse <jcrouse@codeaurora.org>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-arm-msm@vger.kernel.org,
	"Gustavo Padovan" <gustavo@padovan.org>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linaro-mm-sig@lists.linaro.org,
	"Christian König" <christian.koenig@amd.com>,
	linux-media@vger.kernel.org
Subject: Re: [RFC PATCH v1] dma-fence-array: Deal with sub-fences that are signaled late
Date: Mon, 17 Aug 2020 10:24:49 -0600	[thread overview]
Message-ID: <20200817162449.GC3221@jcrouse1-lnx.qualcomm.com> (raw)
In-Reply-To: <159730136458.14054.18114194663048046416@build.alporthouse.com>

On Thu, Aug 13, 2020 at 07:49:24AM +0100, Chris Wilson wrote:
> Quoting Jordan Crouse (2020-08-13 00:55:44)
> > This is an RFC because I'm still trying to grok the correct behavior.
> > 
> > Consider a dma_fence_array created two two fence and signal_on_any is true.
> > A reference to dma_fence_array is taken for each waiting fence.
> > 
> > When the client calls dma_fence_wait() only one of the fences is signaled.
> > The client returns successfully from the wait and puts it's reference to
> > the array fence but the array fence still remains because of the remaining
> > un-signaled fence.
> > 
> > Now consider that the unsignaled fence is signaled while the timeline is being
> > destroyed much later. The timeline destroy calls dma_fence_signal_locked(). The
> > following sequence occurs:
> > 
> > 1) dma_fence_array_cb_func is called
> > 
> > 2) array->num_pending is 0 (because it was set to 1 due to signal_on_any) so the
> > callback function calls dma_fence_put() instead of triggering the irq work
> > 
> > 3) The array fence is released which in turn puts the lingering fence which is
> > then released
> > 
> > 4) deadlock with the timeline
> 
> It's the same recursive lock as we previously resolved in sw_sync.c by
> removing the locking from timeline_fence_release().

Ah, yep. I'm working on a not-quite-ready-for-primetime version of a vulkan
timeline implementation for drm/msm and I was doing something similar to how
sw_sync used to work in the release function. Getting rid of the recursive lock
in the timeline seems a better solution than this. Thanks for taking the time
to respond.

Jordan

> -Chris

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

WARNING: multiple messages have this Message-ID (diff)
From: Jordan Crouse <jcrouse@codeaurora.org>
To: Chris Wilson <chris@chris-wilson.co.uk>
Cc: linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org,
	dri-devel@lists.freedesktop.org, linaro-mm-sig@lists.linaro.org,
	"Gustavo Padovan" <gustavo@padovan.org>,
	"Christian König" <christian.koenig@amd.com>,
	linux-media@vger.kernel.org
Subject: Re: [RFC PATCH v1] dma-fence-array: Deal with sub-fences that are signaled late
Date: Mon, 17 Aug 2020 10:24:49 -0600	[thread overview]
Message-ID: <20200817162449.GC3221@jcrouse1-lnx.qualcomm.com> (raw)
In-Reply-To: <159730136458.14054.18114194663048046416@build.alporthouse.com>

On Thu, Aug 13, 2020 at 07:49:24AM +0100, Chris Wilson wrote:
> Quoting Jordan Crouse (2020-08-13 00:55:44)
> > This is an RFC because I'm still trying to grok the correct behavior.
> > 
> > Consider a dma_fence_array created two two fence and signal_on_any is true.
> > A reference to dma_fence_array is taken for each waiting fence.
> > 
> > When the client calls dma_fence_wait() only one of the fences is signaled.
> > The client returns successfully from the wait and puts it's reference to
> > the array fence but the array fence still remains because of the remaining
> > un-signaled fence.
> > 
> > Now consider that the unsignaled fence is signaled while the timeline is being
> > destroyed much later. The timeline destroy calls dma_fence_signal_locked(). The
> > following sequence occurs:
> > 
> > 1) dma_fence_array_cb_func is called
> > 
> > 2) array->num_pending is 0 (because it was set to 1 due to signal_on_any) so the
> > callback function calls dma_fence_put() instead of triggering the irq work
> > 
> > 3) The array fence is released which in turn puts the lingering fence which is
> > then released
> > 
> > 4) deadlock with the timeline
> 
> It's the same recursive lock as we previously resolved in sw_sync.c by
> removing the locking from timeline_fence_release().

Ah, yep. I'm working on a not-quite-ready-for-primetime version of a vulkan
timeline implementation for drm/msm and I was doing something similar to how
sw_sync used to work in the release function. Getting rid of the recursive lock
in the timeline seems a better solution than this. Thanks for taking the time
to respond.

Jordan

> -Chris

-- 
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2020-08-17 16:25 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-12 23:55 [RFC PATCH v1] dma-fence-array: Deal with sub-fences that are signaled late Jordan Crouse
2020-08-12 23:55 ` Jordan Crouse
2020-08-13  6:49 ` Chris Wilson
2020-08-17 16:24   ` Jordan Crouse [this message]
2020-08-17 16:24     ` Jordan Crouse
2020-08-13  6:52 ` Christian König
2020-08-13  6:52   ` Christian König
2020-09-01  8:03 ` [dma] ee7499cf7d: igt.Subtest_busy-hang-all.fail kernel test robot
2020-09-01  8:03   ` kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200817162449.GC3221@jcrouse1-lnx.qualcomm.com \
    --to=jcrouse@codeaurora.org \
    --cc=chris@chris-wilson.co.uk \
    --cc=christian.koenig@amd.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gustavo@padovan.org \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.